Class PatternSearch


  • public final class PatternSearch
    extends java.lang.Object
    This is a static class that performs the operations to do a pattern search on a given column of a table. The pattern syntax is very simple and follows that of the SQL standard.

    It works as follows: The '%' character represents any sequence of characters. The '_' character represents some character.

    Therefore, the pattern search 'Toby%' will find all rows that start with the string 'Toby' and end with any sequence of characters. The pattern 'T% Downer%' will find all names starting with T and containing 'Downer' somewhere in the end. The pattern '_at' will find all three letter words ending with 'at'.

    NOTE: A 'ab%' type search is faster than a '%bc' type search. If the start of the search pattern is unknown then the entire contents of the column need to be accessed.

    • Field Summary

      Fields 
      Modifier and Type Field Description
      private static char ONE_CHAR  
      private static char ZERO_OR_MORE_CHARS
      Statics for the tokens.
    • Constructor Summary

      Constructors 
      Constructor Description
      PatternSearch()  
    • Method Summary

      All Methods Static Methods Concrete Methods 
      Modifier and Type Method Description
      static boolean fullPatternMatch​(java.lang.String pattern, java.lang.String str, char escape_char)
      Matches a pattern against a string and returns true if it matches or false otherwise.
      private static boolean isWildCard​(char ch)
      Returns true if the given character is a wild card (unknown).
      static void main​(java.lang.String[] args)  
      static boolean patternMatch​(java.lang.String pattern, java.lang.String expression, char escape_char)
      This is the pattern match recurrsive method.
      (package private) static boolean regexMatch​(TransactionSystem system, java.lang.String pattern, java.lang.String value)
      Matches a string against a regular expression pattern.
      (package private) static IntegerVector regexSearch​(Table table, int column, java.lang.String pattern)
      Matches a column of a table against a constant regular expression pattern.
      (package private) static IntegerVector search​(Table table, int column, java.lang.String pattern)
      This is the search method.
      (package private) static IntegerVector search​(Table table, int column, java.lang.String pattern, char escape_char)
      This is the search method.
      static boolean testSearch​(java.lang.String pattern, java.lang.String expression, boolean result)  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • PatternSearch

        public PatternSearch()
    • Method Detail

      • testSearch

        public static boolean testSearch​(java.lang.String pattern,
                                         java.lang.String expression,
                                         boolean result)
      • main

        public static void main​(java.lang.String[] args)
      • isWildCard

        private static boolean isWildCard​(char ch)
        Returns true if the given character is a wild card (unknown).
      • fullPatternMatch

        public static boolean fullPatternMatch​(java.lang.String pattern,
                                               java.lang.String str,
                                               char escape_char)
        Matches a pattern against a string and returns true if it matches or false otherwise. This matches patterns that do not necessarily start with a wild card unlike the 'patternMatch' method.
      • patternMatch

        public static boolean patternMatch​(java.lang.String pattern,
                                           java.lang.String expression,
                                           char escape_char)
        This is the pattern match recurrsive method. It recurses on each wildcard expression in the pattern which makes for slightly better efficiency than a character recurse algorithm. However, patterns such as "_%_a" will result in many recursive calls.

        Returns true if the pattern matches.

        NOTE: That "_%_" will be less efficient than "__%" and will produce the same result. NOTE: It requires that a wild card character is the first character in the expression. ISSUE: Pattern optimiser, we should optimise wild cards of type "%__" to "__%", or "%__%_%_%" to "____%". Optimised forms are identical in result and more efficient. This optimisation could be performed by the client during parsing of the LIKE statement. HACKING ISSUE: Badly formed wild cards may result in hogging of server side resources.

      • search

        static IntegerVector search​(Table table,
                                    int column,
                                    java.lang.String pattern)
        This is the search method. It requires a table to search, a column of the table, and a pattern. It returns the rows in the table that match the pattern if any. Pattern searching only works successfully on columns that are of type Types.DB_STRING. This works by first reducing the search to all cells that contain the first section of text. ie. pattern = "Toby% ___ner" will first reduce search to all rows between "Toby" and "Tobz". This makes for better efficiency.
      • search

        static IntegerVector search​(Table table,
                                    int column,
                                    java.lang.String pattern,
                                    char escape_char)
        This is the search method. It requires a table to search, a column of the table, and a pattern. It returns the rows in the table that match the pattern if any. Pattern searching only works successfully on columns that are of type Types.DB_STRING. This works by first reducing the search to all cells that contain the first section of text. ie. pattern = "Toby% ___ner" will first reduce search to all rows between "Toby" and "Tobz". This makes for better efficiency.
      • regexMatch

        static boolean regexMatch​(TransactionSystem system,
                                  java.lang.String pattern,
                                  java.lang.String value)
        Matches a string against a regular expression pattern. We use the regex library as specified in the DatabaseSystem configuration.
      • regexSearch

        static IntegerVector regexSearch​(Table table,
                                         int column,
                                         java.lang.String pattern)
        Matches a column of a table against a constant regular expression pattern. We use the regex library as specified in the DatabaseSystem configuration.