Aho–Corasick

The Aho-Corasick algorithm is a string searching algorithm invented by Alfred V. Aho and Margaret J. Corasick. Similar to the extended version of the Rabin-Karp algorithm, it is capable of matching elements of a dictionary (set of words) within an input text. The idea behind it is to build a finite state machine that enables matching all strings of the dictionary simultaneously. The algorithm is linear in the length of the strings, plus the length of the searched text, plus the number of output matches. If n is the length of the searched text, m is the sum of the length of all words in the dictionary, and z is the total number of occurrences of words in the text.

Therefore, the time complexity of the Aho-Corasick algorithm is O(n + m + z). In this small section, we've looked at three other famous string matching algorithms. Without going into much detail about them, we've seen their applicability on different problems other than the one the Boyer-Moore algorithm solves. In particular, we've noted that there are algorithms specialized for the finding of a set of patterns in a text.

In 1979, Zvi Galil introduced an important optimization, called the Galil rule, that speeds up the comparisons done at each shift by skipping sections that are known to match. Using the Galil rule, the Boyer-Moore algorithm achieves linear time complexity in the worst case.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset