This is the third in a series of computer science lessons about string matching algorithms and how to implement them. String matching is not just about text processing, it has applications in the fields of artificial intelligence, cybersecurity and computational biology, to name but a few. String matching algorithms include the naive algorithm, KMP, Boyer Moore and Rabin Karp. Some of these are particularly suited to specialist applications like DNA sequencing, data mining and network intrusion detection systems.
In the previous two lessons of this series, you learned about a naive approach to the problem of searching for one string of characters within another. In this lesson you will be introduced to the Knuth, Morris, Pratt algorithm, known more simply as the KMP algorithm.
Some of the key observations that led to the development of the KMP algorithm are illustrated in this lesson, and you will begin to see how the KMP algorithm can address the limitations of the naive approach. You will learn that a crucial feature of the KMP algorithm is that it utilises advanced information about the composition of the pattern it is attempting to find. Specifically, it knows something about the way sequences of characters are repeated within the target pattern, and it can apply this information whenever a partial match occurs.
In the lesson that follows this one, you will learn more about the pattern preprocessing phase of the KMP algorithm. That is, you’ll learn how to capture and store relevant information about the structure of any pattern so that it can be referred to during a search.
Chapters:
00:00 Discovery of the KMP algorithm
01:27 Search for a pattern of unique characters
05:50 Search for a pattern of non-unique characters