This is the fifth in a series of computer science lessons about string matching algorithms and how to implement them. String matching is not just about text processing, it has applications in the fields of artificial intelligence, cybersecurity and computational biology, to name but a few. String matching algorithms include the naive algorithm, KMP, Boyer Moore and Rabin Karp. Some of these are particularly suited to specialist applications like DNA sequencing, data mining and network intrusion detection systems.
In the previous lesson you learned about the pattern preprocessing phase of the KMP algorithm and saw how to construct a partial match table by inspection. In this lesson, you’ll learn how to write a program to implement the pattern preprocessing phase of the KMP algorithm, a program that is able to generate LPS information for any pattern, programmatically. You will first see a brut force approach to the implementation, which includes several nested loops and therefore does not scale well with larger patterns. You will also see the optimal approach that was developed by Knuth, Morris and Pratt. The programs are described with pseudocode first and demonstrated in VB.NET. The lesson also includes equivalent code in C# and Python.
Before continuing, make sure you are familiar with the nature of the LPS information captured during the pattern processing phase. You should already be able to construct a partial match table for a given pattern by inspection. Please review the previous lessons if needs be. It is also assumed that you already have some programming experience.
in the previous lesson.
Chapters:
00:00 Introduction
00:40 Sample pattern
02:01 The brut force approach
04:04 The LPS array
04:46 The KMP approach
10:21 The KMP insight
10:50 Resetting the len variable
14:18 Brut force code in VB.NET
18:40 Brut force code in C#
18:54 Brut force code in Python
19:10 Optimal code in VB.NET
21:00 Optimal code in C#
21:08 Optimal code in Python