Algorithms on Strings Course Syllabus
Full curriculum breakdown — modules, lessons, estimated time, and outcomes.
Overview: This intermediate course covers essential string algorithms used in bioinformatics, text search, and pattern matching. Through hands-on programming assignments and video lectures, learners will explore suffix trees, the Burrows-Wheeler Transform, suffix arrays, and the Knuth-Morris-Pratt algorithm. The course spans approximately 14 hours of content, divided into four core modules and a final project, providing practical experience in algorithm implementation and application to real-world problems such as genomic sequence analysis and text compression.
Module 1: Suffix Trees
Estimated time: 4 hours
- Construct and traverse suffix trees for string representation
- Search for longest repeated substrings using suffix trees
- Apply suffix trees to pattern matching in text and DNA sequences
- Analyze time and space complexity of suffix tree algorithms
Module 2: Burrows-Wheeler Transform and Suffix Arrays
Estimated time: 4 hours
- Apply the Burrows-Wheeler Transform for text compression
- Reverse the BWT to reconstruct original strings
- Build and query suffix arrays for exact pattern matching
- Use BWT and suffix arrays in approximate string search applications
Module 3: Knuth-Morris-Pratt Algorithm
Estimated time: 3 hours
- Implement the KMP algorithm for linear-time pattern matching
- Construct prefix functions from pattern strings
- Optimize string search using failure function preprocessing
Module 4: Advanced Pattern Matching and Applications
Estimated time: 3 hours
- Apply string algorithms to genomic data analysis
- Solve real-world pattern matching problems in bioinformatics
- Optimize algorithm performance for large-scale string processing
Module 5: Final Project
Estimated time: 4 hours
- Design a pattern matching system using suffix arrays or BWT
- Implement and test the algorithm on biological sequence data
- Submit a working solution with documentation and performance analysis
Prerequisites
- Familiarity with basic data structures (arrays, trees, hash tables)
- Intermediate programming experience in Python or Java
- Understanding of algorithmic complexity and big-O notation
What You'll Be Able to Do After
- Implement core string algorithms including KMP, suffix trees, and BWT
- Perform efficient exact and approximate pattern matching in large texts
- Apply string algorithms to problems in bioinformatics and genomics
- Design and optimize solutions for text compression and search tasks
- Strengthen algorithmic problem-solving skills for technical interviews and research