Algorithms for DNA Sequencing Course Syllabus

Full curriculum breakdown — modules, lessons, estimated time, and outcomes.

This course provides an excellent balance between biological context and computational technique, offering a practical, algorithm-rich experience using real DNA sequencing data and Python programming. Over approximately 13 hours, learners will progress through foundational and advanced topics in bioinformatics, combining theory with hands-on implementation. The course is structured into four core modules followed by a final project, allowing learners from both computer science and biology backgrounds to build interdisciplinary skills in genome analysis and algorithm application.

Module 1: DNA Sequencing, Strings, and Matching

Estimated time: 4 hours

  • Overview of DNA sequencing technologies
  • Genome representation as strings
  • Understanding sequencing errors and quality scoring (FASTQ format)
  • Implementation of naive exact string matching in Python

Module 2: Preprocessing, Indexing, and Approximate Matching

Estimated time: 3 hours

  • Application of the Boyer-Moore algorithm
  • Building k-mer indices and hash tables for genome search
  • Understanding approximate matches using the pigeonhole principle
  • Introduction to Hamming distance and edit distance

Module 3: Edit Distance, Assembly, and Overlaps

Estimated time: 3 hours

  • Dynamic programming for edit distance calculation
  • Local and global sequence alignment
  • Principles of shotgun sequencing and read overlaps
  • Construction and analysis of overlap graphs

Module 4: Algorithms for Assembly

Estimated time: 3 hours

  • Shortest common superstring and greedy algorithms
  • Introduction to de Bruijn graphs and their application in genome assembly
  • Eulerian paths and practical genome assembly considerations

Module 5: Final Project

Estimated time: 3 hours

  • Apply string matching and indexing techniques to real sequencing data
  • Implement alignment and edit distance algorithms
  • Perform genome assembly using de Bruijn or overlap graphs

Prerequisites

  • Basic familiarity with Python programming
  • Introductory knowledge of algorithms and data structures
  • Some exposure to biological concepts (helpful but not required)

What You'll Be Able to Do After

  • Understand the core principles of DNA sequencing and its computational challenges
  • Implement and apply string matching and alignment algorithms to genomic data
  • Calculate and interpret Hamming and edit distances for sequence comparison
  • Build and use k-mer indexing, suffix arrays, and overlap graphs for genome analysis
  • Perform genome assembly using de Bruijn graphs and evaluate results
View Full Course Review

Course AI Assistant Beta

Hi! I can help you find the perfect online course. Ask me something like “best Python course for beginners” or “compare data science courses”.