Building a Machine Learning Pipeline from Scratch Course Syllabus

Full curriculum breakdown — modules, lessons, estimated time, and outcomes.

Overview: This course provides a hands-on, project-driven introduction to building end-to-end machine learning pipelines from scratch. You'll learn to transform experimental code into production-grade systems using software engineering best practices, all within a browser-based interactive environment. With approximately 4 hours of total content, the course guides you through designing, structuring, testing, and extending ML pipelines—no setup required. Each module combines foundational concepts with immediate coding exercises to reinforce learning.

Module 1: Course Goals & Structure

Estimated time: 0.2 hours

  • Intended audience and prerequisites
  • Course goals and learning outcomes
  • Structure and navigation
  • Strengths of the interactive format

Module 2: Getting Started

Estimated time: 0.3 hours

  • Why use ML pipelines over notebooks
  • Defining ML training pipelines
  • Understanding pipeline components
  • Completing the Getting Started quiz

Module 3: Structuring the ML Pipeline

Estimated time: 0.5 hours

  • System architecture for ML pipelines
  • Directory layout and code organization
  • Dependency management
  • Project scaffolding

Module 4: Directed Acyclic Graphs (DAGs)

Estimated time: 0.3 hours

  • DAG fundamentals in pipeline orchestration
  • Topological sorting of tasks
  • Implementing a DAG for workflow control

Module 5: Building the ML Library

Estimated time: 0.8 hours

  • Object-oriented programming for ML components
  • Using OmegaConf for configuration management
  • Designing abstract base classes for datasets, models, and reports

Module 6: The Pipeline Core

Estimated time: 0.8 hours

  • Command-line interface parsing with argparse
  • Experiment tracking integration
  • Logging and docstrings for maintainability

Module 7: Extending the Pipeline

Estimated time: 0.5 hours

  • Adding support for new datasets
  • Extending to new model types
  • Hands-on extension to a second dataset

Module 8: Testing

Estimated time: 0.5 hours

  • Unit testing principles
  • Using pytest for function validation
  • System testing pipeline components

Prerequisites

  • Familiarity with Python programming
  • Basic understanding of machine learning concepts
  • Experience with Jupyter notebooks (helpful but not required)

What You'll Be Able to Do After

  • Design and structure production-ready ML pipelines
  • Orchestrate workflows using Directed Acyclic Graphs (DAGs)
  • Build reusable and modular ML components
  • Implement logging, configuration, and CLI interfaces
  • Write and run tests for ML pipeline functions
View Full Course Review

Course AI Assistant Beta

Hi! I can help you find the perfect online course. Ask me something like “best Python course for beginners” or “compare data science courses”.