Data Engineering Foundations Specialization Course Syllabus
Full curriculum breakdown — modules, lessons, estimated time, and outcomes.
Overview (80-120 words) describing structure and time commitment.
Module 1: Introduction to Data Engineering
Estimated time: 6 hours
- Data engineer roles and responsibilities
- Understanding the data lifecycle
- Foundations of data architecture
- Case studies in real-world data engineering
Module 2: Introduction to Relational Databases (RDBMS)
Estimated time: 12 hours
- SQL basics and syntax
- Entity-Relationship (ER) diagrams
- Database normalization principles
- Working with indexes and querying tables
Module 3: Introduction to NoSQL Databases
Estimated time: 12 hours
- Types of NoSQL databases: document, key-value, column, graph
- Working with JSON data structures
- Using MongoDB for non-relational data storage
Module 4: ETL and Data Pipelines with Shell, Airflow, and Kafka
Estimated time: 18 hours
- Data ingestion techniques
- Transformation processes in ETL
- Scheduling pipelines with Apache Airflow
- Stream processing with Kafka simulations
Module 5: Data Warehouses, Lakes, and Business Intelligence
Estimated time: 10 hours
- Introduction to data warehouses and data lakes
- Role of ETL in data integration
- Connecting pipelines to business intelligence systems
Module 6: Final Project
Estimated time: 10 hours
- Design a complete data pipeline from source to analysis
- Use both relational and NoSQL databases
- Implement a scheduled ETL workflow using Airflow
Prerequisites
- Basic computer literacy
- Familiarity with command-line interface (CLI)
- No prior programming experience required
What You'll Be Able to Do After
- Explain core data engineering concepts and roles
- Write and execute SQL queries on relational databases
- Store and query data using NoSQL databases like MongoDB
- Build and schedule ETL pipelines using Apache Airflow
- Process streaming data using Kafka-inspired simulations