Data Engineering, Big Data, and Machine Learning on GCP Course Syllabus
Full curriculum breakdown — modules, lessons, estimated time, and outcomes.
Overview (80-120 words) describing structure and time commitment.
Module 1: Modernizing Data Lakes and Data Warehouses with Google Cloud
Estimated time: 8 hours
- Differentiate between data lakes and data warehouses
- Explore use-cases for data lakes and data warehouses
- Examine available GCP solutions for storage
- Discuss the role of a data engineer
- Analyze benefits of a successful data pipeline to business operations
Module 2: Building Batch Data Pipelines on Google Cloud
Estimated time: 17 hours
- Review data loading methods: EL, ELT, and ETL
- Run Hadoop on Dataproc and leverage Cloud Storage
- Optimize Dataproc jobs for performance and cost
- Build data processing pipelines using Dataflow
- Manage and monitor data pipeline performance
Module 3: Building Resilient Streaming Analytics Systems on Google Cloud
Estimated time: 12 hours
- Design streaming data pipelines using Pub/Sub and Dataflow
- Implement real-time analytics solutions
- Ensure reliability and scalability in streaming systems
- Monitor and troubleshoot streaming data pipelines
Module 4: Smart Analytics, Machine Learning, and AI on Google Cloud
Estimated time: 12 hours
- Explore Google’s AI and machine learning tools
- Implement machine learning models using BigQuery ML
- Apply Vertex AI for model development and deployment
- Integrate AI solutions into data pipelines
- Understand ethical considerations in AI and machine learning
Module 5: Preparation for Google Cloud Professional Data Engineer Certification
Estimated time: 10 hours
- Review key data engineering concepts on GCP
- Practice exam-style questions and scenarios
- Identify core requirements for certification
Module 6: Final Project
Estimated time: 15 hours
- Design an end-to-end data pipeline on GCP
- Incorporate batch and streaming components
- Apply machine learning using BigQuery ML or Vertex AI
Prerequisites
- Familiarity with Python programming
- Basic understanding of cloud computing concepts
- Experience with data processing or analytics workflows
What You'll Be Able to Do After
- Understand the roles and responsibilities of a data engineer
- Design and build data processing systems on Google Cloud Platform (GCP)
- Build end-to-end data pipelines using GCP tools and services
- Analyze data and carry out machine learning tasks on GCP
- Prepare for the Google Cloud Professional Data Engineer certification