Applied Data Science Capstone Course Syllabus
Full curriculum breakdown — modules, lessons, estimated time, and outcomes.
Overview: This capstone course provides a hands-on opportunity to apply the end-to-end data science methodology to a real-world problem. Over approximately 5 weeks with a flexible schedule, learners will progress through key stages of a data science project, including data collection, wrangling, visualization, modeling, and final presentation. Each module builds on the previous one, culminating in a comprehensive project that showcases practical skills in Python, machine learning, and data storytelling. Estimated time commitment is about 20–25 hours total.
Module 1: Introduction and Data Collection
Estimated time: 5 hours
- Understand the project context and objectives
- Identify relevant data sources for analysis
- Access data using APIs
- Extract data via web scraping with BeautifulSoup
Module 2: Data Wrangling and Exploration
Estimated time: 5 hours
- Clean and preprocess raw data
- Handle missing values and data inconsistencies
- Perform exploratory data analysis (EDA)
- Apply statistical methods and visualizations to uncover patterns
Module 3: Data Visualization and Dashboarding
Estimated time: 5 hours
- Create informative visualizations using Matplotlib and Seaborn
- Build interactive maps with Folium
- Develop dynamic dashboards using Plotly Dash
Module 4: Machine Learning and Model Evaluation
Estimated time: 5 hours
- Build classification models using Scikit-learn
- Train and test models including Decision Trees, K-Nearest Neighbors, and Support Vector Machines
- Evaluate model performance using accuracy, precision, recall, and F1-score
- Compare models and refine for better performance
Module 5: Final Report and Presentation
Estimated time: 5 hours
- Compile findings into a comprehensive report
- Summarize methodology, results, and insights
- Prepare a stakeholder-ready presentation
Module 6: Final Project
Estimated time: 5 hours
- Deliverable 1: Complete data collection and cleaning script
- Deliverable 2: Interactive dashboard and visualization notebook
- Deliverable 3: Final report and model evaluation summary
Prerequisites
- Proficiency in Python programming
- Familiarity with data analysis using Pandas
- Understanding of machine learning concepts and techniques
What You'll Be Able to Do After
- Apply the full data science lifecycle to real-world problems
- Collect and clean data from APIs and web sources
- Create interactive visualizations and dashboards
- Build, compare, and evaluate classification models
- Demonstrate end-to-end project skills with a shareable portfolio piece