Project preview
In this course, we will focus on asset-aware orchestrators and how they make data pipelines easier to manage. You’ll use Dagster, an open-source orchestrator, to build a sample data pipeline.
Using data from NYC OpenData, you’ll build a data pipeline that:
- Extracts the data, stored in Parquet files, from NYC OpenData
- Loads it into a DuckDB database
- Transforms and prepares it for analysis
- Creates a visualization using the transformed data
If you get stuck or want to jump ahead, check out the finished project here on GitHub.