MOC - Data Engineering
Overview
About
This note serves as an index for all notes related to Data Engineering - building systems for collecting, storing, and analyzing data at scale.
Core Areas
Data Pipelines
- ETL/ELT processes
- Workflow orchestration (Airflow, Prefect)
- Stream processing
Data Storage
- Data warehouses and lakes
- Distributed storage systems
- Data formats (Parquet, Avro)
Infrastructure
- Container orchestration
- Infrastructure as Code
- Monitoring and observability
Data Quality
- Data validation
- Testing pipelines
- Documentation and lineage
Related MOCs
Parent/Broader MOCs
- MOC - Computer Science - Theoretical foundations
- MOC - Development - Software engineering context
Sibling MOCs (Same Level)
- MOC - Data Science - Analytics and ML that consume pipelines
- MOC - Databases - Storage layer technologies
- MOC - Cloud - Infrastructure and managed services
Child/Specialized MOCs
- MOC - Geospatial - Spatial data engineering
Language-Specific MOCs
- MOC - Python - Primary data engineering language (Airflow, dbt, Polars)
- MOC - R - Analytics and pipeline integration (targets, arrow)
Notes
NOTE
Currently, there are individual notes with the
#Topic/Data Engineeringtag.
Appendix
Note created on 2025-12-31 and last modified on 2025-12-31.
Backlinks
(c) No Clocks, LLC | 2025