Projects with this topic
-
This project is an end-to-end implementation of a Data Lakehouse on AWS. It processes raw IoT sensor data from mobile applications and physical devices into a curated dataset designed for Machine Learning model training.
Updated -
Sparkify, a music streaming startup, has grown their user base and song database and is moving their data processing to the cloud. This project implements a high-performance ETL (Extract, Transform, Load) pipeline that extracts raw JSON logs and metadata from Amazon S3, stages the data in Amazon Redshift and transforms it into a Star Schema optimized for song play analysis and business intelligence.
This architecture enables the analytics team to gain deeper insights into user behavior, popular songs, and artist performance with high-concurrency and low-latency queries.
Updated