Projects with this topic
Sort by:
-
A Project based around a simulated music streaming app (Spotify for example) generating live user events, to practice handling real time streaming of large amounts of data.
Updated -
Stack Exchange releases "data dumps" of all its publicly available content roughly every three months via archive.org.
This project is an example and a framework for building ETL for this data with Apache Spark and Java.
Updated