Procella: YouTube's super-system for analytics data storage
Linear Digressions
English - July 06, 2020 02:29 - 29 minutes - 13.6 MB - ★★★★★ - 350 ratingsTechnology data science machine learning linear digressions Homepage Download Apple Podcasts Google Podcasts Overcast Castro Pocket Casts RSS feed
This is a re-release of an episode that originally ran in October 2019.
If you’re trying to manage a project that serves up analytics data for a few very distinct uses, you’d be wise to consider having custom solutions for each use case that are optimized for the needs and constraints of that use cases. You also wouldn’t be YouTube, which found themselves with this problem (gigantic data needs and several very different use cases of what they needed to do with that data) and went a different way: they built one analytics data system to serve them all. Procella, the system they built, is the topic of our episode today: by deconstructing the system, we dig into the four motivating uses of this system, the complexity they had to introduce to service all four uses simultaneously, and the impressive engineering that has to go into building something that “just works.”