Show Notes(01:35) Ville recalled his education getting degrees in Computer Science from the University of Helsinki in Finland.(04:35) Ville walked over his time working at a startup called Gurusoft that planned to commercialize self-organizing maps, a peculiar artificial neural network.(07:17) Ville reflected on his four years as a researcher at Nokia — working on big data infrastructure, analytics, and ML open-source projects (such as Disco and Ringo).(11:56) Ville shared the story of co-founding a startup that built a novel scriptable data platform called Bitdeli with his brother and not finding a product-market fit.(13:58) Ville walked through AdRoll’s acquisition of Bitdeli in June 2013.(15:49) Ville discussed the engineering challenges associated with his work at AdRoll — AdRoll Prospecting and traildb.io.(19:33) Ville mentioned the product and leadership/management lessons during his time being AdRoll’s Head of Data and leading various data/ML efforts.(24:43) Ville rationalized his decision to join the ML Infrastructure team at Netflix in 2017.(27:26) Ville discussed the motivation behind the creation of Netflix’s human-centric ML infrastructure, Metaflow, later open-sourced in 2019.(30:21) Ville unpacked the key design principles that summarize the philosophy of Metaflow, which is influenced by the unique culture at Netflix.(35:00) Ville talked about his well-known diagram on the data infrastructure’s hierarchy of needs.(37:33) Ville examined the technical details behind Metaflow’s integration with AWS to make it easy for users to move back and forth between their local and remote modes of development and execution.(40:58) Ville expressed the challenges of finding Metaflow’s early adopters internally at Netflix and externally later on at other companies.(45:13) Ville went over the strategy around prioritizing features for Metaflow’s future roadmap.(52:22) Ville shared the story behind the founding of Outerbounds, which he co-founded with Savin Goyal and Oleg Avdeev.(55:03) Ville provided his thoughts behind Metaflow’s contributors in a way that can generate valuable product feedback for Outerbounds.(58:30) Ville shared valuable hiring lessons to attract the right people who are excited about Outerbounds’ mission.(01:01:28) Ville shared upcoming initiatives that he is most excited about for Outerbounds.(01:04:05) Ville walked through his writing process for an upcoming technical book with Manning called “Effective Data Science Infrastructure,” a hands-on guide to assembling infrastructure for data science and machine learning applications.(01:06:34) Ville unpacked his great O’Reilly article that digs deep into the fundamentals of ML as an engineering discipline.(01:11:03) Closing segment.Ville’s Contact InfoLinkedInTwitterGitHubOuterboundsWebsite | Twitter | LinkedIn | GitHub | YouTubeMetaflow GitHub | Metaflow DocsSlack CommunityCareersMetaflow Resources for Data ScienceMetaflow Resources for EngineeringMentioned ContentTalksSF Data Mining Meetup: TrailDB — Processing Trillions of Events at AdRoll (July 2016)QConSF 2018: Human-Centric Machine Learning Infrastructure @Netflix (Feb 2019)AWS re:Invent 2019: More Data Science with Less Engineering — ML Infrastructure at Netflix (Dec 2019)Scale By The Bay 2019: Human-Centric ML Infrastructure at Netflix (Jan 2020)AICamp: Metaflow — The ML Infrastructure at Netflix (Aug 2021)ArticlesOpen-Sourcing Metaflow, a Human-Centric Framework for Data Science (Netflix Tech Blog, Dec 2019)Unbundling Data Science Workflows with Metaflow and AWS Step Functions (Netflix Tech Blog, July 2020)MLOps and DevOps: Why Data Makes It Different (O’Reilly, Oct 2021)PeopleMichael Jordan (Distinguished Professor in EECS and Statistics at UC Berkeley)Matthew Honnibal and Ines Montani (Creators of open-source NLP library spaCy)Hadley Wickham (Chief Scientist at RStudio and Adjunct Professor of Statistics at Rice University)Book“The Mom Test” (by Rob Fitzpatrick)Notes

My conversation with Ville was recorded back in October 2021. Since then, many things have happened at Outerbounds. I’d recommend:

Visiting Outerbounds’ new website with Metaflow resources for Data Science and EngineeringWatching Ville’s recent talk at Data Council Austin about the Modern Stack for ML InfrastructureBuying Ville’s newly released book “Effective Data Science Infrastructure”About the show

Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.

Datacast is produced and edited by James Le. Get in touch with feedback or guest suggestions by emailing [email protected].

Subscribe by searching for Datacast wherever you get podcasts or click one of the links below:

Listen on SpotifyListen on Apple PodcastsListen on Google Podcasts

If you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.


About the show

Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.

Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email [email protected].

Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:

Listen on SpotifyListen on Apple PodcastsListen on Google Podcasts

If you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.

Twitter Mentions