When data leakage turns into a flood of trouble (Practical AI #109)

Changelog Master Feed

English - October 20, 2020 14:10 - 48 minutes - 44.5 MB - ★★★★ - 28 ratings
Technology Education How To changelog open source oss software development developer hacker Homepage Download Apple Podcasts Google Podcasts Overcast Castro Pocket Casts RSS feed

Previous Episode: Shopify’s massive storefront rewrite (Changelog Interviews #416)

Next Episode: Podcasting platform Q&A (Backstage #15)

Rajiv Shah teaches Daniel and Chris about data leakage, and its major impact upon machine learning models. It’s the kind of topic that we don’t often think about, but which can ruin our results. Raj discusses how to use activation maps and image embedding to find leakage, so that leaking information in our test set does not find its way into our training set.

Leave us a comment

Changelog++ members get a bonus 1 minute at the end of this episode and zero ads. Join today!

Sponsors:

DigitalOcean – DigitalOcean’s developer cloud makes it simple to launch in the cloud and scale up as you grow. They have an intuitive control panel, predictable pricing, team accounts, worldwide availability with a 99.99% uptime SLA, and 24/7/365 world-class support to back that up. Get your $100 credit at do.co/changelog.
Changelog++ – You love our content and you want to take it to the next level by showing your support. We’ll take you closer to the metal with no ads, extended episodes, outtakes, bonus content, a deep discount in our merch store (soon), and more to come. Let’s do this!
Fastly – Our bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com.

Featuring:

Rajiv Shah – Twitter, GitHub, LinkedIn, WebsiteChris Benson – Twitter, GitHub, LinkedIn, WebsiteDaniel Whitenack – Twitter, GitHub, Website

Show Notes:

Rajiv Shah | University of Illinois at Chicago
Rajiv Shah | DataRobot Blog
DataRobot

Something missing or broken? PRs welcome!

When data leakage turns into a flood of trouble (Practical AI #109)

Changelog Master Feed

Twitter Mentions