Optimized Web Crawling
Linear Digressions
English - October 28, 2018 23:56 - 21 minutes - 9.86 MB - ★★★★★ - 350 ratingsTechnology data science machine learning linear digressions Homepage Download Apple Podcasts Google Podcasts Overcast Castro Pocket Casts RSS feed
Previous Episode: Better Know a Distribution: The Poisson Distribution
Next Episode: Optimized Optimized Web Crawling
Got a fun optimization problem for you this week! It’s a two-for-one: how do you optimize the web crawling logic of an operation like Google search so that the results are, on average, as up-to-date as possible, and how do you optimize your solution of choice so that it’s maintainable by software engineers in a huge distributed system? We’re following an excellent post from the Unofficial Google Data Science blog going through this problem.
Relevant links: http://www.unofficialgoogledatascience.com/2018/07/by-bill-richoux-critical-decisions-are.html