Arfon Smith from GitHub, and Felipe Hoffa & Will Curran from Google joined the show to talk about BigQuery — the big picture behind Google Cloud’s push to host public datasets, the collaboration between the two companies to expand GitHub’s public dataset, adding query capabilities that have never been possible before, example queries, and more!

Arfon Smith from GitHub, and Felipe Hoffa & Will Curran from Google joined the show to talk about BigQuery — the big picture behind Google Cloud’s push to host public datasets, the collaboration between the two companies to expand GitHub’s public dataset, adding query capabilities that have never been possible before, example queries, and more!

Discuss on Changelog News

Changelog++ members support our work, get closer to the metal, and make the ads disappear. Join today!

Sponsors






Toptal – Take control of your career and join the best at Toptal. Email Adam at [email protected] for a personal introduction to our friends at Toptal.




Linode – Our cloud server of choice! This is what we built our new CMS on. Use the code changelog20 to get 2 months free!




Full Stack Fest 2016 – Early Bird tickets available until July 15. Use the code THECHANGELOG after July 15 to save 75 EUR (before taxes).



Featuring





Arfon Smith – Twitter, GitHub, Website

Felipe Hoffa – Twitter, GitHub

Will Curran – Website

Adam Stacoviak – Twitter, GitHub, LinkedIn, Website

Jerod Santo – Twitter, GitHub

Notes and Links


This show was produced in collaboration with GitHub and Google to announce the big expansion to GitHub’s public dataset on BigQuery.

The Changelog #144: GitHub Archive and Changelog Nightly with Ilya Grigorik
GitHub announcement
Google Cloud Blog announcement
Google Open Source Blog announcement
Felipe Hoffa - GitHub on BigQuery: Analyze all the code
GitHub public dataset — This 3TB+ dataset comprises the largest released source of GitHub activity to date. It contains a full snapshot of the content of more than 2.8 million open source GitHub repositories including more than 145 million unique commits, over 2 billion different file paths, and the contents of the latest revision for 163 million files, all of which are searchable with regular expressions.
NOAA Global Surface Summary of the Day Weather Data
USA Name Data
Google BigQuery
Gist: BigQuery Examples from Arfon Smith
Shawn Pearce (Google) - the unsung hero at Google who did all the hard work getting the data pipeline working for this new dataset
Email [email protected] to talk with Will and BigQuery’s public dataset team

Something missing or broken? PRs welcome!

Twitter Mentions