Mark is joined in this episode of Drill to Detail by Wes McKinney, to talk about the origins of the Python Pandas open-source package for data analysis and his subsequent work as a contributor to the Kudu (incubating) and Parquet projects within the Apache Software Foundation and Arrow, an in-memory data structure specification for use by engineers building data systems and the de-facto standard for columnar in-memory processing and interchange.

Mark is joined in this episode of Drill to Detail by Wes McKinney, to talk about the origins of the Python Pandas open-source package for data analysis and his subsequent work as a contributor to the Kudu (incubating) and Parquet projects within the Apache Software Foundation and Arrow, an in-memory data structure specification for use by engineers building data systems and the de-facto standard for columnar in-memory processing and interchange.

Python Data Analysis Library"Ibis on Impala: Python at Scale for Data Science"Drill To Detail Ep.3 'Apache Kudu And Cloudera's Analytic Platform' With Special Guest Mike PercyApache Arrow homepage"Apache Arrow and the "10 Things I Hate About pandas""Apache Arrow vs. Parquet and ORC: Do we really need a third Apache project for columnar data representation?""Some comments to Daniel Abadi's blog about Apache Arrow"Wes McKinney homepage