110 - Natural Questions, with Tom Kwiatkowski and Michael Collins

NLP Highlights

English - April 06, 2020 18:52 - 43 minutes - 39.8 MB - ★★★★★ - 22 ratings
Science Homepage Download Apple Podcasts Google Podcasts Overcast Castro Pocket Casts RSS feed

Previous Episode: 109 - What Does Your Model Know About Language, with Ellie Pavlick

Next Episode: 111 - Typologically diverse, multi-lingual, information-seeking questions, with Jon Clark

In this episode, Tom Kwiatkowski and Michael Collins talk about Natural Questions, a benchmark for question answering research. We discuss how the dataset was collected to reflect naturally-occurring questions, the criteria used for identifying short and long answers, how this dataset differs from other QA datasets, and how easy it might be to game the benchmark with superficial processing of the text. We also contrast the holistic design in Natural Questions to deliberately targeting specific linguistic phenomena of interest when building a QA dataset.

Dataset: https://ai.google.com/research/NaturalQuestions
Paper: https://www.mitpressjournals.org/doi/full/10.1162/tacl_a_00276