![NLP Highlights artwork](https://is1-ssl.mzstatic.com/image/thumb/Podcasts127/v4/a8/49/90/a849903a-65af-d8fc-07a7-c0d1bbf826a6/mza_4767231250788281707.jpg/100x100bb.jpg)
67 - GLUE: A Multi-Task Benchmark and Analysis Platform, with Sam Bowman
NLP Highlights
English - August 27, 2018 18:06 - 39 minutes - 36 MB - ★★★★★ - 22 ratingsScience Homepage Download Apple Podcasts Google Podcasts Overcast Castro Pocket Casts RSS feed
Previous Episode: 66 - Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods, with Jieyu Zhao
Next Episode: 68 - Neural models of factuality, with Rachel Rudinger
Paper by Alex Wang, Amanpreet Singh, Julian Michael, Felix Hill, Omer Levy, and Samuel R. Bowman.
Sam comes on to tell us about GLUE. We talk about the motivation behind setting up a benchmark framework for natural language understanding, how the authors defined "NLU" and chose the tasks for this benchmark, a very nice diagnostic dataset that was constructed for GLUE, and what insight they gained from the experiments they've run so far. We also have some musings about the utility of general-purpose sentence vectors, and about leaderboards.
https://www.semanticscholar.org/paper/GLUE%3A-A-Multi-Task-Benchmark-and-Analysis-Platform-Wang-Singh/a2054eff8b4efe0f1f53d88c08446f9492ae07c1