Show Notes:

(2:00) Arthur talked about his undergraduate studying Psychology at North Carolina State University.(3:28) Arthur mentioned his time working as a research assistant at the LACElab in NCSU that does human factor and cognition research.(5:08) Arthur discussed his decision to pursue a graduate degree in Cognitive Neuroscience at the University of Oregon right after college.(6:35) Arthur went over his Master's thesis (Navigation performance in virtual environments varies with fractal dimension of landscape) in more detail(10:30) Arthur unpacked his popular blog series called “Simple Reinforcement Learning in TensorFlow” on Medium.(12:56) Arthur recalled his decision to join Unity to work on its reinforcement learning problems.(14:31) Arthur recalled his choice to do the Ph.D. part-time while working full-time.(16:24) Arthur discussed problems with existing reinforcement learning simulation platforms and how the Unity Machine Learning Agents Toolkit addresses those.(18:30) Arthur went over the challenges of maintaining and continuously iterating the Unity ML Agents toolkit.(20:36) Arthur emphasized the benefit of training the agents with an additional curiosity-based intrinsic reward, which is inspired from a paper from UC Berkeley researchers (check out the Unity blog post).(22:33) Arthur talked about the challenges of implementing such curiosity-based techniques.(25:15) Arthur unpacked the introduction of the Obstacle Tower - a high fidelity, 3D, third person, procedurally generated environment - released in the latest version of the toolkit (read his blog post “On “solving” Montezuma’s Revenge”).(29:15) Arthur discussed the Obstacle Tower Challenge, a contest that offers researchers and developers the chance to compete to train the best-performing agents on the Obstacle Tower Environment.(32:49) Referring to his fun tutorial called “GANs explained with a classic sponge bob squarepants episode,” Arthur walked through the theory behind the Generative Adversarial Network algorithm via an explanation using an episode of Spongebob Squarepants.(34:30) Arthur extrapolated on his post “RL or Evolutionary Strategies? Nature has a solution: Both.”(38:36) Arthur shared a couple of approaches to balance the bias and variance tradeoff in reinforcement learning models, referring to his article “Making sense of the bias/variance tradeoff in Deep RL.”(41:19) Arthur talked about successor representations and their applications in deep learning, psychology, and neuroscience (read his post "The present in terms of the future: Successor representations in RL”).(42:38) Arthur reflected on the benefits of his Psychology and Neuroscience background for his research career.(44:21) Arthur shared his advice for graduate students who want to make a dent in the AI / ML research community.(45:30) Closing segment.

His Contact Info:

TwitterGitHubMediumLinkedInGoogle ScholarUnity Blog

His Recommended Resources:

DeepMindGoogle BrainBeing and Time (by Martin Heidegger)

About the show

Datacast features long-form, in-depth conversations with practitioners and researchers in the data community to walk through their professional journeys and unpack the lessons learned along the way. I invite guests coming from a wide range of career paths — from scientists and analysts to founders and investors — to analyze the case for using data in the real world and extract their mental models (“the WHY and the HOW”) behind their pursuits. Hopefully, these conversations can serve as valuable tools for early-stage data professionals as they navigate their own careers in the exciting data universe.

Datacast is produced and edited by James Le. For inquiries about sponsoring the podcast, email [email protected].

Subscribe by searching for Datacast wherever you get podcasts, or click one of the links below:

Listen on SpotifyListen on Apple PodcastsListen on Google Podcasts

If you’re new, see the podcast homepage for the most recent episodes to listen to, or browse the full guest list.

Twitter Mentions