Joining us today is Senior Director at Facebook AI, Manohar Paluri. Mano discusses the biggest challenges facing the field of computer vision, and the commonalities and differences between first and third-person perception. Manohar dives into the complexity of detecting first-person perception, and how to overcome the privacy and ethical issues of egocentric technology. Manohar breaks down the mechanism underlying AI based on decision trees compared to those based on real-world data, and how they result in two different ideals: transparency or accuracy. 

Key Points From This Episode:

Talking to Manohar Paluri, his background in IT, and how he wound up at Facebook AI. Manohar's advice on the pros and cons of doing a Ph.D.Why computer vision is so complex for machines but so simple for humans. Why the term “computer vision” is not a limiting definition in terms of the sensors used.How computer vision and perception differ. The two problems facing computer vision: recognizing entities and augmenting perception. Personalized data; generalized learning ability; and adaptability: the three problems that are responsible for the low number of entities that computer vision recognizes.Managing the direction Manohar's organization is going: egocentric vision, predicting the impact of modeling, and finding the balance between transparency and accuracy. Find out what the differences are between first- and third-person perception: intention, positioning, and long-form reasoning. The similarity between first- and third-person perception: both are trying to understand the world.Which sensors are required to predict intention: gaze and hand-object-interaction. What the privacy and ethical issues are with regard to egocentric technologies. Why Manohar believes striking a balance between accuracy and transparency will set the standard. The three prospects in AI that excite Manohar the most: the next computing platform, bringing different modalities together, and improved access to technology. 

 

Tweetables:

“What I tell many of the new graduates when they come and ask me about ‘Should I do my Ph.D. or not?’ I tell them that ‘You’re asking the wrong question’. Because it doesn’t matter whether you do a Ph.D. or you don’t do a Ph.D., the path and the journey is going to be as long for anybody to take you seriously on the research side.” — Manohar Paluri [0:02:40]

“Just to give you a sense, there are billions of entities in the world. The best of the computer vision systems today can recognize in the order of tens of thousands or hundreds of thousands, not even a million. So abandoning the problem of core computer vision and jumping into perception would be a mistake in my opinion. There is a lot of work we still need to do in making machines understand this billion entity taxonomy.” — Manohar Paluri [0:11:33]

“We are in the research part of the organization, so whatever we are doing, it’s not like we are building something to launch over the next few months or a year, we are trying to ask ourselves how does the world look like three, five, ten years from now and what are the technological problems?” — Manohar Paluri [0:20:00]

“So my hope is, once you set a standard on transparency while maintaining the accuracy, it will be very hard for anybody to justify why they would not use such a model compared to a more black-box model for a little bit more gain in accuracy.” — Manohar Paluri [0:32:55]

Links Mentioned in Today’s Episode:

Manohar Paluri on LinkedIn

Facebook AI Research Website

Facebook AI Website: Ego4D