Spurious normativity enhances learning of compliance and enforcement behavior in artificial agents

Yannic Kilcher Videos (Audio Only)

English - March 10, 2022 11:07 - 1 hour - 89.4 MB - ★★★★★ - 1 rating
Technology Homepage Download Apple Podcasts Google Podcasts Overcast Castro Pocket Casts RSS feed

Previous Episode: First Author Interview: AI & formal math (Formal Mathematics Statement Curriculum Learning)

Next Episode: VOS: Learning What You Don't Know by Virtual Outlier Synthesis (Paper Explained)

#deepmind #rl #society

This is an in-depth paper review, followed by an interview with the papers' authors!

Society is ruled by norms, and most of these norms are very useful, such as washing your hands before cooking. However, there also exist plenty of social norms which are essentially arbitrary, such as what hairstyles are acceptable, or what words are rude. These are called "silly rules". This paper uses multi-agent reinforcement learning to investigate why such silly rules exist. Their results indicate a plausible mechanism, by which the existence of silly rules drastically speeds up the agents' acquisition of the skill of enforcing rules, which generalizes well, and therefore a society that has silly rules will be better at enforcing rules in general, leading to faster adaptation in the face of genuinely useful norms.

OUTLINE:

0:00 - Intro

3:00 - Paper Overview

5:20 - Why are some social norms arbitrary?

11:50 - Reinforcement learning environment setup

20:00 - What happens if we introduce a "silly" rule?

25:00 - Experimental Results: how silly rules help society

30:10 - Isolated probing experiments

34:30 - Discussion of the results

37:30 - Start of Interview

39:30 - Where does the research idea come from?

44:00 - What is the purpose behind this research?

49:20 - Short recap of the mechanics of the environment

53:00 - How much does such a closed system tell us about the real world?

56:00 - What do the results tell us about silly rules?

1:01:00 - What are these agents really learning?

1:08:00 - How many silly rules are optimal?

1:11:30 - Why do you have separate weights for each agent?

1:13:45 - What features could be added next?

1:16:00 - How sensitive is the system to hyperparameters?

1:17:20 - How to avoid confirmation bias?

1:23:15 - How does this play into progress towards AGI?

1:29:30 - Can we make real-world recommendations based on this?

1:32:50 - Where do we go from here?

Paper: https://www.pnas.org/doi/10.1073/pnas...

Blog: https://deepmind.com/research/publica...

Abstract:

The fact that humans enforce and comply with norms is an important reason why humans enjoy higher levels of cooperation and welfare than other animals. Some norms are relatively easy to explain; they may prohibit obviously harmful or uncooperative actions. But many norms are not easy to explain. For example, most cultures prohibit eating certain kinds of foods and almost all societies have rules about what constitutes appropriate clothing, language, and gestures. Using a computational model focused on learning shows that apparently pointless rules can have an indirect effect on welfare. They can help agents learn how to enforce and comply with norms in general, improving the group’s ability to enforce norms that have a direct effect on welfare.

Authors: Raphael Köster, Dylan Hadfield-Menell, Richard Everett, Laura Weidinger, Gillian K. Hadfield, Joel Z. Leibo

Links:

TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick

YouTube: https://www.youtube.com/c/yannickilcher

Twitter: https://twitter.com/ykilcher

Discord: https://discord.gg/4H8xxDF

BitChute: https://www.bitchute.com/channel/yann...

LinkedIn: https://www.linkedin.com/in/ykilcher

BiliBili: https://space.bilibili.com/2017636191

If you want to support me, the best thing to do is to share out the content :)

Twitter Mentions

@ykilcher