#deepmind #rl #society




This is an in-depth paper review, followed by an interview with the papers' authors!


Society is ruled by norms, and most of these norms are very useful, such as washing your hands before cooking. However, there also exist plenty of social norms which are essentially arbitrary, such as what hairstyles are acceptable, or what words are rude. These are called "silly rules". This paper uses multi-agent reinforcement learning to investigate why such silly rules exist. Their results indicate a plausible mechanism, by which the existence of silly rules drastically speeds up the agents' acquisition of the skill of enforcing rules, which generalizes well, and therefore a society that has silly rules will be better at enforcing rules in general, leading to faster adaptation in the face of genuinely useful norms.




OUTLINE:


0:00 - Intro


3:00 - Paper Overview


5:20 - Why are some social norms arbitrary?


11:50 - Reinforcement learning environment setup


20:00 - What happens if we introduce a "silly" rule?


25:00 - Experimental Results: how silly rules help society


30:10 - Isolated probing experiments


34:30 - Discussion of the results


37:30 - Start of Interview


39:30 - Where does the research idea come from?


44:00 - What is the purpose behind this research?


49:20 - Short recap of the mechanics of the environment


53:00 - How much does such a closed system tell us about the real world?


56:00 - What do the results tell us about silly rules?


1:01:00 - What are these agents really learning?


1:08:00 - How many silly rules are optimal?


1:11:30 - Why do you have separate weights for each agent?


1:13:45 - What features could be added next?


1:16:00 - How sensitive is the system to hyperparameters?


1:17:20 - How to avoid confirmation bias?


1:23:15 - How does this play into progress towards AGI?


1:29:30 - Can we make real-world recommendations based on this?


1:32:50 - Where do we go from here?




Paper: https://www.pnas.org/doi/10.1073/pnas...


Blog: https://deepmind.com/research/publica...




Abstract:


The fact that humans enforce and comply with norms is an important reason why humans enjoy higher levels of cooperation and welfare than other animals. Some norms are relatively easy to explain; they may prohibit obviously harmful or uncooperative actions. But many norms are not easy to explain. For example, most cultures prohibit eating certain kinds of foods and almost all societies have rules about what constitutes appropriate clothing, language, and gestures. Using a computational model focused on learning shows that apparently pointless rules can have an indirect effect on welfare. They can help agents learn how to enforce and comply with norms in general, improving the group’s ability to enforce norms that have a direct effect on welfare.




Authors: Raphael Köster, Dylan Hadfield-Menell, Richard Everett, Laura Weidinger, Gillian K. Hadfield, Joel Z. Leibo




Links:


TabNine Code Completion (Referral): http://bit.ly/tabnine-yannick


YouTube: https://www.youtube.com/c/yannickilcher


Twitter: https://twitter.com/ykilcher


Discord: https://discord.gg/4H8xxDF


BitChute: https://www.bitchute.com/channel/yann...


LinkedIn: https://www.linkedin.com/in/ykilcher


BiliBili: https://space.bilibili.com/2017636191




If you want to support me, the best thing to do is to share out the content :)

Twitter Mentions