Artwork

Вміст надано Daniel Filan. Весь вміст подкастів, включаючи епізоди, графіку та описи подкастів, завантажується та надається безпосередньо компанією Daniel Filan або його партнером по платформі подкастів. Якщо ви вважаєте, що хтось використовує ваш захищений авторським правом твір без вашого дозволу, ви можете виконати процедуру, описану тут https://uk.player.fm/legal.
Player FM - додаток Podcast
Переходьте в офлайн за допомогою програми Player FM !

25 - Cooperative AI with Caspar Oesterheld

3:02:09
 
Поширити
 

Manage episode 378703350 series 2844728
Вміст надано Daniel Filan. Весь вміст подкастів, включаючи епізоди, графіку та описи подкастів, завантажується та надається безпосередньо компанією Daniel Filan або його партнером по платформі подкастів. Якщо ви вважаєте, що хтось використовує ваш захищений авторським правом твір без вашого дозволу, ви можете виконати процедуру, описану тут https://uk.player.fm/legal.

Imagine a world where there are many powerful AI systems, working at cross purposes. You could suppose that different governments use AIs to manage their militaries, or simply that many powerful AIs have their own wills. At any rate, it seems valuable for them to be able to cooperatively work together and minimize pointless conflict. How do we ensure that AIs behave this way - and what do we need to learn about how rational agents interact to make that more clear? In this episode, I'll be speaking with Caspar Oesterheld about some of his research on this very topic.

Patreon: patreon.com/axrpodcast

Ko-fi: ko-fi.com/axrpodcast

Episode art by Hamish Doodles: hamishdoodles.com

Topics we discuss, and timestamps:

- 0:00:34 - Cooperative AI

- 0:06:21 - Cooperative AI vs standard game theory

- 0:19:45 - Do we need cooperative AI if we get alignment?

- 0:29:29 - Cooperative AI and agent foundations

- 0:34:59 - A Theory of Bounded Inductive Rationality

- 0:50:05 - Why it matters

- 0:53:55 - How the theory works

- 1:01:38 - Relationship to logical inductors

- 1:15:56 - How fast does it converge?

- 1:19:46 - Non-myopic bounded rational inductive agents?

- 1:24:25 - Relationship to game theory

- 1:30:39 - Safe Pareto Improvements

- 1:30:39 - What they try to solve

- 1:36:15 - Alternative solutions

- 1:40:46 - How safe Pareto improvements work

- 1:51:19 - Will players fight over which safe Pareto improvement to adopt?

- 2:06:02 - Relationship to program equilibrium

- 2:11:25 - Do safe Pareto improvements break themselves?

- 2:15:52 - Similarity-based Cooperation

- 2:23:07 - Are similarity-based cooperators overly cliqueish?

- 2:27:12 - Sensitivity to noise

- 2:29:41 - Training neural nets to do similarity-based cooperation

- 2:50:25 - FOCAL, Caspar's research lab

- 2:52:52 - How the papers all relate

- 2:57:49 - Relationship to functional decision theory

- 2:59:45 - Following Caspar's research

The transcript: axrp.net/episode/2023/10/03/episode-25-cooperative-ai-caspar-oesterheld.html

Links for Caspar:

- FOCAL at CMU: www.cs.cmu.edu/~focal/

- Caspar on X, formerly known as Twitter: twitter.com/C_Oesterheld

- Caspar's blog: casparoesterheld.com/

- Caspar on Google Scholar: scholar.google.com/citations?user=xeEcRjkAAAAJ&hl=en&oi=ao

Research we discuss:

- A Theory of Bounded Inductive Rationality: arxiv.org/abs/2307.05068

- Safe Pareto improvements for delegated game playing: link.springer.com/article/10.1007/s10458-022-09574-6

- Similarity-based Cooperation: arxiv.org/abs/2211.14468

- Logical Induction: arxiv.org/abs/1609.03543

- Program Equilibrium: citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=e1a060cda74e0e3493d0d81901a5a796158c8410

- Formalizing Objections against Surrogate Goals: www.alignmentforum.org/posts/K4FrKRTrmyxrw5Dip/formalizing-objections-against-surrogate-goals

- Learning with Opponent-Learning Awareness: arxiv.org/abs/1709.04326

  continue reading

49 епізодів

Artwork
iconПоширити
 
Manage episode 378703350 series 2844728
Вміст надано Daniel Filan. Весь вміст подкастів, включаючи епізоди, графіку та описи подкастів, завантажується та надається безпосередньо компанією Daniel Filan або його партнером по платформі подкастів. Якщо ви вважаєте, що хтось використовує ваш захищений авторським правом твір без вашого дозволу, ви можете виконати процедуру, описану тут https://uk.player.fm/legal.

Imagine a world where there are many powerful AI systems, working at cross purposes. You could suppose that different governments use AIs to manage their militaries, or simply that many powerful AIs have their own wills. At any rate, it seems valuable for them to be able to cooperatively work together and minimize pointless conflict. How do we ensure that AIs behave this way - and what do we need to learn about how rational agents interact to make that more clear? In this episode, I'll be speaking with Caspar Oesterheld about some of his research on this very topic.

Patreon: patreon.com/axrpodcast

Ko-fi: ko-fi.com/axrpodcast

Episode art by Hamish Doodles: hamishdoodles.com

Topics we discuss, and timestamps:

- 0:00:34 - Cooperative AI

- 0:06:21 - Cooperative AI vs standard game theory

- 0:19:45 - Do we need cooperative AI if we get alignment?

- 0:29:29 - Cooperative AI and agent foundations

- 0:34:59 - A Theory of Bounded Inductive Rationality

- 0:50:05 - Why it matters

- 0:53:55 - How the theory works

- 1:01:38 - Relationship to logical inductors

- 1:15:56 - How fast does it converge?

- 1:19:46 - Non-myopic bounded rational inductive agents?

- 1:24:25 - Relationship to game theory

- 1:30:39 - Safe Pareto Improvements

- 1:30:39 - What they try to solve

- 1:36:15 - Alternative solutions

- 1:40:46 - How safe Pareto improvements work

- 1:51:19 - Will players fight over which safe Pareto improvement to adopt?

- 2:06:02 - Relationship to program equilibrium

- 2:11:25 - Do safe Pareto improvements break themselves?

- 2:15:52 - Similarity-based Cooperation

- 2:23:07 - Are similarity-based cooperators overly cliqueish?

- 2:27:12 - Sensitivity to noise

- 2:29:41 - Training neural nets to do similarity-based cooperation

- 2:50:25 - FOCAL, Caspar's research lab

- 2:52:52 - How the papers all relate

- 2:57:49 - Relationship to functional decision theory

- 2:59:45 - Following Caspar's research

The transcript: axrp.net/episode/2023/10/03/episode-25-cooperative-ai-caspar-oesterheld.html

Links for Caspar:

- FOCAL at CMU: www.cs.cmu.edu/~focal/

- Caspar on X, formerly known as Twitter: twitter.com/C_Oesterheld

- Caspar's blog: casparoesterheld.com/

- Caspar on Google Scholar: scholar.google.com/citations?user=xeEcRjkAAAAJ&hl=en&oi=ao

Research we discuss:

- A Theory of Bounded Inductive Rationality: arxiv.org/abs/2307.05068

- Safe Pareto improvements for delegated game playing: link.springer.com/article/10.1007/s10458-022-09574-6

- Similarity-based Cooperation: arxiv.org/abs/2211.14468

- Logical Induction: arxiv.org/abs/1609.03543

- Program Equilibrium: citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=e1a060cda74e0e3493d0d81901a5a796158c8410

- Formalizing Objections against Surrogate Goals: www.alignmentforum.org/posts/K4FrKRTrmyxrw5Dip/formalizing-objections-against-surrogate-goals

- Learning with Opponent-Learning Awareness: arxiv.org/abs/1709.04326

  continue reading

49 епізодів

Усі епізоди

×
 
Loading …

Ласкаво просимо до Player FM!

Player FM сканує Інтернет для отримання високоякісних подкастів, щоб ви могли насолоджуватися ними зараз. Це найкращий додаток для подкастів, який працює на Android, iPhone і веб-сторінці. Реєстрація для синхронізації підписок між пристроями.

 

Короткий довідник

Слухайте це шоу, досліджуючи
Відтворити