使用Player FM应用程序离线!
值得一听的播客
赞助


The counting argument for scheming (Sections 4.1 and 4.2 of "Scheming AIs")
Manage episode 387983838 series 3402048
This is sections 4.1 and 4.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?”
Text of the report here: https://arxiv.org/abs/2311.08379
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
章节
1. The counting argument for scheming (Sections 4.1 and 4.2 of "Scheming AIs") (00:00:00)
2. 4. Arguments for/against scheming that focus on the final properties of the (00:00:32)
3. 4.1 Contributors to reward vs. extra criteria (00:01:08)
4. 4.2 The counting argument (00:03:42)
63集单集
Manage episode 387983838 series 3402048
This is sections 4.1 and 4.2 of my report “Scheming AIs: Will AIs fake alignment during training in order to get power?”
Text of the report here: https://arxiv.org/abs/2311.08379
Summary of the report here: https://joecarlsmith.com/2023/11/15/new-report-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
Audio summary here: https://joecarlsmithaudio.buzzsprout.com/2034731/13969977-introduction-and-summary-of-scheming-ais-will-ais-fake-alignment-during-training-in-order-to-get-power
章节
1. The counting argument for scheming (Sections 4.1 and 4.2 of "Scheming AIs") (00:00:00)
2. 4. Arguments for/against scheming that focus on the final properties of the (00:00:32)
3. 4.1 Contributors to reward vs. extra criteria (00:01:08)
4. 4.2 The counting argument (00:03:42)
63集单集
所有剧集
×



1 Arguments for/against scheming that focus on the path SGD takes (Section 3 of "Scheming AIs") 29:03


1 How useful for alignment-relevant work are AIs with short-term goals? (Section 2.2.4.3 of "Scheming AIs") 9:21

1 Is scheming more likely if you train models to have long-term goals? (Sections 2.2.4.1-2.2.4.2 of "Scheming AIs") 9:01






1 Full audio for "Scheming AIs: Will AIs fake alignment during training in order to get power?" 6:13:17

1 Introduction and summary of "Scheming AIs: Will AIs fake alignment during training in order to get power?" 56:32
欢迎使用Player FM
Player FM正在网上搜索高质量的播客,以便您现在享受。它是最好的播客应用程序,适用于安卓、iPhone和网络。注册以跨设备同步订阅。