Can We Scale Human Feedback For Complex AI Tasks? AI Safety Fundamentals: Alignment podcast

Artwork

Tech Society Philosophy Blue Dot Impact

内容由BlueDot Impact提供。所有播客内容（包括剧集、图形和播客描述）均由 BlueDot Impact 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品，您可以按照此处概述的流程进行操作https://zh.player.fm/legal。

AI Safety Fundamentals: Alignment « »
Can We Scale Human Feedback for Complex AI Tasks?

1M ago 20:06

分享

MP3•单集首页

内容由BlueDot Impact提供。所有播客内容（包括剧集、图形和播客描述）均由 BlueDot Impact 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品，您可以按照此处概述的流程进行操作https://zh.player.fm/legal。

Reinforcement learning from human feedback (RLHF) has emerged as a powerful technique for steering large language models (LLMs) toward desired behaviours. However, relying on simple human feedback doesn’t work for tasks that are too complex for humans to accurately judge at the scale needed to train AI models. Scalable oversight techniques attempt to address this by increasing the abilities of humans to give feedback on complex tasks.

This article briefly recaps some of the challenges faced with human feedback, and introduces the approaches to scalable oversight covered in session 4 of our AI Alignment course.
Source:
https://aisafetyfundamentals.com/blog/scalable-oversight-intro/
Narrated for AI Safety Fundamentals by Perrin Walker

… continue reading

60集单集

#Tech #Society #Philosophy #Blue Dot Impact

Artwork

Can We Scale Human Feedback for Complex AI Tasks?

AI Safety Fundamentals: Alignment

published 1M ago

分享

MP3•单集首页

内容由BlueDot Impact提供。所有播客内容（包括剧集、图形和播客描述）均由 BlueDot Impact 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品，您可以按照此处概述的流程进行操作https://zh.player.fm/legal。

Reinforcement learning from human feedback (RLHF) has emerged as a powerful technique for steering large language models (LLMs) toward desired behaviours. However, relying on simple human feedback doesn’t work for tasks that are too complex for humans to accurately judge at the scale needed to train AI models. Scalable oversight techniques attempt to address this by increasing the abilities of humans to give feedback on complex tasks.

This article briefly recaps some of the challenges faced with human feedback, and introduces the approaches to scalable oversight covered in session 4 of our AI Alignment course.
Source:
https://aisafetyfundamentals.com/blog/scalable-oversight-intro/
Narrated for AI Safety Fundamentals by Perrin Walker

… continue reading

60集单集

#Tech #Society #Philosophy #Blue Dot Impact

All episodes

×

欢迎使用Player FM

Player FM正在网上搜索高质量的播客，以便您现在享受。它是最好的播客应用程序，适用于安卓、iPhone和网络。注册以跨设备同步订阅。

收听超过500个主题