Artwork

内容由TWIML and Sam Charrington提供。所有播客内容(包括剧集、图形和播客描述)均由 TWIML and Sam Charrington 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal
Player FM -播客应用
使用Player FM应用程序离线!

Coercing LLMs to Do and Reveal (Almost) Anything with Jonas Geiping - #678

48:27
 
分享
 

Manage episode 410063625 series 2355587
内容由TWIML and Sam Charrington提供。所有播客内容(包括剧集、图形和播客描述)均由 TWIML and Sam Charrington 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal

Today we're joined by Jonas Geiping, a research group leader at the ELLIS Institute, to explore his paper: "Coercing LLMs to Do and Reveal (Almost) Anything". Jonas explains how neural networks can be exploited, highlighting the risk of deploying LLM agents that interact with the real world. We discuss the role of open models in enabling security research, the challenges of optimizing over certain constraints, and the ongoing difficulties in achieving robustness in neural networks. Finally, we delve into the future of AI security, and the need for a better approach to mitigate the risks posed by optimized adversarial attacks.

The complete show notes for this episode can be found at twimlai.com/go/678.

  continue reading

710集单集

Artwork
icon分享
 
Manage episode 410063625 series 2355587
内容由TWIML and Sam Charrington提供。所有播客内容(包括剧集、图形和播客描述)均由 TWIML and Sam Charrington 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal

Today we're joined by Jonas Geiping, a research group leader at the ELLIS Institute, to explore his paper: "Coercing LLMs to Do and Reveal (Almost) Anything". Jonas explains how neural networks can be exploited, highlighting the risk of deploying LLM agents that interact with the real world. We discuss the role of open models in enabling security research, the challenges of optimizing over certain constraints, and the ongoing difficulties in achieving robustness in neural networks. Finally, we delve into the future of AI security, and the need for a better approach to mitigate the risks posed by optimized adversarial attacks.

The complete show notes for this episode can be found at twimlai.com/go/678.

  continue reading

710集单集

すべてのエピソード

×
 
Loading …

欢迎使用Player FM

Player FM正在网上搜索高质量的播客,以便您现在享受。它是最好的播客应用程序,适用于安卓、iPhone和网络。注册以跨设备同步订阅。

 

快速参考指南