Artwork

内容由The Nonlinear Fund提供。所有播客内容(包括剧集、图形和播客描述)均由 The Nonlinear Fund 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal
Player FM -播客应用
使用Player FM应用程序离线!

LW - AI #70: A Beautiful Sonnet by Zvi

1:12:27
 
分享
 

Manage episode 426161058 series 3337129
内容由The Nonlinear Fund提供。所有播客内容(包括剧集、图形和播客描述)均由 The Nonlinear Fund 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal
Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI #70: A Beautiful Sonnet, published by Zvi on June 28, 2024 on LessWrong. They said it couldn't be done. No, not Claude Sonnet 3.5 becoming the clear best model. No, not the Claude-Sonnet-empowered automatic meme generators. Those were whipped together in five minutes. They said I would never get quiet time and catch up. Well, I showed them! That's right. Yes, there is a new best model, but otherwise it was a quiet week. I got a chance to incorporate the remaining biggest backlog topics. The RAND report is covered under Thirty Eight Ways to Steal Your Model Weights. Last month's conference in Seoul is covered in You've Got Seoul. I got to publish my thoughts on OpenAI's Model Spec last Friday. Table of Contents Be sure to read about Claude 3.5 Sonnet here. That is by far the biggest story. 1. Introduction. 2. Table of Contents. 3. Language Models Offer Mundane Utility. I am increasingly persuaded. 4. Language Models Don't Offer Mundane Utility. EU's DMA versus the AiPhone. 5. Clauding Along. More people, mostly impressed. 6. Fun With Image Generation. They are coming for our memes. Then Hollywood. 7. Copyright Confrontation. The RIAA does the most RIAA thing. 8. Deepfaketown and Botpocalypse Soon. Character.ai addiction. Am I out of touch? 9. They Took Our Jobs. More arguments that the issues lie in the future. 10. The Art of the Jailbreak. We need to work together as a team. 11. Get Involved. AISI, Apollo, Astra, Accra, BlueDot, Cybersecurity and DOE. 12. Introducing. Forecasting, OpenAI Mac App, Otto, Dot, Butterflies, Decagon. 13. In Other AI News. OpenAI equity takes steps forward. You can sell it. 14. Quiet Speculations. A distinct lack of mojo. 15. You've Got Seoul. Delayed coverage of the Seoul summit from last month. 16. Thirty Eight Ways to Steal Your Model Weights. Right now they would all work. 17. The Quest for Sane Regulations. Steelmanning restraint. 18. SB 1047. In Brief. 19. The Week in Audio. Dwarkesh interviews Tony Blair, and many more. 20. Rhetorical Innovation. A demolition, and also a disputed correction. 21. People Are Worried About AI Killing Everyone. Don't give up. Invest wisely. 22. Other People Are Not As Worried About AI Killing Everyone. What even is ASI? 23. The Lighter Side. Eventually the AI will learn. Language Models Offer Mundane Utility Training only on (x,y) pairs, define the function f(x), compose and invert it without in-context examples or chain of thought. AI Dungeon will let you be the DM and take the role of the party, if you prefer. Lindy 'went rogue' and closed a customer on its own. They seem cool with it? Persuasive capability of the model is proportional to the log of the model size, says paper. Author Kobi Hackenburg paints this as reassuring, but the baseline is that everything scales with the log of the model size. He says this is mostly based on 'task completion' and staying on topic improving, and current frontier models are already near perfect at that, so he is skeptical we will see further improvement. I am not. I do believe the result that none of the models was 'more persuasive than human baseline' in the test, but that is based on uncustomized messages on generic political topics. Of course we should not expect above human performance there for current models. 75% of knowledge workers are using AI, but 78% of the 75% are not telling the boss. Build a team of AI employees to write the first half of your Shopify CEO speech from within a virtual office, then spend the second half of the speech explaining how you built the team. It is so weird to think 'the best way to get results from AI employees I can come up with is to make them virtually thirsty so they will have spontaneous water cooler conversations.' That is the definition of scratching the (virtual) surface. Do a bunch of agent-based analysis off a si...
  continue reading

1704集单集

Artwork
icon分享
 
Manage episode 426161058 series 3337129
内容由The Nonlinear Fund提供。所有播客内容(包括剧集、图形和播客描述)均由 The Nonlinear Fund 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal
Link to original article
Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: AI #70: A Beautiful Sonnet, published by Zvi on June 28, 2024 on LessWrong. They said it couldn't be done. No, not Claude Sonnet 3.5 becoming the clear best model. No, not the Claude-Sonnet-empowered automatic meme generators. Those were whipped together in five minutes. They said I would never get quiet time and catch up. Well, I showed them! That's right. Yes, there is a new best model, but otherwise it was a quiet week. I got a chance to incorporate the remaining biggest backlog topics. The RAND report is covered under Thirty Eight Ways to Steal Your Model Weights. Last month's conference in Seoul is covered in You've Got Seoul. I got to publish my thoughts on OpenAI's Model Spec last Friday. Table of Contents Be sure to read about Claude 3.5 Sonnet here. That is by far the biggest story. 1. Introduction. 2. Table of Contents. 3. Language Models Offer Mundane Utility. I am increasingly persuaded. 4. Language Models Don't Offer Mundane Utility. EU's DMA versus the AiPhone. 5. Clauding Along. More people, mostly impressed. 6. Fun With Image Generation. They are coming for our memes. Then Hollywood. 7. Copyright Confrontation. The RIAA does the most RIAA thing. 8. Deepfaketown and Botpocalypse Soon. Character.ai addiction. Am I out of touch? 9. They Took Our Jobs. More arguments that the issues lie in the future. 10. The Art of the Jailbreak. We need to work together as a team. 11. Get Involved. AISI, Apollo, Astra, Accra, BlueDot, Cybersecurity and DOE. 12. Introducing. Forecasting, OpenAI Mac App, Otto, Dot, Butterflies, Decagon. 13. In Other AI News. OpenAI equity takes steps forward. You can sell it. 14. Quiet Speculations. A distinct lack of mojo. 15. You've Got Seoul. Delayed coverage of the Seoul summit from last month. 16. Thirty Eight Ways to Steal Your Model Weights. Right now they would all work. 17. The Quest for Sane Regulations. Steelmanning restraint. 18. SB 1047. In Brief. 19. The Week in Audio. Dwarkesh interviews Tony Blair, and many more. 20. Rhetorical Innovation. A demolition, and also a disputed correction. 21. People Are Worried About AI Killing Everyone. Don't give up. Invest wisely. 22. Other People Are Not As Worried About AI Killing Everyone. What even is ASI? 23. The Lighter Side. Eventually the AI will learn. Language Models Offer Mundane Utility Training only on (x,y) pairs, define the function f(x), compose and invert it without in-context examples or chain of thought. AI Dungeon will let you be the DM and take the role of the party, if you prefer. Lindy 'went rogue' and closed a customer on its own. They seem cool with it? Persuasive capability of the model is proportional to the log of the model size, says paper. Author Kobi Hackenburg paints this as reassuring, but the baseline is that everything scales with the log of the model size. He says this is mostly based on 'task completion' and staying on topic improving, and current frontier models are already near perfect at that, so he is skeptical we will see further improvement. I am not. I do believe the result that none of the models was 'more persuasive than human baseline' in the test, but that is based on uncustomized messages on generic political topics. Of course we should not expect above human performance there for current models. 75% of knowledge workers are using AI, but 78% of the 75% are not telling the boss. Build a team of AI employees to write the first half of your Shopify CEO speech from within a virtual office, then spend the second half of the speech explaining how you built the team. It is so weird to think 'the best way to get results from AI employees I can come up with is to make them virtually thirsty so they will have spontaneous water cooler conversations.' That is the definition of scratching the (virtual) surface. Do a bunch of agent-based analysis off a si...
  continue reading

1704集单集

所有剧集

×
 
Loading …

欢迎使用Player FM

Player FM正在网上搜索高质量的播客,以便您现在享受。它是最好的播客应用程序,适用于安卓、iPhone和网络。注册以跨设备同步订阅。

 

快速参考指南