Artwork

内容由Jeremie Harris提供。所有播客内容(包括剧集、图形和播客描述)均由 Jeremie Harris 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal
Player FM -播客应用
使用Player FM应用程序离线!

#2: Large Language Models Can Self-Improve

33:38
 
分享
 

Manage episode 347635558 series 3413483
内容由Jeremie Harris提供。所有播客内容(包括剧集、图形和播客描述)均由 Jeremie Harris 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal

Google recently announced a significant breakthrough: a new Language Model Self-Improvement (LMSI) system that makes it possible for big language models to improve their own performance on many tasks without using any additional labeled data. In this post, and its accompanying podcast, we’ll take a look at LMSI to understand why it’s such a big deal.

When applying LMSI to a 540B parameter PaLM model, the Google researchers achieved state-of-the-art results across a variety of arithmetic reasoning, commonsense reasoning, and natural language inference tasks.

The LMSI system allows a language model to self-improve in 3 steps:

  1. First, you give the system some questions like “Stefan goes to a restaurant with his family. They order an appetizer that costs $10 and 4 entrees that are $20 each. If they tip 20% of the total, what is the total amount of money that they spend?”
  2. Then, you ask the language model to explain the answer to the question in 32 different ways. For example, one explanation could be “The appetizer costs $10. The entrees cost 4 * $20 = $80. The tip is 20% of the total, so it is 20% of the $90 they have spent. The tip is 0.2 * 90 = $18. The total they spent is $90 + $18 = $108. The answer is 108.”
  3. Finally, the system picks the explanations with the most common answer and trains the language model on these explanations. For example, if 16 out of 32 explanations give $108 as the answer, and the other explanations have a mix of different answers, then the system will pick the explanations that gave $108 as the answer.

This approach lets an LMSI-augmented language model significantly improve its own performance and achieve state-of-the-art results on reasoning problems.

The authors found that the LMSI system makes language models much more powerful. When they fine-tuned a small language model with LMSI, they found that the model could answer questions better than language models that are 9 times bigger, that didn’t use LMSI.

Industry Context

With only some text-based questions, large language models like PaLM fine-tuned with the LMSI system were able to outperform existing state-of-the-art benchmarks that use more complex reasoning methods and/or ground truth labels. Small language models fine-tuned using LMSI were also able to outperform models that were 9 times larger and did not use LMSI.

This example shows that we are still discovering ways to improve large language models, without increasing model or dataset size, and that it is possible to improve language models without any labeled data. Since LMSI enables small language models to work better than large models without LMSI, malicious uses that leverage LMSI are less expensive to access than they were before.

  continue reading

2集单集

Artwork
icon分享
 
Manage episode 347635558 series 3413483
内容由Jeremie Harris提供。所有播客内容(包括剧集、图形和播客描述)均由 Jeremie Harris 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal

Google recently announced a significant breakthrough: a new Language Model Self-Improvement (LMSI) system that makes it possible for big language models to improve their own performance on many tasks without using any additional labeled data. In this post, and its accompanying podcast, we’ll take a look at LMSI to understand why it’s such a big deal.

When applying LMSI to a 540B parameter PaLM model, the Google researchers achieved state-of-the-art results across a variety of arithmetic reasoning, commonsense reasoning, and natural language inference tasks.

The LMSI system allows a language model to self-improve in 3 steps:

  1. First, you give the system some questions like “Stefan goes to a restaurant with his family. They order an appetizer that costs $10 and 4 entrees that are $20 each. If they tip 20% of the total, what is the total amount of money that they spend?”
  2. Then, you ask the language model to explain the answer to the question in 32 different ways. For example, one explanation could be “The appetizer costs $10. The entrees cost 4 * $20 = $80. The tip is 20% of the total, so it is 20% of the $90 they have spent. The tip is 0.2 * 90 = $18. The total they spent is $90 + $18 = $108. The answer is 108.”
  3. Finally, the system picks the explanations with the most common answer and trains the language model on these explanations. For example, if 16 out of 32 explanations give $108 as the answer, and the other explanations have a mix of different answers, then the system will pick the explanations that gave $108 as the answer.

This approach lets an LMSI-augmented language model significantly improve its own performance and achieve state-of-the-art results on reasoning problems.

The authors found that the LMSI system makes language models much more powerful. When they fine-tuned a small language model with LMSI, they found that the model could answer questions better than language models that are 9 times bigger, that didn’t use LMSI.

Industry Context

With only some text-based questions, large language models like PaLM fine-tuned with the LMSI system were able to outperform existing state-of-the-art benchmarks that use more complex reasoning methods and/or ground truth labels. Small language models fine-tuned using LMSI were also able to outperform models that were 9 times larger and did not use LMSI.

This example shows that we are still discovering ways to improve large language models, without increasing model or dataset size, and that it is possible to improve language models without any labeled data. Since LMSI enables small language models to work better than large models without LMSI, malicious uses that leverage LMSI are less expensive to access than they were before.

  continue reading

2集单集

所有剧集

×
 
Loading …

欢迎使用Player FM

Player FM正在网上搜索高质量的播客,以便您现在享受。它是最好的播客应用程序,适用于安卓、iPhone和网络。注册以跨设备同步订阅。

 

快速参考指南

边探索边听这个节目
播放