Artwork

内容由Demetrios提供。所有播客内容(包括剧集、图形和播客描述)均由 Demetrios 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal
Player FM -播客应用
使用Player FM应用程序离线!

PyTorch's Combined Effort in Large Model Optimization // Michael Gschwind // #274

57:44
 
分享
 

Manage episode 452058172 series 3241972
内容由Demetrios提供。所有播客内容(包括剧集、图形和播客描述)均由 Demetrios 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal

Dr. Michael Gschwind is a Director / Principal Engineer for PyTorch at Meta Platforms. At Meta, he led the rollout of GPU Inference for production services.

// MLOps Podcast #274 with Michael Gschwind, Software Engineer, Software Executive at Meta Platforms.

// Abstract

Explore the role in boosting model performance, on-device AI processing, and collaborations with tech giants like ARM and Apple. Michael shares his journey from gaming console accelerators to AI, emphasizing the power of community and innovation in driving advancements.

// Bio

Dr. Michael Gschwind is a Director / Principal Engineer for PyTorch at Meta Platforms. At Meta, he led the rollout of GPU Inference for production services. He led the development of MultiRay and Textray, the first deployment of LLMs at a scale exceeding a trillion queries per day shortly after its rollout. He created the strategy and led the implementation of PyTorch donation optimization with Better Transformers and Accelerated Transformers, bringing Flash Attention, PT2 compilation, and ExecuTorch into the mainstream for LLMs and GenAI models. Most recently, he led the enablement of large language models on-device AI with mobile and edge devices.

// MLOps Swag/Merch

https://mlops-community.myshopify.com/

// Related Links

Website: https://en.m.wikipedia.org/wiki/Michael_Gschwind

--------------- ✌️Connect With Us ✌️ -------------

Join our Slack community: https://go.mlops.community/slack

Follow us on Twitter: @mlopscommunity

Sign up for the next meetup: https://go.mlops.community/register

Catch all episodes, blogs, newsletters, and more: https://mlops.community/

Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/

Connect with Michael on LinkedIn: https://www.linkedin.com/in/michael-gschwind-3704222/?utm_source=share&utm_campaign=share_via&utm_content=profile&utm_medium=ios_app

Timestamps:

[00:00] Michael's preferred coffee

[00:21] Takeaways

[01:59] Please like, share, leave a review, and subscribe to our MLOps channels!

[02:10] Gaming to AI Accelerators

[11:34] Torch Chat goals

[18:53] Pytorch benchmarking and competitiveness

[21:28] Optimizing MLOps models

[24:52] GPU optimization tips

[29:36] Cloud vs On-device AI

[38:22] Abstraction across devices

[42:29] PyTorch developer experience

[45:33] AI and MLOps-related antipatterns

[48:33] When to optimize

[53:26] Efficient edge AI models

[56:57] Wrap up

  continue reading

489集单集

Artwork
icon分享
 
Manage episode 452058172 series 3241972
内容由Demetrios提供。所有播客内容(包括剧集、图形和播客描述)均由 Demetrios 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal

Dr. Michael Gschwind is a Director / Principal Engineer for PyTorch at Meta Platforms. At Meta, he led the rollout of GPU Inference for production services.

// MLOps Podcast #274 with Michael Gschwind, Software Engineer, Software Executive at Meta Platforms.

// Abstract

Explore the role in boosting model performance, on-device AI processing, and collaborations with tech giants like ARM and Apple. Michael shares his journey from gaming console accelerators to AI, emphasizing the power of community and innovation in driving advancements.

// Bio

Dr. Michael Gschwind is a Director / Principal Engineer for PyTorch at Meta Platforms. At Meta, he led the rollout of GPU Inference for production services. He led the development of MultiRay and Textray, the first deployment of LLMs at a scale exceeding a trillion queries per day shortly after its rollout. He created the strategy and led the implementation of PyTorch donation optimization with Better Transformers and Accelerated Transformers, bringing Flash Attention, PT2 compilation, and ExecuTorch into the mainstream for LLMs and GenAI models. Most recently, he led the enablement of large language models on-device AI with mobile and edge devices.

// MLOps Swag/Merch

https://mlops-community.myshopify.com/

// Related Links

Website: https://en.m.wikipedia.org/wiki/Michael_Gschwind

--------------- ✌️Connect With Us ✌️ -------------

Join our Slack community: https://go.mlops.community/slack

Follow us on Twitter: @mlopscommunity

Sign up for the next meetup: https://go.mlops.community/register

Catch all episodes, blogs, newsletters, and more: https://mlops.community/

Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/

Connect with Michael on LinkedIn: https://www.linkedin.com/in/michael-gschwind-3704222/?utm_source=share&utm_campaign=share_via&utm_content=profile&utm_medium=ios_app

Timestamps:

[00:00] Michael's preferred coffee

[00:21] Takeaways

[01:59] Please like, share, leave a review, and subscribe to our MLOps channels!

[02:10] Gaming to AI Accelerators

[11:34] Torch Chat goals

[18:53] Pytorch benchmarking and competitiveness

[21:28] Optimizing MLOps models

[24:52] GPU optimization tips

[29:36] Cloud vs On-device AI

[38:22] Abstraction across devices

[42:29] PyTorch developer experience

[45:33] AI and MLOps-related antipatterns

[48:33] When to optimize

[53:26] Efficient edge AI models

[56:57] Wrap up

  continue reading

489集单集

所有剧集

×
 
Loading …

欢迎使用Player FM

Player FM正在网上搜索高质量的播客,以便您现在享受。它是最好的播客应用程序,适用于安卓、iPhone和网络。注册以跨设备同步订阅。

 

快速参考指南

版权2025 | 隐私政策 | 服务条款 | | 版权
边探索边听这个节目
播放