Player FM - Internet Radio Done Right
71 subscribers
Checked 3d ago
four 年前已添加!
内容由Databricks提供。所有播客内容(包括剧集、图形和播客描述)均由 Databricks 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal。
Player FM -播客应用
使用Player FM应用程序离线!
使用Player FM应用程序离线!
Data Brew by Databricks
标记全部为未/已播放
Manage series 2814833
内容由Databricks提供。所有播客内容(包括剧集、图形和播客描述)均由 Databricks 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal。
Welcome to Data Brew by Databricks with Denny and Brooke! In this series, we explore various topics in the data and AI community and interview subject matter experts in data engineering/data science. So join us with your morning brew in hand and get ready to dive deep into data + AI! For this first season, we will be focusing on lakehouses – combining the key features of data warehouses, such as ACID transactions, with the scalability of data lakes, directly against low-cost object stores.
40集单集
标记全部为未/已播放
Manage series 2814833
内容由Databricks提供。所有播客内容(包括剧集、图形和播客描述)均由 Databricks 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal。
Welcome to Data Brew by Databricks with Denny and Brooke! In this series, we explore various topics in the data and AI community and interview subject matter experts in data engineering/data science. So join us with your morning brew in hand and get ready to dive deep into data + AI! For this first season, we will be focusing on lakehouses – combining the key features of data warehouses, such as ACID transactions, with the scalability of data lakes, directly against low-cost object stores.
40集单集
所有剧集
×In this episode, Chang She, CEO and Co-founder of LanceDB, discusses the challenges of handling multimodal data and how LanceDB provides a cutting-edge solution. He shares his journey from contributing to Pandas to building a database optimized for images, video, vectors, and subtitles. Highlights include: - The limitations of traditional storage systems like Parquet for multimodal AI. - How LanceDB enables efficient querying and processing of diverse data types. - The growing importance of multimodal AI in enterprise applications. - Future trends in AI, including a shift from single models to holistic AI systems. - Predictions and "spicy takes" on AI advancements in 2025.…
In this episode, Michele Catasta, President of Replit, explores how AI-driven agents are transforming software development by making coding more accessible and automating application creation. Highlights include: - The difference between AI agents and copilots in software development. - How AI is democratizing coding, enabling non-programmers to build applications. - Challenges in AI agent development, including error handling and software quality. - The growing role of AI in entrepreneurship and business automation. - Why 2025 could be the year of AI agents and what’s next for the industry.…
In this episode, Brandon Cui, Research Scientist at MosaicML and Databricks, dives into cutting-edge advancements in AI model optimization, focusing on Reward Models and Reinforcement Learning from Human Feedback (RLHF). Highlights include: - How synthetic data and RLHF enable fine-tuning models to generate preferred outcomes. - Techniques like Policy Proximal Optimization (PPO) and Direct Preference Optimization (DPO) for enhancing response quality. - The role of reward models in improving coding, math, reasoning, and other NLP tasks. Connect with Brandon Cui: https://www.linkedin.com/in/bcui19/…
In this episode, Andrew Drozdov, Research Scientist at Databricks, explores how Retrieval Augmented Generation (RAG) enhances AI models by integrating retrieval capabilities for improved response accuracy and relevance. Highlights include: - Addressing LLM limitations by injecting relevant external information. - Optimizing document chunking, embedding, and query generation for RAG. - Improving retrieval systems with embeddings and fine-tuning techniques. - Enhancing search results using re-rankers and retrieval diagnostics. - Applying RAG strategies in enterprise AI for domain-specific improvements.…
In this episode, Yev Meyer, Chief Scientist at Gretel AI, explores how synthetic data transforms AI and ML by improving data access, quality, privacy, and model training. Highlights include: - Leveraging synthetic data to overcome AI data limitations. - Enhancing model training while mitigating ethical and privacy risks. - Exploring the intersection of computational neuroscience and AI workflows. - Addressing licensing and legal considerations in synthetic data usage. - Unlocking private datasets for broader and safer AI applications.…
In this episode, Julia Neagu, CEO & co-founder of Quotient AI, explores the challenges of deploying Generative AI and LLMs, focusing on model evaluation, human-in-the-loop systems, and iterative development. Highlights include: - Merging reinforcement learning and unsupervised learning for real-time AI optimization. - Reducing bias in machine learning with fairness and ethical considerations. - Lessons from large-scale AI deployments on scalability and feedback loops. - Automating workflows with AI through successful business examples. - Best practices for managing AI pipelines, from data collection to validation.…
In this episode, Sharon Zhou, Co-Founder and CEO of Lamini AI, shares her expertise in the world of AI, focusing on fine-tuning models for improved performance and reliability. Highlights include: - The integration of determinism and probabilism for handling unstructured data and user queries effectively. - Proprietary techniques like memory tuning and robust evaluation frameworks to mitigate model inaccuracies and hallucinations. - Lessons learned from deploying AI applications, including insights from GitHub Copilot’s rollout. Connect with Sharon Zhou and Lamini: https://www.linkedin.com/in/zhousharon/ https://x.com/realsharonzhou https://www.lamini.ai/…
In this episode, Shashank Rajput, Research Scientist at Mosaic and Databricks, explores innovative approaches in large language models (LLMs), with a focus on Retrieval Augmented Generation (RAG) and its impact on improving efficiency and reducing operational costs. Highlights include: - How RAG enhances LLM accuracy by incorporating relevant external documents. - The evolution of attention mechanisms, including mixed attention strategies. - Practical applications of Mamba architectures and their trade-offs with traditional transformers.…
In this episode, Jure Leskovec, Co-founder of Kumo AI and Professor of Computer Science at Stanford University, discusses Relational Deep Learning (RDL) and its role in automating feature engineering. Highlights include: - How RDL enhances predictive modeling. - Applications in fraud detection and recommendation systems. - The use of graph neural networks to simplify complex data structures.…
Our fifth season dives into large language models (LLMs), from understanding the internals to the risks of using them and everything in between. While we're at it, we'll be enjoying our morning brew. In this session, we interviewed Chengyin Eng (Senior Data Scientist, Databricks), Sam Raymond (Senior Data Scientist, Databricks), and Joseph Bradley (Lead Production Specialist - ML, Databricks) on the best practices around LLM use cases, prompt engineering, and how to adapt MLOps for LLMs (i.e., LLMOps).…
We will dive into LLMs for our fifth season, from understanding the internals to the risks of using them and everything in between. While we’re at it, we’ll be enjoying our morning brew. In this session, we interviewed Omar Khattab - Computer Science Ph.D. Student at Stanford, creator of DSP (Demonstrate–Search–Predict Framework), to discuss DSP, common applications, and the future of NLP.…
We will dive into LLMs for our fifth season, from understanding the internals to the risks of using them and everything in between. While we’re at it, we’ll be enjoying our morning brew. In this session, we interviewed Yaron Singer, CEO of Robust Intelligence, Professor of Computer Science at Harvard University, and guest of Data Brew Season 3 (our first repeat guest!). In this session, we discuss generative AI, the trends toward embracing LLMs, and how the surface area for vulnerabilities in generative AI is much bigger.…
We are back and we will dive into LLMs from understanding the internals to the risks of using them and everything in between. While we’re at it, we’ll be enjoying our morning brew. In this session, we interviewed David Talby who is the CTO at John Snow Labs; they help healthcare & life science companies put AI to good use. David's interests include natural language processing, applied artificial intelligence in healthcare, and responsible AI.…
For our fourth season, we focus on connected health and how data & AI augment and improve our daily health. While we’re at it, we’ll be enjoying our morning brew. Shayna Powless and Eli Ankou, professional cyclist for L39ion of Los Angeles and defensive tackle for the Buffalo Bills, respectively, provide valuable insight on how professional athletes leverage data to improve their performance and how they combine their passion for sports with the Dreamcatcher Foundation. See more at databricks.com/data-brew…
For our fourth season, we focus on connected health and how data & AI augment and improve our daily health. While we’re at it, we’ll be enjoying our morning brew. Matt Willis, Marin County Public Health Officer, shares the three pillars of public health: education, access, and policy, and the critical role data plays in addressing the COVID-19 pandemic & opioid epidemic. See more at databricks.com/data-brew…
欢迎使用Player FM
Player FM正在网上搜索高质量的播客,以便您现在享受。它是最好的播客应用程序,适用于安卓、iPhone和网络。注册以跨设备同步订阅。