Artwork

内容由Thomas Wang提供。所有播客内容(包括剧集、图形和播客描述)均由 Thomas Wang 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal
Player FM -播客应用
使用Player FM应用程序离线!

Ep11. Designing Data-Intensive Applications - Partitioning

33:46
 
分享
 

Manage episode 332172726 series 2858756
内容由Thomas Wang提供。所有播客内容(包括剧集、图形和播客描述)均由 Thomas Wang 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal

这一期我们讨论Designing Data-Intensive Applications书中partitioning这一章的学习笔记。

🔴 这一期偏重技术话题,我们会用很多英文表述技术性专有名词。之前有朋友反馈过中英夹杂对大家收听不方便,希望在意的朋友见谅。如果有不准确或者过时的地方欢迎指正。

# Show Notes

  • 📕 Designing Data-Intensive Applications
  • What is partitioning?
    • A partition is a division of a logical database or its constituent elements into distinct independent parts.
  • Main reason: scalability - the query load can be distributed across many processors.
  • Youtube / Vitess scaling story
    • Single MySQL → Add read replica → Write can’t catchup up → Partition
  • How to partition?
  • Partitioning by Key Range (e.g., Bigtable)
    • Assign a continuous range of keys to each partition
    • Pro: range scan is easier, data locality
    • Cons: certain access patterns can lead to hot spots (timestamp)
    • Cons: finding split points and managing rebalancing is hard
  • Partitioning by Hash
    • Good hash function: uniformly distribute keys
    • Con: no easy range queries
  • Cassandra does KKV (partitioning key, sort key, value)
  • Hot spots: 3% of Twitter's Servers Dedicated to Justin Bieber
  • Secondary indexes: Local index
    • Efficient write, expensive read
    • ElasticSearch
  • Secondary indexes: Global index
  • Rebalancing partitions
    • Move loads to other nodes
  • Fixed number of partitions
    • New node steals partitions from every existing node
  • Notion: 480 partitions
  • Dynamic partitioning
    • 📈: split partition into 2
    • 📉: merge 2 partitions into 1
  • Fixed number of partitions per node
  • Operations: full automatic (dangerous) / semi-automatic / full manual (tedious)
  • Request Routing
    • 3 approaches: nodes talk to each other, separate routing tier, smart client
    • Separate coordination service such as ZooKeeper
  • Notes by xg

# 联系方式

  continue reading

16集单集

Artwork
icon分享
 
Manage episode 332172726 series 2858756
内容由Thomas Wang提供。所有播客内容(包括剧集、图形和播客描述)均由 Thomas Wang 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal

这一期我们讨论Designing Data-Intensive Applications书中partitioning这一章的学习笔记。

🔴 这一期偏重技术话题,我们会用很多英文表述技术性专有名词。之前有朋友反馈过中英夹杂对大家收听不方便,希望在意的朋友见谅。如果有不准确或者过时的地方欢迎指正。

# Show Notes

  • 📕 Designing Data-Intensive Applications
  • What is partitioning?
    • A partition is a division of a logical database or its constituent elements into distinct independent parts.
  • Main reason: scalability - the query load can be distributed across many processors.
  • Youtube / Vitess scaling story
    • Single MySQL → Add read replica → Write can’t catchup up → Partition
  • How to partition?
  • Partitioning by Key Range (e.g., Bigtable)
    • Assign a continuous range of keys to each partition
    • Pro: range scan is easier, data locality
    • Cons: certain access patterns can lead to hot spots (timestamp)
    • Cons: finding split points and managing rebalancing is hard
  • Partitioning by Hash
    • Good hash function: uniformly distribute keys
    • Con: no easy range queries
  • Cassandra does KKV (partitioning key, sort key, value)
  • Hot spots: 3% of Twitter's Servers Dedicated to Justin Bieber
  • Secondary indexes: Local index
    • Efficient write, expensive read
    • ElasticSearch
  • Secondary indexes: Global index
  • Rebalancing partitions
    • Move loads to other nodes
  • Fixed number of partitions
    • New node steals partitions from every existing node
  • Notion: 480 partitions
  • Dynamic partitioning
    • 📈: split partition into 2
    • 📉: merge 2 partitions into 1
  • Fixed number of partitions per node
  • Operations: full automatic (dangerous) / semi-automatic / full manual (tedious)
  • Request Routing
    • 3 approaches: nodes talk to each other, separate routing tier, smart client
    • Separate coordination service such as ZooKeeper
  • Notes by xg

# 联系方式

  continue reading

16集单集

所有剧集

×
 
Loading …

欢迎使用Player FM

Player FM正在网上搜索高质量的播客,以便您现在享受。它是最好的播客应用程序,适用于安卓、iPhone和网络。注册以跨设备同步订阅。

 

快速参考指南

边探索边听这个节目
播放