Artwork

内容由Confluent, founded by the original creators of Apache Kafka® and Founded by the original creators of Apache Kafka®提供。所有播客内容(包括剧集、图形和播客描述)均由 Confluent, founded by the original creators of Apache Kafka® and Founded by the original creators of Apache Kafka® 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal
Player FM -播客应用
使用Player FM应用程序离线!

Top 6 Worst Apache Kafka JIRA Bugs

1:10:58
 
分享
 

Manage episode 424666719 series 2510642
内容由Confluent, founded by the original creators of Apache Kafka® and Founded by the original creators of Apache Kafka®提供。所有播客内容(包括剧集、图形和播客描述)均由 Confluent, founded by the original creators of Apache Kafka® and Founded by the original creators of Apache Kafka® 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal

Entomophiliac, Anna McDonald (Principal Customer Success Technical Architect, Confluent) has seen her fair share of Apache Kafka® bugs. For her annual holiday roundup of the most noteworthy Kafka bugs, Anna tells Kris Jenkins about some of the scariest, most surprising, and most enlightening corner cases that make you ask, “Ah, so that’s how it really works?”
She shares a lot of interesting details about how batching works, the replication protocol, how Kafka’s networking stack dances with Linux’s one, and which is the most important Scala class to read, if you’re only going to read one.
In particular, Anna gives Kris details about a bug that he’s been thinking about lately – sticky partitioner (KAFKA-10888). When a Kafka producer sends several records to the same partition at around the same time, the partition can get overloaded. As a result, if too many records get processed at once, they can get stuck causing an unbalanced workload. Anna goes on to explain that the fix required keeping track of the number of offsets/messages written to each partition, and then batching to force more balanced distributions.
She found another bug that occurs when Kafka server triggers TCP Congestion Control in some conditions (KAFKA-9648). Anna explains that when Kafka server restarts and then executes the preferred replica leader, lots of replica leaders trigger cluster metadata updates. Then, all clients establish a server connection at the same time that lots TCP requests are waiting in the TCP sync queue.
The third bug she talks about (KAFKA-9211), may cause TCP delays after upgrading…. Oh, that’s a nasty one. She goes on to tell Kris about a rare bug (KAFKA-12686) in Partition.scala where there’s a race condition between the handling of an AlterIsrResponse and a LeaderAndIsrRequest. This rare scenario involves the delay of AlterIsrResponse when lots of ISR and leadership changes occur due to broker restarts.
Bugs five (KAFKA-12964) and six (KAFKA-14334) are no better, but you’ll have to plug in your headphones and listen in to explore the ghoulish adventures of Anna McDonald as she gives a nightmarish peek into her world of JIRA bugs. It’s just what you might need this holiday season!
EPISODE LINKS

  continue reading

章节

1. Intro (00:00:00)

2. Kafka JIRA-10888: The sticky partitioner (00:03:55)

3. Kafka JIRA-9648: SYN cookies with evil frosting (00:17:36)

4. Kafka JIRA-9211: TCP delays after upgrading (00:26:37)

5. Kafka JIRA-12686: Attack of the overloaded cluster (00:31:34)

6. Kafka JIRA-12964: A Killer From the Past Strikes When You Least Expect It (00:45:09)

7. Kafka JIRA-14334: Whoops! I forgot to buy you a gift by Christmas (00:54:50)

8. It's a wrap! (01:09:11)

265集单集

Artwork
icon分享
 
Manage episode 424666719 series 2510642
内容由Confluent, founded by the original creators of Apache Kafka® and Founded by the original creators of Apache Kafka®提供。所有播客内容(包括剧集、图形和播客描述)均由 Confluent, founded by the original creators of Apache Kafka® and Founded by the original creators of Apache Kafka® 或其播客平台合作伙伴直接上传和提供。如果您认为有人在未经您许可的情况下使用您的受版权保护的作品,您可以按照此处概述的流程进行操作https://zh.player.fm/legal

Entomophiliac, Anna McDonald (Principal Customer Success Technical Architect, Confluent) has seen her fair share of Apache Kafka® bugs. For her annual holiday roundup of the most noteworthy Kafka bugs, Anna tells Kris Jenkins about some of the scariest, most surprising, and most enlightening corner cases that make you ask, “Ah, so that’s how it really works?”
She shares a lot of interesting details about how batching works, the replication protocol, how Kafka’s networking stack dances with Linux’s one, and which is the most important Scala class to read, if you’re only going to read one.
In particular, Anna gives Kris details about a bug that he’s been thinking about lately – sticky partitioner (KAFKA-10888). When a Kafka producer sends several records to the same partition at around the same time, the partition can get overloaded. As a result, if too many records get processed at once, they can get stuck causing an unbalanced workload. Anna goes on to explain that the fix required keeping track of the number of offsets/messages written to each partition, and then batching to force more balanced distributions.
She found another bug that occurs when Kafka server triggers TCP Congestion Control in some conditions (KAFKA-9648). Anna explains that when Kafka server restarts and then executes the preferred replica leader, lots of replica leaders trigger cluster metadata updates. Then, all clients establish a server connection at the same time that lots TCP requests are waiting in the TCP sync queue.
The third bug she talks about (KAFKA-9211), may cause TCP delays after upgrading…. Oh, that’s a nasty one. She goes on to tell Kris about a rare bug (KAFKA-12686) in Partition.scala where there’s a race condition between the handling of an AlterIsrResponse and a LeaderAndIsrRequest. This rare scenario involves the delay of AlterIsrResponse when lots of ISR and leadership changes occur due to broker restarts.
Bugs five (KAFKA-12964) and six (KAFKA-14334) are no better, but you’ll have to plug in your headphones and listen in to explore the ghoulish adventures of Anna McDonald as she gives a nightmarish peek into her world of JIRA bugs. It’s just what you might need this holiday season!
EPISODE LINKS

  continue reading

章节

1. Intro (00:00:00)

2. Kafka JIRA-10888: The sticky partitioner (00:03:55)

3. Kafka JIRA-9648: SYN cookies with evil frosting (00:17:36)

4. Kafka JIRA-9211: TCP delays after upgrading (00:26:37)

5. Kafka JIRA-12686: Attack of the overloaded cluster (00:31:34)

6. Kafka JIRA-12964: A Killer From the Past Strikes When You Least Expect It (00:45:09)

7. Kafka JIRA-14334: Whoops! I forgot to buy you a gift by Christmas (00:54:50)

8. It's a wrap! (01:09:11)

265集单集

Alle Folgen

×
 
Loading …

欢迎使用Player FM

Player FM正在网上搜索高质量的播客,以便您现在享受。它是最好的播客应用程序,适用于安卓、iPhone和网络。注册以跨设备同步订阅。

 

快速参考指南