Streaming Audio: Apache Kafka® & Real-Time Data
Confluent, founded by the original creators of Apache Kafka®
Streaming Audio features all things Apache Kafka®, Confluent, real-time data, and the cloud. We cover frequently asked questions, best practices, and use cases from the Kafka community—from Kafka connectors and distributed systems, to data mesh, data integration, modern data architectures, and data mesh built with Confluent and cloud Kafka as a service. Join our hosts as they stream through a series of interviews, stories, and use cases with guests from the data streaming industry. Apache®️, Apache Kafka, Kafka, and the Kafka logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks.
All episodes
Best episodes
Seasons
Top 10 Streaming Audio: Apache Kafka® & Real-Time Data Episodes
Goodpods has curated a list of the 10 best Streaming Audio: Apache Kafka® & Real-Time Data episodes, ranked by the number of listens and likes each episode have garnered from our listeners. If you are listening to Streaming Audio: Apache Kafka® & Real-Time Data for the first time, there's no better place to start than with one of these standout episodes. If you are a fan of the show, vote for your favorite Streaming Audio: Apache Kafka® & Real-Time Data episode by adding your comments to the episode page.
Ask Confluent #8: Guozhang Wang on Kafka Streams Standby Tasks
Streaming Audio: Apache Kafka® & Real-Time Data
12/18/18 • 22 min
Gwen is joined in studio by special guest Guozhang Wang, Kafka Streams pioneer and engineering lead at Confluent. He’ll talk to us about standby tasks and how one deserializes message headers. In "Ask Confluent," Gwen Shapira (Data Architect, Confluent) and guests respond to a handful of questions and comments from Twitter, YouTube and elsewhere.
EPISODE LINKS
Streaming Call of Duty at Activision with Apache Kafka ft. Yaroslav Tkachenko
Streaming Audio: Apache Kafka® & Real-Time Data
01/27/20 • 46 min
Call of Duty: Modern Warfare is the most played Call of Duty multiplayer of this console generation with over $1 billion in sales and almost 300 million multiplayer matches. Behind the scenes, Yaroslav Tkachenko (Software Engineer and Architect, Activision) gets to be on the team behind it all, architecting, designing, and implementing their next-generation event streaming platform, including a large-scale, near-real-time streaming data pipeline using Kafka Streams and Kafka Connect.
Learn about how his team ingests huge amounts of data, what the backend of their massive distributed system looks like, and the automated services involved for collecting data from each pipeline.
EPISODE LINKS
Ask Confluent #16: ksqlDB Edition
Streaming Audio: Apache Kafka® & Real-Time Data
12/12/19 • 30 min
Vinoth Chandar has led various infrastructure projects at Uber and is one of the main drivers behind the ksqlDB project. In this episode hosted by Gwen Shapira (Engineering Manager, Cloud-Native Apache Kafka®), Vinoth and Gwen discuss what ksqlDB is, the kinds of applications that you can build with it, vulnerabilities, and various ksqlDB use cases. They also talk about what's currently the best version of Apache Kafka version for performance improvements that don’t cause breaking changes to existing Kafka configuration and functionality.
EPISODE LINKS
- Read about ksqlDB on the blog
- Learn more about ksqlDB
- ksqlDB Demo | The Event Streaming Database in Action
- Follow ksqlDB on Twitter
- What’s New in Apache Kafka 2.3
- What is Apache Kafka?
- Watch the video version of this podcast
- Join the Confluent Community Slack
- Fully managed Apache Kafka as a service! Try free.
Apache Kafka 2.7 - Overview of Latest Features, Updates, and KIPs
Streaming Audio: Apache Kafka® & Real-Time Data
12/21/20 • 10 min
Apache Kafka® 2.7 is here! Here are the key Kafka Improvement Proposals (KIPs) and updates in this release, presented by Tim Berglund.
KIP-497 adds a new inter-broker API to alter in-sync replicas (ISRs). Every partition leader maintains the ISR list or the list of ISRs. KIP-497 is also related to the removal of ZooKeeper.
KIP-599 has to do with throttling the rate of creating topics, deleting topics, and creating partitions. This KIP will add a new feature called the controller mutation rate.
KIP-612 adds the ability to limit the connection creation rate on brokers, while KIP-651 supports the PEM format for SSL certificates and private keys.
The release of Kafka 2.7 furthermore includes end-to-end latency metrics and sliding windows.
Find out what’s new with the Kafka broker, producer, and consumer, and what’s new with Kafka Streams in today’s episode of Streaming Audio!
EPISODE LINKS
- Read about what’s new in Apache Kafka 2.7
- Check out the Apache Kafka 2.7 release notes
- Watch the video version of this podcast
- Join the Confluent Community Slack
- Learn more with Kafka tutorials, resources, and guides at Confluent Developer
- Live demo: Kafka streaming in 10 minutes on Confluent Cloud
- Use 60PDCAST to get an additional $60 of free Confluent Cloud usage (details)
Mastering DevOps with Apache Kafka, Kubernetes, and Confluent Cloud ft. Rick Spurgeon and Allison Walther
Streaming Audio: Apache Kafka® & Real-Time Data
12/22/20 • 46 min
How do you use Apache Kafka®, Confluent Platform, and Confluent Cloud for DevOps? Integration Architects Rick Spurgeon and Allison Walther share how, including a custom tool they’ve developed for this very purpose.
First, Rick and Allison share their perspective of what it means to be a DevOps engineer. Mixing development and operations skills to deploy, manage, monitor, audit, and maintain distributed systems. DevOps is multifaceted and can be compared to glue, in which you’re stitching software, services, databases, Kafka, and more, together to integrate end to end solutions.
Using the Confluent Cloud Metrics API (actionable operational metrics), you pull a wide range of metrics about your cluster, a topic or partition, bytes, records, and requests. The Metrics API is unique in that it is queryable. You can send this API question, “What's the max retained bytes per hour over 10 hours for my topic or my cluster?” and find out just like that.
To make writing operators much easier, Rick and Allison also share about Crossplane, KUDO, Shell-operator, and how to use these tools.
EPISODE LINKS
- Confluent Cloud Metrics API
- Shell Operator
- DevOps for Apache Kafka
- The Kubernetes Universal Declarative Operator
- Introducing the AWS Controllers for Kubernetes (ACK)
- Manage any infrastructure your applications need directly from Kubernetes with Crossplane
- DevOps for Apache Kafka with Kubernetes and GitOps
- Spring Your Microservices into Production with Kubernetes and GitOps
- Join the Confluent Community Slack
- Learn more with Kafka tutorials, resources, and guides at Confluent Developer
- Live demo: Kafka streaming in 10 minutes on Confluent Cloud
- Use 60PDCAST to get an additional $60 of free Confluent Cloud usage (details)
Tales From the Frontline of Apache Kafka DevOps ft. Jason Bell
Streaming Audio: Apache Kafka® & Real-Time Data
12/02/20 • 60 min
Kafka Summit SF 2018 Panel | Microsoft, Slack, Confluent, University of Cambridge
Streaming Audio: Apache Kafka® & Real-Time Data
10/18/18 • 34 min
Neha Narkhede leads a panel discussion at Kafka Summit SF 2018 with Kevin Scott (CTO, Microsoft), Julia Grace (Head of Infrastructure Engineering, Slack), Martin Kleppman (Researcher, U. of Cambridge), Jay Kreps (Co-founder and CEO, Confluent) and Neha Narkhede (Co-founder and CTO at Confluent).
Real-Time Data Transformation and Analytics with dbt Labs
Streaming Audio: Apache Kafka® & Real-Time Data
02/22/23 • 43 min
dbt is known as being part of the Modern Data Stack for ELT processes. Being in the MDS, dbt Labs believes in having the best of breed for every part of the stack. Oftentimes folks are using an EL tool like Fivetran to pull data from the database into the warehouse, then using dbt to manage the transformations in the warehouse. Analysts can then build dashboards on top of that data, or execute tests.
It’s possible for an analyst to adapt this process for use with a microservice application using Apache Kafka® and the same method to pull batch data out of each and every database; however, in this episode, Amy Chen (Partner Engineering Manager, dbt Labs) tells Kris about a better way forward for analysts willing to adopt the streaming mindset: Reusable pipelines using dbt models that immediately pull events into the warehouse and materialize as materialized views by default.
dbt Labs is the company that makes and maintains dbt. dbt Core is the open-source data transformation framework that allows data teams to operate with software engineering’s best practices. dbt Cloud is the fastest and most reliable way to deploy dbt.
Inside the world of event streaming, there is a push to expand data access beyond the programmers writing the code, and towards everyone involved in the business. Over at dbt Labs they’re attempting something of the reverse— to get data analysts to adopt the best practices of software engineers, and more recently, of streaming programmers. They’re improving the process of building data pipelines while empowering businesses to bring more contributors into the analytics process, with an easy to deploy, easy to maintain platform. It offers version control to analysts who traditionally don’t have access to git, along with the ability to easily automate testing, all in the same place.
In this episode, Kris and Amy explore:
- How to revolutionize testing for analysts with two of dbt’s core functionalities
- What streaming in a batch-based analytics world should look like
- What can be done to improve workflows
- How to democratize access to data for everyone in the business
EPISODE LINKS
- Learn more about dbt labs
- An Analytics Engineer’s Guide to Streaming
- Panel discussion: If Streaming Is the Answer, Why Are We Still Doing Batch?
- All Current 2022 sessions and slides
- Watch the video version of this podcast
- Kris Jenkins’ Twitter
- Streaming Audio Playlist
- Join the Confluent Community
- Learn more with Kafka tutorials, resources, and guides at Confluent Developer
- Live demo: Intro to Event-Driven Microservices with Confluent
- Use PODCAST100 to get an additional $100 of free Confluent Cloud usage (details)
Tales From The Frontline of Apache Kafka Devops ft. Jason Bell
Streaming Audio: Apache Kafka® & Real-Time Data
12/02/20 • 60 min
Kafka Streams in Action with Bill Bejeck
Streaming Audio: Apache Kafka® & Real-Time Data
09/27/18 • 49 min
Tim Berglund interviews Bill Bejeck about the Kafka Streams API and his new book, Kafka Streams in Action.
Show more best episodes
Show more best episodes
FAQ
How many episodes does Streaming Audio: Apache Kafka® & Real-Time Data have?
Streaming Audio: Apache Kafka® & Real-Time Data currently has 270 episodes available.
What topics does Streaming Audio: Apache Kafka® & Real-Time Data cover?
The podcast is about Open Source, Cloud, Data, How To, Podcasts, Technology and Education.
What is the most popular episode on Streaming Audio: Apache Kafka® & Real-Time Data?
The episode title 'Benchmarking Apache Kafka Latency at the 99th Percentile ft. Anna Povzner' is the most popular.
What is the average episode length on Streaming Audio: Apache Kafka® & Real-Time Data?
The average episode length on Streaming Audio: Apache Kafka® & Real-Time Data is 37 minutes.
How often are episodes of Streaming Audio: Apache Kafka® & Real-Time Data released?
Episodes of Streaming Audio: Apache Kafka® & Real-Time Data are typically released every 7 days.
When was the first episode of Streaming Audio: Apache Kafka® & Real-Time Data?
The first episode of Streaming Audio: Apache Kafka® & Real-Time Data was released on Jun 20, 2018.
Show more FAQ
Show more FAQ