
InfluxDB 3 & Rust
11/08/23 • 56 min
InfluxDB 3.0 Rewrite
InfluxDB, a time series database, underwent a major rewrite to create InfluxDB 3.0, also known as IOx. The decision to rewrite the database was driven by the need for strict control over memory management and high performance. The project started as a research endeavor and gradually gained traction within the company. The team decided to build around projects under the Apache Foundation, such as Apache Arrow and Apache Data Fusion. In April 2022, InfluxDB 3.0 was officially announced, aiming to improve performance, scalability, and cost-effectiveness for users.
IOx Database Engine
The new database engine, IOx, is designed to handle various types of observability and monitoring data, including metrics, traces, and logs. It aims to provide a single store for all these signals, eliminating the need for separate databases. However, querying the data efficiently is still a challenge that the team is working on. The goal is to make IOx the go-to solution for storing and querying observational data, not only for server infrastructure monitoring but also for sensor data use cases.
Challenges and Considerations
Working with logs, tracing, and structured events in time series databases poses challenges. The dynamic and inconsistent nature of schemas in logs and tracing use cases can make extracting structured fields difficult. Time series databases also have limitations in handling tracing front ends and require an index to map trace IDs to individual traces. While metrics, logs, and traces are the gold standard for observability, there is room for improvement in terms of usability and performance.
Flux and Data Fusion
Flux, a scripting language developed for InfluxDB 2.0, addresses user requests for more complex query logic and integration with third-party systems. InfluxDB 3.0 incorporates a parser in Rust to translate SQL queries into a Data Fusion query plan, benefiting from the performance optimizations of Data Fusion. However, bringing Flux to InfluxDB 3.0 proved challenging due to the large surface area of Flux and limited time and resources. Updating the Flux engine to use the 3.0 native API could potentially resolve these issues.
InfluxDB Development and Open Source Licensing
InfluxData is focused on improving the core query engine of InfluxDB and enhancing its capabilities and performance. They have created a separate community fork of Flux to allow collaboration on its development. Paul Dix, the co-founder, believes that true open source should be about freedom and expresses his intention to keep InfluxDB 3 as a permissively licensed project. He discusses the recent license change by HashiCorp and the growing distrust in the developer community towards VC-backed open source projects. Putting InfluxDB into a foundation may not be feasible due to the lack of multiple contributors.
InfluxDB 3.0 Rewrite
InfluxDB, a time series database, underwent a major rewrite to create InfluxDB 3.0, also known as IOx. The decision to rewrite the database was driven by the need for strict control over memory management and high performance. The project started as a research endeavor and gradually gained traction within the company. The team decided to build around projects under the Apache Foundation, such as Apache Arrow and Apache Data Fusion. In April 2022, InfluxDB 3.0 was officially announced, aiming to improve performance, scalability, and cost-effectiveness for users.
IOx Database Engine
The new database engine, IOx, is designed to handle various types of observability and monitoring data, including metrics, traces, and logs. It aims to provide a single store for all these signals, eliminating the need for separate databases. However, querying the data efficiently is still a challenge that the team is working on. The goal is to make IOx the go-to solution for storing and querying observational data, not only for server infrastructure monitoring but also for sensor data use cases.
Challenges and Considerations
Working with logs, tracing, and structured events in time series databases poses challenges. The dynamic and inconsistent nature of schemas in logs and tracing use cases can make extracting structured fields difficult. Time series databases also have limitations in handling tracing front ends and require an index to map trace IDs to individual traces. While metrics, logs, and traces are the gold standard for observability, there is room for improvement in terms of usability and performance.
Flux and Data Fusion
Flux, a scripting language developed for InfluxDB 2.0, addresses user requests for more complex query logic and integration with third-party systems. InfluxDB 3.0 incorporates a parser in Rust to translate SQL queries into a Data Fusion query plan, benefiting from the performance optimizations of Data Fusion. However, bringing Flux to InfluxDB 3.0 proved challenging due to the large surface area of Flux and limited time and resources. Updating the Flux engine to use the 3.0 native API could potentially resolve these issues.
InfluxDB Development and Open Source Licensing
InfluxData is focused on improving the core query engine of InfluxDB and enhancing its capabilities and performance. They have created a separate community fork of Flux to allow collaboration on its development. Paul Dix, the co-founder, believes that true open source should be about freedom and expresses his intention to keep InfluxDB 3 as a permissively licensed project. He discusses the recent license change by HashiCorp and the growing distrust in the developer community towards VC-backed open source projects. Putting InfluxDB into a foundation may not be feasible due to the lack of multiple contributors.
Previous Episode

Trust and Validation in AI
Here are 5 key takeaways from this episode that you don't want to miss:
1️⃣ The People Problem: Laura Santamaria raises an important concern about verifying AI-generated outputs and tackling the challenge of the "people problem" in AI development.
2️⃣ Verifying Data Authenticity: JJ discusses the challenge of proving that a data blob originated from a specific model and how this issue is being addressed by companies like IBM through pile cleaning and legal penalties.
3️⃣ AI Misconceptions: We debunk some common misconceptions about AI, including the belief that it is an all-knowing fact machine.
4️⃣ Trusted AI: IBM's approach to building trusted models, with dedicated engineers responsible for cleaning and verifying data, is explained. Plus, we discover IBM's partnerships with Hugging Face to leverage the open-source ecosystem.
5️⃣ The Impact of AI: We delve into the potential positive and negative implications of AI, and how the rapid advancement of this technology presents challenges with trust and validation.
💡 Fun Fact: Did you know that 95% of open-source language models are trained on a data set called "the pile," which contains pirated and copyrighted material? Discover why this has implications for copyright and patent laws!
As always, the conversation in this episode is engaging and eye-opening. JJ Asghar provides insightful perspectives and sheds light on the future of AI development. Don't miss out on the valuable information shared!
Questions We Covered
1. How can the problem of untrusted data in AI models be effectively addressed?
2. Should companies like OpenAI and Microsoft be required to provide their data sets for verification purposes? Why or why not?
3. What are the potential risks and challenges associated with using AI technology without proper regulation?
4. Should AI creations be eligible for copyright protection? Why or why not?
5. How can we ensure the accuracy and trustworthiness of AI-generated data, especially when it comes to extracting information from sources like PDFs?
6. What are some potential positive impacts of AI technology, and how can we maximize its benefits while minimizing its negative implications?
7. How can the rapid advancement of AI technology be balanced with the need for trust and validation?
8. In what ways do copyright and patent laws need to evolve to accommodate AI technology?
9. What are the implications of China having its own set of laws and approaches to technology that may differ from other countries?
10. How can individuals navigate and better understand the AI space in order to make informed decisions and contributions?
Next Episode

From Kubernetes to Cloud Run: Chainguard's Journey
Exploring Cloud Migrations & Infrastructure Strategies with Jason Hall of Chainguard
Click here to watch a video of this episode.
In this episode of the Cloud Native Compass podcast, hosts David Flanagan and Laura Santamaria chat with Jason Hall, Principal Engineer at Chainguard. They delve into Chainguard's migration from Kubernetes and Knative to Cloud Run, discussing the reasons behind the move, cost considerations, managing technical debt, and best practices for infrastructure management. The conversation also covers the benefits of using Cloud Run, their strategic use of BigQuery for event logging, and insights into least access security models. Tune in to learn more about navigating cloud-native environments and optimizing infrastructure.
Creators & Guests
- David Flanagan - Host
- Laura Santamaria - Host
- Jason Hall - Guest
- (00:00) - Introduction
- (00:52) - Jason Does Stuff
- (01:32) - Chainguard's Migration Journey
- (02:18) - Challenges with Kubernetes and Knative
- (04:33) - Adopting Cloud Run
- (12:15) - Multi-Region Deployment with Cloud Run
- (19:26) - Security and Authorization Practices
- (27:29) - Operational Decisions and Cost Considerations
- (33:07) - Debunking Kubernetes Myths
- (33:24) - The Illusion of Free Services
- (33:42) - Scaling Challenges and Solutions
- (37:00) - Convincing Leadership to Address Technical Debt
- (39:41) - Developer Environments in the Cloud
- (43:18) - Cloud Run vs. BigQuery Debate
- (47:20) - Security and Logging Best Practices
- (52:56) - Future Plans and Focus Areas
- (54:45) - Final Thoughts and Farewells
If you like this episode you’ll love
Episode Comments
Generate a badge
Get a badge for your website that links back to this episode
<a href="https://goodpods.com/podcasts/cloud-native-compass-257626/influxdb-3-and-rust-36357181"> <img src="https://storage.googleapis.com/goodpods-images-bucket/badges/generic-badge-1.svg" alt="listen to influxdb 3 & rust on goodpods" style="width: 225px" /> </a>
Copy