
The Data Warehouse for Distributed Clouds - Yellowbrick
06/29/21 • 37 min
In this episode, we speak with Mark Cusack, CTO at Yellowbrick. Yellowbrick is a data warehouse platform that was built from the ground up for performance and cost that can be deployed across clouds and on-prem.
Top 3 Value Bombs:
- Yellowbrick DW was recently named a contender in Cloud Data Warehouses by Forrester Research and they are able to achieve 100X performance at 1/5th the price against many competitors.
- As data production is exponentially increasing at the “edge” the need to pre-process and keep the data where it is is becoming critical. The distributed cloud model helps solve this increasing problem.
- Yellowbrick was created from the ground up with a focus on performance and cost, a few of its technical features include a custom Linux-based OS kernel, data is read directly from primary storage into the CPU cache, and custom network drivers.
In this episode, we speak with Mark Cusack, CTO at Yellowbrick. Yellowbrick is a data warehouse platform that was built from the ground up for performance and cost that can be deployed across clouds and on-prem.
Top 3 Value Bombs:
- Yellowbrick DW was recently named a contender in Cloud Data Warehouses by Forrester Research and they are able to achieve 100X performance at 1/5th the price against many competitors.
- As data production is exponentially increasing at the “edge” the need to pre-process and keep the data where it is is becoming critical. The distributed cloud model helps solve this increasing problem.
- Yellowbrick was created from the ground up with a focus on performance and cost, a few of its technical features include a custom Linux-based OS kernel, data is read directly from primary storage into the CPU cache, and custom network drivers.
Previous Episode

What You Should Know Before Getting Started With Data Science with DATA SCIENCE I N F I N I T Y
In this episode, we speak with Andrew Jones who has spent 13 years in Data Science at companies including Amazon & more recently Sony PlayStation where he developed and prototyped Machine Learning based features for the PlayStation 5, several of which have been patented by Sony. Since then he has created the DATA SCIENCE I N F I N I T Y community to support folks on there data science journey.
Top 3 Value Bombs:
- 85% of AI projects fail, one of the reasons is due to going too complex too soon. When solving problems with data science, you should always start with the business problem first.
- Having a strong understanding of these foundational data science models will help you solve the majority of data science problems: linear regression, logistic regression, decision trees and, random forest
- Learning is a journey not a destination :)
Find out more here: https://data-science-infinity.teachable.com/
Next Episode

Launch, Monitor, and Share Data Pipelines In a Matter of Minutes
In this episode, we speak with Blake Burch, co-founder of Shipyard, a data orchestrator tool that allows you to create powerful workflows in a matter of minutes.
Top 3 Value Bombs:
- Data tests are often for the assumptions we already know. There's a lot of unknowns that can crop up and cause issues that tests are not catching. Start analyzing job metadata to alert on potential anomalies.
- Store your raw data to allow the most flexibility when it comes to re-transforming the data.
- Don’t settle for scatter shot troubleshooting. Have a clear lineage of how your data is being used from the source to the various consumers.
If you like this episode you’ll love
Episode Comments
Generate a badge
Get a badge for your website that links back to this episode
<a href="https://goodpods.com/podcasts/building-the-backend-data-solutions-that-power-leading-organizations-225533/the-data-warehouse-for-distributed-clouds-yellowbrick-25578264"> <img src="https://storage.googleapis.com/goodpods-images-bucket/badges/generic-badge-1.svg" alt="listen to the data warehouse for distributed clouds - yellowbrick on goodpods" style="width: 225px" /> </a>
Copy