Log in

goodpods headphones icon

To access all our features

Open the Goodpods app
Close icon
Data Engineering Weekly - DEW #129: DoorDash's Generative AI, Europe data salary, Data Validation with Great Expectations, Expedia's Event Sourcing

DEW #129: DoorDash's Generative AI, Europe data salary, Data Validation with Great Expectations, Expedia's Event Sourcing

05/27/23 • 31 min

Data Engineering Weekly

Welcome to another episode of Data Engineering Weekly. Aswin and I select 3 to 4 articles from each edition of Data Engineering Weekly and discuss them from the author’s and our perspectives.

On DEW #129, we selected the following article

DoorDash identifies Five big areas for using Generative AI

Generative AI has taken the industry by storm, and every company is trying to determine what it means to them. DoorDash writes about its discovery of Generative AI and its application to boost its business.

  1. The assistance of customers in completing tasks
  2. Better tailored and interactive discovery [Recommendation]
  3. Generation of personalized content and merchandising
  4. Extraction of structured information
  5. Enhancement of employee productivity

https://doordash.engineering/2023/04/26/doordash-identifies-five-big-areas-for-using-generative-ai/

Mikkel Dengsøe: Europe data salary benchmark 2023

Fascinating findings on Europe’s data salary among various countries. The key findings are

  1. German-based roles pay lower.
  2. London and Dublin-based roles have the highest compensations. The Dublin sample is skewed to more senior roles, with 55% of reported salaries being senior, which is more indicative of the sample than jobs in Dublin paying higher than in London.
  3. The top 75% percentile jobs in Amsterdam, London, and Dublin pay nearly 50% more than those in Berlin

https://medium.com/@mikldd/europe-data-salary-benchmark-2023-b68cea57923d

Trivago: Implementing Data Validation with Great Expectations in Hybrid Environments

The article by Trivago discusses the integration of data validation with Great Expectations. It presents a well-balanced case study that emphasizes the significance of data validation and the necessity for sophisticated statistical validation methods.

https://tech.trivago.com/post/2023-04-25-implementing-data-validation-with-great-expectations-in-hybrid-environments.html

Expedia: How Expedia Reviews Engineering Is Using Event Streams as a Source Of Truth

“Events as a source of truth” is a simple but powerful idea to persist the state of the business entity as a sequence of state-changing events. How to build such a system? Expedia writes about the review stream system to demonstrate how it adopted the event-first approach.

https://medium.com/expedia-group-tech/how-expedia-reviews-engineering-is-using-event-streams-as-a-source-of-truth-d3df616cccd8

plus icon
bookmark

Welcome to another episode of Data Engineering Weekly. Aswin and I select 3 to 4 articles from each edition of Data Engineering Weekly and discuss them from the author’s and our perspectives.

On DEW #129, we selected the following article

DoorDash identifies Five big areas for using Generative AI

Generative AI has taken the industry by storm, and every company is trying to determine what it means to them. DoorDash writes about its discovery of Generative AI and its application to boost its business.

  1. The assistance of customers in completing tasks
  2. Better tailored and interactive discovery [Recommendation]
  3. Generation of personalized content and merchandising
  4. Extraction of structured information
  5. Enhancement of employee productivity

https://doordash.engineering/2023/04/26/doordash-identifies-five-big-areas-for-using-generative-ai/

Mikkel Dengsøe: Europe data salary benchmark 2023

Fascinating findings on Europe’s data salary among various countries. The key findings are

  1. German-based roles pay lower.
  2. London and Dublin-based roles have the highest compensations. The Dublin sample is skewed to more senior roles, with 55% of reported salaries being senior, which is more indicative of the sample than jobs in Dublin paying higher than in London.
  3. The top 75% percentile jobs in Amsterdam, London, and Dublin pay nearly 50% more than those in Berlin

https://medium.com/@mikldd/europe-data-salary-benchmark-2023-b68cea57923d

Trivago: Implementing Data Validation with Great Expectations in Hybrid Environments

The article by Trivago discusses the integration of data validation with Great Expectations. It presents a well-balanced case study that emphasizes the significance of data validation and the necessity for sophisticated statistical validation methods.

https://tech.trivago.com/post/2023-04-25-implementing-data-validation-with-great-expectations-in-hybrid-environments.html

Expedia: How Expedia Reviews Engineering Is Using Event Streams as a Source Of Truth

“Events as a source of truth” is a simple but powerful idea to persist the state of the business entity as a sequence of state-changing events. How to build such a system? Expedia writes about the review stream system to demonstrate how it adopted the event-first approach.

https://medium.com/expedia-group-tech/how-expedia-reviews-engineering-is-using-event-streams-as-a-source-of-truth-d3df616cccd8

Previous Episode

undefined - DEW #129: DoorDash's Generative AI, Europe data salary, Data Validation with Great Expectations, Expedia's Event Sourcing

DEW #129: DoorDash's Generative AI, Europe data salary, Data Validation with Great Expectations, Expedia's Event Sourcing

Welcome to another episode of Data Engineering Weekly. Aswin and I select 3 to 4 articles from each edition of Data Engineering Weekly and discuss them from the author’s and our perspectives.

On DEW #129, we selected the following article

DoorDash identifies Five big areas for using Generative AI.

Generative AI took the industry by storm, and every company is trying to figure out what it means to them. DoorDash writes about its discovery of Generative AI and its application to boost its business.

The assistance of customers in completing tasks

Better tailored and interactive discovery [Recommendation]

Generation of personalized content and merchandising

Extraction of structured information

Enhancement of employee productivity

https://doordash.engineering/2023/04/26/doordash-identifies-five-big-areas-for-using-generative-ai/

Mikkel Dengsøe: Europe data salary benchmark 2023

Fascinating findings on Europe’s data salary among various countries. The key findings are

German-based roles pay lower.

London and Dublin-based roles have the highest compensations. The Dublin sample is skewed to more senior roles, with 55% of reported salaries being senior, which is more indicative of the sample than jobs in Dublin paying higher than in London.

The top 75% percentile jobs in Amsterdam, London, and Dublin pay nearly 50% more than those in Berlin

https://medium.com/@mikldd/europe-data-salary-benchmark-2023-b68cea57923d

Trivago: Implementing Data Validation with Great Expectations in Hybrid Environments

The article by Trivago discusses the integration of data validation with Great Expectations. It presents a well-balanced case study that emphasizes the significance of data validation and the necessity for sophisticated statistical validation methods.

https://tech.trivago.com/post/2023-04-25-implementing-data-validation-with-great-expectations-in-hybrid-environments.html

Expedia: How Expedia Reviews Engineering Is Using Event Streams as a Source Of Truth

“Events as a source of truth” is a simple but powerful idea to persist the state of the business entity as a sequence of state-changing events. How to build such a system? Expedia writes about the review stream system to demonstrate how it adopted the event-first approach.

https://medium.com/expedia-group-tech/how-expedia-reviews-engineering-is-using-event-streams-as-a-source-of-truth-d3df616cccd8


This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit www.dataengineeringweekly.com

Next Episode

undefined - DEW #131: dbt model contract, Instacart ads modularization in LakeHouse Architecture, Jira to automate Glue tables, Server-Side Tracking

DEW #131: dbt model contract, Instacart ads modularization in LakeHouse Architecture, Jira to automate Glue tables, Server-Side Tracking

Welcome to another episode of Data Engineering Weekly. Aswin and I select 3 to 4 articles from each edition of Data Engineering Weekly and discuss them from the author’s and our perspectives.

On DEW #131, we selected the following article

Ramon Marrero: DBT Model Contracts - Importance and Pitfalls

dbt introduces model contract with 1.5 release. There were a few critics of the dbt model implementation, such as The False Promise of dbt Contracts. I found the argument made in the false promise of the dbt contract surprising, especially the below comments.

As a model owner, if I change the columns or types in the SQL, it's usually intentional. - My immediate no reaction was, Hmm, Not really.

However, as with any initial system iteration, the dbt model contract implementation has pros and cons. I’m sure it will evolve as the adoption increases. The author did an amazing job writing a balanced view of dbt model contract.

https://medium.com/geekculture/dbt-model-contracts-importance-and-pitfalls-20b113358ad7

Instacart: How Instacart Ads Modularized Data Pipelines With Lakehouse Architecture and Spark

Instacart writes about its journey of building its ads measurement platform. A couple of thing stands out for me in the blog.

The Event store is moving from S3/ parquet storage to DeltaLake storage—a sign of LakeHouse format adoption across the board.

Instacart adoption of Databricks ecosystem along with Snowflake.

The move to rewrite SQL into a composable Spark SQL pipeline for better readability and testing.

https://tech.instacart.com/how-instacart-ads-modularized-data-pipelines-with-lakehouse-architecture-and-spark-e9863e28488d

Timo Dechau: The extensive guide for Server-Side Tracking

The blog is an excellent overview of server-side event tracking. The author highlights how the event tracking is always close to the UI flow than the business flow and all the possible things wrong with frontend event tracking. A must-read article if you’re passionate about event tracking like me.

Credit Saison: Using Jira to Automate Updations and Additions of Glue Tables

This Schema change could’ve been a JIRA ticket!!!

I found the article excellent workflow automation on top of the familiar ticketing system, JIRA. The blog narrates the challenges with Glue Crawler and how selectively applying the db changes management using JIRA help to overcome its technical debt of running 6+ hours custom crawler.

https://medium.com/credit-saison-india/using-jira-to-automate-updations-and-additions-of-glue-tables-58d39adf9940


This is a public episode. If you would like to discuss this with other subscribers or get access to bonus episodes, visit www.dataengineeringweekly.com

Episode Comments

Generate a badge

Get a badge for your website that links back to this episode

Select type & size
Open dropdown icon
share badge image

<a href="https://goodpods.com/podcasts/data-engineering-weekly-249164/dew-129-doordashs-generative-ai-europe-data-salary-data-validation-wit-30594014"> <img src="https://storage.googleapis.com/goodpods-images-bucket/badges/generic-badge-1.svg" alt="listen to dew #129: doordash's generative ai, europe data salary, data validation with great expectations, expedia's event sourcing on goodpods" style="width: 225px" /> </a>

Copy