
From Statecraft to Codebreaking: The Big Data Origin Story with Chris Wiggins, Chief Data Scientist at The New York Times
12/18/24 • 45 min
If you’re a history buff in the data world, you know that there’s a complex interplay between data, statecraft, and machine learning. The history of data visualization is entwined with societal governance and technological advancements, starting from the usage of statistics for statecraft in the 18th century to the transformative innovations during World War II that birthed computation and data science as we know it. And because of the subjective design choices that underpin data gathering and analysis, there’s an inherently political nature of deciding what data to collect and how to utilize it, which is critical in understanding both historical and contemporary data practices.
As we move into the modern applications of data science and the advent of AI technologies, deep reinforcement learning and the integration with generative AI models, these technologies are reshaping the field by enabling computers to process and interact with unstructured data in unprecedented ways. Satyen and Chris discuss his book How Data Happened, the origins of data science and the role of Alan Turing in the creation of digital computing, and the challenges generative AI brings around model interoperability.
*Satyen’s narration was created using AI
--------
“In the last two years, one of the major techniques for advancing the most eye-popping products has been RLHF, Reinforcement Learning from Human Feedback. There's innumerable subjective design choices happening there, which eventually become encoded in a product. But, the presentation of it as though it's somehow unbiased and free from any subjective design choices is illusory.” – Chris Wiggins
--------
Time Stamps
*(01:36): How did Chris come to write How Data Happened?
*(10:33): World War II as the springboard for data science and digital computing
*(18:37): The tension between objectivity and subjectivity in data today
*(25:36): What is Reinforcement Learning from Human Feedback (RLHF)?
*(36:03): How has Gen AI impacted data science?
*(44:53): Satyen’s takeaways
--------
Sponsor
This podcast is presented by Alation.
Learn more:
Subscribe to the newsletter: https://www.alation.com/podcast/
Alation’s LinkedIn Profile: https://www.linkedin.com/company/alation/
Satyen’s LinkedIn Profile:
https://www.linkedin.com/in/ssangani/
--------
Links
If you’re a history buff in the data world, you know that there’s a complex interplay between data, statecraft, and machine learning. The history of data visualization is entwined with societal governance and technological advancements, starting from the usage of statistics for statecraft in the 18th century to the transformative innovations during World War II that birthed computation and data science as we know it. And because of the subjective design choices that underpin data gathering and analysis, there’s an inherently political nature of deciding what data to collect and how to utilize it, which is critical in understanding both historical and contemporary data practices.
As we move into the modern applications of data science and the advent of AI technologies, deep reinforcement learning and the integration with generative AI models, these technologies are reshaping the field by enabling computers to process and interact with unstructured data in unprecedented ways. Satyen and Chris discuss his book How Data Happened, the origins of data science and the role of Alan Turing in the creation of digital computing, and the challenges generative AI brings around model interoperability.
*Satyen’s narration was created using AI
--------
“In the last two years, one of the major techniques for advancing the most eye-popping products has been RLHF, Reinforcement Learning from Human Feedback. There's innumerable subjective design choices happening there, which eventually become encoded in a product. But, the presentation of it as though it's somehow unbiased and free from any subjective design choices is illusory.” – Chris Wiggins
--------
Time Stamps
*(01:36): How did Chris come to write How Data Happened?
*(10:33): World War II as the springboard for data science and digital computing
*(18:37): The tension between objectivity and subjectivity in data today
*(25:36): What is Reinforcement Learning from Human Feedback (RLHF)?
*(36:03): How has Gen AI impacted data science?
*(44:53): Satyen’s takeaways
--------
Sponsor
This podcast is presented by Alation.
Learn more:
Subscribe to the newsletter: https://www.alation.com/podcast/
Alation’s LinkedIn Profile: https://www.linkedin.com/company/alation/
Satyen’s LinkedIn Profile:
https://www.linkedin.com/in/ssangani/
--------
Links
Previous Episode

The Art of Data Leadership: Lessons from Taylor Culver
What does it take for data leaders to deliver real business value? In this episode, Taylor Culver, founder of XenoDATA, shares practical strategies for success, including:
Focus on the right problems: Taylor explains the importance of refining problem statements for actionable, data-driven solutions.
Engage like a salesperson: Actively listening to stakeholders and identifying pain points is key to building impactful use cases.
Adopt a product management mindset: Taylor emphasizes weaving governance and architecture into customer-centric data strategies.
While the path of the data leader is fraught with obstacles, success is possible. Taylor offers time-tested strategies to help data and business leaders alike make a measurable impact.
--------
“What data leaders should just own is the path to me is probably going to be fraught with failure, but I need to be able to pivot and I need to be agile. I can very much serve myself by adhering to a common set of principles, which I'm going to practice consistently and continually adapt and adjust in the way I engage with my stakeholders and identify their problems and lean in or lean out on data management techniques or delivering certain solutions. It comes down to intent. Do you genuinely want to help people in your business solve problems with data? Do you genuinely want to grow? Do you genuinely recognize that there's not a magic bullet to doing this? Those are the data leaders who will be successful despite facing adversity.” – Taylor Culver
--------
Time Stamps
*(07:01): The data-business people problem
*(17:50): How data leaders can tackle business problems in 3 steps
*(26:22): Is data a strategic function or an enablement function?
*(33:50): Strategy: Data offense vs. data defense
*(40:45): Data is a people business: the value of trust
*(46:20): Our takeaways
--------
Sponsor
This podcast is presented by Alation.
Learn more:
Subscribe to the newsletter: https://www.alation.com/podcast/
Alation’s LinkedIn Profile: https://www.linkedin.com/company/alation/
Satyen’s LinkedIn Profile:
https://www.linkedin.com/in/ssangani/
--------
Links
Next Episode

Using AI to Revolutionize CX with Michael Olaye, EVP & Managing Director at Hero Digital
Any digital marketing leader will tell you that data and marketing strategies go hand-in-hand. In this episode, Michael Olaye, EVP and Managing Director of Hero Digital, shares his journey and practical strategies for success, drawing from his career path that began with door-to-door job hunting and led to spearheading major digital initiatives.
Michael emphasizes the central role of data in digital marketing, from informing internal business decisions to enhancing customer experiences, and discusses the dual focus of AI in driving internal efficiency while offering robust public-facing tools.
He highlights the critical interplay between data governance and AI ethics, stressing the importance of businesses being 'AI ready.' By exploring customer journeys and leveraging data for innovation, Michael demonstrates how insights can shape product development and business strategies.
As a forward-thinker, he shares his enthusiasm for emerging technologies like learning agents and multimodal models, envisioning a transformative future for business operations. Through candid anecdotes and expert advice, Michael delivers actionable insights on harnessing data and AI to drive innovation and customer satisfaction.
--------
“Some clients do not know that they're sitting on gold, they do not know that. They have tons of data that they've never done anything with and then they focus on the most simplistic things: media, SEO, social media content, website content. Then you come in and you're like, ‘Hey, we can help your customer service be more efficient by understanding how the data, how long it takes a call to go through. We can help you process products more better by understanding the transaction from seeing something online to going in store, to buying it, to returning it.’ Looking at those data sets and seeing patterns or bringing them together to see journeys, that's where the secret lies.” – Michael Olaye
--------
Time Stamps
*(03:37): How Michael uses data for customer experience
*(11:39): AI in marketing today: The role of data
*(20:54): The dangers of bad data in AI
*(23:30): How do you find high-value data?
*(30:17): Understanding data and the brand-loyalty debate
*(38:16): David’s takeaways
--------
Sponsor
This podcast is presented by Alation.
Learn more:
Subscribe to the newsletter: https://www.alation.com/podcast/
Alation’s LinkedIn Profile: https://www.linkedin.com/company/alation/
David’s LinkedIn Profile:
https://www.linkedin.com/in/davidwchao/
--------
Links
If you like this episode you’ll love
Episode Comments
Generate a badge
Get a badge for your website that links back to this episode
<a href="https://goodpods.com/podcasts/data-radicals-217063/from-statecraft-to-codebreaking-the-big-data-origin-story-with-chris-w-80272003"> <img src="https://storage.googleapis.com/goodpods-images-bucket/badges/generic-badge-1.svg" alt="listen to from statecraft to codebreaking: the big data origin story with chris wiggins, chief data scientist at the new york times on goodpods" style="width: 225px" /> </a>
Copy