Log in

goodpods headphones icon

To access all our features

Open the Goodpods app
Close icon
headphones
Data Skeptic

Data Skeptic

Kyle Polich

The Data Skeptic Podcast features interviews and discussion of topics related to data science, statistics, machine learning, artificial intelligence and the like, all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches.

4 Listeners

bookmark
Share icon

All episodes

Best episodes

Top 10 Data Skeptic Episodes

Goodpods has curated a list of the 10 best Data Skeptic episodes, ranked by the number of listens and likes each episode have garnered from our listeners. If you are listening to Data Skeptic for the first time, there's no better place to start than with one of these standout episodes. If you are a fan of the show, vote for your favorite Data Skeptic episode by adding your comments to the episode page.

Data Skeptic - iNaturalist

iNaturalist

Data Skeptic

play

06/24/24 • 37 min

Have you ever participated in citizen science? Do you want to? One of the most popular platforms for crowdsourcing biodiversity data is iNaturalist. In addition to being a great science tool, the iNaturalist app can help you identify the organisms you encounter every day. We talked to Executive Director Scott Laurie about how scientists use iNaturalist. We also got to discuss what makes iNaturalist’s AI species recognition so good, and how citizen scientists are constantly providing high-quality training data. Listen in and learn how this fun-to-use tool works, where it's headed, and how you can get involved.

1 Listener

bookmark
plus icon
share episode
Data Skeptic - arXiv Publication Patterns
play

10/23/23 • 28 min

Today, we are joined by Rajiv Movva, a PhD student in Computer Science at Cornell Tech University. His research interest lies in the intersection of responsible AI and computational social science. He joins to discuss the findings of this work that analyzed LLM publication patterns.

He shared the dataset he used for the survey. He also discussed the conditions for determining the papers to analyze. Rajiv shared some of the trends he observed from his analysis. For one, he observed there has been an increase in LLMs research. He also shared the proportions of papers published by universities, organizations, and industry leaders in LLMs such as OpenAI and Google. He mentioned the majority of the papers are centered on the social impact of LLMs. He also discussed other exciting application of LLMs such as in education.

1 Listener

bookmark
plus icon
share episode
Data Skeptic - Analysis of Unstructured Data
play

06/28/24 • 27 min

Robbie Moon from the Georgia Tech Scheller College of Business joins us to discuss the analysis of unstructured data and the application of NLP methodologies towards financial data.

1 Listener

bookmark
plus icon
share episode
Data Skeptic - Learn to Code

Learn to Code

Data Skeptic

play

06/18/24 • 49 min

Do you code or are you interested in learning to code? Join us today and hear from three individuals that are at very different stages of their coding journeys. Becky Hansis-O’Neill (also our co-host this season) shares her experiences as a newbie who wants to learn more. Dr. Malia Gehan, a self-taught developer interested in studying plant phenotypes, explains why and how she and her colleagues learned to code and developed PlantCV. Finally, Dr. John Wilmes discusses his work as a professional mathematician and Machine Learning Research Engineer. Whether you are thinking about learning to code or an expert, we’re sure you will see a bit of yourself in this episode.

1 Listener

bookmark
plus icon
share episode
Data Skeptic - Memory in Chess

Memory in Chess

Data Skeptic

play

02/12/24 • 48 min

On today’s show, we are joined by our co-host, Becky Hansis-O’Neil. Becky is a Ph.D. student at the University of Missouri, St Louis, where she studies bumblebees and tarantulas to understand their learning and cognitive work.

She joins us to discuss the paper: Perception in Chess. The paper aimed to understand how chess players perceive the positions of chess pieces on a chess board. She discussed the findings paper. She spoke about situations where grandmasters had better recall of chess positions than beginners and situations where they did not.

Becky and Kyle discussed the use of chess engines for cheating. They also discussed how chess players use chunking. Becky discussed some approaches to studying chess cognition, including eye tracking, EEG, and MRI.

## Paper in Focus

Perception in chess

## Resources

Detecting Cheating in Chess with Ken Regan

1 Listener

bookmark
plus icon
share episode
Data Skeptic - NCAA Predictions on Spark
play

05/11/19 • 23 min

In this episode, Kyle interviews Laura Edell at MS Build 2019. The conversation covers a number of topics, notably her NCAA Final 4 prediction model.

1 Listener

comment icon

1 Comment

1

bookmark
plus icon
share episode
Data Skeptic - Proposing Annoyance Mining
play

06/09/15 • 30 min

A recent episode of the Skeptics Guide to the Universe included a slight rant by Dr. Novella and the rouges about a shortcoming in operating systems. This episode explores why such a (seemingly obvious) flaw might make sense from an engineering perspective, and how data science might be the solution.

In this solo episode, Kyle proposes the concept of "annoyance mining" - the idea that with proper logging and enough feedback, data scientists could be provided the right dataset from which they can detect flaws and annoyances in software and other systems and automatically detect potential bugs, flaws, and improvements which could make those systems better.

As system complexity grows, it seems that an abstraction like this might be required in order to keep maintaining an effective development cycle. This episode is a bit of a soap box for Kyle as he explores why and how we might track an appropriate amount of data to be able to make better software and systems more suited for the users.

bookmark
plus icon
share episode
Data Skeptic - [MINI] Structured and Unstructured Data
play

08/21/15 • 13 min

Today's mini-episode explains the distinction between structured and unstructured data, and debates which of these categories best describe recipes.

bookmark
plus icon
share episode
Data Skeptic - [MINI] The CAP Theorem
play

06/17/16 • 10 min

Distributed computing cannot guarantee consistency, accuracy, and partition tolerance. Most system architects need to think carefully about how they should appropriately balance the needs of their application across these competing objectives. Linh Da and Kyle discuss the CAP Theorem using the analogy of a phone tree for alerting people about a school snow day.

bookmark
plus icon
share episode
Data Skeptic - Prequisites for Time Series
play

05/21/21 • 8 min

Today's experimental episode uses sound to describe some basic ideas from time series.

This episode includes lag, seasonality, trend, noise, heteroskedasticity, decomposition, smoothing, feature engineering, and deep learning.

bookmark
plus icon
share episode

Show more best episodes

Toggle view more icon

FAQ

How many episodes does Data Skeptic have?

Data Skeptic currently has 560 episodes available.

What topics does Data Skeptic cover?

The podcast is about Mathematics, Podcasts, Technology and Science.

What is the most popular episode on Data Skeptic?

The episode title 'Analysis of Unstructured Data' is the most popular.

What is the average episode length on Data Skeptic?

The average episode length on Data Skeptic is 31 minutes.

How often are episodes of Data Skeptic released?

Episodes of Data Skeptic are typically released every 7 days.

When was the first episode of Data Skeptic?

The first episode of Data Skeptic was released on May 23, 2014.

Show more FAQ

Toggle view more icon

Comments