Data Skeptic
Kyle Polich
4 Listeners
All episodes
Best episodes
Top 10 Data Skeptic Episodes
Goodpods has curated a list of the 10 best Data Skeptic episodes, ranked by the number of listens and likes each episode have garnered from our listeners. If you are listening to Data Skeptic for the first time, there's no better place to start than with one of these standout episodes. If you are a fan of the show, vote for your favorite Data Skeptic episode by adding your comments to the episode page.
iNaturalist
Data Skeptic
06/24/24 • 37 min
Have you ever participated in citizen science? Do you want to? One of the most popular platforms for crowdsourcing biodiversity data is iNaturalist. In addition to being a great science tool, the iNaturalist app can help you identify the organisms you encounter every day. We talked to Executive Director Scott Laurie about how scientists use iNaturalist. We also got to discuss what makes iNaturalist’s AI species recognition so good, and how citizen scientists are constantly providing high-quality training data. Listen in and learn how this fun-to-use tool works, where it's headed, and how you can get involved.
1 Listener
arXiv Publication Patterns
Data Skeptic
10/23/23 • 28 min
Today, we are joined by Rajiv Movva, a PhD student in Computer Science at Cornell Tech University. His research interest lies in the intersection of responsible AI and computational social science. He joins to discuss the findings of this work that analyzed LLM publication patterns.
He shared the dataset he used for the survey. He also discussed the conditions for determining the papers to analyze. Rajiv shared some of the trends he observed from his analysis. For one, he observed there has been an increase in LLMs research. He also shared the proportions of papers published by universities, organizations, and industry leaders in LLMs such as OpenAI and Google. He mentioned the majority of the papers are centered on the social impact of LLMs. He also discussed other exciting application of LLMs such as in education.
1 Listener
Analysis of Unstructured Data
Data Skeptic
06/28/24 • 27 min
Robbie Moon from the Georgia Tech Scheller College of Business joins us to discuss the analysis of unstructured data and the application of NLP methodologies towards financial data.
1 Listener
Learn to Code
Data Skeptic
06/18/24 • 49 min
Do you code or are you interested in learning to code? Join us today and hear from three individuals that are at very different stages of their coding journeys. Becky Hansis-O’Neill (also our co-host this season) shares her experiences as a newbie who wants to learn more. Dr. Malia Gehan, a self-taught developer interested in studying plant phenotypes, explains why and how she and her colleagues learned to code and developed PlantCV. Finally, Dr. John Wilmes discusses his work as a professional mathematician and Machine Learning Research Engineer. Whether you are thinking about learning to code or an expert, we’re sure you will see a bit of yourself in this episode.
1 Listener
Memory in Chess
Data Skeptic
02/12/24 • 48 min
On today’s show, we are joined by our co-host, Becky Hansis-O’Neil. Becky is a Ph.D. student at the University of Missouri, St Louis, where she studies bumblebees and tarantulas to understand their learning and cognitive work.
She joins us to discuss the paper: Perception in Chess. The paper aimed to understand how chess players perceive the positions of chess pieces on a chess board. She discussed the findings paper. She spoke about situations where grandmasters had better recall of chess positions than beginners and situations where they did not.
Becky and Kyle discussed the use of chess engines for cheating. They also discussed how chess players use chunking. Becky discussed some approaches to studying chess cognition, including eye tracking, EEG, and MRI.
## Paper in Focus
## Resources
1 Listener
NCAA Predictions on Spark
Data Skeptic
05/11/19 • 23 min
In this episode, Kyle interviews Laura Edell at MS Build 2019. The conversation covers a number of topics, notably her NCAA Final 4 prediction model.
1 Listener
1 Comment
1
Proposing Annoyance Mining
Data Skeptic
06/09/15 • 30 min
A recent episode of the Skeptics Guide to the Universe included a slight rant by Dr. Novella and the rouges about a shortcoming in operating systems. This episode explores why such a (seemingly obvious) flaw might make sense from an engineering perspective, and how data science might be the solution.
In this solo episode, Kyle proposes the concept of "annoyance mining" - the idea that with proper logging and enough feedback, data scientists could be provided the right dataset from which they can detect flaws and annoyances in software and other systems and automatically detect potential bugs, flaws, and improvements which could make those systems better.
As system complexity grows, it seems that an abstraction like this might be required in order to keep maintaining an effective development cycle. This episode is a bit of a soap box for Kyle as he explores why and how we might track an appropriate amount of data to be able to make better software and systems more suited for the users.
[MINI] Structured and Unstructured Data
Data Skeptic
08/21/15 • 13 min
Today's mini-episode explains the distinction between structured and unstructured data, and debates which of these categories best describe recipes.
[MINI] The CAP Theorem
Data Skeptic
06/17/16 • 10 min
Distributed computing cannot guarantee consistency, accuracy, and partition tolerance. Most system architects need to think carefully about how they should appropriately balance the needs of their application across these competing objectives. Linh Da and Kyle discuss the CAP Theorem using the analogy of a phone tree for alerting people about a school snow day.
Prequisites for Time Series
Data Skeptic
05/21/21 • 8 min
Today's experimental episode uses sound to describe some basic ideas from time series.
This episode includes lag, seasonality, trend, noise, heteroskedasticity, decomposition, smoothing, feature engineering, and deep learning.
Show more best episodes
Show more best episodes
FAQ
How many episodes does Data Skeptic have?
Data Skeptic currently has 560 episodes available.
What topics does Data Skeptic cover?
The podcast is about Mathematics, Podcasts, Technology and Science.
What is the most popular episode on Data Skeptic?
The episode title 'Analysis of Unstructured Data' is the most popular.
What is the average episode length on Data Skeptic?
The average episode length on Data Skeptic is 31 minutes.
How often are episodes of Data Skeptic released?
Episodes of Data Skeptic are typically released every 7 days.
When was the first episode of Data Skeptic?
The first episode of Data Skeptic was released on May 23, 2014.
Show more FAQ
Show more FAQ