Machine Learning with Coffee

Gustavo Lujan

Machine Learning with Coffee is a podcast where we are going to be sharing ideas about Machine Learning and related areas such as: artificial intelligence, business intelligence, business analytics, data mining and Big data. The objective is to promote a healthy discussion on the current state of this fascinating world of Machine Learning. We will be sharing our experience, sharing tricks, talking about latest developments and interviewing experts, all these on a very laid back, friendly manner. So, what are you waiting for? Grab a coffee and join us.

All episodes

Best episodes

Top 10 Machine Learning with Coffee Episodes

Goodpods has curated a list of the 10 best Machine Learning with Coffee episodes, ranked by the number of listens and likes each episode have garnered from our listeners. If you are listening to Machine Learning with Coffee for the first time, there's no better place to start than with one of these standout episodes. If you are a fan of the show, vote for your favorite Machine Learning with Coffee episode by adding your comments to the episode page.

17 Anomaly Detection: Clustering

Machine Learning with Coffee

12/22/20 • 27 min

We present 3 clustering algorithms which will help us detect anomalies: DBSCAN, Gaussian Mixture Models and K-means. These 3 algorithms are very popular and basic but have passed the test of time. All these algorithms have many variations which try to overcome some of the disadvantages of the original implementation.

11 Inferential Statistics

Machine Learning with Coffee

05/10/20 • 16 min

We talk about the importance of inferential statistics in Data Science. Inferential statistics are a set of techniques used to make generalizations about a population from a sample. One of the tools used in inferential statistics is hypothesis testing. In this episode we provide a couple of examples on when and why to use 1-sample t-tests and 2-sample t-tests. We also argue that the mean or average of a sample means nothing if we do not also consider the variation of the data.

06 How to Become a Data Scientist

Machine Learning with Coffee

03/15/20 • 29 min

We talk about what it takes to become a Data Scientist. We also discuss 4 prerequisites before preparing yourself to become a Data Scientist. Finally, we provide recommendations on 3 online courses, that if mastered, will put you above 90% of all Data Scientists out there.

15 Adaboost: Adaptive Boosting

Machine Learning with Coffee

09/28/20 • 18 min

Adaboost is one of the classic machine learning algorithms. Just like Random Forest and XGBoost, Adaboost belongs to the ensemble models, in other words, it aggregates the results of simpler classifiers to make robust predictions. The main different of Adaboost is that it is an adaptive algorithm, which means that it learns from the misclassified instances of previous models, assigning more weights to those errors and focusing its attention on those instances in the next round.

09 Regularization to Deal with Overfitting

Machine Learning with Coffee

04/19/20 • 15 min

In this episode with talk about regularization, an effective technique to deal with overfitting by reducing the variance of the model. Two techniques are introduced: ridge regression and lasso. The latter one is effectively a feature selection algorithm.

20 Perceptron: Machine Learning Begins

Machine Learning with Coffee

03/15/21 • 15 min

We introduce the concept of a perceptron as the basic component of a neural network. We talk about how important is to understand the concept of backpropagation applied to a single neuron.

14 XGBoost: The Winner of Many Competitions

Machine Learning with Coffee

07/26/20 • 13 min

XGBoost is an open-source software library which has won several Machine Learning competitions in Kaggle. It is based on the principles of gradient boosting, which is based on the ideas of the Leo Breiman, the creator of Random Forest. The theory behind gradient boosting was later formalized by Jerome H. Friedman. Gradient boosting combines weak learners just as Random Forest. XGBoost is an engineering implementation which includes a clever penalization of trees and a proportional shrinking of leaf nodes.

13 Random Forest

Machine Learning with Coffee

07/12/20 • 23 min

Random Forest is one of the best out-of-the-shelf algorithms. In this episode we try to understand the intuition behind the Random Forest and how it tries to leverage the capabilities of Decision Trees by aggregating them using a very smart trick called “bagging”. Variable Importance and out-of-bag error are two of the nice capabilities of Random Forest which allow us to find the most important predictors and compute a good generalization error, respectively.

10 Logistic Regression

Machine Learning with Coffee

04/26/20 • 22 min

Logistic regression is a very robust machine learning technique which can be used in three modes: binary, multinomial and ordinal. We talk about assumptions and some misconceptions. For example, people believe that because logistic regression fits only a linear separator in the expanded dimensional space it wouldn’t be able to fit a complex boundary in the original space. Also, people normally use either linear regression or multinomial logistic regression when they should be using ordinal logistic regression.

08 Linear Regression: The Return of the Queen

Machine Learning with Coffee

04/04/20 • 21 min

In this episode I will try to convince you that Linear Regression is one of the most powerful Machine Learning algorithms. We will talk about common misconceptions, especially that Linear Regression is not able to model non-linear relationships. We also discuss how the myth of normality encourages many people to completely discard Linear Regression on non-normal data, when in reality, normality of the data has nothing to do with this assumption. Finally, I provide advice in how to check, but most importantly, how to fix any violated assumption in Linear Regression.

Show more best episodes

FAQ

What is the most popular episode on Machine Learning with Coffee?

The episode title '14 XGBoost: The Winner of Many Competitions' is the most popular.

Machine Learning with Coffee

Gustavo Lujan

Top 10 Machine Learning with Coffee Episodes

FAQ

What is the most popular episode on Machine Learning with Coffee?

Comments