Part — 1: Subset Selection

Photo by Birmingham Museums Trust on Unsplash

The regression models assume a linear relationship between predictors and response variables. They try to estimate the coefficients of the following expression.


Find out which Coursera course is best suited for you this year

Photo by John Schnobrich on Unsplash

When it comes to picking the best Data Science course from Coursera, one’s mind can be confused over the available options. Today, in this blog, I’ll share the top and the best Data Science courses that you can opt for on Coursera, based on your current knowledge and experience.

Coursera is an American open online course provider company. Founded in 2012 by Stanford University computer science professors Andrew Ng and Daphne Koller, Coursera works with universities and other organizations to offer online courses, certifications, and degrees in a variety of subjects.

Before we begin, I want you to know that…


Also, what are the LOOCV and k-Fold Cross-Validation techniques?

Photo by Ashkan Forouzani on Unsplash

One of the most crucial aspects of modern statistics and machine learning is resampling, which is the process of repeatedly drawing samples (subsets) from a training set and refitting a particular model on each sample in order to gain info such as variability of fit.

One of the reasons we use resampling is that we can analyze the results of a model fitted over many samples of the same dataset. This allows us to obtain some additional knowledge that would not have been obtained by fitting the model just once. …


All you need to know about KNN.

Photo by Robert Katzki on Unsplash

“A man is known for the company he keeps.”

A perfect opening line I must say for presenting the K-Nearest Neighbors. Yes, that's how simple the concept behind KNN is. It just classifies a data point based on its few nearest neighbors. How many neighbors? That is what we decide.

Looks like you already know a lot of there is to know about this simple model. Let’s dive in to have a much closer look.

Before moving on, it’s important to know that KNN can be used for both classification and regression problems. …


A gentle introduction to Logistic Regression

Photo by Paweł Czerwiński on Unsplash

Logistic Regression is one of the fundamental models used in Machine Learning. It is a classification technique and is best suited for predicting the categorical response variable.

Why Logistic Regression?

While linear regression works well with a continuous or quantitative output variable, the Logistic Regression is used to predict a categorical or qualitative output variable.

For example, target values like price, sales, temperature, etc are quantitative in nature and thus can be analyzed and predicted using any linear model such as linear regression.

But what if we have to predict, whether a mail is a spam or not spam, whether a credit card…


Read this before building a Machine Learning model.

Photo by Paweł Czerwiński on Unsplash

Some facts just mess up in our minds and then it gets hard to recall what’s what. I had a similar experience with Bias & Variance, in terms of recalling the difference between the two. And the fact that you are here suggests that you too are muddled by the terms.

So let’s understand what Bias and Variance are, what Bias-Variance Trade-off is, and how they play an inevitable role in Machine Learning.

The Bias

Let me ask you a question. Why do humans get biased when they do? Or what motivates them to show some bias every now and then?

I’m…


A complete study — Model Interpretation →Hypothesis Testing →Feature Selection

Photo by Steve Johnson on Unsplash

Linear Regression, one of the most popular and discussed models, is certainly the gateway to go deeper into Machine Learning (ML). Such a simplistic, straightforward approach to modeling is worth learning as one of your first steps into ML.

Before moving forward, let us recall that Linear Regression can be broadly classified into two categories.

  • Simple Linear Regression: It’s the simplest form of Linear Regression that is used when there is a single input variable for the output variable.

If you are new to regression, then I strongly suggest you first read about Simple Linear Regression from the link below…


Everything you need to know about Simple Linear Regression

Photo by Glenn Carstens-Peters on Unsplash

When we go about understanding Machine Learning models, one of the first things we generally come across is Simple Linear Regression. It’s the first step into Machine Learning and this post will help you understand all you need to know about it. Let’s start with understanding what Regression is.

What is Regression?

The term regression was first coined in the 19th century to describe a phenomenon, that the heights of descendants of tall ancestors tend to regress (or approach) towards the normal average height. In other words, regression is the tendency to return to moderation (mean). Interesting right?

In statistics, the term is…


How to step into Data Science as a complete beginner

Photo by Uriel Soberanes on Unsplash

Data Science, which is also known as the sexiest job of the century, has become a dream job for many of us. But for some, it looks like a challenging maze and they don’t know where to start. If you are one of them, then continue reading.

In this post, I’ll discuss how you can start your journey of Data Science from scratch. I’ll explain the following steps in detail.

  1. Learn the basics of programming with Python
  2. Learn basic Statistics and Mathematics
  3. Learn Python for Data Analysis
  4. Learn Machine Learning
  5. Practice with projects

Learn the basics of programming with Python

If you are from an IT background…


As the world progressed towards technology and digitalization, it was giving rise to humongous amounts of data. Initially, the challenge was to store the data. But when the storage started to become cheap and feasible, the focus shifted to processing this data. That marked the beginning of what we today call Data Science.

Let's understand Data Science by first understanding Data.

What is Data?

Usually, when we hear the term “Data”, we think of numbers, tables, charts, etc. But there’s a much broader spectrum of recorded things that can be classified as Data. For instance, the GDP of a country, annual sales of…

Sangeet Aggarwal

Data Analyst at Optum (a United Health Group Compay) | Data Science & Machine Learning Enthusiast | Happy to share knowledge as I acquire it

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store