There is a lot of jargon that comes with data science and machine learning. Often, it takes a while to really understand some concepts that are often masked by formal definitions and mathematical equations. Allegedly, according to Einstein,

“You do not really understand something unless you can explain it to your grandmother."

Maximum likelihood is one of the two fundamental approaches (the other being least squares estimation) to estimate parameters in machine learning and many new methods are inspired from these two.

Despite this, many courses and books explaining Maximum Likelihood fail to make it obvious that the technique is…

They don’t come more easily (and for free) than

this…

Perhaps the most well-known resource for learning deep learning is Andrew Ng’s series of 5 courses on Coursera. Those courses are still a great resource for anyone learning the fundamentals of the field but they are now a few years old (their launch was announced in August 2017). In this post, I will give you three main reasons why you should instead **start** from MIT’s course that I am going to tell you about.

Before I try to convince you to start your deep learning journey from there, here is…

As someone working in data science for over a decade, it is frustrating to see people prophesying on how the field will get extinct in 10 years. The typical reason given is how emerging AutoML tools will eliminate the need for practitioners to develop their own algorithms.

I find such opinions especially frustrating because it dissuades a beginner from taking data science seriously enough to excel in it. Frankly, it is a disservice to the data science community to see such prophecies about a field where the demand is only going to increase even further!

Why would any sane person…

Until recently, I struggled with consistently staying productive. I have achieved phenomenal successes close to deadlines, but I never managed to keep momentum. Thanks to being in academia, I attended numerous courses and training workshops in the last 14 years to improve. However, at best, most of those have been only incrementally helpful. I even bought books on procrastination that I procrastinated on.

Despite these struggles, I always had a way of pulling things together last minute by dialing down on leisure, sleep, and social activities as and when required. This was possible because I was living alone. …

This post will help you understand Bayesian inference at an intuitive level with the help of a simple case study. I hope that once you read this article, you will be very clear on how the well-known “Bayes theorem” is used, what do the terms in the theorem mean (prior, posterior, likelihood) and how this compares with other approaches to decision making (pessimist /optimist/frequentist). We will use a simple case study to help explain the concepts. For those who are interested, I have provided simulation results for the given case study and a link to R code for further exploration…

To minimize bomber plane losses to enemy fire during World War 2, the US military wanted to armor the planes in places where they are most needed (identified as the points where the planes were the most damaged on return).

The challenge was to figure out the right amount of armor to put on. Too much would make the plane heavy leading to more fuel consumption and difficulty in maneuvering. Too little may not be sufficient to protect the plane.

To help with this, the military approached Abraham Wald (Hungarian Jewish mathematician, later to pioneer statistical sequential analysis). Wald gave…

Despite being amongst the few fundamental concepts in data science, the Central Limit Theorem (CLT) is still misunderstood.

Questions around such fundamental statistical concepts do pop up in data science interviews. Yet, you’d be surprised how often aspiring data scientists invest their learning time on the latest trends and new algorithms but miss the trick by not revisiting basic concepts and get stumped at interviews.

This post will help you better understand the CLT theorem at an intuitive level. It will also help you better appreciate its importance, and the key assumptions when it is used.

In a somewhat formal…

The data science field is evolving at an unprecedented pace. While the field is certainly not becoming extinct in the foreseeable future, your skillset may well do if you cease learning and upskilling.

Data science continues to enjoy the spotlight as more organizations wish to use data to stay competitive. This is promising for each one of us. However, the rising demand also means that an ever-increasing number of people are getting into data science.

With ubiquitous learning opportunities at everyone’s disposal, you must continue to learn and grow to stay competitive in such an environment.

This post will introduce…

If you have ever wanted to analyze data and add a new toolkit to your skillset that is used by professional data scientists, then read on.

Rather than presenting a dry list of commands, this tutorial will use a specific case study as a motivating example to teach you everything you need to know about ggplot2, the de facto standard for creating high-quality graphics in R. It is a third-party library supported by the tidyverse ecosystem. While you could plot in R using the base library, you will most likely end up using ggplot2 for any actual project. …

*Had the correct scatterplot or data table been constructed, no one would have dared to risk the Challenger in such cold weather. Edward Tufte*

It was supposed to be a landmark day in modern history.

The first civilian (a high school teacher named Christina McAuliffe) was selected to go into space. There was even a possibility of a televised conversation between her and President Reagan during the annual State of the Union address, due on the same day in 10 hours. Instead, the space shuttle exploded 73 seconds after launch killing all the seven astronauts on board.

A day before…