Member-only story

How to Handle Missing Data in Python

Step-by-step follow-along | Data Series | Episode 15.2

Mazen Ahmed
5 min readDec 13, 2023

In the previous episode we discussed different methods we can use to handle missing data, and the advantages and disadvantages of each. In this episode we look to implement these methods in Python.

You can view and use the code and data in this episode here: Link

Objective

Implement various methods for handling missing data.

The methods we cover, include:

1. Dropping rows with missing values
2. Dropping variables with missing values
3. Imputing on Continuous and Categorical data

4. Predicting missing values
5. KNN Imputation
6. Multiple Imputation

Libraries

We start by importing some general python libraries that will enable us to import and manipulate our data such as pandas and produce graphs such as seaborn.

We set the random seed to 4 to ensure we get reproducible results.

import pandas as pd
import warnings
import seaborn as sns
import numpy as np

np.random.seed(4)
warnings.filterwarnings("ignore")

--

--

Mazen Ahmed
Mazen Ahmed

No responses yet