Member-only story

Feature Importance in Python

Step-by-step follow-along | Data Series | Episode 14.2

Mazen Ahmed
6 min readNov 21, 2023

In the previous episode we discussed feature importance and how it is derived from some of the main data science models. In this episode we look to implement some python code that derives feature importance from some of these models.

You can view and use the code and data in this episode here: Link

Objective

To derive feature importance from logistic regression coefficients, gradient boosted trees gini impurity, feature permutation applied to a neural network and SHAP applied to a neural network.

In this episode we look at determining the feature importance of ph, hardness, solids, chloramines, sulfate, conductivity, organic carbon, trihalomethanes and turbidity in predicting water potability (if water is drinkable or not).

Libraries

We start by importing some general python libraries that will enable us to import and manipulate our data such as pandas and produce graphs such as seaborn. We also filter out any warnings that may pop up.

import pandas as pd
import warnings
import seaborn as sns

warnings.filterwarnings("ignore")

Data Exploration

--

--

Mazen Ahmed
Mazen Ahmed

No responses yet