Member-only story
Feature Importance in Python
Step-by-step follow-along | Data Series | Episode 14.2
In the previous episode we discussed feature importance and how it is derived from some of the main data science models. In this episode we look to implement some python code that derives feature importance from some of these models.
You can view and use the code and data in this episode here: Link
Objective
To derive feature importance from logistic regression coefficients, gradient boosted trees gini impurity, feature permutation applied to a neural network and SHAP applied to a neural network.
In this episode we look at determining the feature importance of ph, hardness, solids, chloramines, sulfate, conductivity, organic carbon, trihalomethanes and turbidity in predicting water potability (if water is drinkable or not).
Libraries
We start by importing some general python libraries that will enable us to import and manipulate our data such as pandas and produce graphs such as seaborn. We also filter out any warnings that may pop up.
import pandas as pd
import warnings
import seaborn as sns
warnings.filterwarnings("ignore")