Sharing some programming knowledge.

0%

Pandas:Data Cleaning(Dealing With Missing Values)

Check the Null Data

1
2
3
4
5
6
7
8
# check which cells in your DataFram are null (return a dataframe)
df.isnull()

# is any column null ? (return a series)
df.isnull().any()

# count the number of nulls in each column (return a series)
df.isnull().sum()

Delete the Null Data

1
2
3
4
5
6
7
8
# simply remove all rows that contain null data
df.dropna()

# simply remove all cols that contain null data
df.dropna(axis=1)

# use inplace=True to clean the data of the original dataframe
df.dropna(inplace=True)

Impute the Null Data

1
2
df.fillna(5, inplace=True) # fill all null data with number 5
df['first_column'].fillna(df['first_column'].mean(), inplace=True) # fill the first column's null data with mean value