May 30, 2021 Article blog
Hello, Everyone In Front of the Screen, Year After Year, Day After Day, In A Flash Between The Fantastic 2020, IS About To Pass, The Editor-in-Chief Here in Advance Wish You All: New Year's Day Fast 楽. T oday, share with you several ways to encounter missing values when analyzing data with pandas. Recommended lessons: Python Automation Office, Python3 Advanced: Data Analysis and Visibility.
Pandas provides a comprehensive approach to this when we encounter numerical deficiencies when working with data, including:
Isull() is used to find out where the missing value is, returning a Boolean-type mask to mark the missing value, and here is the case:
import pandas as pd
import numpy as np
data = pd.DataFrame({'name':['W3CSCHOOL',np.nan,'JAVA','PYTHON'],'age':[18,np.nan,99,None]})
data
The data obtained by executing the above code is as follows:
name age
0 W3CSCHOOL 18.0
1 NaN NaN
2 JAVA 99.0
3 PYTHON NaN
Here we can see that whether we create DataFrame with np.nan or Non, it becomes NaN when we create it.
name age
0 False False
1 True True
2 False False
3 False True
"Notnull() is the opposite of isnull() to find out non-empty values and mark them with Boolean values, and here is an example:"
data.notnull()
name age
0 True True
1 False False
2 True True
3 True False
Dropna() literally means losing the missing value.
DataFrame.dropna(axis=0, how=‘any’, thresh=None, subset=None, inplace=False)
parameter:
Here's an example:
data.dropna(axis=1,thresh=3)
name
0 W3CSCHOOL
1 NaN
2 JAVA
3 PYTHON
data.dropna(axis=0,how='all')
name age
0 W3CSCHOOL 18.0
2 JAVA 21.0
3 PYTHON NaN
data.dropna(subset = ['name'])
name age
0 W3CSCHOOL 18.0
2 JAVA 21.0
3 PYTHON NaN
The purpose of fillna() is to fill in the missing values
DataFrame.fillna(value=None, method=None, axis=None, inplace=False, limit=None, downcast=None)
parameter:
Here's an example:
data.fillna(0)
name age
0 W3CSCHOOL 18.0
1 0 0.0
2 JAVA 21.0
3 PYTHON 0.0
data.fillna(method='ffill')
name age
0 W3CSCHOOL 18.0
1 W3CSCHOOL 18.0
2 JAVA 21.0
3 PYTHON 21.0