df.dropna()函数用于删除dataframe数据中的缺失数据,即 删除NaN数据.
官方函数说明:
DataFrame.dropna(axis=0, how='any', thresh=None, subset=None, inplace=False)Remove missing values.See the User Guide for more on which values are considered missing, and how to work with missing data.ReturnsDataFrameDataFrame with NA entries dropped from it.
参数说明:
测试:
>>>df = pd.DataFrame({"name": ['Alfred', 'Batman', 'Catwoman'],"toy": [np.nan, 'Batmobile', 'Bullwhip'],"born": [pd.NaT, pd.Timestamp("1940-04-25"),pd.NaT]})
>>>dfname toy born0 Alfred NaN NaT1 Batman Batmobile 1940-04-252 Catwoman Bullwhip NaT
删除至少缺少一个元素的行:
>>>df.dropna()name toy born1 Batman Batmobile 1940-04-25
删除至少缺少一个元素的列:
>>>df.dropna(axis=1)name0 Alfred1 Batman2 Catwoman
删除所有元素丢失的行:
>>>df.dropna(how='all')name toy born0 Alfred NaN NaT1 Batman Batmobile 1940-04-252 Catwoman Bullwhip NaT
只保留至少2个非NA值的行:
>>>df.dropna(thresh=2)name toy born1 Batman Batmobile 1940-04-252 Catwoman Bullwhip NaT
从特定列中查找缺少的值:
>>>df.dropna(subset=['name', 'born'])name toy born1 Batman Batmobile 1940-04-25
修改原数据:
>>>df.dropna(inplace=True)>>>dfname toy born1 Batman Batmobile 1940-04-25
以上。