当前位置: 动力学知识库 > 问答 > 编程问答 >

python - Pandas: fill in missing values by Mode having 'index out of bounds' error

问题描述:

Suppose I have a following DataFrame:

Sample=pd.DataFrame({'Gender':['Male','Male','Male','Male','Female','Female','Male','Male'],

'Married':['No','Yes','Yes','Yes','No','No','Yes','Yes'],

'Dependents':['1','1','1','0','3+','3+','1','1'],

'Education':['Not Graduate','Graduate','Graduate','Graduate','Not Graduate','Not Graduate','Graduate','Graduate'],

'ApplicantIncome':[3596,3717,4166,2400,3333,6000,1234,4567],

'Credit_History':['1',np.nan,'0','1',np.nan,'1',np.nan,'0']})

ApplicantIncome Credit_History Dependents Education Gender Married

0 3596 1 1 Not Graduate Male No

1 3717 NaN 1 Graduate Male Yes

2 4166 0 1 Graduate Male Yes

3 2400 1 0 Graduate Male Yes

4 3333 NaN 3+ Not Graduate Female No

5 6000 1 3+ Not Graduate Female No

6 1234 NaN 1 Graduate Male Yes

7 4567 0 1 Graduate Male Yes

I would like to fill in NaN with Mode value in ['Gender','Married','Dependents','Education'] group.

I wrote the code below:

Sample['Credit_History']=Sample.groupby(['Gender','Married','Dependents','Education']).transform(lambda x:

x.fillna(x.mode()[0]))['Credit_History']

An error message about out of bounds popped up:

IndexError: ('index out of bounds', 'occurred at index ApplicantIncome')

Any idea about how to fix my code above? Thanks!

网友答案:

You can use a simple code to achieve what you want. df["credithistory"].fillna(df["credithistory"].mode())

Don't forget to import numpy.

分享给朋友:
您可能感兴趣的文章:
随机阅读: