Skip to content Skip to sidebar Skip to footer

Fill Cell Containing Nan With Average Of Value Before And After

I would like to fill missing values in a pandas dataframe with the average of the cells directly before and after the missing value. So if it was [1, NaN, 3], the NaN value would b

Solution 1:

Consider this dataframe

df = pd.DataFrame({'val': [1,np.nan, 4, 5, np.nan, 10]})

    val
0   1.0
1   NaN
2   4.0
3   5.0
4   NaN
5   10.0

You can use fillna along with shift() to get the desired output

df.val = df.val.fillna((df.val.shift() + df.val.shift(-1))/2)

You get

val01.012.524.035.047.5510.0

Solution 2:

Use spies006's example df.

df = pd.DataFrame({'a': [10, 6, -3, -2, 4, 12, 3, 3], 
'b': [6, -3, np.nan, 12, 8, 11, -5, -5], 
'id': [1, 1, 1, 1, np.nan, 2, 2, 4]})

#use np.where to locate the nans and fill it with the average of surrounding elements.
df.where(df.notnull(), other=(df.fillna(method='ffill')+df.fillna(method='bfill'))/2)
Out[2517]: 
    a     b   id
0  10   6.0  1.0
1   6  -3.0  1.0
2  -3   4.5  1.0
3  -2  12.0  1.0
4   4   8.0  1.5
5  12  11.0  2.0
6   3  -5.0  2.0
7   3  -5.0  4.0

Solution 3:

This will work if you don't have any NaN values as the last index, which is implied to be true by your imputation methodology.

>>>data = pd.DataFrame({'a': [10, 6, -3, -2, 4, 12, 3, 3], 
'b': [6, -3, np.nan, 12, 8, 11, -5, -5], 
'id': [1, 1, 1, 1, np.nan, 2, 2, 4]})
>>>data
    a     b   id
0  10   6.0  1.0
1   6  -3.0  1.0
2  -3   NaN  1.0
3  -2  12.0  1.0
4   4   8.0  NaN
5  12  11.0  2.0
6   3  -5.0  2.0
7   3  -5.0  4.0



>>>nan_cols = data.columns[data.isnull().any(axis=0)]>>>for col in nan_cols:...for i inrange(len(data)):...if pd.isnull(data.loc[i, col]):...                    data.loc[i, col] = (data.loc[i-1, col] + data.loc[i+1, col])/2>>>data
    a     b   id
0  10   6.0  1.0
1   6  -3.0  1.0
2  -3   4.5  1.0
3  -2  12.0  1.0
4   4   8.0  1.5
5  12  11.0  2.0
6   3  -5.0  2.0
7   3  -5.0  4.0

Post a Comment for "Fill Cell Containing Nan With Average Of Value Before And After"