当前位置: 动力学知识库 > 问答 > 编程问答 >

python 2.7 - saving results of pandas ols regression to dataframe

问题描述:

I am running a regression on a grouped dataframe like so:

import pandas as pd

from pandas.stats.api import ols

df=pd.read_csv(r'C:\path_to_file.csv') #path to original file

#groupby POINTID

list1=[]

for i, grp in df.groupby('POINTID'):

result = ols(y=grp['Date'], x=grp['SWIR32']) #run regression

#turn regression paramaters to a dataframe

frame=pd.DataFrame({'POINTID':i, 'R2': result.r2, 'pvalue': result.p_value[1], 'rmse': result.rmse})

list1.append(frame)

final_frame=pd.concat(list1)

but this returns:

ValueError: If using all scalar values, you must pass an index

when I change the dataframe creation line to this:

frame=pd.DataFrame({'R2': result.r2, 'pvalue': result.p_value[1] , 'rmse': result.rmse}, index=i)

this is returned:

TypeError: len() of unsized object

Essentially I just want the POINTID, r2, RMSE and p-value saved to one dataframe.

网友答案:

Use pd.Series instead

import pandas as pd
from pandas.stats.api import ols

df=pd.read_csv(r'C:\path_to_file.csv') #path to original file

#groupby POINTID
list1=[]
for i, grp in df.groupby('POINTID'):
    result = ols(y=grp['Date'], x=grp['SWIR32']) #run regression
    #turn regression paramaters to a dataframe
    frame=pd.Series({'POINTID':i, 'R2': result.r2, 'pvalue': result.p_value[1], 'rmse': result.rmse})
    list1.append(frame)
final_frame=pd.concat(list1, axis=1).T
分享给朋友:
您可能感兴趣的文章:
随机阅读: