当前位置: 动力学知识库 > 问答 > 编程问答 >

r - Aggregate by year and month for a POSIX variable

问题描述:

I have a dataset of the following form.

 country datetime x

1 United States 2008-01-01 00:00:00 5962.06

2 United States 2008-01-02 00:00:00 6002.74

3 United States 2008-01-03 00:00:00 6040.98

4 United States 2008-01-04 00:00:00 6031.44

5 United States 2008-01-05 00:00:00 6029.91

6 United States 2008-01-06 00:00:00 6025.24

For me time (hours, minutes, seconds) and days of the week are irrelevant, but I want to aggregate the values of variable "x" by country, year and month. Is there any straightforward way of doing this?

网友答案:

The easiest way is possibly to use strftime to format your datetime as a character vector that contains only the year and month.

Assuming your column datetime is of class POSIXct, and that your data.frame is called dat:

dat$shortdate <- strftime(dat$datetime, format="%Y/%m")
dat
        country   datetime       x shortdate
1 United States 2008-01-01 5962.06   2008/01
2 United States 2008-01-02 6002.74   2008/01
3 United States 2008-01-03 6040.98   2008/01
4 United States 2008-01-04 6031.44   2008/01
5 United States 2008-01-05 6029.91   2008/01
6 United States 2008-01-06 6025.24   2008/01

Then its a simple matter to use your favourite aggregation method to summarise the data. For example, using plyr:

library(plyr)
ddply(dat, .(shortdate), summarize, mean_x=mean(x))

  shortdate   mean_x
1   2008/01 6015.395
网友答案:

Given Andrie's better solution this will mainly be an exercise in POSIXlt illustration. Using the assumptions about the classes of your variables noted above and using mean as the aggregating function:

aggregate(dfrm$x, list(dfrm$country, as.POSIXlt(dfrm$datetime)$year, 
                       as.POSIXlt(dfrm$datetime)$mon), FUN=mean)
         Group.1 Group.2 Group.3        x
1  United States     108       0 6015.395

Note that one could add 1900 to the POSIXlt year value to recover a year and use the month value as an index into the R constant vector 'month.abb', and adding nice column labels:

aggregate(dfrm$x, list(Country=dfrm$country, 
                       Year=1900+as.POSIXlt(dfrm$datetime)$year, 
                       Month=month.abb[1+as.POSIXlt(dfrm$datetime)$mon]), 
FUN=mean)
         Country Year Month        x
1  United States 2008   Jan 6015.395
网友答案:

You can use zoo::as.yearmon:

 aggregate(x ~ country * as.yearmon(datetime), FUN=mean, data=dat)

 as.yearmon(datetime)       country        x
1             ene 2008 United States 6015.395
分享给朋友:
您可能感兴趣的文章:
随机阅读: