当前位置: 动力学知识库 > 问答 > 编程问答 >

R - Partitioning units based on two variables

问题描述:

I am trying to partition observations in a data frame into 36 groups, based on two variables. More specifically, I am trying to cut each of the two variables into six groups, and then group the observations in one of the 36 different possible groups.

My attempt is below, which works. But is there a faster way to do this that avoids the double for loops?

Also, this isn't necessary, but how could I visualize the total number of observations in each group in a 6 by 6 grid? I know table() would produce a list of the 36 possible groups and their totals, but not in grid format.

set.seed(123)

x1 <- rnorm(1000)

x2 <- rnorm(1000)

data <- data.frame(x1,x2)

labs1 <- levels(cut(x1, 6))

ints1 <- cbind(lower = as.numeric(sub("\\((.+),.*", "\\1", labs1)),

upper = as.numeric(sub("[^,]*,([^]]*)\\]", "\\1", labs1)))

labs2 <- levels(cut(x2, 6))

ints2 <- cbind(lower = as.numeric(sub("\\((.+),.*", "\\1", labs2)),

upper = as.numeric(sub("[^,]*,([^]]*)\\]", "\\1", labs2)))

tmp <- expand.grid(labs1, labs2)

groups <- cbind(lower1 = as.numeric(sub("\\((.+),.*", "\\1", tmp[,1])),

upper1 = as.numeric(sub("[^,]*,([^]]*)\\]", "\\1", tmp[,1])),

lower2 = as.numeric(sub("\\((.+),.*", "\\1", tmp[,2])),

upper2 = as.numeric(sub("[^,]*,([^]]*)\\]", "\\1", tmp[,2])))

for (i in 1:1000){

for (j in 1:36){

if (x1[i] >= groups[j,1] & x1[i] <= groups[j,2] &

x2[i] >= groups[j,3] & x2[i] <= groups[j,4]){

data$group[i] <- j

}

}

}

网友答案:

You can use a mix of apply() that will iterate thru your data.frame and which() that will iterate thru your groups array:

data$group <- apply(data, 1, FUN=function(dataRow) 
  which(
    dataRow[1] >= groups[,1] & 
    dataRow[1] <= groups[,2] & 
    dataRow[2] >= groups[,3] & 
    dataRow[2] <= groups[,4]))
分享给朋友:
您可能感兴趣的文章:
随机阅读: