for loop - R, Create data.frame conditional on colnames and row entries of existing df -


i have follow question.

i creating data.frame conditional on column names , specific row entries of existing data.frame. below how resolved using for loop (thanks @roland's suggestion... real data violated requirements of @eddi's answer), has been running on actual data set (200x500,000+ rows.cols) more 2 hours now...

(the following generated data.frames similar actual data.)

set.seed(1) <- data.frame(year=c(1986:1990),                 events=round(runif(5,0,5),digits=2)) b <- data.frame(year=c(rep(1986:1990,each=2,length.out=40),1986:1990),                  region=c(rep(c("x","y"),10),rep(c("y","z"),10),rep("y",5)),                 state=c(rep(c("ny","pa","nc","fl"),each=10),rep("al",5)),                 events=round(runif(45,0,5),digits=2)) d <- matrix(rbinom(200,1,0.5),10,20, dimnames=list(c(1:10), rep(1986:1990,each=4))) e <- data.frame(id=sprintf("%02d",1:10), as.data.frame(d),                  region=c("x","y","x","z","z","y","y","z","y","y"),                  state=c("pa","al","ny","nc","nc","nc","fl","fl","al","al"))    (i in seq_len(nrow(d))) {    (j in seq_len(ncol(d))) {      d[i,j] <- ifelse(d[i,j]==0,                       a$events[a$year==colnames(d)[j]],                       b$events[b$year==colnames(d)[j] &                                b$state==e$state[i] &                                b$region==e$region[i]])    }  } 

is there better/faster way this?

a simpler way (i think - not involve melting, dcasting , merging) follows:

first, , b arrays, should indexed year (for a) , year/state/region (for b):

at = a$events; names(at) = a$year  bt = tapply(b$events,list(b$year,b$state,b$region),function(x) min(x)) # note, used min(x) in tapply on safe side, functions returns scalar  # create result of more complex case (lookup in b) ids = cbind(colnames(d)[col(d)],             as.character(e$state[row(d)]),             as.character(e$region[row(d)])            ) vals=bt[ids]; dim(vals)=dim(d) # , compute desired result ifelse result = ifelse(d==0,at[colnames(d)[col(d)]],vals) # , that's it! 

this should faster (avoiding nested loops), haven't profiled that. let know how works on full data


Comments

Popular posts from this blog

javascript - DIV "hiding" when changing dropdown value -

Does Firefox offer AppleScript support to get URL of windows? -

android - How to install packaged app on Firefox for mobile? -