Skip to content Skip to sidebar Skip to footer

How To Optimise Filtering And Counting For Every Row In A Large R Data Frame

I have a data frame, such as the following: name day wages 1 Ann 1 100 2 Ann 1 150 3 Ann 2 200 4 Ann 3 150 5 Bob 1 100 6 Bob 1 200 7 Bob 1 150

Solution 1:

Going simply from your example output, here's something a bit fancier using data.table:

require(data.table)
DT <- data.table(df)
setkey(DT,name,day)

DT[,list(gt175 = sum(wages >= 175)),list(name,day)][,list(day = day,gt175 = as.integer(gt175 + c(tail(gt175,-1),0) > 0)),list(name)]

This is a little convoluted, but should be fast.

Post a Comment for "How To Optimise Filtering And Counting For Every Row In A Large R Data Frame"