excel - record how long a variable was above a level in r -
i working on converting project have programmed in excel r. reason doing code includes lots of logic , data means thats excel's performance poor. far have coded around 50% of project in r , extremely impressed performance.
the code have following:
- loads 5min time-series data of stock , adds
day of year
column labeleddoy
in example below.
the ohlc data looks this:
date open high low close doy 1 2015-09-21 09:30:00 164.6700 164.7100 164.3700 164.5300 264 2 2015-09-21 09:35:00 164.5300 164.9000 164.5300 164.6400 264 3 2015-09-21 09:40:00 164.6600 164.8900 164.6000 164.8900 264 4 2015-09-21 09:45:00 164.9100 165.0900 164.9100 164.9736 264 5 2015-09-21 09:50:00 164.9399 165.0980 164.8200 164.8200 264
- converts data table called df
df <- tbl_df(dia_5)
- using
plyr
hint ofttr
filters through data creating set of 10 new variables in new data frame calleddata
. see below:
data <- structure(list(doy = c(264, 265, 266, 267, 268, 271, 272, 11,12, 13), date = structure(c(1442824200, 1442910600, 1442997000,1443083400, 1443169800, 1443429000, 1443515400, 1452504600, 1452591000,1452677400), class = c("posixct", "posixt"), tzone = ""), or_high = c(164.71,162.96, 163.38, 161.37, 163.91, 162.06, 160.22, 164.5, 165.23,165.84), or_low = c(164.37, 162.62, 162.98, 161.06, 163.57, 161.66,159.7, 164.06, 164.84, 165.4), hod = c(165.56, 163.36, 163.38,162.24, 164.43, 162.06, 160.96, 164.5, 165.78, 165.84), lod = c(165.22,163.1, 162.98, 161.95, 164.24, 161.66, 160.75, 164.06, 165.56,165.4), close = c(164.92, 163.02, 162.58, 161.85, 162.94, 159.84,160.19, 163.83, 165.02, 161.38), range = c(0.340000000000003,0.260000000000019, 0.400000000000006, 0.29000000000002, 0.189999999999998,0.400000000000006, 0.210000000000008, 0.439999999999998, 0.219999999999999,0.439999999999998), `a-val` = c(na, na, na, na, na, na, na, 0.0673439999999994,0.0659639999999996, 0.0729499999999996), `a-up` = c(na, na, na,na, na, na, na, 164.567344, 165.295964, 165.91295), `a-down` = c(na,na, na, na, na, na, na, 163.992656, 164.774036, 165.32705)), .names = c("doy","date", "or_high", "or_low", "hod", "lod", "close", "range","a-val", "a-up", "a-down"), row.names = c(1l, 2l, 3l, 4l, 5l,6l, 7l, 78l, 79l, 80l), class = "data.frame")
the next part gets complicated. need analyse high , low prices of each 5 minute bar of day in relation a-up & a-down , close values seen in table. looking able compute score day depending on time spent above a-up level or below a-down level.
the way got in excel index each 5 minute high & low price of time series used logic score activity in 5min time slice. if low > a-up level given 1 , - 1 if high < a-down. scoring if price stays > a-up level or < a-down level greater 30 mins score 2 0r -2. achieved using running 5 period sum of results of , if 1 had more 5 ones knew price had stayed > a-up level etc score 2.
for days scoring need know following;
- did price stay above or below , level > 30 minutes or fail spending < 30 minutes there?
- if price went above , below both levels in 1 day, level did break first?
so after long winded intro question. out there have idea of best way go coding this. don't need specific code packages may accomplish this. mentioned above reason switching r speed whatever code used must efficient. when have coded intend on programming loop can analyse several hundred instruments.
thanks.
Comments
Post a Comment