regex - R: Reading multiline data patterns from file -


here file pattern

metastring: time1, a,b,c,d,f 144135 42435 345425 2342423 263766 35553 353453 3534553 355345 52454 525252 2423465 245466 45645 355345 6454556 355662 26397 353577 3558676 metastring: time2, a,c,d,f 224234 23423 324234 4242324 312323 13123 312312 1312321 246456 63564 646544 4456456 244424 53556 546456 4645645 

metastrings consist of time stamp , a,b,c,d names referring strings of numbers (e.g. "a" refers first number string of block). number strings fixed-width quantity not constant, depends on metastring. want either data.frame structured this:

time1 144135 42435 345425 2342423 time1 b 263766 35553 353453 3534553 time1 c 355345 52454 525252 2423465 time1 d 245466 45645 355345 6454556 time1 f 355662 26397 353577 3558676 time2 224234 23423 324234 4242324 time2 c 312323 13123 312312 1312321 time2 d 246456 63564 646544 4456456 time2 f 244424 53556 546456 4645645 

or able read single block @ time matching metastring format , reading lines between 2 metastrings. can't find way it, since gsubfn read.pattern seems read file line @ time , can't further metastring.

to data frame in return, here's possibility uses readlines() , post-processing on strings. in code, replace textconnection(text) name of file.

## read file dat <- readlines(textconnection(text)) ## find 'metastring' lines meta <- grepl("metastring", dat, fixed = true) ## split 'metastring' lines first 2 columns ## create first 2 columns f2cols <- do.call(     "rbind",      lapply(         strsplit(dat[meta], "(.*: )|, ?"),          function(x) cbind(text1 = x[2], text2 = tail(x, -2))     ) ) ## create final data frame cbind(f2cols, read.table(text = dat[!meta])) #   text1 text2     v1    v2     v3      v4 # 1 time1     144135 42435 345425 2342423 # 2 time1     b 263766 35553 353453 3534553 # 3 time1     c 355345 52454 525252 2423465 # 4 time1     d 245466 45645 355345 6454556 # 5 time1     f 355662 26397 353577 3558676 # 6 time2     224234 23423 324234 4242324 # 7 time2     c 312323 13123 312312 1312321 # 8 time2     d 246456 63564 646544 4456456 # 9 time2     f 244424 53556 546456 4645645 

data:

text <- "metastring: time1, a,b,c,d,f\n144135 42435 345425 2342423\n263766 35553 353453 3534553\n355345 52454 525252 2423465\n245466 45645 355345 6454556\n355662 26397 353577 3558676\nmetastring: time2, a,c,d,f\n224234 23423 324234 4242324\n312323 13123 312312 1312321\n246456 63564 646544 4456456\n244424 53556 546456 4645645" 

Comments

Popular posts from this blog

get url and add instance to a model with prefilled foreign key :django admin -

css - Make div keyboard-scrollable in jQuery Mobile? -

ruby on rails - Seeing duplicate requests handled with Unicorn -