regex - How do I extract a range of lines from a text file using sed -n but in Python? -

- March 15, 2011

say have file 10gb has 20,000 lines filled digits of pi.

123123
12312312
123123
123123
12312312
123123

how extract lines 10,000 20,000 using unix command sed -n?

i'd each line newline character export file using code below.

so far, have following:

com = "sed -n \' " + str(window[0]) + "," + str(window[1]) + "p\' " + "sample.txt" + ">" + "output.txt" os.system(com)

but throwing concatenation errors.

how should phrase command sed -n python in program below?

inputfilename = "sample.txt"  import itertools import linecache   def sliding_window(window_size, step_size, last_window_start):     in xrange(0, last_window_start, step_size):         yield (i, + window_size)     yield (last_window_start, total_pi_digits)  def picrop(window_size, step_size):      f = open(inputfilename, 'r')      first_line = f.readline().split()      total_pi_digits = int(first_line[0])      last_window_start = total_pi_digits-(total_pi_digits%window_size)      lastcounter = (total_pi_digits//window_size)*(window_size/step_size)      flags = [false in range(lastcounter)]      first_line[0] = str(window_size)     second_line = f.readline().split()     offset = int(round(float(second_line[0].strip('\n'))))     first_line = " ".join(first_line)      f. close()      open(inputfilename, 'r') f:         header = f.readline()          counter, window in enumerate(sliding_window(window_size,step_size,last_window_start)):              open('picrop_{}.txt'.format(counter), 'w') output:                  if (flags[counter] == false):                     flags[counter] = true                      headerline = float(linecache.getline(inputfilename, window[1]+1)) - offset                     output.write(str(window_size) + " " + str("{0:.4f}".format(headerline)) + " " + 'l' + '\n')                                     com = "sed -n \' " + str(window[0]) + "," + str(window[1]) + "p\' " + "sample.txt" + ">" + "output.txt"                 os.system(com)  picrop(1000,500)

you can yield each line file:

def lines(filename):     open(filename) f:         line in f:             yield line

and can slice sequence using islice:

from itertools import islice  open('picrop.txt', 'w') output:     line in islice(lines('sample.txt'), 10000, 20001):         output.write(line)

Search This Blog

Erty

regex - How do I extract a range of lines from a text file using sed -n but in Python? -

Comments

Post a Comment

Popular posts from this blog

Cross-Compiling Linux Kernel for Raspberry Pi - ${CCPREFIX}gcc -v does not work -

c++ - llvm function pass ReplaceInstWithInst malloc -

python - IO.UnsupportedOperation: Not Writable -