python - Extract timestamp from large string -


i have large string this:

send ok http/1.1 200 ok  access-control-allow- l-allow-methods: get,post,delete access-control-allow-headers: x-requested-with, phant-private-key content-type: text/plain x-rate-limit-limit: 300 x-rate-limit-remaining: 297 x-rate-limit-reset: 1452931335.777 date: sat, 16 jan 2016 07:50:17 gmt  set-cookie: serverid=; expires=thu, 01-jan-197 0 00:00:01 gmt; path=/ cache-control: private transfer-encoding: chunked 

it contains strings sat, 16 jan 2016 07:50:17 gmt string may of time. want string out of whole. know basic question how can in python.

not string contain substrings date:.

use

import re datepattern = re.compile("\w{3}, \d{2} \w{3} \d{4} \d{2}:\d{2}:\d{2} \w{3}") matcher = datepattern.search(string_to_match_against) print(matcher.group(0)) 

with example

string_to_match_against = """ send ok http/1.1 200 ok  access-control-allow- l-allow-methods: get,post,delete access-control-allow-headers: x-requested-with, phant-private-key content-type: text/plain x-rate-limit-limit: 300 x-rate-limit-remaining: 297 x-rate-limit-reset: 1452931335.777 date: sat, 16 jan 2016 07:50:17 gmt  set-cookie: serverid=; expires=thu, 01-jan-197 0 00:00:01 gmt; path=/ cache-control: private transfer-encoding: chunked """ 

we print

sat, 16 jan 2016 07:50:17 gmt 

it looks trying match http header, , (according "http: pocket reference", o'reilly, 2000) 3 formats of date possible date header:

  1. rfc 1123 (mon, 06 may 1996 04:57:00 gmt) - 1 in example
  2. rfc 1036 (monday, 06-may-96 04:57:00 gmt)
  3. ansi c asctime() (mon may 6 04:57:00 1996)

rfc1123 recommended, if wish match of 3 possibilities, need design regex can select between 3 alternation

import re pat1123 = "\w{3}, \d{2} \w{3} \d{4} \d{2}:\d{2}:\d{2} \w{3}" pat1036 = "\w+?, \d{2}-\w{3}-\d{2} \d{2}:\d{2}:\d{2} \w{3}" patc = "\w{3} \w{3} \d+? \d{2}:\d{2}:\d{2} \d{4}" datepattern = re.compile("(?:%s)|(?:%s)|(?:%s)"%(pat1123,pat1036,patc)) matcher = datepattern.search(string_to_match_against) print(matcher.group(0)) 

note approach not rely on being present except date extract (we don't need date: text). if more 1 such date occurs, finds first. use datepattern.findall if more 1 desired.


Comments

Popular posts from this blog

c++ - llvm function pass ReplaceInstWithInst malloc -

java.lang.NoClassDefFoundError When Creating New Android Project -

Decoding a Python 2 `tempfile` with python-future -