python - Extract timestamp from large string -
i have large string this:
send ok http/1.1 200 ok access-control-allow- l-allow-methods: get,post,delete access-control-allow-headers: x-requested-with, phant-private-key content-type: text/plain x-rate-limit-limit: 300 x-rate-limit-remaining: 297 x-rate-limit-reset: 1452931335.777 date: sat, 16 jan 2016 07:50:17 gmt set-cookie: serverid=; expires=thu, 01-jan-197 0 00:00:01 gmt; path=/ cache-control: private transfer-encoding: chunked
it contains strings sat, 16 jan 2016 07:50:17 gmt
string may of time. want string out of whole. know basic question how can in python.
not string contain substrings date:
.
use
import re datepattern = re.compile("\w{3}, \d{2} \w{3} \d{4} \d{2}:\d{2}:\d{2} \w{3}") matcher = datepattern.search(string_to_match_against) print(matcher.group(0))
with example
string_to_match_against = """ send ok http/1.1 200 ok access-control-allow- l-allow-methods: get,post,delete access-control-allow-headers: x-requested-with, phant-private-key content-type: text/plain x-rate-limit-limit: 300 x-rate-limit-remaining: 297 x-rate-limit-reset: 1452931335.777 date: sat, 16 jan 2016 07:50:17 gmt set-cookie: serverid=; expires=thu, 01-jan-197 0 00:00:01 gmt; path=/ cache-control: private transfer-encoding: chunked """
we print
sat, 16 jan 2016 07:50:17 gmt
it looks trying match http header, , (according "http: pocket reference", o'reilly, 2000) 3 formats of date possible date header:
- rfc 1123 (mon, 06 may 1996 04:57:00 gmt) - 1 in example
- rfc 1036 (monday, 06-may-96 04:57:00 gmt)
- ansi c asctime() (mon may 6 04:57:00 1996)
rfc1123 recommended, if wish match of 3 possibilities, need design regex can select between 3 alternation
import re pat1123 = "\w{3}, \d{2} \w{3} \d{4} \d{2}:\d{2}:\d{2} \w{3}" pat1036 = "\w+?, \d{2}-\w{3}-\d{2} \d{2}:\d{2}:\d{2} \w{3}" patc = "\w{3} \w{3} \d+? \d{2}:\d{2}:\d{2} \d{4}" datepattern = re.compile("(?:%s)|(?:%s)|(?:%s)"%(pat1123,pat1036,patc)) matcher = datepattern.search(string_to_match_against) print(matcher.group(0))
note approach not rely on being present except date extract (we don't need date: text). if more 1 such date occurs, finds first. use datepattern.findall
if more 1 desired.
Comments
Post a Comment