python - How do I scrape the favorite_count of a tweet on someone's timeline? -
if run code without final line: getval(tweet['retweeted_status']['favorite_count']),
scrape works when add line error message keyerror: 'retweeted_status'
does know i'm doing wrong?
q = "david_cameron" results = twitter_user_timeline(twitter_api, q) print len(results) # show 1 sample search result slicing list... # print json.dumps(results[0], indent=1) csvfile = open(q + '_timeline.csv', 'w') csvwriter = csv.writer(csvfile) csvwriter.writerow(['created_at', 'user-screen_name', 'text', 'coordinates lng', 'coordinates lat', 'place', 'user-location', 'user-geo_enabled', 'user-lang', 'user-time_zone', 'user-statuses_count', 'user-followers_count', 'user-created_at']) tweet in results: csvwriter.writerow([tweet['created_at'], getval(tweet['user']['screen_name']), getval(tweet['text']), getlng(tweet['coordinates']), getlat(tweet['coordinates']), getplace(tweet['place']), getval(tweet['user']['location']), getval(tweet['user']['geo_enabled']), getval(tweet['user']['lang']), getval(tweet['user']['time_zone']), getval(tweet['user']['statuses_count']), getval(tweet['user']['followers_count']), getval(tweet['user']['created_at']), getval(tweet['retweeted_status']['favorite_count']), ]) print "done"
according api on @ https://dev.twitter.com/overview/api/tweets attribute may or may not exist.
if not exist not able access attribute. can either make safe lookup using in operator access checking existence first
retweeted_favourite_count = tweet['retweeted_status']['favourite_count'] if 'retweeted_status' in tweet else none
or doing way of assuming there handle when not
try: retweeted_favourite_count = tweet['retweeted_status']['favourite_count'] except keyerror: retweeted_favourite_count = 0
then assign retweeted_favourite_count value in write row function.
also csv header row lacking description retweeted favourite count
updated example: tweet in results: #notice 1 long line not 2 rows. retweeted_favourite_count = tweet['retweeted_status']['favourite_count'] if 'retweeted_status' in tweet else none csvwriter.writerow([tweet['created_at'], getval(tweet['user']['screen_name']), getval(tweet['text']), getlng(tweet['coordinates']), getlat(tweet['coordinates']), getplace(tweet['place']), getval(tweet['user']['location']), getval(tweet['user']['geo_enabled']), getval(tweet['user']['lang']), getval(tweet['user']['time_zone']), getval(tweet['user']['statuses_count']), getval(tweet['user']['followers_count']), getval(tweet['user']['created_at']), # , insert here instead getval(retweeted_favourite_count), ])
you coulse switch line:
getval(tweet['retweeted_status']['favorite_count'])
with padriac cunningham suggested
getval(tweet.get('retweeted_status', {}).get('favourite_count', none)
Comments
Post a Comment