python 2.7 - Why does Beautiful Soup Return filename instead of full link? -
this question has answer here:
using below simple code, i'm facing following problem: why beautiful soup return file names rather full link addresses?
from bs4 import beautifulsoup import urllib2 url = 'http://www.gks.ru/bgd/free/b00_25/isswww.exe/stg/d000/i000650r.htm' data = urllib2.urlopen(url).read() page = beautifulsoup(data,'lxml') link in page.findall('a'): l = link.get('href') print l
all i'm getting output:
i000660r.htm i000670r.htm i000680r.htm i000690r.htm i000700r.htm i000706r.htm i000707r.htm i000708r.htm i000709r.htm 000710.htm 000711.htm 000712.htm 000713.htm 000714.htm 000715.htm
problem solved, given relativeness of links concatenated output root of url. thanks.
Comments
Post a Comment