View Single Post
Old 06-07-2020, 04:33 AM   #5
hiperlink
Enthusiast
hiperlink began at the beginning.
 
Posts: 45
Karma: 10
Join Date: Dec 2010
Device: Kindle 3 Wifi only
Hi Kovid,

I just experimented a bit with mechanize, and it happily fetched the proper page this way:

Code:
import re
import mechanize
br = mechanize.Browser()
br.open("https://www.es.hu/rovat/kritika") # this is a subpage of the main domain, with less articles
resp = br.follow_link(text_regex=r"A H", nr=0) # this is a part of the specific // URL's article title
resp.geturl()
# 'https://www.es.hu/cikk/2020-06-05//a-het-konyvei.html'
Thus it follows the URL easily. Where it gets normalized then?
hiperlink is offline   Reply With Quote