MobileRead Forums - View Single Post - Print friendly url unrelated to regular url (and javascript)

Barty · 12-01-2011, 02:17 PM

I don't know if there's a better way to do this but it seems to work

Code:

    def print_version(self, url):
        soup = self.index_to_soup(url)
        regex = re.compile(r'javascript:printPage\((\d+?)\)',re.I)
        atag = soup.find('a',attrs={'href':regex})
        if atag is not None:
            m = regex.search(atag['href'])
            if m:
                url = 'http://www.christianitytoday.com/ct/article_print.html?id='+m.group(1)
        return url

this load the original page and find the article id by parsing the file

Note: add

Code:

import re

to the start of your recipe

12-01-2011, 02:17 PM	#2
Barty doofus Posts: 2,555 Karma: 13089041 Join Date: Sep 2010 Device: Kobo Libra 2, Kindle Voyage	I don't know if there's a better way to do this but it seems to work Code: def print_version(self, url): soup = self.index_to_soup(url) regex = re.compile(r'javascript:printPage\((\d+?)\)',re.I) atag = soup.find('a',attrs={'href':regex}) if atag is not None: m = regex.search(atag['href']) if m: url = 'http://www.christianitytoday.com/ct/article_print.html?id='+m.group(1) return url this load the original page and find the article id by parsing the file Note: add Code: import re to the start of your recipe