Quote:
Originally Posted by dgvirtual
I was wondering if someone could help me make a recipe for a news source http://www.lrytas.lt. The website divides longer articles into pages, but you can access the whole article via print version. However, I do not know how to produce the print version.
....
|
Try this,
it is not optimized, but seems to work.
Regards, Mauro
Code:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
__license__ = 'GPL v3'
__author__ = "mauropiccolo"
import re
class AdvancedUserRecipe1382294260(BasicNewsRecipe):
title = u'http://www.lrytas.lt/'
oldest_article = 7
max_articles_per_feed = 100
auto_cleanup = True
feeds = [(u'Energetika',u'http://www.lrytas.lt/rss/?tema=47')]
def print_version(self, url):
soup = self.index_to_soup(url)
a = soup.find("a", attrs={"href":re.compile('^/print\.asp')})
if a:
url = 'http://www.lrytas.lt'+a["href"]
return url