View Single Post
Old 03-03-2008, 09:24 PM   #6
moz
Addict
moz once ate a cherry pie in a record 7 seconds.moz once ate a cherry pie in a record 7 seconds.moz once ate a cherry pie in a record 7 seconds.moz once ate a cherry pie in a record 7 seconds.moz once ate a cherry pie in a record 7 seconds.moz once ate a cherry pie in a record 7 seconds.moz once ate a cherry pie in a record 7 seconds.moz once ate a cherry pie in a record 7 seconds.moz once ate a cherry pie in a record 7 seconds.moz once ate a cherry pie in a record 7 seconds.moz once ate a cherry pie in a record 7 seconds.
 
moz's Avatar
 
Posts: 370
Karma: 1553
Join Date: Feb 2008
Location: Melbun
Device: Kobo H2O
OK, trying to write a profile, but really struggling. I get one of two things: a blank document, or the script hangs.

from libprs500.ebooks.lrf.web.profiles import DefaultProfile
import re

class SMH(DefaultProfile):

title = 'SMH'
max_recursions = 2
oldest_article = 1
no_stylesheets = True

preprocess_regexps = \
[ (re.compile(i[0], re.IGNORECASE | re.DOTALL), i[1]) for i in
[
# Remove links to homepage
(r'<P>[ <a href="/">SMH</a> ]</P>', lambda match : ''),
# and business pages
(r'<p><a href="http://business.smh.com.au.*', lambda match : ''),
]
]

def get_feeds(self):
return [ ('SMH', 'http://smh.com.au/text') ]
moz is offline   Reply With Quote