View Single Post
Old 05-02-2008, 03:26 AM   #75
Necator
Junior Member
Necator began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Apr 2008
Device: PRS-505
Hi, altough i am a newbee i happen to jump in python language to read my local newspaper. And as expected i need some advice

1. i failed to show libprs500 print_version URL so the conted comes from the Article URL,

Article URL :http://www.radikal.com.tr/haber.php?haberno=253962
Print_vesion URL:http://www.radikal.com.tr/yazici.php?haberno=253962

i tried this which failed:
def print_version (self, url):
return url.replace ('http://www.radikal.com.tr/haber.php?haberno=', 'http://www.radikal.com.tr/yazici.php?haberno=')

2. So i get the feed from article and to get the main news body from the HTML i removed the tables but this time i cannot cut the news body from the rest of thepage, i copied the recipe from the manual (The Newyork Times) which again ended up in failiure,
html_description = True
html2lrf_options = ['--ignore-tables']
remove_tags_before = dict(name='img' , attrs='src')
remove_tags_after = dict(id='footer')
remove_tags = [dict(attrs={'class':['articleTools', 'post-tools', 'side_tool']}),
dict(id=['footer', 'table', 'navigation', 'archive', 'side_search', 'blog_sidebar', 'side_tool', 'side_index']),
dict(name=['script', 'noscript'])]

what is it that i do wrong? Please lead me, thanks anyway.....
Necator is offline   Reply With Quote