Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 03-10-2008, 09:02 PM   #61
Deputy-Dawg
Groupie
Deputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-books
 
Deputy-Dawg's Avatar
 
Posts: 153
Karma: 799
Join Date: Dec 2007
Device: sony prs505
Kovid,
In the attached .zip file is the user-profile for one of my local newspapers. It use to work. Now all it gets is the TOC - no articles. What is strange is that the print file addresses are still the same and the error messages when I run it in terminal do not contain any thing that resembles the URL of the print files. I have enclosed a copy of one such run.

My question is has the newspaper changed something or has something changed in lbprs500?
Attached Files
File Type: zip For Kovid.zip (2.4 KB, 591 views)
Deputy-Dawg is offline   Reply With Quote
Old 03-10-2008, 09:09 PM   #62
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,146
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You need to fix the print_version function, the way the feed links to articles seems to have changed.
kovidgoyal is offline   Reply With Quote
Advert
Old 03-10-2008, 09:47 PM   #63
Deputy-Dawg
Groupie
Deputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-books
 
Deputy-Dawg's Avatar
 
Posts: 153
Karma: 799
Join Date: Dec 2007
Device: sony prs505
Thats what I thought had happened but the link to the print version of

http://www.nwaonline.net/articles/20...datefiling.txt

is

http://www.nwaonline.net/articles/20...datefiling.prt

which is what I would expect the function as written to return. The only difference I can see, if is different - because I am a bit hazy on how it behaved before, is that the print version opens in a new window. I don't think thats an issue in as much as I have seen others were the print version opened in a new window. Darned if I can put my hands on it though.
Deputy-Dawg is offline   Reply With Quote
Old 03-10-2008, 09:56 PM   #64
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,146
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
The format of the feed itself has changed use

Code:
url_search_order = ['link', 'guid']
kovidgoyal is offline   Reply With Quote
Old 03-10-2008, 10:31 PM   #65
Deputy-Dawg
Groupie
Deputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-books
 
Deputy-Dawg's Avatar
 
Posts: 153
Karma: 799
Join Date: Dec 2007
Device: sony prs505
Thanks, again! that fixed it. But... what sort of landmarks should I have been looking for in the source file if a similar problem occur again. I guess what I am asking for is more generalized solution.
Deputy-Dawg is offline   Reply With Quote
Advert
Old 03-10-2008, 10:40 PM   #66
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,146
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Well the log has a bunch of error messages about not being able to fetch .prt URLs. That's your clue, it means either that the print_version function no longer works or that the feed format has changed, causing the URL being fed to print_version to be wrong. You can check that by stick a
Code:
print url
into print_version
kovidgoyal is offline   Reply With Quote
Old 03-10-2008, 11:21 PM   #67
Deputy-Dawg
Groupie
Deputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-booksDeputy-Dawg has learned how to read e-books
 
Deputy-Dawg's Avatar
 
Posts: 153
Karma: 799
Join Date: Dec 2007
Device: sony prs505
Great minds in the same gutter, well almost. What I did was to put

Code:
return url
in and checked the error log. A little sloppier but it works. But by the time I came back to report what I had determined what was going on you had posted the fix. I suppose I should spend a bit of time taking an in depth review of DefaultProfile and see just what more goodies are there. Again thanks!

Last edited by Deputy-Dawg; 03-10-2008 at 11:26 PM.
Deputy-Dawg is offline   Reply With Quote
Old 03-10-2008, 11:56 PM   #68
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,146
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You should probably hold off for a bit. I'm in the process of re-writing web2lrf to make it much more powerful.
kovidgoyal is offline   Reply With Quote
Old 03-11-2008, 09:00 AM   #69
balok
Ugly alien
balok doesn't litterbalok doesn't litterbalok doesn't litter
 
balok's Avatar
 
Posts: 144
Karma: 225
Join Date: Sep 2007
Location: Québec, QC
Device: tricorder
Quote:
Originally Posted by kovidgoyal View Post
I'm in the process of re-writing web2lrf to make it much more powerful.
What kind of changes, or new features, should we expect? Will it handle current custom profiles, or will they need to be rewritten?
balok is offline   Reply With Quote
Old 03-11-2008, 11:34 AM   #70
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,146
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
It will handle current profiles, but in any case the old web2lrf code will remain for a long time, so no need to worry.

It will be multithreaded, handle many different feed formats, have a much more powerful and easy to use preprocessing engine, so you dont have to use regexps, unless you want to. Eventually, it should be smart enough that if you give it just the URL to a feed, it will go a fetch a reasonably sanitized version of the articles.

EDIT: Oh and I forgot that it will have links at the end of each article back to the table of contents

Last edited by kovidgoyal; 03-11-2008 at 11:40 AM.
kovidgoyal is offline   Reply With Quote
Old 03-12-2008, 08:17 AM   #71
balok
Ugly alien
balok doesn't litterbalok doesn't litterbalok doesn't litter
 
balok's Avatar
 
Posts: 144
Karma: 225
Join Date: Sep 2007
Location: Québec, QC
Device: tricorder
Quote:
Originally Posted by kovidgoyal View Post
It will handle current profiles, but in any case the old web2lrf code will remain for a long time, so no need to worry.

It will be multithreaded, handle many different feed formats, have a much more powerful and easy to use preprocessing engine, so you dont have to use regexps, unless you want to. Eventually, it should be smart enough that if you give it just the URL to a feed, it will go a fetch a reasonably sanitized version of the articles.

EDIT: Oh and I forgot that it will have links at the end of each article back to the table of contents
All of that sounds really cool. A link to the table of contents, in particular, seems like a no brainer, but I never thought of it. It would be nice if the link would bring you to the contents of the current rss feed (and not the first level table of contents). That way if you're reading say international news, you can stay in that section.
balok is offline   Reply With Quote
Old 03-12-2008, 12:30 PM   #72
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,146
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Quote:
Originally Posted by balok View Post
All of that sounds really cool. A link to the table of contents, in particular, seems like a no brainer, but I never thought of it. It would be nice if the link would bring you to the contents of the current rss feed (and not the first level table of contents). That way if you're reading say international news, you can stay in that section.
There's an up one level, up two levels and next and previous links.
kovidgoyal is offline   Reply With Quote
Old 03-19-2008, 02:08 PM   #73
DaleDe
Grand Sorcerer
DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.
 
DaleDe's Avatar
 
Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
Quote:
Originally Posted by balok View Post
Deputy-Dawg, are you really 74? I've never met a person over 50 who can handle a computer beyond pointing and clicking with difficulty. You must have been a professor or an engineer during your working career.
You need to get out more.

dale
DaleDe is offline   Reply With Quote
Old 05-02-2008, 03:06 AM   #74
Necator
Junior Member
Necator began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Apr 2008
Device: PRS-505
Hi, i have some difficulties on
1.making libprs500 see the printable_version URL correctly
2removing the tables.
i would appretiate if you lead me.

1.
Article URL : http://www.radikal.com.tr/haber.php?haberno=XXXXX
Printable URL: http://www.radikal.com.tr/yazici.php?haberno=XXXXX

i tried usning this:
def print_version (self, url):
return url.replace ('http://www.radikal.com.tr/haber.php?haberno=', 'http://www.radikal.com.tr/yazici.php?haberno=')

however it still downloads content from the Article URL

2. The article page has 3 rows of tables and i want the one in the middle
here is an example of the Article: " http://www.radikal.com.tr/haber.php?haberno=253962"

i coppied some lines from The Newyork Times and added --ignore tables--, unfortunately it did no good,
html_description = True
html2lrf_options = ['--ignore-tables']
remove_tags_before = dict(name='img' , attrs='src')
remove_tags_after = dict(id='footer')
remove_tags = [dict(attrs={'class':['articleTools', 'post-tools', 'side_tool']}),
dict(id=['footer', 'table', 'navigation', 'archive', 'side_search', 'blog_sidebar', 'side_tool', 'side_index']),
dict(name=['script', 'noscript'])]

what is it that i am doing wrong?? Thanks
Necator is offline   Reply With Quote
Old 05-02-2008, 03:26 AM   #75
Necator
Junior Member
Necator began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Apr 2008
Device: PRS-505
Hi, altough i am a newbee i happen to jump in python language to read my local newspaper. And as expected i need some advice

1. i failed to show libprs500 print_version URL so the conted comes from the Article URL,

Article URL :http://www.radikal.com.tr/haber.php?haberno=253962
Print_vesion URL:http://www.radikal.com.tr/yazici.php?haberno=253962

i tried this which failed:
def print_version (self, url):
return url.replace ('http://www.radikal.com.tr/haber.php?haberno=', 'http://www.radikal.com.tr/yazici.php?haberno=')

2. So i get the feed from article and to get the main news body from the HTML i removed the tables but this time i cannot cut the news body from the rest of thepage, i copied the recipe from the manual (The Newyork Times) which again ended up in failiure,
html_description = True
html2lrf_options = ['--ignore-tables']
remove_tags_before = dict(name='img' , attrs='src')
remove_tags_after = dict(id='footer')
remove_tags = [dict(attrs={'class':['articleTools', 'post-tools', 'side_tool']}),
dict(id=['footer', 'table', 'navigation', 'archive', 'side_search', 'blog_sidebar', 'side_tool', 'side_index']),
dict(name=['script', 'noscript'])]

what is it that i do wrong? Please lead me, thanks anyway.....
Necator is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
RSS Feed timezone Feedback 8 01-02-2010 06:55 PM
RSS Feed questions rambling Calibre 2 11-20-2008 05:35 AM
Working User Profile for Wired.com RSS feeds for libprs500 DaveNB Calibre 6 11-30-2007 07:00 AM
RSS Feed Updates Alexander Turcic Announcements 0 06-11-2004 04:11 PM


All times are GMT -4. The time now is 11:43 AM.


MobileRead.com is a privately owned, operated and funded community.