MobileRead Forums - View Single Post

eksor · 02-03-2010, 03:48 PM

Quote:

Originally Posted by posativ

Hi,

I hope, I've choosen the right subforum.
I am not really familiar with all these ebooks standards...

I would like to make a little python script which downloads from the given wikipedia article all mentioned and linked wikipedia-entries for lets say 1 or 2 recursion depth.

My output would be the following some html files.

How can I convert them to e.g. LRF, so I can click on a link in the LRF to get the related article in another LRF-file?

I think that lrf files are self contained, the whole bunch of images, html/xml files and so on are compressed in a single file (ala chm), without possibility of external lrf files linking.

calibre http://calibre-ebook.com/ relies in python, i think, and already has web2disk and ebook-convert cmd line utilities that should do what you want.

The bad thing is that I tried that with mixed results, blame on wikipedia layout not calibre (I would trashcan my prs700 without that marvelous software). To be fair with wikipedia, I think that recursive downloading of articles it is not recommended in the TOS or something similar. And I can understand that, overloading of the servers and things like that.

If plucker format is fine for you, you can try plucker or sunrisexp, this two work very well, I was able to read the whole Solar System (60-70MB) article in my ppc.

Regards.