View Full Version : How can I download a Website and view in reader?


shousa
12-29-2007, 11:16 PM
I am massively interested in Ancient Rome, Greece and Japan and wish to download webpages (legally OK to copy for non-commercial use stuff only) to read in an ebook reader.

The problem is these use multiple linked webpages (hundreds of 'em).

My understanding is that whether copying en-masse with HTTrack Website copier or (not that I would do this) one page at a time a key problem would likely remain: All the pages are referenced by a root directory and subsequent directories eg file:///D:/My%20Web%20Sites/Rome/index.html

Could an ebook reader handle this and if so which one or is another conversion process needed? Can the ebook readers only handle a single html file? (meaning cannot click on the links on one page to go to another in a deeper directory)

Could someone please help me???:help:

RWood
12-30-2007, 12:24 AM
I have used libprs500 to capture web sites like the NY Times. It outputs a great LRF file for the Sony Reader.

shousa
12-30-2007, 06:30 PM
I have used libprs500 to capture web sites like the NY Times. It outputs a great LRF file for the Sony Reader.

Sorry...from my testing this does not work at all.

Please reread my issue above.

Seems basic but......CAN ANYONE DO IT???????:help:

kovidgoyal
12-30-2007, 06:52 PM
No the sony reader cannot handle this. So you have to first convert the downloaded HTML into an LRF file. look at the tools web2lrf, web2disk and html2lrf all of them support links in html files to other files.

shousa
12-31-2007, 12:26 AM
No the sony reader cannot handle this. So you have to first convert the downloaded HTML into an LRF file. look at the tools web2lrf, web2disk and html2lrf all of them support links in html files to other files.

I am running your web2lrf now on a "big" site. Is it possible to specify on the command line how "deep" the program is to go when copying a website?

If this works I will buy the Sony ebook reader for sure. :thumbsup:

Regardless of whether it works or not I have much respect for you and your efforts. Sincere thanks.:thanks:

shousa
12-31-2007, 02:17 AM
No the sony reader cannot handle this. So you have to first convert the downloaded HTML into an LRF file. look at the tools web2lrf, web2disk and html2lrf all of them support links in html files to other files.

Just reporting my findings, which I admit should have been obvious from the beginning - The Sony Reader requires LRF files so anything more than a "small" set of web pages will not work as all the pages get saved to ONE FILE (tested with kovidgoyal's converter and then viewer - viewer dies in the middle of the only 10MB file - the actual webpage is more than 70MB but the converter stopped at 10MB).

Putting it simply the LRF file will be too big for the Sony Reader to handle.

So in my view the Sony Reader is totally inadequate for reading webpages (unless they are very very small) based on its IMHO flawed design concept - the atrocious LRF files - end of story. This means the boys at Sony never thought about webpages being used on an ebook reader!!!

Words utterly fail me.

Does the Hanlin V3 with its native HTML support enable links in the webpages?? (I doubt it but asking anyway).

Seems no ebook actually supports webpages just a single webpage at a time?!

I must be wrong....tell me I am wrong!!!!!!!:help:

JSWolf
12-31-2007, 09:17 AM
Not wrong.

kovidgoyal
12-31-2007, 11:06 AM
I have successfully viewed 10MB LRF files. It's not the filesize as much as the file structure that sometimes causes LRF rendering to fail. And I agree that SONY could have done a much better job with LRF both the file spec and the rendering algorithms.