Shiny New E-Book Gizmo: The Amazon Kindle


View Full Version : Convert offline websites into a single pdf?


magogo
05-08-2007, 08:14 AM
Is any tool that can do that and specify the link depth? If so, I can read web content on my reader. I know RSS is possible, just wonder if there is any tool for normal web content. Thanks.

Hadrien
05-08-2007, 09:05 AM
Instead of converting a whole website, I usually extract the right information using a content extractor like Dapper, and then generate a PDF out of the RSS output of Dapper.

You can hardly turn a full website into a PDF: you need to remove most of the layout, links etc... A tool that would convert "a full website", would be a content extractor too. That's why using a content extraction service or tool and then converting this information into PDF, you get the same kind of result.

HarryT
05-08-2007, 09:05 AM
Given that there's no facility on the Reader for linking between documents, you'd need a tool which converted the web site into a single "linear" document with the links jumping you around within it - ie something which "stiched together" the individual pages of the web site "end to end" into one single "page". That wouldn't actually be terribly difficult to do - I might have a go at doing something about it if I have some free time.

kovidgoyal
05-08-2007, 09:44 AM
html2lrf

RWood
05-08-2007, 10:01 AM
Also Adobe Acrobat in the more recent versions will allow you to extract full web sites or just down so many levels into a PDF. I do this for later analysis but these are letter sized pages. I see no reason you could not do the same with pages sized for your reader.

magogo
05-11-2007, 06:56 AM
thanks, guys. I found that acrobat can do a really good job for online/offline website (See attached pic). The hyberlink is also working with sony reader. Dapper is a good idea (though i am not sure if you can use it for offline contents).

geekraver
05-11-2007, 10:52 AM
web2book/rss2book can also do this.

magogo
05-12-2007, 11:05 AM
html2lrf

I tried that. It's great. Much better than Acrobat, because it generates native lrf format. Thanks.