Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Readers > Sony Reader > Sony Reader Dev Corner

Notices

Reply
 
Thread Tools Search this Thread
Old 07-23-2007, 07:51 PM   #1
sic
Addict
sic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enough
 
Posts: 202
Karma: 692
Join Date: Oct 2006
Device: SONY reader
HTML merge tool needed

Hi

do you guys know a good tool to merge a bunch of HTML files to form a single document?

What I'm willing to do is to download manuals or electronic books that are available online and convert them into either LRF, PDF or RTF.
The conversion works best on a single file.

An example of what I'm trying to convert is here:
http://www.zeroc.com/doc/Ice-3.2.0/manual/

I guess there are HTML merge utilities out there... just I don't know where

thanks
sic is offline   Reply With Quote
Old 07-23-2007, 09:02 PM   #2
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,897
Karma: 128597114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by sic View Post
Hi

do you guys know a good tool to merge a bunch of HTML files to form a single document?

What I'm willing to do is to download manuals or electronic books that are available online and convert them into either LRF, PDF or RTF.
The conversion works best on a single file.

An example of what I'm trying to convert is here:
http://www.zeroc.com/doc/Ice-3.2.0/manual/

I guess there are HTML merge utilities out there... just I don't know where

thanks
If you have say an HTML file that links to another and that links to another and so on, you can use HTML2LRF to convert to LRF easily and it will follow the links and pick up the other files needed.
JSWolf is offline   Reply With Quote
Advert
Old 07-24-2007, 11:15 AM   #3
sic
Addict
sic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enough
 
Posts: 202
Karma: 692
Join Date: Oct 2006
Device: SONY reader
Thanks, I know...
I tried Kovid's tools but am not always happy with the results.
I guess it'll take some shell scripting sed/awk/cat to get the work done...
sic is offline   Reply With Quote
Old 07-24-2007, 11:40 AM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,843
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
May I ask why not?
kovidgoyal is offline   Reply With Quote
Old 07-24-2007, 11:51 AM   #5
sic
Addict
sic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enough
 
Posts: 202
Karma: 692
Join Date: Oct 2006
Device: SONY reader
Hi Kovid

can you please try this:
web2lrf http://www.zeroc.com/doc/Ice-3.2.0/manual/

I guess it's the style-sheet
or could even be that the pages are not perfect XHTML...

thanks
sic is offline   Reply With Quote
Advert
Old 07-24-2007, 12:37 PM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,843
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
The correct commandline for that site should be
Code:
web2lrf --url http://www.zeroc.com/doc/Ice-3.2.0/manual/toc.html
kovidgoyal is offline   Reply With Quote
Old 07-24-2007, 12:39 PM   #7
sic
Addict
sic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enough
 
Posts: 202
Karma: 692
Join Date: Oct 2006
Device: SONY reader
yes... my bad.
sic is offline   Reply With Quote
Old 07-24-2007, 01:18 PM   #8
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,843
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
The site converted fine for me, what was the problem?
kovidgoyal is offline   Reply With Quote
Old 07-24-2007, 07:05 PM   #9
sic
Addict
sic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enough
 
Posts: 202
Karma: 692
Join Date: Oct 2006
Device: SONY reader
The results are ugly.
e.g. formatting sych as code examples vs. "normal" text was lost.

Another problem - which is not the fault of the html2lrf tool - is that it would be nice to remove the header and footer such as the Previous/Next links from every page.
These are needed when the document is presented online, but are only noise when it gets converted for offline viewing.
sic is offline   Reply With Quote
Old 07-24-2007, 08:11 PM   #10
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,843
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
the code examples are in a monospace font, what other formatting do you mean? As for stripping the header/footers easily done by creating a profile for web2lrf.
kovidgoyal is offline   Reply With Quote
Old 07-24-2007, 08:26 PM   #11
sic
Addict
sic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enough
 
Posts: 202
Karma: 692
Join Date: Oct 2006
Device: SONY reader
thanks

how do I create a profile? where can I find some info on that?

did you use the web2lrf script to get the monospace?
sic is offline   Reply With Quote
Old 07-24-2007, 10:29 PM   #12
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,843
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Yeah the commandline i posted before gave me monospaced code samples. Unfortunately, at the moment the only way to create new profiles is by editing the source. I'll add an easier way when I get the time. If you're interested look at the web2lrf thread where I've posted the link to some example profiles.
kovidgoyal is offline   Reply With Quote
Old 07-25-2007, 03:34 AM   #13
Moonraker
Addict
Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.
 
Moonraker's Avatar
 
Posts: 314
Karma: 1002965
Join Date: Mar 2006
Location: UK
Device: ILiad. Gen 3, PocketBook 360, Kobo Aura HD, Kindle Oasis 2
I use SoftSnow's HTML Merger.

You can get it here:

http://softsnow.griffin3.com/merger/merger.shtml
Moonraker is offline   Reply With Quote
Old 07-25-2007, 12:09 PM   #14
sic
Addict
sic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enough
 
Posts: 202
Karma: 692
Join Date: Oct 2006
Device: SONY reader
thanks Kovid

I had a look at the python source.
Cool stuff.
I'm not a guru on python, am learning it... but I think I can figure it out.
sic is offline   Reply With Quote
Old 11-07-2007, 06:04 PM   #15
kiltme
Junior Member
kiltme began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Nov 2007
Device: axim x51v
Quote:
Originally Posted by Moonraker View Post
I use SoftSnow's HTML Merger.

You can get it here:

http://softsnow.griffin3.com/merger/merger.shtml
I've tried to install it on a couple of different machines and it crashes on startup. It is kind of old (2003) so I don't know if it's a dll problem or what.

Have you gotten it to work recently? I remeber running it years ago on my old computer.
kiltme is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
DR800 PDF annotation merge tool? CoolDragon iRex 1 06-21-2010 02:56 PM
Which Diff/Merge/Patch/Updater tool? pdurrant Kindle Formats 10 12-17-2008 08:38 PM
Tool to easily clean and refurbish html-text before conversion Pulp Workshop 3 10-13-2008 10:16 AM


All times are GMT -4. The time now is 04:34 PM.


MobileRead.com is a privately owned, operated and funded community.