Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 02-13-2011, 05:26 PM   #1
kenth
Junior Member
kenth began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Oct 2010
Device: none
Issues converting Web HTML docs to ebooks

Hi,

I'm trying to convert a couple of real page-turners: 1) the Python Library reference and 2) PostgreSQL online documentation to Kindle eBooks.

The Python docs start as restructured text & are converted by Sphinx to HTML. The HTML source is available at http://docs.python.org/archives/pyth...s-html.tar.bz2. I added library/index.html as an eBook and converted this for my Kindle.

The Postgres docs start as sgml & are converted by jade to HTML. The HTML source is available at ftp://ftp9.us.postgresql.org/pub/mir...-9.0.3.tar.bz2. I added doc/src/sgml/html/index.html as an ebook and converted that.

The conversions were suboptimal in two different ways. The postgres book was out of order -- the appendixes appeared first & html footers were on each page. The python library still had all (at least most) html tags inplace & you were reading html on the kindle.

Any thoughts on converting to nice ebooks with TOC, etc. Obviously, reading the SGML or RsT source (before HTML encoding) would be best. Any other thoughts?

Thanks. Kent
kenth is offline   Reply With Quote
Old 02-13-2011, 09:03 PM   #2
almagary
Wears funny hat (cloth)
almagary began at the beginning.
 
Posts: 28
Karma: 26
Join Date: Dec 2010
Location: Limbo
Device: Kobo WiFi, Kobo Touch
"Add book" turning HTML into ZIP rather than EPUB

I've been pasting longer HTML docs into Word, creating a TOC, then saving as Filtered HTML, then using Calibre Add-Book to create EPUBs for reading on my Kobo WiFi. Suddenly, perhaps with Calibre 0.7.43 or .44, all the HTMLs get added as ZIPs. That requires an extra conversion, ZIP to EPUB, and litters directories with extra files. I found that even a one-paragraph HTML doc gets ZIPped when added by Calibre.

Any way to have Calibre Add-Book default to EPUB?

I am not sophisticated with HTML and really only know how to use Word to create simple HTML.

Calibre is otherwise terrific and I'm donating $50 right now!
almagary is offline   Reply With Quote
Advert
Old 02-13-2011, 10:46 PM   #3
itimpi
Wizard
itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.
 
Posts: 4,552
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
I am confused - adding a book to Calibre does not convert the format - merely stores the format you add. In addition when you add an HTML format file, Calibre has stored it (and all linked files) a a ZIP file for as long as I can remember.

Once you have added the HTML (zip) file it can be converted to the format of your choce. This can be done explicitly after the add, or implicitly as part of the transfer of files to the reader device. It is not (and never has been) done as part of the add process.
itimpi is offline   Reply With Quote
Old 02-13-2011, 11:54 PM   #4
almagary
Wears funny hat (cloth)
almagary began at the beginning.
 
Posts: 28
Karma: 26
Join Date: Dec 2010
Location: Limbo
Device: Kobo WiFi, Kobo Touch
My bad; I was addled. I had used 2epub.com to convert some HTML docs into EPUBs which I then added to Calibre. A few days later I had completely forgotten that intermediate step.

BTW 2epub.com is convenient, quick, reliable. Maybe Calibre should incorporate that function.
almagary is offline   Reply With Quote
Old 02-13-2011, 11:55 PM   #5
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,842
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Umm you do know that 2epub.com uses calibre to do the actual conversions dont you?
kovidgoyal is online now   Reply With Quote
Advert
Old 02-14-2011, 01:36 AM   #6
almagary
Wears funny hat (cloth)
almagary began at the beginning.
 
Posts: 28
Karma: 26
Join Date: Dec 2010
Location: Limbo
Device: Kobo WiFi, Kobo Touch
I do now and I guess I'd better say no more!
almagary is offline   Reply With Quote
Old 02-14-2011, 01:39 AM   #7
Anansii
Junior Member
Anansii began at the beginning.
 
Anansii's Avatar
 
Posts: 1
Karma: 10
Join Date: Feb 2011
Device: Droid, iTouch, Dell Latitude XP
So it reads html, packs them in zip files, tells you it just added a zip file, but then if you continue and try to convert it it actually does produce an epub file. I just love the little things that don't get mentioned in the manual because after all EVERYBODY knows about them don't they? Queep - thanks folks, was going nuts but you've solved the problem I just registered with this thing to answer.
Anansii is offline   Reply With Quote
Old 02-14-2011, 04:01 AM   #8
kenth
Junior Member
kenth began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Oct 2010
Device: none
Hi guys,

Thanks for the thoughts. Anything specific I should try to get a web HTML doc into a nice ebook format? I thought that using the Python Library Reference might be helpful as kovidgoyal is obviously quite familiar with Python/RsT/Sphinx and knows what magic would need to happen, or if it's a road not yet ready to be taken.

Please keep the thoughts coming. This is one problem I've not been able to solve & I'd appreciate any help I could get.

Kent

Last edited by kenth; 02-14-2011 at 04:13 AM. Reason: typo
kenth is offline   Reply With Quote
Old 02-16-2011, 05:10 AM   #9
Archon
Zealot
Archon , Klaatu Barada Niktu!Archon , Klaatu Barada Niktu!Archon , Klaatu Barada Niktu!Archon , Klaatu Barada Niktu!Archon , Klaatu Barada Niktu!Archon , Klaatu Barada Niktu!Archon , Klaatu Barada Niktu!Archon , Klaatu Barada Niktu!Archon , Klaatu Barada Niktu!Archon , Klaatu Barada Niktu!Archon , Klaatu Barada Niktu!
 
Archon's Avatar
 
Posts: 110
Karma: 5176
Join Date: Dec 2010
Device: Mac OSX, iPad, iPod, & Nook
"It's nice to be nice to the nice" - Frank Burns M*A*S*H

Quote:
Anything specific I should try to get a web HTML doc into a nice ebook format?
The word "nice" is subjective but I would suggest you flatten the web site to be read linearly as a book instead of a web site. Web sites are not designed to be books and are optimized to load pages quickly on Gramma's dial-up and consequently have quite an 'exploded' structre. To make it 'nice' will require some manual manipulation on your part.

So, on ideas to make it a "nice" ebook, I would think you need to take the site and flatten it manually so that instead of having an exploded structure it is linear like a normal book. If the html files are all sequentially numbered in the order you want them in, like pages in a book, you can easily stitch them together as one file and then add chapter markers and a TOC on the first page. Or, their filenames can be adjusted according to an index you create or modify and then possibly loaded into calibre by adding that index instead.

I would use textutil to concatenate smaller files into a large file for each chapter or the whole book depending on how large it is. Then import it into calibre or add it to your Kindle.

Textutil is a unix tool that is found on Linux and Mac OSX also. If you are using Windows I am sure there is some utility that does the same thing.

http://www.unix.com/man-page/All/1/TEXTUTIL/
http://oreilly.com/pub/a/mac/2005/11/22/cli-tools.html

Happy Wednesday
Archon

Last edited by Archon; 02-16-2011 at 05:14 AM.
Archon is offline   Reply With Quote
Old 03-03-2011, 03:28 PM   #10
kenth
Junior Member
kenth began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Oct 2010
Device: none
Followup with Resolutions

I successfully (well sort-of) loaded these ebooks onto my Kindle by downloading the original documentation source code & going from there.

For the PostgreSQL documentation, I built the documentation as a single HTML file & successfully converted this. Since building the docs involved installing lots of tools, I just created a new FreeBSD virtual machine for the purpose, built my HTML file & called it a day. Looks nice on the Kindle.

For Python standard library, it was a bit more difficult. The version of Sphinx used for the 2.x documentation is old & doesn't support the singlehtml nor epub output formats. However the 3.x docs use Sphinx 1.07, so I punted on 2.7 docs, & just did a "make epub" in the 3.2 doc subdirectory. I loaded the resulting into calibre with no problem.

All in all, not so bad. The new Sphinx output formats (as of v1.0 I believe) make conversion a piece of cake. Hopefully this is the direction the world is moving.

Kent
kenth is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
bookmark issues converting HTML to EPUB isabellkirsten Calibre 0 04-09-2010 11:47 PM
issues with Technical PDF docs (equations; matrice...) tristouille Calibre 1 01-27-2010 07:52 AM
Converting PDF tech docs shunyun Amazon Kindle 10 01-22-2010 06:41 PM
Problems converting Word docs ficbot Sony Reader 4 05-15-2009 07:36 PM
Converting Docs for Iphone 3G mjhudston Apple Devices 10 04-15-2009 03:06 AM


All times are GMT -4. The time now is 10:45 PM.


MobileRead.com is a privately owned, operated and funded community.