Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > ePub

Notices

Reply
 
Thread Tools Search this Thread
Old 08-08-2008, 11:25 PM   #1
maggotb0y
Connoisseur
maggotb0y can extract oil from cheesemaggotb0y can extract oil from cheesemaggotb0y can extract oil from cheesemaggotb0y can extract oil from cheesemaggotb0y can extract oil from cheesemaggotb0y can extract oil from cheesemaggotb0y can extract oil from cheesemaggotb0y can extract oil from cheesemaggotb0y can extract oil from cheese
 
Posts: 84
Karma: 1166
Join Date: Apr 2007
Location: New Jersey, Outside of Philadelphia
Device: Sony Reader
Converting BookDesigner to epub (bd2epub.pl)

I've created a perl script that will convert a BookDesigner html0 file to an ePub book. This is an early attempt and still needs lots of work, but I'm posting it for those that are interested (backups are your friend, etc, etc). Right now, this requires that info-zip's zip.exe be somewhere on the path, and this is Windows only at the moment. I hope to remove both of these requirements at some point.

I could use some help wrapping this up, I'll happily take any advice from anyone who has some suggestions.

I've attached the Perl script (when it's a little more polished I'll compile and post the .exe) so you'll need Perl to run it. For those who are just interested in seeing the output, I've attached "Ivanhoe" by Sir Walter Scott in a bookDesigner generated Sony Reader file, and the .epub version as well. Eventually I'll work on a more capability intensive test, but I've already noticed some things that the ePub engine renders better- for example a line ending in a dash with an endquote can lead to a line with only a dash and endquote in Sony's format. The ePub file seems not to suffer from this.
Attached Files
File Type: pl bd2epub.pl (11.5 KB, 857 views)
File Type: lrf Ivanhoe.lrf (822.6 KB, 940 views)
File Type: epub Ivanhoe.epub (574.9 KB, 1059 views)
maggotb0y is offline   Reply With Quote
Old 08-08-2008, 11:46 PM   #2
Timoleon
Time Enough at Last
Timoleon ought to be getting tired of karma fortunes by now.Timoleon ought to be getting tired of karma fortunes by now.Timoleon ought to be getting tired of karma fortunes by now.Timoleon ought to be getting tired of karma fortunes by now.Timoleon ought to be getting tired of karma fortunes by now.Timoleon ought to be getting tired of karma fortunes by now.Timoleon ought to be getting tired of karma fortunes by now.Timoleon ought to be getting tired of karma fortunes by now.Timoleon ought to be getting tired of karma fortunes by now.Timoleon ought to be getting tired of karma fortunes by now.Timoleon ought to be getting tired of karma fortunes by now.
 
Timoleon's Avatar
 
Posts: 382
Karma: 551316
Join Date: Feb 2008
Location: New England
Device: iPad 3, iPhone 5, Kindle 3, Fire, Sony PRS-350
Well done!
Timoleon is offline   Reply With Quote
Old 08-10-2008, 09:00 AM   #3
wallcraft
reader
wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.
 
wallcraft's Avatar
 
Posts: 6,979
Karma: 5183568
Join Date: Mar 2006
Location: Mississippi, USA
Device: Kindle 3 and Fire
Quote:
Originally Posted by maggotb0y View Post
For those who are just interested in seeing the output, I've attached "Ivanhoe" by Sir Walter Scott in a bookDesigner generated Sony Reader file, and the .epub version as well.
The ePub version is entirely in italics under FBReader. This is because of a missing in </div> in Ivanhoe-title.html (after Prior.):

Code:
<div class="justify">&nbsp;&nbsp;&nbsp;&nbsp;</div>
<div class="epigraph"><em><div class="justify">&nbsp;&nbsp;&nbsp;&nbsp;Now fitted the halter, now traversed the cart, And often took leave, — but seemed loath to depart!*</div>
<div class="justify">&nbsp;&nbsp;&nbsp;&nbsp;* The motto alludes to the Author returning to the stage * repeatedly after having taken leave.</div>
<div class="justify">&nbsp;&nbsp;&nbsp;&nbsp;Prior.</em>
</div>
Should be:
Code:
<div class="justify">&nbsp;&nbsp;&nbsp;&nbsp;</div>
<div class="epigraph"><em>
<div class="justify">&nbsp;&nbsp;&nbsp;&nbsp;Now fitted the halter, now traversed the cart, And often took leave, — but seemed loath to depart!*</div>
<div class="justify">&nbsp;&nbsp;&nbsp;&nbsp;* The motto alludes to the Author returning to the stage * repeatedly after having taken leave.</div>
<div class="justify">&nbsp;&nbsp;&nbsp;&nbsp;Prior.</div></em>
</div>
Adobe DE does not care, because it treats each HTML file separately, but FBReader merges all the HTML into one ebook on startup.

I also suggest adding an option to remove the "justify" of each paragraph. Or, better still, promote the justify and the initial spaces of each paragraph (added by BD) into the CSS file (with an option to remove the justify).
wallcraft is offline   Reply With Quote
Old 08-11-2008, 03:07 AM   #4
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 62,779
Karma: 40397151
Join Date: Nov 2006
Location: UK
Device: PW2, iPad Retina Mini, iPhone 4, MS Surface Pro, Onyx T68, N7,
If you export a BD file as HTML, you get "<DIV align=justify>" on every "normal" paragraph. First thing I do with BD files is a global search and replace and replace the "align=justify" with an empty string.
HarryT is offline   Reply With Quote
Old 10-27-2008, 01:47 PM   #5
Mr. Goodbar
Guru
Mr. Goodbar has a complete set of Star Wars action figures.Mr. Goodbar has a complete set of Star Wars action figures.Mr. Goodbar has a complete set of Star Wars action figures.Mr. Goodbar has a complete set of Star Wars action figures.Mr. Goodbar has a complete set of Star Wars action figures.
 
Posts: 918
Karma: 452
Join Date: Jul 2006
Location: Atlanta, GA
Device: Sony 950, Kindle Graphite DX, iRex iLiad
Quote:
Originally Posted by HarryT View Post
If you export a BD file as HTML, you get "<DIV align=justify>" on every "normal" paragraph. First thing I do with BD files is a global search and replace and replace the "align=justify" with an empty string.
I'm missing something. How do you export a BD file? I can't seem to find that option anywhere and my apologies for a stupid question.
Mr. Goodbar is offline   Reply With Quote
Old 10-27-2008, 01:49 PM   #6
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 62,779
Karma: 40397151
Join Date: Nov 2006
Location: UK
Device: PW2, iPad Retina Mini, iPhone 4, MS Surface Pro, Onyx T68, N7,
Just "File/Save As..." on the menu. You'll get a dialog with a list of formats to save in.
HarryT is offline   Reply With Quote
Old 10-27-2008, 02:29 PM   #7
Mr. Goodbar
Guru
Mr. Goodbar has a complete set of Star Wars action figures.Mr. Goodbar has a complete set of Star Wars action figures.Mr. Goodbar has a complete set of Star Wars action figures.Mr. Goodbar has a complete set of Star Wars action figures.Mr. Goodbar has a complete set of Star Wars action figures.
 
Posts: 918
Karma: 452
Join Date: Jul 2006
Location: Atlanta, GA
Device: Sony 950, Kindle Graphite DX, iRex iLiad
Ok, now I really feel stupid. Thanks for the help.
Mr. Goodbar is offline   Reply With Quote
Old 03-17-2009, 04:38 PM   #8
Croker
Connoisseur
Croker will become famous soon enoughCroker will become famous soon enoughCroker will become famous soon enoughCroker will become famous soon enoughCroker will become famous soon enoughCroker will become famous soon enough
 
Posts: 73
Karma: 630
Join Date: Sep 2008
Location: Liverpool, UK
Device: Kindle Paperwhite, obv.
Quote:
Originally Posted by maggotb0y View Post
I've created a perl script that will convert a BookDesigner html0 file to an ePub book. This is an early attempt and still needs lots of work, but I'm posting it for those that are interested (backups are your friend, etc, etc). Right now, this requires that info-zip's zip.exe be somewhere on the path, and this is Windows only at the moment. I hope to remove both of these requirements at some point.

I could use some help wrapping this up, I'll happily take any advice from anyone who has some suggestions.

I've attached the Perl script (when it's a little more polished I'll compile and post the .exe) so you'll need Perl to run it. For those who are just interested in seeing the output, I've attached "Ivanhoe" by Sir Walter Scott in a bookDesigner generated Sony Reader file, and the .epub version as well. Eventually I'll work on a more capability intensive test, but I've already noticed some things that the ePub engine renders better- for example a line ending in a dash with an endquote can lead to a line with only a dash and endquote in Sony's format. The ePub file seems not to suffer from this.
Hi! I could really do with some help on this one, as I like to use Book Designer to tidy up my files, but can't convert the html0 format to ePub, which is what I want to use. I've installed Perl and downloaded the script here, but I have no idea what to do next to actually convert the file. Can someone talk me through it?

I suspect that the part that I've highlighted in bold in the quote above is where I'm going wrong, as I have no idea what it means! I've found Info-Zip's Zip.exe, but how do I put it "somewhere on the path"??

Basically, I need to know very basic stuff, too, such as actually where to run the script, etc. I'd appreciate it if anyone can give me a quick step-by-step guide so I can get on with converting my books!

Also, a slightly OT question: a while back, I created an LRF file using Book Designer, embedding a font during the creation. As a result (and I'm sure you can all guess what's coming next), the page turn speed for that particular book is very slow in comparison to "normal" LRF or ePub files.

Would I be right in thinking that, if I stick to using ePub files, and install the font I want to use directly to my PRS-505, the page turn speed will be back to normal?

Thanks for any help you can provide!
Croker is offline   Reply With Quote
Old 03-17-2009, 04:49 PM   #9
wallcraft
reader
wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.
 
wallcraft's Avatar
 
Posts: 6,979
Karma: 5183568
Join Date: Mar 2006
Location: Mississippi, USA
Device: Kindle 3 and Fire
One option would be to use Calibre. If you want to import the .html0 file you first need to rename it .html, but then the Calbre GUI should be enough to get the file converted to ePub. I don't know if the html0 file is a better starting point than a .html exported from Book Designer or not.
wallcraft is offline   Reply With Quote
Old 03-17-2009, 04:53 PM   #10
Croker
Connoisseur
Croker will become famous soon enoughCroker will become famous soon enoughCroker will become famous soon enoughCroker will become famous soon enoughCroker will become famous soon enoughCroker will become famous soon enough
 
Posts: 73
Karma: 630
Join Date: Sep 2008
Location: Liverpool, UK
Device: Kindle Paperwhite, obv.
Quote:
Originally Posted by wallcraft View Post
One option would be to use Calibre. If you want to import the .html0 file you first need to rename it .html, but then the Calbre GUI should be enough to get the file converted to ePub. I don't know if the html0 file is a better starting point than a .html exported from Book Designer or not.
Thanks, I didn't know that - I'll give it a whirl. I'd still like to know how to use the script, though, for comparison purposes.

Do you have any advice on my question regarding fonts, though?

EDIT: Hmmm. I turned the .html0 file into an html file, added it to Calibre, and then converted it to ePub, but I got the following:

Job: **Convert book: The Beach**
**tuple**: ('SplitError', u'Could not find reasonable point at which to split: The Beach.html Sub-tree size: 800 KB')
**Traceback**:
Traceback (most recent call last):
File "parallel.py", line 942, in worker
File "parallel.py", line 900, in work
File "C:\Program Files\calibre\library.zip\calibre\ebooks\epub\from _any.py", line 148, in any2epub
File "C:\Program Files\calibre\library.zip\calibre\ebooks\epub\from _html.py", line 393, in convert
File "calibre\ebooks\epub\split.pyo", line 480, in split
File "calibre\ebooks\epub\split.pyo", line 73, in __init__
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 141, in split_to_size
SplitError: Could not find reasonable point at which to split: The Beach.html Sub-tree size: 800 KB

**Log**:
Found OPF file in archive
Creating EPUB file...
[DEBUG] Processing HTMLFile:0:a:c:\users\dom\appdata\local\temp\calib re_0.4.129_adi8af_any2epub1\content\The Beach.html...
[INFO] Parsing calibre_0.4.129_adi8af_any2epub1\content\The Beach.html
[INFO] Rationalizing fonts...
[DEBUG] Done rationalizing
[DEBUG] Saving stylesheets...
[INFO] Splitting The Beach.html (803 KB)
[INFO] Splitting on page breaks...
[INFO] Looking for large trees...
[DEBUG] Splitting...
[DEBUG] Committed sub-tree #1 (1 KB)
[DEBUG] Splitting...
[DEBUG] Committed sub-tree #2 (0 KB)
[DEBUG] Splitting...
[DEBUG] Committed sub-tree #3 (0 KB)
[DEBUG] Splitting...
[DEBUG] Splitting...
[DEBUG] Splitting...
[DEBUG] Committed sub-tree #4 (2 KB)
[DEBUG] Splitting...
[DEBUG] Splitting...
('SplitError', u'Could not find reasonable point at which to split: The Beach.html Sub-tree size: 800 KB')
Traceback (most recent call last):
File "parallel.py", line 942, in worker
File "parallel.py", line 900, in work
File "C:\Program Files\calibre\library.zip\calibre\ebooks\epub\from _any.py", line 148, in any2epub
File "C:\Program Files\calibre\library.zip\calibre\ebooks\epub\from _html.py", line 393, in convert
File "calibre\ebooks\epub\split.pyo", line 480, in split
File "calibre\ebooks\epub\split.pyo", line 73, in __init__
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 141, in split_to_size
SplitError: Could not find reasonable point at which to split: The Beach.html Sub-tree size: 800 KB"

So that didn't work. I'll have to try opening the BD file, saving it as an html, and then trying to convert that instead.

DOUBLE EDIT: Saving the BD file as html and then trying to convert the resulting file produces the same error as above, too. Any ideas?

Last edited by Croker; 03-17-2009 at 05:10 PM. Reason: Added error data and further info.
Croker is offline   Reply With Quote
Old 03-17-2009, 05:15 PM   #11
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 25,450
Karma: 4961459
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
in calibre set the profile to None (the resulting epub file will work every where except on the sony readers).
kovidgoyal is online now   Reply With Quote
Old 03-17-2009, 05:22 PM   #12
Croker
Connoisseur
Croker will become famous soon enoughCroker will become famous soon enoughCroker will become famous soon enoughCroker will become famous soon enoughCroker will become famous soon enoughCroker will become famous soon enough
 
Posts: 73
Karma: 630
Join Date: Sep 2008
Location: Liverpool, UK
Device: Kindle Paperwhite, obv.
Right, I'll try that, but I actually own a Sony Reader, so I'm a bit of a loss to see how that helps me specifically!

Cheers for the tip off, though. I'll see if I can actually produce an ePub first, and then worry about getting it on to my Reader later!

EDIT: Aha! That's worked and kept all my formatting, etc, which is great. I've just tried reading it in ADE, and it worked fine. Thanks, Kovid, I really appreciate your help.

However, is it the case that this ePub file I now have cannot be used on my Reader?

Last edited by Croker; 03-17-2009 at 05:27 PM.
Croker is offline   Reply With Quote
Old 03-17-2009, 05:39 PM   #13
wallcraft
reader
wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.
 
wallcraft's Avatar
 
Posts: 6,979
Karma: 5183568
Join Date: Mar 2006
Location: Mississippi, USA
Device: Kindle 3 and Fire
Quote:
Originally Posted by Croker View Post
SplitError: Could not find reasonable point at which to split: The Beach.html Sub-tree size: 800 KB"?
Calibre splits up big HTML files on page breaks for the PRS-505. Does it make sense that this 800 KB html file has no chapters or other opportunities for page breaks? If so, then pick a few places for breaks and manually add them in Book Designer. If not, then Calibre's automatic chapter detection is likely failing. This should be obvious in Adobe Digital Editions (no chapters in the TOC). I think Calibre uses h1 and h2 as the default chapter detection marks, but this is customizable.
wallcraft is offline   Reply With Quote
Old 03-17-2009, 05:43 PM   #14
Croker
Connoisseur
Croker will become famous soon enoughCroker will become famous soon enoughCroker will become famous soon enoughCroker will become famous soon enoughCroker will become famous soon enoughCroker will become famous soon enough
 
Posts: 73
Karma: 630
Join Date: Sep 2008
Location: Liverpool, UK
Device: Kindle Paperwhite, obv.
Quote:
Originally Posted by wallcraft View Post
Does it make sense that this 800 KB html file has no chapters or other opportunities for page breaks? If so, then pick a few places for breaks and manually add them in Book Designer. If not, then Calibre's automatic chapter detection is likely failing. This should be obvious in Adobe Digital Editions (no chapters in the TOC). I think Calibre uses h1 and h2 as the default chapter detection marks, but this is customizable.
There are page breaks in the file already, though, and I'd already created a TOC in the Book Designer file, before converting it to html.

Anyway, when I followed Kovid's advice above, and set profile to "none" in Calibre, it produced an ePub with the formatting (and working TOC) intact. The only thing missing was the cover image, which I'd dragged and dropped on to the first page when originally creating the BD file (any advice on how I can get the image back into the document, or how I can get the cover image to display when I open the ePub, would be greatly appreciated).

As things seem to be working now, I just need to know if I can use the ePub I've created on my PRS-505. The advice that Kovid gave me above seems to say that I can't. Is this right?
Croker is offline   Reply With Quote
Old 03-17-2009, 05:57 PM   #15
wallcraft
reader
wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.wallcraft ought to be getting tired of karma fortunes by now.
 
wallcraft's Avatar
 
Posts: 6,979
Karma: 5183568
Join Date: Mar 2006
Location: Mississippi, USA
Device: Kindle 3 and Fire
Quote:
Originally Posted by Croker View Post
As things seem to be working now, I just need to know if I can use the ePub I've created on my PRS-505. The advice that Kovid gave me above seems to say that I can't. Is this right?
Calibre apparently isn't detecting the page breaks or the Chapters, otherwise it would be able to split up the file (or perhaps there is one really big chapter at some point).

Note that by TOC, I mean a formal ePub TOC (that ADE will display in a side panel) not an internal set of HTML links acting as a TOC. The latter is ok, but it may not necessarily help Calibre detect chapters.
wallcraft is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
converting to ePub bookwurm70 Calibre 3 05-25-2010 08:20 AM
Converting to Epub verybadcat General Discussions 3 04-08-2010 05:26 PM
Converting to ePub for new Opus scgf Calibre 3 07-30-2009 08:34 AM
Bookdesigner: Which fontsizes, margins,settings when converting to LRF? ProDigit LRF 2 11-25-2008 06:42 PM


All times are GMT -4. The time now is 04:17 AM.


MobileRead.com is a privately owned, operated and funded community.