View Full Version : Converting BookDesigner to epub (bd2epub.pl)


maggotb0y
08-08-2008, 11:25 PM
I've created a perl script that will convert a BookDesigner html0 file to an ePub book. This is an early attempt and still needs lots of work, but I'm posting it for those that are interested (backups are your friend, etc, etc). Right now, this requires that info-zip's zip.exe be somewhere on the path, and this is Windows only at the moment. I hope to remove both of these requirements at some point.

I could use some help wrapping this up, I'll happily take any advice from anyone who has some suggestions.

I've attached the Perl script (when it's a little more polished I'll compile and post the .exe) so you'll need Perl to run it. For those who are just interested in seeing the output, I've attached "Ivanhoe" by Sir Walter Scott in a bookDesigner generated Sony Reader file, and the .epub version as well. Eventually I'll work on a more capability intensive test, but I've already noticed some things that the ePub engine renders better- for example a line ending in a dash with an endquote can lead to a line with only a dash and endquote in Sony's format. The ePub file seems not to suffer from this.

Timoleon
08-08-2008, 11:46 PM
Well done!:thumbsup:

wallcraft
08-10-2008, 09:00 AM
For those who are just interested in seeing the output, I've attached "Ivanhoe" by Sir Walter Scott in a bookDesigner generated Sony Reader file, and the .epub version as well. The ePub version is entirely in italics under FBReader. This is because of a missing in </div> in Ivanhoe-title.html (after Prior.):

<div class="justify">&nbsp;&nbsp;&nbsp;&nbsp;</div>
<div class="epigraph"><em><div class="justify">&nbsp;&nbsp;&nbsp;&nbsp;Now fitted the halter, now traversed the cart, And often took leave, — but seemed loath to depart!*</div>
<div class="justify">&nbsp;&nbsp;&nbsp;&nbsp;* The motto alludes to the Author returning to the stage * repeatedly after having taken leave.</div>
<div class="justify">&nbsp;&nbsp;&nbsp;&nbsp;Prior.</em>
</div>

Should be:
<div class="justify">&nbsp;&nbsp;&nbsp;&nbsp;</div>
<div class="epigraph"><em>
<div class="justify">&nbsp;&nbsp;&nbsp;&nbsp;Now fitted the halter, now traversed the cart, And often took leave, — but seemed loath to depart!*</div>
<div class="justify">&nbsp;&nbsp;&nbsp;&nbsp;* The motto alludes to the Author returning to the stage * repeatedly after having taken leave.</div>
<div class="justify">&nbsp;&nbsp;&nbsp;&nbsp;Prior.</div></em>
</div>

Adobe DE does not care, because it treats each HTML file separately, but FBReader merges all the HTML into one ebook on startup.

I also suggest adding an option to remove the "justify" of each paragraph. Or, better still, promote the justify and the initial spaces of each paragraph (added by BD) into the CSS file (with an option to remove the justify).

HarryT
08-11-2008, 03:07 AM
If you export a BD file as HTML, you get "<DIV align=justify>" on every "normal" paragraph. First thing I do with BD files is a global search and replace and replace the "align=justify" with an empty string.

Mr. Goodbar
10-27-2008, 01:47 PM
If you export a BD file as HTML, you get "<DIV align=justify>" on every "normal" paragraph. First thing I do with BD files is a global search and replace and replace the "align=justify" with an empty string.

I'm missing something. How do you export a BD file? I can't seem to find that option anywhere and my apologies for a stupid question.

HarryT
10-27-2008, 01:49 PM
Just "File/Save As..." on the menu. You'll get a dialog with a list of formats to save in.

Mr. Goodbar
10-27-2008, 02:29 PM
Ok, now I really feel stupid. :smack: Thanks for the help.

Croker
03-17-2009, 04:38 PM
I've created a perl script that will convert a BookDesigner html0 file to an ePub book. This is an early attempt and still needs lots of work, but I'm posting it for those that are interested (backups are your friend, etc, etc). Right now, this requires that info-zip's zip.exe be somewhere on the path, and this is Windows only at the moment. I hope to remove both of these requirements at some point.

I could use some help wrapping this up, I'll happily take any advice from anyone who has some suggestions.

I've attached the Perl script (when it's a little more polished I'll compile and post the .exe) so you'll need Perl to run it. For those who are just interested in seeing the output, I've attached "Ivanhoe" by Sir Walter Scott in a bookDesigner generated Sony Reader file, and the .epub version as well. Eventually I'll work on a more capability intensive test, but I've already noticed some things that the ePub engine renders better- for example a line ending in a dash with an endquote can lead to a line with only a dash and endquote in Sony's format. The ePub file seems not to suffer from this.

Hi! I could really do with some help on this one, as I like to use Book Designer to tidy up my files, but can't convert the html0 format to ePub, which is what I want to use. I've installed Perl and downloaded the script here, but I have no idea what to do next to actually convert the file. Can someone talk me through it?

I suspect that the part that I've highlighted in bold in the quote above is where I'm going wrong, as I have no idea what it means! I've found Info-Zip's Zip.exe, but how do I put it "somewhere on the path"??

Basically, I need to know very basic stuff, too, such as actually where to run the script, etc. I'd appreciate it if anyone can give me a quick step-by-step guide so I can get on with converting my books!

Also, a slightly OT question: a while back, I created an LRF file using Book Designer, embedding a font during the creation. As a result (and I'm sure you can all guess what's coming next), the page turn speed for that particular book is very slow in comparison to "normal" LRF or ePub files.

Would I be right in thinking that, if I stick to using ePub files, and install the font I want to use directly to my PRS-505, the page turn speed will be back to normal?

Thanks for any help you can provide!

wallcraft
03-17-2009, 04:49 PM
One option would be to use Calibre. If you want to import the .html0 file you first need to rename it .html, but then the Calbre GUI should be enough to get the file converted to ePub. I don't know if the html0 file is a better starting point than a .html exported from Book Designer or not.

Croker
03-17-2009, 04:53 PM
One option would be to use Calibre. If you want to import the .html0 file you first need to rename it .html, but then the Calbre GUI should be enough to get the file converted to ePub. I don't know if the html0 file is a better starting point than a .html exported from Book Designer or not.

Thanks, I didn't know that - I'll give it a whirl. I'd still like to know how to use the script, though, for comparison purposes.

Do you have any advice on my question regarding fonts, though? :grin:

EDIT: Hmmm. I turned the .html0 file into an html file, added it to Calibre, and then converted it to ePub, but I got the following:

Job: **Convert book: The Beach**
**tuple**: ('SplitError', u'Could not find reasonable point at which to split: The Beach.html Sub-tree size: 800 KB')
**Traceback**:
Traceback (most recent call last):
File "parallel.py", line 942, in worker
File "parallel.py", line 900, in work
File "C:\Program Files\calibre\library.zip\calibre\ebooks\epub\from _any.py", line 148, in any2epub
File "C:\Program Files\calibre\library.zip\calibre\ebooks\epub\from _html.py", line 393, in convert
File "calibre\ebooks\epub\split.pyo", line 480, in split
File "calibre\ebooks\epub\split.pyo", line 73, in __init__
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 141, in split_to_size
SplitError: Could not find reasonable point at which to split: The Beach.html Sub-tree size: 800 KB

**Log**:
Found OPF file in archive
Creating EPUB file...
[DEBUG] Processing HTMLFile:0:a:c:\users\dom\appdata\local\temp\calib re_0.4.129_adi8af_any2epub1\content\The Beach.html...
Parsing calibre_0.4.129_adi8af_any2epub1\content\The Beach.html
[INFO] Rationalizing fonts...
[DEBUG] Done rationalizing
[DEBUG] Saving stylesheets...
[INFO] Splitting The Beach.html (803 KB)
[INFO] Splitting on page breaks...
[INFO] Looking for large trees...
[DEBUG] Splitting...
[DEBUG] Committed sub-tree #1 (1 KB)
[DEBUG] Splitting...
[DEBUG] Committed sub-tree #2 (0 KB)
[DEBUG] Splitting...
[DEBUG] Committed sub-tree #3 (0 KB)
[DEBUG] Splitting...
[DEBUG] Splitting...
[DEBUG] Splitting...
[DEBUG] Committed sub-tree #4 (2 KB)
[DEBUG] Splitting...
[DEBUG] Splitting...
('SplitError', u'Could not find reasonable point at which to split: The Beach.html Sub-tree size: 800 KB')
Traceback (most recent call last):
File "parallel.py", line 942, in worker
File "parallel.py", line 900, in work
File "C:\Program Files\calibre\library.zip\calibre\ebooks\epub\from _any.py", line 148, in any2epub
File "C:\Program Files\calibre\library.zip\calibre\ebooks\epub\from _html.py", line 393, in convert
File "calibre\ebooks\epub\split.pyo", line 480, in split
File "calibre\ebooks\epub\split.pyo", line 73, in __init__
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 154, in split_to_size
File "calibre\ebooks\epub\split.pyo", line 141, in split_to_size
SplitError: Could not find reasonable point at which to split: The Beach.html Sub-tree size: 800 KB"

So that didn't work. I'll have to try opening the BD file, saving it as an html, and then trying to convert [i]that instead.

DOUBLE EDIT: Saving the BD file as html and then trying to convert the resulting file produces the same error as above, too. Any ideas?

kovidgoyal
03-17-2009, 05:15 PM
in calibre set the profile to None (the resulting epub file will work every where except on the sony readers).

Croker
03-17-2009, 05:22 PM
Right, I'll try that, but I actually own a Sony Reader, so I'm a bit of a loss to see how that helps me specifically! :grin:

Cheers for the tip off, though. I'll see if I can actually produce an ePub first, and then worry about getting it on to my Reader later!

EDIT: Aha! That's worked and kept all my formatting, etc, which is great. I've just tried reading it in ADE, and it worked fine. Thanks, Kovid, I really appreciate your help.

However, is it the case that this ePub file I now have cannot be used on my Reader?

wallcraft
03-17-2009, 05:39 PM
SplitError: Could not find reasonable point at which to split: The Beach.html Sub-tree size: 800 KB"? Calibre splits up big HTML files on page breaks for the PRS-505. Does it make sense that this 800 KB html file has no chapters or other opportunities for page breaks? If so, then pick a few places for breaks and manually add them in Book Designer. If not, then Calibre's automatic chapter detection is likely failing. This should be obvious in Adobe Digital Editions (no chapters in the TOC). I think Calibre uses h1 and h2 as the default chapter detection marks, but this is customizable.

Croker
03-17-2009, 05:43 PM
Does it make sense that this 800 KB html file has no chapters or other opportunities for page breaks? If so, then pick a few places for breaks and manually add them in Book Designer. If not, then Calibre's automatic chapter detection is likely failing. This should be obvious in Adobe Digital Editions (no chapters in the TOC). I think Calibre uses h1 and h2 as the default chapter detection marks, but this is customizable.

There are page breaks in the file already, though, and I'd already created a TOC in the Book Designer file, before converting it to html.

Anyway, when I followed Kovid's advice above, and set profile to "none" in Calibre, it produced an ePub with the formatting (and working TOC) intact. The only thing missing was the cover image, which I'd dragged and dropped on to the first page when originally creating the BD file (any advice on how I can get the image back into the document, or how I can get the cover image to display when I open the ePub, would be greatly appreciated).

As things seem to be working now, I just need to know if I can use the ePub I've created on my PRS-505. The advice that Kovid gave me above seems to say that I can't. Is this right?

wallcraft
03-17-2009, 05:57 PM
As things seem to be working now, I just need to know if I can use the ePub I've created on my PRS-505. The advice that Kovid gave me above seems to say that I can't. Is this right? Calibre apparently isn't detecting the page breaks or the Chapters, otherwise it would be able to split up the file (or perhaps there is one really big chapter at some point).

Note that by TOC, I mean a formal ePub TOC (that ADE will display in a side panel) not an internal set of HTML links acting as a TOC. The latter is ok, but it may not necessarily help Calibre detect chapters.

Croker
03-17-2009, 06:25 PM
Calibre apparently isn't detecting the page breaks or the Chapters, otherwise it would be able to split up the file (or perhaps there is one really big chapter at some point).

Note that by TOC, I mean a formal ePub TOC (that ADE will display in a side panel) not an internal set of HTML links acting as a TOC. The latter is ok, but it may not necessarily help Calibre detect chapters.

Yes, but strangely enough, ADE is showing all the chapter titles in the side panel, too, in the same way as a formal ePub TOC. There's only one thing for it, I'll have to just connect up my Reader and give it a bash. Keep your fingers crossed for me!

EDIT: Ah, no, it doesn't work, as Kovid said earlier. You can copy the file across, but when you try to open it, it just says "Page Error!" and won't do anything else.

This is excruciating, to be honest - I'm on the verge of creating files in the format and style that I really want, and can't get them over to my Reader! Gnnnnngh!

kovidgoyal
03-17-2009, 07:08 PM
open up the html file and post an extract from it, say the title and first few paragraphs of chapter 1

Croker
03-17-2009, 07:36 PM
open up the html file and post an extract from it, say the title and first few paragraphs of chapter 1

I'm assuming you mean the actual html source, so here it is:

<SPAN id=title><DIV align=center><FONT color=#001950><B><A name=uBmk_816121><DIV align=center><FONT color=#001950><B>BANGKOK</B></FONT></DIV>
</A></B></FONT>
</SPAN>
<DIV align=justify>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</DIV>
<DIV align=justify>&nbsp;&nbsp;&nbsp;&nbsp;</DIV>
</DIV>
<DIV align=center><I>Bitch</I></DIV>
<DIV align=justify>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;</DIV>
<DIV align=justify>&nbsp;&nbsp;&nbsp;&nbsp;The first I heard of the beach was in Bangkok, on the Khao San Road. Khao San Road was backpacker land. Almost all the buildings had been converted into guest-houses, there were long-distance-telephone booths with air-con, the cafés showed brand-new Hollywood films on video, and you couldn't walk ten feet without passing a bootleg-tape stall. The main function of the street was as a decompression chamber for those about to leave or enter Thailand, a halfway house between East and West.</DIV>
<DIV align=justify>&nbsp;&nbsp;&nbsp;&nbsp;I'd landed at Bangkok in the late afternoon, and by the time I got to Khao San it was dark. My taxi driver winked and told me that at one end of the street was a police station, so I asked him to drop me off at the other end. I wasn't planning on crime but I wanted to oblige his conspiratorial charm. Not that it made much difference which end one stayed because the police obviously weren't active. I caught the smell of grass as soon as I got out of the cab, and half the travellers weaving past me were stoned.</DIV>
<DIV align=justify>&nbsp;&nbsp;&nbsp;&nbsp;He left me outside a guest-house with an eating area open to the street. As I studied it, checking the clientele to gauge what kind of place it was, a thin man at the table nearest me leant over and touched my arm. I glanced down. He was, I guessed, one of the heroin hippies that float around India and Thailand. He'd probably come to Asia ten years ago and turned an occasional dabble into an addiction. His skin was old, though I'd have believed he was in his thirties. The way he was looking at me, I had the feeling I was being sized up as someone to rip off.</DIV>
<DIV align=justify>&nbsp;&nbsp;&nbsp;&nbsp;"What?" I said warily.</DIV>
<DIV align=justify>&nbsp;&nbsp;&nbsp;&nbsp;He pulled an expression of surprise and held up the palms of his hands. Then he curled his finger and thumb into the O-shaped perfection sign, and pointed into the guest-house.</DIV>

<DIV align=justify>&nbsp;&nbsp;&nbsp;&nbsp;"It's a good place?"</DIV>
<DIV align=justify>&nbsp;&nbsp;&nbsp;&nbsp;He nodded.</DIV>
<DIV align=justify>&nbsp;&nbsp;&nbsp;&nbsp;I looked again at the people around the tables. They were mostly young and friendly looking, some watching the TV, and some chattering over their dinner.</DIV>
<DIV align=justify>&nbsp;&nbsp;&nbsp;&nbsp;"OK." I smiled at him in case he wasn't a heroin addict, just a friendly mute. "I'm sold."</DIV>
<DIV align=justify>&nbsp;&nbsp;&nbsp;&nbsp;He returned the smile and turned back to the video screen.</DIV>
<DIV align=justify>&nbsp;&nbsp;&nbsp;&nbsp;Quarter of an hour later I was settling into a room that was a little larger than a double bed. I can be accurate about it because there was a double bed in the room, and on each of its four sides was a foot of space. My backpack could just slide in the gap.</DIV>
<DIV align=justify>&nbsp;&nbsp;&nbsp;&nbsp;One wall was concrete - the side of the building. The others were Formica and bare. They moved when I touched them. I had the feeling that if I leant against one it would fall over and maybe hit another, and all the walls of the neighbouring rooms would collapse like dominoes. Just short of the ceiling, the walls stopped, and covering the space was a strip of metal mosquito netting. The netting almost upheld the illusion of a confined, personal area - until I lay down on the bed. As soon as I relaxed, stopped moving, I began to hear cockroaches scuttling around in the other rooms.</DIV>
<DIV align=justify>&nbsp;&nbsp;&nbsp;&nbsp;At my head end I had a French couple in their late teens - a beautiful, slim girl with a suitably handsome boy attached. They'd been leaving their room as I got to mine and we exchanged nods as we passed in the corridor. The other end was empty. Through the netting I could see the light was off, and anyway, if it had been occupied I would have heard the person breathing. It was the last room on the corridor, so I presumed it faced the street and had a window.</DIV>
<DIV align=justify>&nbsp;&nbsp;&nbsp;&nbsp;On my ceiling was a fan, strong enough to stir the air on full setting. For a while I did nothing but lie on the bed and look up at it. It was calming, following the revolutions, and with the mixture of heat and soft breeze I felt I could drift asleep. That suited me. West to East is the worst for jet lag, and it would be good to fall into the right sleeping pattern on the first night.</DIV>

<DIV align=justify>&nbsp;&nbsp;&nbsp;&nbsp;I switched off the light. There was a glow from the corridor, and I could still see the fan. Soon I was asleep.</DIV>
<DIV align=justify>&nbsp;&nbsp;&nbsp;&nbsp;Once or twice I was aware of people in the corridor, and I thought I heard the French couple coming back, then leaving again. But the noises never woke me fully and I was always able to slip back into the dream I'd been having before. Until I heard the man's footsteps. They were different, too creepy to doze through. They had no rhythm or weight and dragged on the floor.</DIV>
<DIV align=justify>&nbsp;&nbsp;&nbsp;&nbsp;A muttered stream of British swear-words floated into my room as he jiggled the padlock on his door. Then there was a loud sigh, the lock opened with a click, and his light came on. The mosquito netting cast a patterned shadow on my ceiling.</DIV>


Hope this helps!

EDIT: I'm guessing that my problems may be connected to something HarryT flagged up on the first page of this thread - namely, that when exporting from BD, it adds <DIV align=justify> to each paragraph. Am I close?

kovidgoyal
03-17-2009, 07:55 PM
hmm open a ticket and attach the html file

Croker
03-18-2009, 12:54 PM
hmm open a ticket and attach the html file

Done. Thanks for all your help. I look forward to hearing what you make of all this! It's ticket #2098, by the way.

JSWolf
03-18-2009, 04:21 PM
It would be rather nice to be able to make ePub from Book Designer output.

Croker
03-18-2009, 04:29 PM
It would be rather nice to be able to make ePub from Book Designer output.

Well, that's what the script in this thread claims to do, but as I don't understand how to run it properly, I can't test it out!

Any suggestions on that front, whilst Kovid looks into the issues with Calibre, would be greatly appreciated!

kovidgoyal
03-18-2009, 04:44 PM
creating epub from BD output using calibre works fine, apart from the splitting issue, which will be fixed in the next calibre release.

Croker
03-18-2009, 05:00 PM
creating epub from BD output using calibre works fine, apart from the splitting issue, which will be fixed in the next calibre release.

When you say "the splitting issue", is this the first problem I had? The one I posted the long error message for, I mean?

Whatever it is, it's great that the fix will be in the next release, anyway, so I'm chuffed with that. Thanks for your swift response!

Croker
03-20-2009, 06:17 PM
The latest edition of Calibre (0.5.2, I think) cures the problem I had.

I can now convert to ePub using the PRS-505 profile option without it crashing. It seems to work fine, which is nice!

Thanks, Kovid!