View Single Post
Old 03-25-2008, 09:11 AM   #34
Moonraker
Addict
Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.Moonraker ought to be getting tired of karma fortunes by now.
 
Moonraker's Avatar
 
Posts: 314
Karma: 1002965
Join Date: Mar 2006
Location: UK
Device: ILiad. Gen 3, PocketBook 360, Kobo Aura HD, Kindle Oasis 2
Quote:
Want to try a new (actually very old) .prc-->.html converter. It's called 'makedocN' and was hacked together by me almost four years ago.
I tried this on one of my own .prc creations.

I thought your 'makedocN' converter was excellent. It preserved the layout, retained all # numeric punctuation, and retained all the original tags. It is very fast and extremely easy to use.

I would still remove the following code inserted by Mobipocket Creator:

<div height="0em"></div> <div height="0em"></div> but because the closing </p> is preserved this would be an easy find and replace task.

The Headings I would change to my preferred simple and clean:

<h4>Chapter Number</h4>
instead of
<h4 align="center"><font size="+1"><b>Chapter Number</b></font></h4>

I don't care that it doesn't batch convert.


I then tried it on Harry's Lorna Doone Vol 1.prc.

I wanted the code to have some white space so I could read it easily so I ran the file through Tidy.exe:

Tidy reported:
1777 warnings, 52 errors were found! Not all warnings/errors were shown.
This document has errors that must be fixed before using HTML Tidy to generate a tidied up version.


So Tidy could not work on the file until the errors had been fixed.

I changed the html header to my own one and removed every <mbpagebreak/> and put it through Tidy again.

This time Tidy reported:
1454 warnings, 0 errors were found!

and the tidied code was easier for me to read.

Tidy.exe had corrected all out-of-date upper case tags to lower case, changed <b> to <strong>, changed all &nbsp; to & #160;, all en or em dashes were shown as ? because the original file did not contain the correct html code for these.<i> tags were corrected to <em>. <br/> tags were corrected to <br /> and all <font> tags were removed.

But most disconcerting of all, all the paragraphs started and ended with <div></div> respectively. I hate this because it makes for a horrendous task to clean up the html code because of all the other <divs this and <divs that.

I use Textpipe frequently to clean up bad html code and I doubt that even this fine programme could easily sort out all these divs.

So, if the original .prc file contains good clean code it is a quick and easy task to clean up the resultant html. But if it contains bad outdated code then it aint so easy.

I understand that the code generated by say, BookDesigner is adequate in creating good looking ebooks but a peek under the hood reveals out of date bloated code. My aim is to future proof my html files with good clean code that will render faster, reduce size and convert to any format for most reading devices.

Thank you and well done nrapallo for your makedocN converter. I shall be using it frequently.
Moonraker is offline   Reply With Quote