View Single Post
Old 05-19-2010, 06:31 PM   #1
Pablo
Guru
Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.
 
Pablo's Avatar
 
Posts: 972
Karma: 4999999
Join Date: Mar 2009
Location: Rosario, Argentina
Device: SONY PRS-T2, Kindle Paperwhite 11th gen
BookDesigner HTML0 to clean HTML conversion utility

Many of us still use BookDesigner to edit our books, or have some old backups of books created in the past in "html0" format.
Uncompressed "html0" files are in fact html files, but they do not import well to other programs such as Sigil, and require a lot of manual work to fix them.
I wrote this quick and dirty utility to help me convert my "html0" files to simpler and cleaner html files that can be easily imported with Sigil, and would like to share it with everybody here.

Installation:

Just uncompress the attached file and put it anywhere on your disk.

Usage

You will need an uncompressed html0 file as input. If your file is compressed, just open it with BookDesigner. An uncompressed version of the file will be written to the Lastfile directory of BD's installation folder.

Run the file HTML02HTML.exe, press the "File" button and search for your html0 file. After that click on the "Convert" button.

The source file remains unchanged. A new file, with the same name as the source file (but with "html" extension) will show in the same directory as the source file. Copy and paste the "style.css" file from the HTML02HTML directory to this directory. Open the new file with a browser or HTML editor.

Conversion operations

- Book Title and Book Author are converted to H1 tags.
- Titles are converted to H2 tags.
- Subtitles are converted to H3 tags.
- <DIV> tags are converted to <p> tags.
- Paragraph indentation with &nbsp; is removed.
- A link to "style.css" is included.
- <HR> tags are deleted.
- Blank lines are replaced with <br />
- Notes and Links are fixed to work both ways (link to endnotes and "go back" links)

Notes

The program works in Win XP, I don't know if it works on Vista or W7.


This is a preliminary version, barely tested with 3 files. Probably buggy.
I would like to have some feedback to make it more useful.


EDIT:

I uploaded a modified version of the utility.
Changes:

- Encoding information copied to the output html.
- Author and Book Title metadata added to the output html. This metadata is reconigsed as such by Sigil when importing the file.
- Processing of <HR> tags is now user selectable: they can be deleted, kept or replaced by Sigil Chapter Breaks. The latter simplifies file splitting in Sigil 0.2.0.

EDIT:

Uploaded new version

EDIT

Uploaded new version

EDIT

Uploaded new version with optional paragraph splitting as requested by JSWolf

EDIT (29 May)

Uploaded new version with 2 new options

EDIT (1 June)

Uploaded new version with support for all BookDesigner styles and minor bug fixes.

EDIT (18 August)
Uploaded new version.
Changes:
-Minor bugfixes in css
-New option "Different style for first paragraph after h2/h3"
-H1 tags (Title and Author) automatically excluded from TOC in Sigil
-<br /> tags no longer used for blank lines. When there is a blank line before a paragraph, the style of the paragraph is changed to include a margin at the top of it. This produces better results when imported with Sigil (Thanks charlesky for your help in this).

I am also including a html0 version of Stevenson's "The Master of Ballantrae" (unzip before using) that can be used to test the utility.
Attached Files
File Type: zip HTML02HTML.zip (207.8 KB, 416 views)
File Type: zip Stevenson Robert_The Master of Ballantrae.zip (174.2 KB, 382 views)

Last edited by Pablo; 08-18-2010 at 09:10 PM. Reason: New version uploaded
Pablo is offline   Reply With Quote