Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 05-19-2010, 06:31 PM   #1
Pablo
Guru
Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.
 
Pablo's Avatar
 
Posts: 970
Karma: 4999999
Join Date: Mar 2009
Location: Rosario, Argentina
Device: SONY PRS-505, PRS-T2
BookDesigner HTML0 to clean HTML conversion utility

Many of us still use BookDesigner to edit our books, or have some old backups of books created in the past in "html0" format.
Uncompressed "html0" files are in fact html files, but they do not import well to other programs such as Sigil, and require a lot of manual work to fix them.
I wrote this quick and dirty utility to help me convert my "html0" files to simpler and cleaner html files that can be easily imported with Sigil, and would like to share it with everybody here.

Installation:

Just uncompress the attached file and put it anywhere on your disk.

Usage

You will need an uncompressed html0 file as input. If your file is compressed, just open it with BookDesigner. An uncompressed version of the file will be written to the Lastfile directory of BD's installation folder.

Run the file HTML02HTML.exe, press the "File" button and search for your html0 file. After that click on the "Convert" button.

The source file remains unchanged. A new file, with the same name as the source file (but with "html" extension) will show in the same directory as the source file. Copy and paste the "style.css" file from the HTML02HTML directory to this directory. Open the new file with a browser or HTML editor.

Conversion operations

- Book Title and Book Author are converted to H1 tags.
- Titles are converted to H2 tags.
- Subtitles are converted to H3 tags.
- <DIV> tags are converted to <p> tags.
- Paragraph indentation with &nbsp; is removed.
- A link to "style.css" is included.
- <HR> tags are deleted.
- Blank lines are replaced with <br />
- Notes and Links are fixed to work both ways (link to endnotes and "go back" links)

Notes

The program works in Win XP, I don't know if it works on Vista or W7.


This is a preliminary version, barely tested with 3 files. Probably buggy.
I would like to have some feedback to make it more useful.


EDIT:

I uploaded a modified version of the utility.
Changes:

- Encoding information copied to the output html.
- Author and Book Title metadata added to the output html. This metadata is reconigsed as such by Sigil when importing the file.
- Processing of <HR> tags is now user selectable: they can be deleted, kept or replaced by Sigil Chapter Breaks. The latter simplifies file splitting in Sigil 0.2.0.

EDIT:

Uploaded new version

EDIT

Uploaded new version

EDIT

Uploaded new version with optional paragraph splitting as requested by JSWolf

EDIT (29 May)

Uploaded new version with 2 new options

EDIT (1 June)

Uploaded new version with support for all BookDesigner styles and minor bug fixes.

EDIT (18 August)
Uploaded new version.
Changes:
-Minor bugfixes in css
-New option "Different style for first paragraph after h2/h3"
-H1 tags (Title and Author) automatically excluded from TOC in Sigil
-<br /> tags no longer used for blank lines. When there is a blank line before a paragraph, the style of the paragraph is changed to include a margin at the top of it. This produces better results when imported with Sigil (Thanks charlesky for your help in this).

I am also including a html0 version of Stevenson's "The Master of Ballantrae" (unzip before using) that can be used to test the utility.
Attached Files
File Type: zip HTML02HTML.zip (207.8 KB, 355 views)
File Type: zip Stevenson Robert_The Master of Ballantrae.zip (174.2 KB, 316 views)

Last edited by Pablo; 08-18-2010 at 09:10 PM. Reason: New version uploaded
Pablo is offline   Reply With Quote
Old 05-20-2010, 05:10 PM   #2
Pablo
Guru
Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.
 
Pablo's Avatar
 
Posts: 970
Karma: 4999999
Join Date: Mar 2009
Location: Rosario, Argentina
Device: SONY PRS-505, PRS-T2
I would like to add a "Smart quotes" option, to change straight quotes to curly quotes.

What would be the best approach? Just write the actual characters or use some decimal numeric character references, such as & #8220; for the left double quotation mark?

I think this also depends on text encoding. Any ideas?

Last edited by Pablo; 05-20-2010 at 05:14 PM.
Pablo is offline   Reply With Quote
Advert
Old 05-21-2010, 05:15 PM   #3
Pablo
Guru
Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.
 
Pablo's Avatar
 
Posts: 970
Karma: 4999999
Join Date: Mar 2009
Location: Rosario, Argentina
Device: SONY PRS-505, PRS-T2
I just uploaded a new version of the utility (see first post for the file).
Changes:
- css file generated automatically in the destination directory - no need to copy it manually.
- Added option to change straight quotes to curly quotes ("smart quotes")
- Added results window with save to file option



Notes:
- To use the utility you need to generate your ebook in BookDesigner format with Make ebooks --> BookDesigner (html0), and after that open it from BookDesigner with File --> Open book. If you make any changes to the book, you have to repeat the process. The reason for this is that when you save the file, BD makes some format changes in the html0 file in Lastfile that are not handled well with this utility. Generating the book and opening it again with File--> Open Book restores formatting.

- The utility supports BookTitle, BookAuthor, Paragraph, Title, Subtitle and Notes and Links.

- Verse, text author, anotation and epigraph are not supported by the utility.

- The smart quotes option will not produce good results always. Particularly problematic are posesives of words ending in 's' and spelling of slang words. When the program finds something problematic, it indicates the line in the results window. In this case you will need an editor such as notepad++ that displays line numbers to locate the line and correct manually.
Attached Thumbnails
Click image for larger version

Name:	HTML02HTML.jpg
Views:	2662
Size:	34.1 KB
ID:	52029  

Last edited by Pablo; 05-21-2010 at 05:20 PM.
Pablo is offline   Reply With Quote
Old 05-22-2010, 08:50 PM   #4
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,877
Karma: 128597114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Thanks. This sounds a lot better then saving as HTML from Book Designer. I will give it a go and let you know how it works.
JSWolf is offline   Reply With Quote
Old 05-23-2010, 07:16 AM   #5
Pablo
Guru
Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.
 
Pablo's Avatar
 
Posts: 970
Karma: 4999999
Join Date: Mar 2009
Location: Rosario, Argentina
Device: SONY PRS-505, PRS-T2
Quote:
Originally Posted by JSWolf View Post
Thanks. This sounds a lot better then saving as HTML from Book Designer. I will give it a go and let you know how it works.
Thanks. I am willing to fix any bugs or add any useful features you can think of.
Pablo is offline   Reply With Quote
Advert
Old 05-23-2010, 01:55 PM   #6
Pablo
Guru
Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.
 
Pablo's Avatar
 
Posts: 970
Karma: 4999999
Join Date: Mar 2009
Location: Rosario, Argentina
Device: SONY PRS-505, PRS-T2
Just uploaded a new version:

Changes:
- Fixed bug that caused left or right-justified text to convert incorrectly.
- Paragraphs are now split in several lines no longer than 70 chars.

See first post for the file
Pablo is offline   Reply With Quote
Old 05-23-2010, 01:56 PM   #7
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,877
Karma: 128597114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by Pablo View Post
Just uploaded a new version:

Changes:
- Fixed bug that caused left or right-justified text to convert incorrectly.
- Paragraphs are now split in several lines no longer than 70 chars.

See first post for the file
Can you make the splitting of paragraphs an option?
JSWolf is offline   Reply With Quote
Old 05-23-2010, 02:03 PM   #8
Pablo
Guru
Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.
 
Pablo's Avatar
 
Posts: 970
Karma: 4999999
Join Date: Mar 2009
Location: Rosario, Argentina
Device: SONY PRS-505, PRS-T2
Quote:
Originally Posted by JSWolf View Post
Can you make the splitting of paragraphs an option?
That was fast!!!
Yes, I can do that. I will upload a new version in a few hours with that change.
Pablo is offline   Reply With Quote
Old 05-23-2010, 06:01 PM   #9
Pablo
Guru
Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.
 
Pablo's Avatar
 
Posts: 970
Karma: 4999999
Join Date: Mar 2009
Location: Rosario, Argentina
Device: SONY PRS-505, PRS-T2
Uploaded a new version. The only change is that long paragraph splitting is now optional.
Pablo is offline   Reply With Quote
Old 05-23-2010, 11:05 PM   #10
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,877
Karma: 128597114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Thanks. I'll give it a go later this week. Maybe Tuesday.
JSWolf is offline   Reply With Quote
Old 05-29-2010, 05:55 PM   #11
Pablo
Guru
Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.
 
Pablo's Avatar
 
Posts: 970
Karma: 4999999
Join Date: Mar 2009
Location: Rosario, Argentina
Device: SONY PRS-505, PRS-T2
Uploaded a new version of the utility.

Changes:
- Added option to supress inner nbsp (non breaking spaces)
- Added option to add <h> tags around IMG tags, with text Figure 1, 2, ... so that images can be indexed.



See first post for the utility.
Attached Thumbnails
Click image for larger version

Name:	HTML02HTML.jpg
Views:	1920
Size:	42.5 KB
ID:	52480  
Pablo is offline   Reply With Quote
Old 06-01-2010, 05:53 PM   #12
Pablo
Guru
Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.
 
Pablo's Avatar
 
Posts: 970
Karma: 4999999
Join Date: Mar 2009
Location: Rosario, Argentina
Device: SONY PRS-505, PRS-T2
Uploaded a new version (see first post for the file)

Changes:
- Fixed a bug in the closing tag </h...> in image references.
- Added support for the remaining BD styles (epigraph, text author, verse, annotation, epigraph+text author)
Pablo is offline   Reply With Quote
Old 06-14-2010, 03:33 PM   #13
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,877
Karma: 128597114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
I will have to give this a go later on tonight or tomorrow. I keep forgetting. Sorry.
JSWolf is offline   Reply With Quote
Old 06-15-2010, 07:03 PM   #14
Pablo
Guru
Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.
 
Pablo's Avatar
 
Posts: 970
Karma: 4999999
Join Date: Mar 2009
Location: Rosario, Argentina
Device: SONY PRS-505, PRS-T2
Quote:
Originally Posted by JSWolf View Post
I will have to give this a go later on tonight or tomorrow. I keep forgetting. Sorry.
No problem... I'm very interested in you opinion and suggestions. Unfortunately, I will not be able to resume work on the program for two weeks.
Pablo is offline   Reply With Quote
Old 08-18-2010, 09:13 PM   #15
Pablo
Guru
Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.
 
Pablo's Avatar
 
Posts: 970
Karma: 4999999
Join Date: Mar 2009
Location: Rosario, Argentina
Device: SONY PRS-505, PRS-T2
I just uploaded a new version (see first post).
Pablo is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
clean HTML or PDF before mobi conversion in Calibre mark235 Calibre 9 12-25-2010 09:37 PM
Ahhhhh - Utility overload: BookDesigner, BookCreator, Textify, txt2lrf...too much FatDog Workshop 6 05-10-2010 12:00 AM
BookDesigner file format (html0) Pablo Other formats 3 09-11-2009 08:45 PM
Best way to get clean HTML JSWolf Kindle Formats 18 04-02-2009 11:00 AM
Tool to easily clean and refurbish html-text before conversion Pulp Workshop 3 10-13-2008 10:16 AM


All times are GMT -4. The time now is 11:10 AM.


MobileRead.com is a privately owned, operated and funded community.