View Single Post
Old 02-10-2014, 03:40 AM   #105
unboggling
Wizard
unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.
 
Posts: 1,065
Karma: 858115
Join Date: Jan 2011
Device: Kobo Clara, Kindle Paperwhite 10
Quote:
Originally Posted by LadyKate View Post
Ok, I tend to look at things as starting to cleanup with HTML.

HTML can be obtained by opening an ePub, or a Mobi file from Calibre. Saving an rtf, doc or docx file as html in some kind of editor that handles it.

Converting a pdf file to HTM or HTML using Acrobat Pro (I only have version 7 lol. don't use it enough to buy a newer version), a word processor that can translate to HTML or mobipocket creator which as part of the process of translating the prc generates an html file.

In other words. Using any method I can find I translate my original document to HTML. Perhaps even taking an old text file and going through and adding tags to it. (I can't find the php files I had that used a bunch of rules for creating paragraphs out of a flat txt file. It took me quite a while to write it and figure out the regex for finding all the characters found in a paragraph) ...
You want to clean the HTML, or generate clean HTML. This thread isn't the best place to discuss that. Like I said, I generally ignore the code level (HTML/XHTML, CSS). My knowledge/skills there are at the low end of the learning curve. I fix formatting problems interfering with readability if they are quickly fixable, but for my purpose, reading books for enjoyment, I don't care if the underlying code is clean or not.

btw, take a look at Toxaris' Word macro for clean HTML code:

https://www.mobileread.com/forums/showthread.php?t=142530

(for Word on Windows or OS X)

Last edited by unboggling; 02-13-2014 at 08:56 PM. Reason: clarify.
unboggling is offline   Reply With Quote