View Single Post
Old 10-29-2011, 08:45 PM   #10
shamanNS
Wizard
shamanNS ought to be getting tired of karma fortunes by now.shamanNS ought to be getting tired of karma fortunes by now.shamanNS ought to be getting tired of karma fortunes by now.shamanNS ought to be getting tired of karma fortunes by now.shamanNS ought to be getting tired of karma fortunes by now.shamanNS ought to be getting tired of karma fortunes by now.shamanNS ought to be getting tired of karma fortunes by now.shamanNS ought to be getting tired of karma fortunes by now.shamanNS ought to be getting tired of karma fortunes by now.shamanNS ought to be getting tired of karma fortunes by now.shamanNS ought to be getting tired of karma fortunes by now.
 
Posts: 1,125
Karma: 12345678
Join Date: Feb 2010
Location: Serbia
Device: Kindle PW5, Kobo Libra 2, Kindle PW1
Those html files are all from a book publisher, distributed on CD, intendeed to be viewed in web browser. And in the browser it look ok, if I enlarge letters with Ctrl+ it reflows nicely. There is now visible table, don't know why they use table tags,maybe because on the left corner there is always visible TOC... maybe they used tables instead of frames. As far I noticed there is only pair of <td><tr> tags,one before chapter title and one after all book text,so it's like all book chapter text is in one "table" (html code wise,there is no visible table while viewing the html in web browser). So no,there are no tables I need to keep.

Anyway,I am converting it to .mobi with Calibre (import zipped html to Calibre,convert that to mobi) and read on my Kindle. I've only converted one book. And it look normal on Kindle,text reflows when I change text size, paragraphs have indents . Book has TOC and also I can navigate through chapters by pressing left/right buttons on Kindle.
I've manually removed those bold-ed lines and then converted it in Kindle.

Before that tried importing book in Calibre without touching those html files, via one html file that references all the chapter html files,converted it to .mobi but book had 2 problem when view on Kindle or Mobipocket Reader for PC:
1) when moving cursor I've realized that whole "page" is actually in one box of text (it displayed boarder around whole "page" text)
2) when I press next page button on Kindle it show blank page,and on the second button press it skips to next chapter.

That is why I tried editing html files and deleting <td> and >tr> tags.After that book looks normal,as it was proper html. No need to clean more of html junk in them

I'm off to sleep now (it 2 AM here ),so I will test your regex tomorrow.

Last edited by shamanNS; 10-29-2011 at 08:47 PM.
shamanNS is online now   Reply With Quote