View Single Post
Old 09-14-2014, 10:45 AM   #14
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Quote:
Originally Posted by AlanHK View Post
Won't that leave all the text using those styles with undefined styles?


The problem is that in one file
<p class="p6 sgc-2">October 11</p> is bold, in another it would be italic.
Knew there was something I was forgetting.

Quote:
That looks useful, but first I need to get all the text tagged consistently.

Anyway, I think I can do this by unzipping the epub and sorting the files into groups with common style definitions, using Far file manger, then doing S&R on groups of files to make them all consistent, then making a new epub.
Yeah, my second regex assumes unreasonably that tidy named classes "bold" and "italic" (where did I get that from?) which I blame on the lateness of the hour.

So I can't think of a purely regex way to fix these, matching class to style. You'd have to do each class on its own.

Best thing is to avoid this entirely. HTML Tidy is a really annoying crutch.
eschwartz is offline   Reply With Quote