MobileRead Forums - View Single Post

capidamonte · 06-30-2016, 11:32 PM

I agree that it could and possibly should be an import mode into the editor. But I don't see it as "fixes" exactly, although one can argue that any attempt to simplify and make more robust is a fix.

I use regex when I do ebook fixes too, although not everyone can. But I think my idea here was to avoid having to examine the entire book for minor variations in styling and have the wizard group as many as possible for me without my having to design regex to collect them myself. Then just push each collected group (after some sort of visual confirmation for each one being a proper member of the group) into a set of selectors I already prefer that cover elements I find common to most books (fiction or non-fiction book at least.) Those minor variations are more difficult to find when individually designing regex on a case-by-case basis. And as I implied, you cannot easily catch the individual variations of books that were poorly designed in the first place.

It could be tedious in a book with terrible styling (and thus hundreds of practically individual styles), but certainly no more tedious than determining regex for some of the books I've seen.

It would also allow you to get as fine-grained with your selectors as you like, and help to standardize the markup in your collection of books. You could perhaps optionally even transfer the original styling of the book into your preferred selectors, which would preserve much of the design if you found you liked that book's look -- then you're just editing a familiar set of CSS selectors to any style you later prefer instead of trying to mentally model the result of whatever terrible naming convention the Word/InDesign/Calibre intersection has produced. (I'm not criticizing Calibre, here, it's just the nature of the conversion beast.)

Of course, I'm also talking about working on the classes that actually apply to something in the book, but wouldn't want to rid the book of something referenced in the book that had no style applied -- that markup information might actually be useful. I'm thinking for instance of something like every paragraph's first-letter -- it could be marked up but not styled and I certainly wouldn't want to lose that information.

I think what I'm focused on here is the book structure which is the most difficult part of properly marking up a book and definitely the most difficult part of cleaning up bad design. I think it is generally the first and most important thing to get right, and the hardest part to programmatically deal with.

06-30-2016, 11:32 PM	#4
capidamonte Not who you think I am... Posts: 374 Karma: 30283 Join Date: Jan 2010 Location: Honolulu Device: PocketBook 360 -- Ivory	I agree that it could and possibly should be an import mode into the editor. But I don't see it as "fixes" exactly, although one can argue that any attempt to simplify and make more robust is a fix. I use regex when I do ebook fixes too, although not everyone can. But I think my idea here was to avoid having to examine the entire book for minor variations in styling and have the wizard group as many as possible for me without my having to design regex to collect them myself. Then just push each collected group (after some sort of visual confirmation for each one being a proper member of the group) into a set of selectors I already prefer that cover elements I find common to most books (fiction or non-fiction book at least.) Those minor variations are more difficult to find when individually designing regex on a case-by-case basis. And as I implied, you cannot easily catch the individual variations of books that were poorly designed in the first place. It could be tedious in a book with terrible styling (and thus hundreds of practically individual styles), but certainly no more tedious than determining regex for some of the books I've seen. It would also allow you to get as fine-grained with your selectors as you like, and help to standardize the markup in your collection of books. You could perhaps optionally even transfer the original styling of the book into your preferred selectors, which would preserve much of the design if you found you liked that book's look -- then you're just editing a familiar set of CSS selectors to any style you later prefer instead of trying to mentally model the result of whatever terrible naming convention the Word/InDesign/Calibre intersection has produced. (I'm not criticizing Calibre, here, it's just the nature of the conversion beast.) Of course, I'm also talking about working on the classes that actually apply to something in the book, but wouldn't want to rid the book of something referenced in the book that had no style applied -- that markup information might actually be useful. I'm thinking for instance of something like every paragraph's first-letter -- it could be marked up but not styled and I certainly wouldn't want to lose that information. I think what I'm focused on here is the book structure which is the most difficult part of properly marking up a book and definitely the most difficult part of cleaning up bad design. I think it is generally the first and most important thing to get right, and the hardest part to programmatically deal with.