View Single Post
Old 10-19-2017, 10:47 AM   #30
AlanHK
Guru
AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.AlanHK ought to be getting tired of karma fortunes by now.
 
AlanHK's Avatar
 
Posts: 668
Karma: 929286
Join Date: Apr 2014
Device: PW-3, iPad, Android phone
Quote:
Originally Posted by slowsmile View Post
It's certainly possible to have spaces in filenames in the Book Browser. But to cure that in plugin code would be quite a difficult task because there are strict IDPF relationships and dependencies
Yes, but Sigil handles all that when you use the GUI rename-- I hoped a plugin could use that. I suppose not so easy.

Quote:
Also, converting several different encodings within your file to unicode is also not such an easy task. These problems are usually related to the user using too many applications to edit or format his or her doc or code and not ensuring that these editing environments are in unicode.
These aren't documents made to be part of an epub, but HTML web files I'm repurposing.

As it is, I do a bunch of S&R and convert to character entities so it all ends up as plain ASCII.

Quote:
Originally Posted by slowsmile View Post
I could be wrong but I think Sigil also does a basic unicode check whenever you open an epub or import an html file into Sigil.
No. Just gives you a bunch of garbage without any notice.

Quote:
Originally Posted by slowsmile View Post
My meaning of the word "redundant" used above is wrt unnecessary style property declarations in the css. Here's an example:
It would be clearer if you listed all the properties removed.


Quote:
Originally Posted by slowsmile View Post
Sorry again, but I can't accommodate your wish with this one either. For two reasons. First I've found the best way to format your ebook as a Word doc or ODT doc to be converted to epub is to ensure that you have created named paragraph styles for all your headings, text and spacing
If by "hard linebreaks" you mean carriage return /linebreak characters, they don't matter, my concern is with "html tags that are empty or that contain just spaces".

Again, these are not files I created, but ones I'm importing or trying to fix.

Some books have a blank para between every text para. These have no significance and I will just delete. Others though have them before a section break and I will use them to insert a proper code.
e.g.
Search: <p></p> <p>
Replace: <p class="topspace">

Or maybe <p></p><p></p> <p>
or whatever. They should not be in the final epub; but if these are just filtered out I lose information.

Blank paragraphs sometimes indicate a vertical space is required; sometimes they are useless side effects. So I would never want to just delete them before being sure they are bathwater and not baby.

I've been in publishing for over 20 years; I gave up long ago trying to get authors to follow any formatting rules. 90% of them have no clue what a style is and do not want to know. So the files I get are never to spec and require inspection to see if the code reflects the intention.

Last edited by AlanHK; 10-20-2017 at 12:50 AM.
AlanHK is offline   Reply With Quote