MobileRead Forums

MobileRead Forums (https://www.mobileread.com/forums/index.php)
-   Sigil (https://www.mobileread.com/forums/forumdisplay.php?f=203)
-   -   Adding a limited Automate Feature To Sigil (https://www.mobileread.com/forums/showthread.php?t=341347)

DiapDealer 09-19-2021 12:21 PM

Quote:

Originally Posted by Doitsu (Post 4155507)
IMHO, 3 lists should be more than enough for most users. You could use one for epub2-specific tasks, one for epub3-specific tasks, and a third one for "special tasks."

Oh, that's my opinion, as well. I was just throwing out a potential workaround for the handful of inevitable future squeaky wheels. :D

un_pogaz 09-19-2021 12:24 PM

2 Attachment(s)
Quote:

IMHO, 3 lists should be more than enough for most users.
Is very probable for me to have more.

I think it's a pity that to use several automation you have to rename the linked files. That's too... nerdy. We have a GUI for gods sake.

Well, I'm aware that it's too late for this update, but maybe an improvement for the next one.

So my idea is inspired by some Calibre plugins that have a dynamic menu system (see pictures).
In the settings window of the plugin, there is a table where each line represents a menu entry associated to a list of operations.
Some advanced versions even include the possibility to make sub-menus.

How all this is stored, no idea, but it would be better than directly manipulating the file names.
A quick suggestion:
automate.txt Contains the association list "Menu entry" <-> "automate file"
automateXX.txt Contains an automate

KevinH 09-19-2021 12:43 PM

Huh? There are no menus. As DiapDealer said, you can use a Plugin (with a gui if a gui is important to you) to control what gets copied into any of the 3 automate lists if you truly need more than 3.

In the future we *may* expand to 5 but we are starting with 3 and we will see what the consensus turns out to be.

Remember that plugins are capable of doing almost anything so if your need to automate is huge then creating custom plugins is probably your best bet anyway.

KevinH 09-19-2021 12:54 PM

FWIW, its arrival is not imminent. We had to change the gui so I wanted to give translators more time to do translations. We have not planned a 1.8 release or date yet and things are still subject to change.

odamizu 09-20-2021 03:22 AM

Hello KevinH,

Attached is the Automate List chapter for the Sigil User Guide. Hope I got everything right. Let me know if it needs adjustment.

~oda :)

KevinH 09-20-2021 08:47 AM

Thank you!
I will incorporate it into the Sigil User Guide.


Update:
Pushed your changes, plus a bunch of minor version number updates to the sigil-user-guide repo.
Thank You!


Quote:

Originally Posted by odamizu (Post 4155707)
Hello KevinH,

Attached is the Automate List chapter for the Sigil User Guide. Hope I got everything right. Let me know if it needs adjustment.

~oda :)


KevinH 09-20-2021 02:53 PM

FYI, I have updated Sigil's IconThemes repo with BeckyEbook's wonderful automate icons. For legacy, I just used the same icons we use for main because there was no "legacy" equivalent since this is a new feature.

un_pogaz 09-21-2021 04:19 AM

Apparently my proposal for a dynamic and extensive solution of the Automate was not understood.
Oh, well, have a nice day.

KevinH 09-21-2021 09:17 AM

Yes, I did not understand how what you posted could help at all.

The Automate feature is designed to create a limited tool/plugin sequential execution list much like a simple .bat file would allow for. Anything more complex can and should be done in custom plugins themselves.

And we already build multi level dynamic menus when plugins are added or removed. Plugins are where any heavy lifting should be done. So there is no need to build a dynamic menu system just to select one of the 3 available automate lists.

And, as DiapDealer said there are workarounds if you really do turn out to need more than 3 automate lists - which is 2 more than the original "big green publish button idea" this feature evolved from.

And of course, if you wanted more of a voice then volunteering to test things and providing early input when it was asked for would generally have been good ideas.

Quote:

Originally Posted by un_pogaz (Post 4156086)
Apparently my proposal for a dynamic and extensive solution of the Automate was not understood.
Oh, well, have a nice day.


RbnJrg 09-21-2021 11:45 AM

Just yesterday I had proof about how powerfull is this new feature. I had to put my hands on an epub with 206 style sheets! And on those 206 sheets, I had to run the "Reformat CSS" command. It was to go crazy.

Then I remembered that Kevin, when added the command "Reformat CSS" to the list of the automated tasks, said me that that command had to apply to ALL stylesheets :) Never better welcome that command. I only had to edit the Automated List 3, to have just the command "ReformatCSSMultiplesLines" and problem solved!

By the way, by pure chance anyone knows about a tool to consolite stylesheets? Because I noted that there are many styles, that are equals, but they are in different sheets with different names.

phillipgessert 09-21-2021 12:17 PM

There’s surely a more elegant way, but I’d just pull em all out via unzip and concatenate them outside of Sigil, go back to Sigil to trash all 206 within it, pull in the new megastylesheet, and finally run a find/replace on all the xhtml files to repair the now-broken stylesheet link/s in the head.

Tex2002ans 09-21-2021 01:29 PM

Quote:

Originally Posted by RbnJrg (Post 4156224)
By the way, by pure chance anyone knows about a tool to consolite stylesheets? Because I noted that there are many styles, that are equals, but they are in different sheets with different names.

Calibre EPUB->EPUB conversion.

It's what I do when I export individual chapters as InDesign EPUBs.

So let's say there's 26 different chapters, 26 different EPUBs, all have almost-the-same-but-slightly-different CSS, all with conflicting class names.

I merge all EPUBs together (using Calibre's EPUBMerge plugin), then a Calibre EPUB->EPUB.

This will convert each unique CSS into 1 class.

So these matching classes:

Spoiler:
CSS #1:

Code:

span.CharOverride-3 {
        font-family:"Adobe Garamond Pro Regular", sans-serif;
        font-size:1.181em;
        font-style:normal;
        font-variant:small-caps;
        font-weight:normal;
        text-transform:none;
}

CSS #2:

Code:

span.CharOverride-4 {
        font-family:"Adobe Garamond Pro Regular", sans-serif;
        font-size:1.181em;
        font-style:normal;
        font-variant:small-caps;
        font-weight:normal;
        text-transform:none;
}



will convert into a single "charoverride3".

Side Note: If you've used human-readable names for CSS, you'll probably get a big mess in your HTML, since Calibre will convert everything to the very first matching name:

Code:

<p class="first">This is a first line.</p>
<p class="noindent">This is a typical no indent paragraph.</p>

Code:

p.first {
        text-indent: 0;
}

p.noindent {
        text-indent: 0;
}

after Calibre EPUB->EPUB conversion would turn into:

Code:

<p class="first">This is a first line.</p>
<p class="first">This is a typical no indent paragraph.</p>

But if you already have a giant spaghetti mess of auto-generated classes, it'll make it infinitely easier. :)

RbnJrg 09-21-2021 03:43 PM

Quote:

Originally Posted by Tex2002ans (Post 4156269)
Calibre EPUB->EPUB conversion.

It's what I do when I export individual chapters as InDesign EPUBs.

So let's say there's 26 different chapters, 26 different EPUBs, all have almost-the-same-but-slightly-different CSS, all with conflicting class names.

I merge all EPUBs together (using Calibre's EPUBMerge plugin), then a Calibre EPUB->EPUB.

This will convert each unique CSS into 1 class.

Thanks a lot! I did what said and Calibre made a great job. It was not perfect, but I can fix the minor "issues" originated by the conversion EPUB -> EPUB. And, the most important, now I have ONLY ONE stylesheet. I didn't know about that feature of Calibre; thanks for sharing the info. :thanks:

Tex2002ans 09-21-2021 03:51 PM

Quote:

Originally Posted by RbnJrg (Post 4156322)
Thanks a lot! I did what said and Calibre made a great job. It was not perfect, but I can fix the minor "issues" originated by the conversion EPUB -> EPUB. And, the most important, now I have ONLY ONE stylesheet. I didn't know about that feature of Calibre; thanks for sharing the info. :thanks:

:thumbsup:

There's also:

Convert books > Convert individually > Look & Feel > Styling

At the very bottom, there's a "Filter Style Information" section with checkboxes for:
  • Fonts
  • Margins
  • Padding
  • Floats
  • Colors

This would remove all that cruft from the CSS files on conversion as well.

Might be helpful for condensing down even more of those "unique" CSS classes.

I only noticed those options a few weeks ago though, so I haven't done in-depth testing.

Might also help convert those hundreds of styles down to a few dozen.

Tex2002ans 09-21-2021 06:08 PM

Quote:

Originally Posted by phillipgessert (Post 4156235)
There’s surely a more elegant way, but I’d just pull em all out via unzip and concatenate them outside of Sigil, go back to Sigil to trash all 206 within it, pull in the new megastylesheet, and finally run a find/replace on all the xhtml files to repair the now-broken stylesheet link/s in the head.

Be very careful, that wouldn't work with conflicting class names:

class="blockquote" in CSS#1

might not be the same as

class="blockquote" in CSS#2.

The Calibre EPUB->EPUB approach would thoroughly go through all HTML+CSS and merge/rename everything for you.

If the two classes are exactly the same, great.

If the two classes are same name, but different CSS, Calibre will merge/create a new class + make sure to properly update the HTML too:

Spoiler:
Before:

Book #1:

Code:

<blockquote>
<p class="blockquote">This is an example.</blockquote>
</blockquote>

Code:

p.blockquote {
        margin-top: 1em;
        margin-bottom: 1em;
        margin-left: 5%;
        margin-right: 5%;
}

Book #2:

Code:

<blockquote>
<p class="blockquote">This is a second example.</blockquote>
</blockquote>

Code:

p.blockquote {
        margin-left: 5%;
}

After:

Book #1:

Code:

<blockquote>
<p class="blockquote">This is an example.</blockquote>
</blockquote>

Book #2:

Code:

<blockquote>
<p class="blockquote2">This is a second example.</blockquote>
</blockquote>

Code:

p.blockquote {
        margin-top: 1em;
        margin-bottom: 1em;
        margin-left: 5%;
        margin-right: 5%;
}

p.blockquote2 {
        margin-left: 5%;
}



Side Note: Similar logic applies to "Removing Unused Styles". You have to pay very close attention to what's happening with edge cases.

In 2021: "Indesign-epub-kindle formatting problem: footnotes export with massive indent", I also explained a more "surgical" approach + discussed a few things to look out for (like accidentally stripping important/busted font information).

phillipgessert 09-21-2021 06:52 PM

Quote:

Originally Posted by Tex2002ans (Post 4156377)
Be very careful, that wouldn't work with conflicting class names:

<snip>

That's a good point, my suggestion would give undue priority to any same-named stuff pulled in from the later sheets. That fix you proposed seems incredibly useful.

KevinH 09-21-2021 07:52 PM

So are you saying a possible plugin to "merge" stylesheets might be useful?

Does it work on all selectors or only class selectors? How does it treat element selectors that differ across the sheets?

Hmm ... it would also need to:

- split all selector lists out

- find all styles/selectors with identical property value lists and assign them a common class name

- make sure the selectors for unique styles are themselves unique.

- create a class name mapping to fix up the assigned classes in all html files

Anything else?

RbnJrg 09-21-2021 08:49 PM

Quote:

Originally Posted by Tex2002ans (Post 4156327)
:thumbsup:

There's also:

Convert books > Convert individually > Look & Feel > Styling

At the very bottom, there's a "Filter Style Information" section with checkboxes for:
  • Fonts
  • Margins
  • Padding
  • Floats
  • Colors

This would remove all that cruft from the CSS files on conversion as well.

Might be helpful for condensing down even more of those "unique" CSS classes.

I only noticed those options a few weeks ago though, so I haven't done in-depth testing.

Might also help convert those hundreds of styles down to a few dozen.

Many thanks for that too!

RbnJrg 09-21-2021 09:26 PM

Quote:

Originally Posted by KevinH (Post 4156395)
So are you saying a possible plugin to "merge" stylesheets might be useful?

Imagine :) To have to work with 206 stylesheets or with only one. Calibre saved my day but it would be nice for Sigil to have that feature too (even by means of a plugin).

Quote:

Does it work on all selectors or only class selectors?
I think it should work on all selectors.

Quote:

How does it treat element selectors that differ across the sheets?
Good question. I suppose that you are refering to selectors based on tag names (because those ones based on #id or classes are not problematic. because if they have the same properties, they must be treat them as a same style; otherwise, as different styles). But if you in one sheet has styles for p, h*, blockquote, etc., etc. and in another sheet different styles for those same selectors, that can be an issue. I can't see another way to solve the problem that to assign them a class (p.sheet1 or p.s1, p.sheet2 or p.s2 and so on).

Quote:

Hmm ... it would also need to:

- split all selector lists out

- find all styles/selectors with identical property value lists and assign them a common class name

- make sure the selectors for unique styles are themselves unique.

- create a class name mapping to fix up the assigned classes in all html files

Anything else?
In principle, it seems that that plan covers all points.

Tex2002ans 09-21-2021 11:21 PM

2 Attachment(s)
Quote:

Originally Posted by KevinH (Post 4156395)
So are you saying a possible plugin to "merge" stylesheets might be useful?

Perhaps...

I still think Style Mapping would be much more powerful.

I discussed that + "Consolidate Stylesheet" a few months back:

2021: "What Features or Tools does Sigil Still Need Yet?" (Post #163+)

InDesign has such a mapping function when doing EPUB Export.

For import, I also believe it also maps Word (DOCX) Styles -> InDesign Styles, so when you import those documents, you can quickly go through a table and say what gets assigned to what.

It speeds up the to-Print workflow dramatically.

Don't see why it couldn't speed up the to-clean-EPUB workflow as well.

Quote:

Originally Posted by KevinH (Post 4156395)
Does it work on all selectors or only class selectors? How does it treat element selectors that differ across the sheets?

Unsure.

I rarely use anything beyond very basic classes in my ebooks, so I haven't done extensive testing into Calibre's innards to see exactly what it does with more complicated selectors.

During a Calibre EPUB->EPUB, I think it converts everything down to individual "calibre##" classes. For example:

Spoiler:

Code:

  <p>Testing</p>

  <blockquote>
    <p>This is an example</p>
    <p>of a larger blockquote.</p>
  </blockquote>

  <p>Testing</p>

Code:

p {
        margin-top: 0;
        margin-bottom: 0;
        text-align: justify;
        text-indent: 2em;
}

blockquote > p:first-child {
        background-color: red;
        margin-top: 1em;
        margin-bottom: 1em;
        text-indent: 0;
}

blockquote > p {
        background-color: yellow;
        padding-top: 1em;
        margin-bottom: 1em;
}



Attachment 189347

Calibre EPUB->EPUB turned into:

Spoiler:

Code:

  <p class="calibre1">Testing</p>

  <blockquote class="calibre2">
    <p class="calibre3">This is an example</p>
    <p class="calibre4">of a larger blockquote.</p>
  </blockquote>

  <p class="calibre1">Testing</p>

Code:

.calibre {
    display: block;
    font-size: 1em;
    padding-left: 0;
    padding-right: 0;
    margin: 0 5pt
    }
.calibre1 {
    display: block;
    text-align: justify;
    text-indent: 2em;
    margin: 0
    }
.calibre2 {
    display: block;
    margin: 1em
    }
.calibre3 {
    background-color: yellow;
    display: block;
    padding-top: 1em;
    text-align: justify;
    text-indent: 0;
    margin: 1em 0
    }
.calibre4 {
    background-color: yellow;
    display: block;
    padding-top: 1em;
    text-align: justify;
    text-indent: 2em;
    margin: 0 0 1em
    }



(Side Note: The red background-color went poof. Suspecting it's a conversion bug.)

Quote:

Originally Posted by KevinH (Post 4156395)
Hmm ... it would also need to:

Hmmm... similar to those Calibre checkboxes, it would be nice to completely strip/ignore certain properties.

Nice to have broad/easy-mode checkbox categories like "Colors" + "Margins" + "Floats".

But also a surgical/advanced-mode where you could specify attributes to strip:
  • letter-spacing
  • orphans
  • widows
  • text-transform
  • [...]

(Maybe a live list of all currently used properties within the CSS?)

So some InDesign cruft like this:

Spoiler:
Code:

p.Block-indent {
        color:#000000;
        font-family:"Minion Pro Medium", sans-serif;
        font-size:0.917em;
        font-style:normal;
        font-variant:normal;
        font-weight:normal;
        line-height:1.182;
        margin-bottom:5px;
        margin-left:36px;
        margin-right:36px;
        margin-top:5px;

        orphans:2;
        page-break-after:auto;
        page-break-before:auto;
        text-align:justify;
        text-decoration:none;
        text-indent:0;
        text-transform:none;
        widows:2;
}

Code:

p.Body-text {
        color:#000000;
        font-family:"Minion Pro Medium", sans-serif;
        font-size:0.917em;
        font-style:normal;
        font-variant:normal;
        font-weight:normal;
        line-height:1.2;
        margin-bottom:0;
        margin-left:0;
        margin-right:0;
        margin-top:1px;

        orphans:2;
        page-break-after:auto;
        page-break-before:auto;
        text-align:justify;
        text-decoration:none;
        text-indent:18px;
        text-transform:none;
        widows:2;
}



* * *

If you check 3... Remove:
  • Margins
  • line-height
  • text-indent

Those classes would now be considered equivalent.

So consolidate .Body-text -> Block-indent in CSS:

Spoiler:
Code:

p.Block-indent {
        color:#000000;
        font-family:"Minion Pro Medium", sans-serif;
        font-size:0.917em;
        font-style:normal;
        font-variant:normal;
        font-weight:normal;
        orphans:2;
        page-break-after:auto;
        page-break-before:auto;
        text-align:justify;
        text-decoration:none;
        text-transform:none;
        widows:2;
}



+ go through and update any HTML:

<p class="Block-indent"> -> <p class="Body-text">

* * *

If you say... Remove:
  • Margins
  • line-height

they'd be extremely close, but at least you'll strip/remove some trash:

Spoiler:
Code:

p.Block-indent {
        color:#000000;
        font-family:"Minion Pro Medium", sans-serif;
        font-size:0.917em;
        font-style:normal;
        font-variant:normal;
        font-weight:normal;
        orphans:2;
        page-break-after:auto;
        page-break-before:auto;
        text-align:justify;
        text-decoration:none;
        text-indent:0;
        text-transform:none;
        widows:2;
}

p.Body-text {
        color:#000000;
        font-family:"Minion Pro Medium", sans-serif;
        font-size:0.917em;
        font-style:normal;
        font-variant:normal;
        font-weight:normal;
        orphans:2;
        page-break-after:auto;
        page-break-before:auto;
        text-align:justify;
        text-decoration:none;
        text-indent:18px;
        text-transform:none;
        widows:2;
}



Update CSS, but do not update the HTML.

* * *

Would be Helpful: After this stage, if you had a "Style Mapper", you'd be able to select these 2 classes, then see their CSS compared side-by-side, highlighting the diffs.

Then you'd be able to:
  • Edit
    • Remove the "text-indent:18px;" line
    • Sigil updates the CSS.
      • (Optionally checks again to see if there's any matching classes that it can consolidate into now.)
  • Merge Left/Right
    • Be able to say:
      • Block-indent -> Body-text
      • OR Block-indent <- Body-text
    • Sigil updates CSS + HTML.
  • Rename
    • Block-indent now called "normal"
    • Sigil updates CSS + HTML:
      • p.Block-indent -> p.normal
      • <p class="Block-indent"> -> <p class="normal">

* * *

Side Note: Like I mentioned, I only ran across those Look & Feel screens in Calibre very recently, so I believe there's a way to do this property stripping already...

In Calibre, there is the Transform Styles tab (right next to the Styling tab):

Attachment 189346

... but documentation is sparse + I don't exactly know how useful it would be (yet), since I'm typically dealing with all types of nonsense on a per-book level. (I do see Import/Export button + a GUI to create rules though.)

Right now, I do CSS cleanup manually (using regex) + multiple rounds of Calibre EPUB->EPUB conversions... until I'm satisfied and have a relatively clean base to work from.

But to have a quicker way to:
  • strip/consolidate CSS
  • compare CSS
  • convert/map to human-readable/standard class names

would be absolutely fantastic.

Many times, I'm just looking through tons and tons of cruft only to finally spot the single difference being a:

- font-variant: italic;

then I know: "Oh, this should just be a class="italic" (or <i> / <em>)."

Then I do a simple S&R or open up Diap's Toolbag and convert it.

KevinH 09-22-2021 10:16 AM

IMHO, adding all those unneeded classes to the code is really a shame. It creates quite the mess.

And it converts all selectors into class selectors with *non* mnemonic class names destroying the structure of the original css completely.

Perhaps something that simply identifies and removes identical classes would be better / safer / cleaner.

As for a properties filter, that should be easy to now do with a SavedSearch Group and run with the target set to all css stylesheets, all in one command.

That could be run first with then something that identifies and removes extra identical selectors might be enough.

But I have never seen anything with 206 stylesheets, I must admit.

If In-Design can handle style mapping from .docx styles, why isn't this handled by In-Design when inputting the .docx files?

KevinH 09-22-2021 11:39 AM

It would be nice to see what one of these real epubs with lots and lots of stylesheets from InDesign looks like to test cleanup ideas on.

If you have access to such an epub *before* the css was converted by Calibre, please run the Borkify Epub plugin on it and post it here or privately PM me with a link so that I can see some of the issues involved and test some approaches.

Thanks

Hitch 09-22-2021 11:47 AM

Quote:

Originally Posted by KevinH (Post 4156550)
It would be nice to see what one of these real epubs with lots and lots of stylesheets from InDesign looks like to test cleanup ideas on.

If you have access to such an epub *before* the css was converted by Calibre, please run the Borkify Epub plugin on it and post it here or privately PM me with a link so that I can see some of the issues involved and test some approaches.

Thanks

Kevin:

Sorry, late to this party. Or...was here, had a work crisis, back. What exactly do you need from INDD?

(I freely admit that Tex's work might be better suited to help you, but I have a fairly huge collection of INDD files...)

Hitch

KevinH 09-22-2021 12:09 PM

I am looking for an epub created from InDesign that uses individual stylesheets (one per chapter) with many chapters (many stylesheets) that I can use to test some ideas for techniques to merge the large number of stylesheets down into a small hand full of stylesheets and in the process remap styles if possible.

All hopefully *without* having to convert all selectors to class selectors with non-mnemonic numbered names that end up littering the html.

I am thinking of using ngram scoring to try to identify the most similar set of selector properties (after a filtering step) and presenting those for the userto approve of, then doing the merge.

I am thinking that by paretos rule we should be able to take a large number of stylesheets and merge them into a much small number but keep most of the individuality present.

Hitch 09-22-2021 01:14 PM

Quote:

Originally Posted by KevinH (Post 4156561)
I am looking for an epub created from InDesign that uses individual stylesheets (one per chapter) with many chapters (many stylesheets) that I can use to test some ideas for techniques to merge the large number of stylesheets down into a small hand full of stylesheets and in the process remap styles if possible.

All hopefully *without* having to convert all selectors to class selectors with non-mnemonic numbered names that end up littering the html.

I am thinking of using ngram scoring to try to identify the most similar set of selector properties (after a filtering step) and presenting those for the userto approve of, then doing the merge.

I am thinking that by paretos rule we should be able to take a large number of stylesheets and merge them into a much small number but keep most of the individuality present.

Okay. That wouldn't be one of ours (our production I mean) but it's entirely possible that I have files like that, from other designers that we used for export to ePUB or HTML and subsequent conversion. I will take a look. I mean, to be clear--I know we have had those, but I don't know if I still have one in-house that would be available to Borkify for you. I'll check.

Hitch

KevinH 09-22-2021 03:17 PM

@Hitch,
Tex2002ans, has already posted a few test cases for me so no worries. Thanks,

KevinH

Tex2002ans 09-22-2021 03:19 PM

Quote:

Originally Posted by Hitch (Post 4156592)
Okay. That wouldn't be one of ours (our production I mean) but it's entirely possible that I have files like that, from other designers that we used for export to ePUB or HTML and subsequent conversion. I will take a look. I mean, to be clear--I know we have had those, but I don't know if I still have one in-house that would be available to Borkify for you. I'll check.

:thumbsup: I just PMed KevinH 3 of my examples.

2 InDesign EPUBs -> Merged -> Calibre EPUB->EPUB conversion.

This is where you can see 1 CSS file per chapter + overlapping class names:

Spoiler:

CSS #1:

Code:

p.ParaOverride-1 {
        margin-bottom:0px;
}
p.ParaOverride-2 {
        margin-top:1px;
        text-indent:18px;
}
p.ParaOverride-3 {
        text-indent:14px;
}
p.ParaOverride-4 {
        text-indent:18px;
}
span.CharOverride-1 {
        font-size:1.454em;
}
span.CharOverride-2 {
        font-size:1.091em;
}
span.CharOverride-3 {
        font-size:58%;
        vertical-align:super;
}

CSS #2:

Code:

p.ParaOverride-1 {
        text-align:center;
}
p.ParaOverride-2 {
        margin-top:0px;
        text-align:center;
        text-indent:0px;
}
p.ParaOverride-3 {
        text-align:center;
        text-indent:0px;
}
p.ParaOverride-4 {
        margin-top:2px;
        text-align:center;
        text-indent:0px;
}
span.CharOverride-1 {
        font-family:"Myriad Pro Semibold", sans-serif;
        font-size:1.801em;
        font-style:normal;
        font-weight:normal;
}
span.CharOverride-2 {
        font-family:"Minion Pro Medium";
        font-size:0.909em;
        font-style:normal;
        font-weight:normal;
}
span.CharOverride-3 {
        font-family:"Myriad Pro Semibold", sans-serif;
        font-style:normal;
        font-weight:normal;
}



1 Word -> HTML -> Calibre EPUB->EPUB conversion.

This is where you can see a typical CSS mess:

Spoiler:

Code:

.calibre7 {
    font-family: "Times New Roman", serif
    }

[...]

.calibre12 {
    font-size: 1em
    }
.calibre13 {
    font-family: "Times New Roman", serif;
    font-size: 1em
    }
[...]

.calibre14 {
    font-size: 1.125em;
    line-height: 1.2
    }
.calibre15 {
    color: black;
    font-family: "Garamond", serif;
    font-size: 1em;
    line-height: 1.2
    }
[...]

.calibre17 {
    line-height: 1.2
    }
.calibre18 {
    color: black;
    display: none;
    text-decoration: none
    }
[...]
.calibre20 {
    color: black;
    display: block;
    font-family: "Garamond", serif;
    font-size: 1.48148em;
    font-weight: normal;
    line-height: 1.2;
    page-break-after: avoid;
    text-align: center;
    text-autospace: none;
    margin: 30pt 0
    }
.calibre21 {
    color: black;
    display: block;
    font-family: "Garamond", serif;
    font-size: 1.25926em;
    font-weight: normal;
    line-height: 1.2;
    page-break-after: avoid;
    text-align: justify;
    text-autospace: none;
    margin: 20pt 0
    }



Quote:

Originally Posted by KevinH (Post 4156561)
I am looking for an epub created from InDesign that uses individual stylesheets (one per chapter) with many chapters (many stylesheets) that I can use to test some ideas for techniques to merge the large number of stylesheets down into a small hand full of stylesheets and in the process remap styles if possible.

InDesign's EPUB export actually only outputs a single CSS file.

When designing a print book, one type of workflow is:

- individual "chapter file"s
- then link them together into a single "book file".

(This allows you to easily swap/remove chapters, auto-renumber pages/endnotes, etc.)

In my case though, as a converter, I don't have that single "book file"... I only get the 20 separate "chapter file"s.

So, when I'm exporting, I export each individual chapter -> EPUB... hence the 20 different similar-but-not-quite CSS files.

Mix Direct Formatting and lots of other cruft in there, and you get a giant, conflicting mess on your hands.

IF I had the monolithic "book file", I'd be able to export a single EPUB... but you'd still have a spaghetti mess, but no conflicting names. :P

(Same as cleaning up Word->HTML, etc. etc.)

Quote:

Originally Posted by KevinH (Post 4156561)
All hopefully *without* having to convert all selectors to class selectors with non-mnemonic numbered names that end up littering the html.

Yeah, I don't believe InDesign or Word/LibreOffice generates complicated selectors.

I think they all just break it down to individual classes.

So the bulk of consolidate/cleanup would probably be this simple conversion cruft:

Code:

.class1 {
        text-align: center;
}
.class2 {
        text-align: center;
        font-size: 1em;
}
.class3 {
        text-align: center;
        font-size: .9em;
}

not necessarily trying to tackle all the advanced CSS3 selectors, etc.

Quote:

Originally Posted by KevinH (Post 4156534)
If In-Design can handle style mapping from .docx styles, why isn't this handled by In-Design when inputting the .docx files?

Hmm... the Export (Styles Mapper) is definitely there.

I'm not familiar with Import. (I don't actually use InDesign, I only know enough to get text OUT OF IT as soon as possible.)

I believe it's built-in. See this video as one example:

Nukefactory: "How to import text into InDesign without losing basic formatting"

But as usual, the thing is:
  • 99+% of people don't use Styles
  • they don't use them consistently
    • lots of Direct Formatting
  • and InDesign Styles =/= Word Styles
    • InDesign is much more powerful.
  • Print-focused designers probably don't have one clue about HTML or ebooks
    • That's just technical gobbledeegook. Everything looks fine with my eyes!
    • And hey, great, InDesign "exports" EPUBs. Looks "perfect" on my iPad!!! What's the problem?

So each stage in the conversion workflow has the potential to introduce nonsense or lose key information.

And again, as a converter... I don't have control over what these people are doing in intermediate steps. I just have to clean up the cruft and create the ultimate ebook. :D

Minor Rant:

Spoiler:
Grumble, grumble.

My latest is trying to get them to understand the text:

Code:

For more information, click here and here.
might be 'usable' in a web article... but this type of text CANNOT be used in a physical book (and is very very bad in an ebook).


Quote:

Originally Posted by KevinH (Post 4156561)
I am thinking of using ngram scoring to try to identify the most similar set of selector properties (after a filtering step) and presenting those for the userto approve of, then doing the merge.

Yeah, I was thinking of something similar. A similarity score.

You click on a class, it ranks everything that's close.

Then you can Shift+Click or Ctrl+Click and merge the classes together.

* * *

Usually, I try to do this stripping/consolidating in passes. Clean up:
  • Fonts
  • Colors
  • font-size
  • italics
  • superscripts
  • [...]

and at each stage, I try to merge what I can to my "normalized" (human-readable) classes:
  • All classes with "vertical-align: super"
    • I'll try to convert to class="super" or <sup>.
  • Many classes with "font-variant: italic"
    • I'll try to convert to class="italics" or <i>/<em>.
  • Colors (black text + blue links), I'll instantly strip.
    • Then take a closer look at oddities (red, orange, green text, etc.).
      • Sometimes these things slip in (especially when authors are doing "Track Changes").
    • Commonly see very dark gray text instead of black.
      • CMYK -> RGB or copy/paste-from-other-source issue.
      • Once I spot the shade of gray and see it's irrelevant, I strip it.
  • All classes with "font-size: 1em;"
    • I remove that line.
  • Most fonts
    • I'll remove the CSS for main text font, then take a look at classes that DON'T use that font.
    • For example, the book is "Times New Roman", but there's a few classes with "Arial" or "Symbol" or something different. I'll take a closer look to see exactly where/how that was used.
      • Very common when there's Greek letters or Maths symbols.
  • [...]

This is where I got excited when I stumbled upon that Calibre "Transform Styles" tab.

It will allow me to at least come up with a set of some property-stripping rules that would save some time.

But the frustrating thing about Calibre EPUB->EPUB is it changes the class names.

And it's hard to know ahead-of-time what junk is going to be in this specific book! Each one will introduce their own unique niggles:

Like one book might use font-size: .88889em, another might have .888em and .8em.

One book might be typeset in "Times New Roman" with "Arial" crept in, another book "Arial" as the main with "Times New Roman" crept in.

This is why I mostly do CSS consolidation as THE VERY FIRST STEP after merging, then do successive rounds of EPUB->EPUB to make sure I get down to more bare bones.

But, of course, at later stages, when looking at CSS details, that's when you spot more consolidation that could've been done.

(Hence, a nice GUI, CSS Comparison/Merger, Style Mapper, etc.)

Quote:

Originally Posted by KevinH (Post 4156561)
I am thinking that by paretos rule we should be able to take a large number of stylesheets and merge them into a much small number but keep most of the individuality present.

:thumbsup: :thumbsup:

Hitch 09-22-2021 04:26 PM

Quote:

Originally Posted by Tex2002ans (Post 4156621)
:thumbsup: I just PMed KevinH 3 of my examples.

2 InDesign EPUBs -> Merged -> Calibre EPUB->EPUB conversion.

What, the ONE TIME I could have actually been useful, since they stopped accepting donations and you send him the files! You are Dastardly!

Quote:

InDesign's EPUB export actually only outputs a single CSS file.
Yes ^.

Quote:

When designing a print book, one type of workflow is:

- individual "chapter file"s
- then link them together into a single "book file".

(This allows you to easily swap/remove chapters, auto-renumber pages/endnotes, etc.)

In my case though, as a converter, I don't have that single "book file"... I only get the 20 separate "chapter file"s.
You should be getting a master "file" called an indb. However, to use it, you'd have to have INDD.

Quote:

So, when I'm exporting, I export each individual chapter -> EPUB... hence the 20 different similar-but-not-quite CSS files.

Mix Direct Formatting and lots of other cruft in there, and you get a giant, conflicting mess on your hands.

IF I had the monolithic "book file", I'd be able to export a single EPUB... but you'd still have a spaghetti mess, but no conflicting names. :P

(Same as cleaning up Word->HTML, etc. etc.)
Oh, god, yes.

My latest rant, unrelated to Kevin and Diap, is that we're getting INDD package files from "designers" that

Do
Not
Use
Any
Styles
At
All.


Apparently, they licensed INDD on Monday, got a job on Wednesday and now, even though they couldn't spell DEZINER on Monday, now they IZ ONE!

GRRRRRMMMMMMMBLLLL



Quote:

:thumbsup: :thumbsup:
Yes, me too^^^

Hitch

phillipgessert 09-22-2021 05:47 PM

Quote:

Originally Posted by Hitch (Post 4156641)

My latest rant, unrelated to Kevin and Diap, is that we're getting INDD package files from "designers" that

Do
Not
Use
Any
Styles
At
All.



Isn’t that most of em? Just kidding, most do use styles. Gives em something to constantly override.

Tex2002ans 09-22-2021 06:36 PM

Quote:

Originally Posted by Hitch (Post 4156641)
What, the ONE TIME I could have actually been useful, since they stopped accepting donations and you send him the files! You are Dastardly!

You're still useful, Hitch, you're still suuuuper useful... (not :D).

Still look for some baaaad InDesign files.

I sent relatively tame ones.

(I know I have some really disgusting ones over the years, but I just grabbed one off the top of my head + my latest from two weeks ago.)

You've definitely seen a lot worse stuff than me too. Oh boy... some of your horror stories...

Quote:

Originally Posted by Hitch (Post 4156641)
You should be getting a master "file" called an indb. However, to use it, you'd have to have INDD.

Because of Adobe's cloud/subscription bullshit, and insisting on INDD only opening in the latest-and-"greatest" version...

... I fallback to IDML files.

This allows me to open it up in whatever older InDesign version I have.

There's no way I'm going to pay some preposterous fee, only to open the files for a split second and export EPUBs out.

Quote:

Originally Posted by Hitch (Post 4156641)
Apparently, they licensed INDD on Monday, got a job on Wednesday and now, even though they couldn't spell DEZINER on Monday, now they IZ ONE!

GRRRRRMMMMMMMBLLLL

I only met one, one typesetter, who knew what he was doing with Styles.

He was a pleasure to meet/talk with in-person, and he had a very nice Styles workflow. (Also in charge of typesetting monthly/quarterly journals + lots of other random documents.)

I haven't had the pleasure to work directly with him though.

But that's the kind of person I could trust coming up with a consistent Styles workflow from:

Word Styles -> InDesign Styles -> HTML/EPUB Classes -> Sigil cleanup.

But that requires someone who knows what they're doing at that Input/Output stage.

(Similar to those videos I link to showing how to use Word to cleanup Styles + catch Direct Formatting. [LibreOffice 7.1 recently added a Styles Inspector too!])

Word Styles... a few of the editors I work with know/use them... but the problem there is lots of them are working in CMSes (Content Management Systems) or Google Docs... so they're mostly working directly on the web.

Yes, they may get a Word file initially from an author, but then it quickly goes into web collaboration... and then, with many fingers in the pie, more and more Direct Formatting and crap gets introduced.

(Not to mention all the absolute hidden TRASH that gets introduced when doing Comments, Track Changes, copying/pasting from the web, using the Rich Text Editors, using the mobile version of these apps, etc.)

Complete Side Note: LibreOffice is going to be having their 2021 conference tomorrow:

https://blog.documentfoundation.org/...-participants/

There's dozens of talks with lots of interesting information in there. :)

I'm very interested in the:

Built-in "Xray" like UNO object inspector

which is similar to the Inspector built into web browsers. You'd be able to open up a document and see the DOM + exactly what properties are set:

https://tomazvajngerl.blogspot.com/2...inspector.html

That might help in debugging some particularly tricky documents.

un_pogaz 09-23-2021 06:44 AM

Quote:

Originally Posted by KevinH (Post 4156157)
And of course, if you wanted more of a voice then volunteering to test things and providing early input when it was asked for would generally have been good ideas.

You have a point, I should have followed and made this suggestion earlier.

But in my defense, I didn't care about the development details because I thought it would be logical that you develop this feature as it is already the case for plugins: an extensive list of which the GUI buttons are only shortcuts to the favorites.
Hence my bad surprise to see a static system.

(Oh, and stop developing Sigil according to your own specification and preference. Some things I understand for technical reasons, but "oh no, I just need 3, deal with it" is stupid, you're not the only one using it.)

DiapDealer 09-23-2021 07:56 AM

Quote:

Originally Posted by un_pogaz (Post 4156830)
(Oh, and stop developing Sigil according to your own specification and preference. Some things I understand for technical reasons, but "oh no, I just need 3, deal with it" is stupid, you're not the only one using it.)

We will continue to develop Sigil as we see fit (while continuing to welcome the recommendations of users who don't present themselves as rude and entitled). If you'll note: there are several USERS in this thread who have acknowledged that 3 lists will likely be sufficient.

Please consider not posting here any more if you can't do so without lashing out with personal insults when development doesn't go the way you'd like. YOU'RE not the only one using Sigil either. If you can't refrain on you're own, I will help you in that regard.

KevinH 09-23-2021 09:58 AM

Quote:

Originally Posted by un_pogaz (Post 4156830)
You have a point, I should have followed and made this suggestion earlier.

But in my defense, I didn't care about the development details because I thought it would be logical that you develop this feature as it is already the case for plugins: an extensive list of which the GUI buttons are only shortcuts to the favorites.
Hence my bad surprise to see a static system.

(Oh, and stop developing Sigil according to your own specification and preference. Some things I understand for technical reasons, but "oh no, I just need 3, deal with it" is stupid, you're not the only one using it.)

Wow so much anger over a new feature! We've removed nothing from Sigil. Interested users took an an idea and helped us refine it to something they thought would be useful. That seems like a very good development model to me.

You did not bother to become involved, provided no feedback at all during the process, and then whined and complained when it did not turn out the way you wanted. You are like a child on a playground when things do not go their way. And your ability to win friends and influence people astounds me.

And you obviously do not understand how Open Source software development works. Open source has been and will always be about the developers "scratching their own itch first" followed by merit based input from interested and involved users who "earn" their voice. Look at all the users who have already earned their voice by volunteering their time to help others here on the forums, help test, help report bugs, help when asked for input, etc. That is what an open source community is all about. If you do not like that, then use something else instead of leaching off the goodwill and hard work of others.

Themycles 11-28-2021 02:39 PM

Log Window
 
Is there a way to stop the log window from opening unless there is an error?

KevinH 11-28-2021 02:56 PM

No, since an errors may cascade or be ignored depending on earlier decisions. So the Log needs to be examined to determine full success as well as failure for each step.


All times are GMT -4. The time now is 08:11 PM.

Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.