MobileRead Forums

MobileRead Forums (https://www.mobileread.com/forums/index.php)
-   Sigil (https://www.mobileread.com/forums/forumdisplay.php?f=203)
-   -   eBook Formatting in Sigil (https://www.mobileread.com/forums/showthread.php?t=333348)

Tex2002ans 09-28-2020 12:01 PM

Quote:

Originally Posted by FDPuthuff (Post 4040171)
Also, has there been anything published that is more recent and updated as far as what works now vs. five or six years ago. How to set up Word, etc...? :cool:

Last year we had some discussion in:

"MS Word vs Open Office Word"

(And watch those two Styles videos I linked to, they explain the WHY and the HOW.)

Once you create very clean source documents, then every future step becomes easier. :)

And once you understand the basic ideas of Styles, I think you'll be able to more easily wrap your head around basic ebook code+CSS.

exaltedwombat 09-28-2020 12:16 PM

To be honest, if your publication is straightforward narrative or information, you might as well compose in WordPad. Or even straight into Sigil. Apply styles to headings and sub-headings, set up the metadata and you're done.

Ok, that's over-simplifing a BIT :-) But most eBook problems stem from messy source.

Hitch 09-28-2020 12:22 PM

Quote:

Originally Posted by Tex2002ans (Post 4040400)
Last year we had some discussion in:

"MS Word vs Open Office Word"

(And watch those two Styles videos I linked to, they explain the WHY and the HOW.)

Once you create very clean source documents, then every future step becomes easier. :)

And once you understand the basic ideas of Styles, I think you'll be able to more easily wrap your head around basic ebook code+CSS.

There you are. You seem to have been MIA of late. ???

Hitch

FDPuthuff 09-28-2020 06:48 PM

This has answered my question of 'How to use Word as written.'
Thanks for the video links.

Tex2002ans 09-28-2020 11:11 PM

Quote:

Originally Posted by FDPuthuff (Post 4040550)
This has answered my question of 'How to use Word as written.'
Thanks for the video links.

:thumbsup:

I think Styles are the #1 most important step you can learn.

You start thinking in "purpose" instead of "looks":
  • Purpose (Good)
    • "Chapter 1" is a Heading.
    • These three asterisks mean scenebreak.
  • Looks (Bad)
    • "Let me click the Centered button, the Italics button, the size 48 font dropdown"
    • ... and let me repeat that for all my chapter titles throughout the entire book, then hope I don't make a mistake.
    • ... and then let me push this other list of buttons for all the asterisks.

Code:

              Chapter 1 <--- Heading 1

This is an example first sentence. <--- First.

              * * *  <--- Scenebreak.

And the beginning of a new scene.  <--- First.

    Today, it was a dark and stormy night.  <--- A normal paragraph.

Now that you have everything marked by with Styles (its purpose), now you can say:
  • "Hey, make all my Headings Centered + Bold + This fancy font."
  • "Hey, I want my First paragraphs to have no indent."
  • "Hey, I want each scenebreak to be Centered and have a larger gap above/below."

Now when you export your clean DOCX to ebooks, all that information will be transferred over!

Code:

<h1>Chapter 1</h1>

<p class="first">This is an example first sentence.</p>

<p class="scenebreak">* * *</p>

<p class="first">And the beginning of a new scene.</p>

<p>Today, it was a dark and stormy night.</p>

Wow, now that's some nice stuff! :D

Quote:

Originally Posted by Hitch (Post 4040413)
There you are. You seem to have been MIA of late. ???

Heh, yep yep. Past month I completely dropped off the radar... was digitizing a ~2 million word beast. (10+ more volumes of a journal.)

Spellchecking/Grammarchecking is done, and now it's just correcting the little things here and there.

FDPuthuff 09-29-2020 11:46 AM

Quote:

Originally Posted by Tex2002ans (Post 4040650)
:thumbsup:
I think Styles are the #1 most important step you can learn.

Thank you sir. Most definitely Styles are the way.

I need to completely rewrite my checklist for Formatting prep. But, this is for sure going to save time. :cool:

So, when you have a 'perfectly' styled Word doc and you send it through Calibre, or your app of choice, you get your EPUB file. In my tests, I see a lot of HTML that I am not as familiar with. As long as it looks like I expect in an eReader, or previewer, am I good as far as the code? Will the EPUD file work on most eReaders young and old? Or are there 'things' I need to change the tags on because it just makes things unreadable on older devices and or phones?

Thanx for the input.

Tex2002ans 09-29-2020 02:37 PM

Quote:

Originally Posted by FDPuthuff (Post 4040859)
As long as it looks like I expect in an eReader, or previewer, am I good as far as the code?

For the most part, if you KISS (Keep It Simple Stupid), you'll be fine.

I wrote a bit about some "things to look out for" in Post #53+ in "Why is it so hard to preserve blank lines?".

Also look at those 2 Reddit threads I linked where I gave basic code samples (Scenebreaks, Fleurons, etc.).

There's a handful of other key "things to look out for", like:
  • color text
    • If you force black text, then turn on Night Mode, you'll get black-on-black text.

That one's very easy to miss, since if you're in Word/Print, it's always on a white page. But ebooks are different, people can read using all different font/background colors.

Side Note: Amazon will ding you if you force the text's font color for your book. It's against their Kindle Publishing Guidelines (PDF).

Quote:

Originally Posted by FDPuthuff (Post 4040859)
In my tests, I see a lot of HTML that I am not as familiar with.

Most tools export a bunch of detrimental/useless garbage (especially for ebooks).

Quote:

Originally Posted by FDPuthuff (Post 4040859)
Will the EPUD file work on most eReaders young and old? Or are there 'things' I need to change the tags on because it just makes things unreadable on older devices and or phones?

For a basic Fiction book, you barely need any CSS classes. They can mostly be boiled down to:
  • Actual Book
    • Headings
    • First paragraphs
    • Scenebreaks
    • occasionally Center/Right aligned text
  • Frontmatter
    • Title/Subtitle
    • Copyright Page (maybe smaller font, and/or gaps between paragraphs)
    • Table of Contents (negative indent stuff)

Almost everything else should be a simple <p> (paragraph).

For Non-Fiction, things get a little more complicated, because you're dealing with Footnotes/Diagrams/Captions/Tables/etc.

Quote:

Originally Posted by FDPuthuff (Post 4040859)
So, when you have a 'perfectly' styled Word doc and you send it through Calibre, or your app of choice, you get your EPUB file.

You'll want something that keeps the Style names in the output HTML.

IF (and this is a big if) you Style your DOCX properly, even a simple Word "Save As Filtered HTML" could work with some massaging.

Calibre tries its best to recreate the look of the original source document.

It doesn't keep Word's Style names -> CSS classes though, and converts a lot of them to "block_#" or "calibre#".

Its method is also more GIGO (Garbage-In, Garbage Out):

If your source file stinks, Calibre tries its best, but you'll get messy out.

Clean file in? Relatively clean file out.

Toxaris's EPUB Tools is my personal favorite. It exports very clean HTML, and even has an option to keep the Word Styles -> CSS (Settings > HTML Export > Retain own stylenames).

Its method is more "Strip/clean up all the junk" + "Output very clean EPUB+CSS". This lets you do tweaking of the EPUB's code much more easily, since 99% of the crap is gone.

There are many more ways of converting DOCX to EPUB... but these are probably too advanced (and this post is getting a little long). :D

But, with any of these methods, consistent usage of Styles will make any conversion steps 1000% easier/cleaner. :)

hobnail 09-29-2020 03:11 PM

Quote:

Originally Posted by Tex2002ans (Post 4040943)
For a basic Fiction book, you barely need any CSS classes.

For the sake of an example here's the CSS I use when fixing / cleaning the CSS in a book I've bought. I do this in Sigil; I first create an empty css file and move the book's CSS into that file, so now the book's css file is empty. I have a Sigil clip which gets put in that empty file which is

body {
font-size: 100%;
border: 0;
margin: 0;
padding: 0;
width: auto;
}

body * {
line-height: inherit;
}

p {
font-size: 100%;
margin: 0;
padding: 0;
border: 0;
text-indent: 2em;
}

a {
color: inherit;
text-decoration: none;
}

h1, h2, h3, h4 {
text-align: center;
}

hr {
border-style: none;
margin-left: auto;
margin-right: auto;
margin-top: 1em;
page-break-after: avoid;
break-after: avoid;
background-color: hsl(0, 0%, 55%);
height: 1px;
width: 6em;
}

.italic {
font-style: italic
}

That doesn't properly format chapter headings that used p tags instead of h tags but I can live with that. What I want is a consistent text size, consistent line-height, consistent indentation, consistent minimal borders, and consistent space between paragraphs (none). All of those things I can adjust on my Kobo (maybe not the interparagraph spacing). If you leave out text-align left/justified then that can be set how you like it on the Kobo. (Kindles aren't so flexible unfortunately.) I also have to go through the original CSS and find what class they're using for italic when they're using spans instead of i or em tags and add it to that italic class. The book's html is left untouched so whatever crud is there remains (divs and spans usually).

It's rather a bit of a sledge hammer approach but it works well with the minimum amount of effort.

Tex2002ans 09-29-2020 05:30 PM

Quote:

Originally Posted by hobnail (Post 4040964)
I also have to go through the original CSS and find what class they're using for italic when they're using spans instead of i or em tags and add it to that italic class.

Or use "Diap's Editing Toolbag" (Calibre) or "TagMechanic" (Sigil) to reliably convert <span class="italics"> to <i> or <em>.

hobnail 09-29-2020 06:10 PM

Quote:

Originally Posted by Tex2002ans (Post 4041037)
Or use "Diap's Editing Toolbag" (Calibre) or "TagMechanic" (Sigil) to reliably convert <span class="italics"> to <i> or <em>.

Yeah, I used to also do that for the crud that doesn't need to be there but now it's about minimal effort. I'm following the eReaderIQ rss feed and getting a lot of free books every day.

exaltedwombat 09-29-2020 06:17 PM

If you put messy source into some converters, you CAN get ridiculous EPUB code where every syllable is given a separate, verbose in-line style. And hand-crafted EPUB code can be very sparse and pretty! But a little cruft can be tolerated in return for the convenience of conversion from a WYSIWYG environment.

AlanHK 10-09-2020 11:02 PM

Quote:

Originally Posted by FDPuthuff (Post 4037841)
I am following this guy's method to start. I

(Guido Henkel's "Zen of eBook Formatting")

I read that and it's a good introduction to basic CSS. However one of his precepts I think is quite ill-advised:
Quote:

Originally Posted by Guido Henkel
there is one group of HTML tags in particular that I think I should single out at this point. I usually stay away from using <h1> tags and its brethren <h2>, <h3>, <h4>, <h5> and <h6>. These are tags that are usually used to create headlines. They are often predefined in web browsers and eBook readers and as a result they are strange bedfellows. Despite the use of style sheets, I have found that their behavior can be quite unpredictable, depending on the device or browser you are using. Since we can recreate the desired behavior of these tags easily through the use of specially styled paragraphs, I usually prefer going that route instead.

You can get the same appearance, but "hn" tags gives the book structure. And tools like the TOC generator can use them. Dumb reader apps that can't parse styles (or if you have an error in your CSS) can show you something intelligible instead of a page of plain text.

One of the first things I do is convert chapter heads to h tags and then style the h tags. Simpler and more readable code and easier to manage.

Invest some time in learning regular expressions. Very useful in Sigil to clean up cruft from Word or other sources. Get the Tag Mechanic Sigil plugin.

Also, look at Calibre. It has advanced much since Guido wrote his book. Its editor has some quite powerful and useful features.

Quote:

Originally Posted by FDPuthuff (Post 4040859)
So, when you have a 'perfectly' styled Word doc and you send it through Calibre, or your app of choice, you get your EPUB file. In my tests, I see a lot of HTML that I am not as familiar with. As long as it looks like I expect in an eReader, or previewer, am I good as far as the code? Will the EPUD file work on most eReaders young and old? Or are there 'things' I need to change the tags on because it just makes things unreadable on older devices and or phones?

Get the ePubCheck Sigil plugin.


Quote:

Originally Posted by exaltedwombat (Post 4041059)
If you put messy source into some converters, you CAN get ridiculous EPUB code where every syllable is given a separate, verbose in-line style.

The RemoveInLineStyles Plugin converts all those to CSS styles.
You might get 100 similar styles, but easier to then merge and simplify them.

FDPuthuff 10-17-2020 12:59 PM

I'm still here. I have just been digesting and practicing.
I have more questions, but I am going to start new threads since they don't quite fit in this one.
Thanx a ton for all this information.

FDPuthuff 10-21-2020 01:21 AM

Quote:

Originally Posted by Tex2002ans (Post 4040943)
Toxaris's EPUB Tools is my personal favorite. It exports very clean HTML, and even has an option to keep the Word Styles -> CSS (Settings > HTML Export > Retain own stylenames).

Its method is more "Strip/clean up all the junk" + "Output very clean EPUB+CSS". This lets you do tweaking of the EPUB's code much more easily, since 99% of the crap is gone.


Rats! It looks like this is only working on Windows machines.


I'm all set up and running on my Mac. Any other suggestions as far as Calibre is concerned? :cool:

Tex2002ans 10-21-2020 06:06 PM

Quote:

Originally Posted by FDPuthuff (Post 4049571)
Rats! It looks like this is only working on Windows machines.

Yes, Toxaris's add-in only works in the Windows version of Microsoft Word.

(The Mac Word is a completely different beast. It's not like LibreOffice/Calibre/Sigil where it works exactly the same across OSes.)

Quote:

Originally Posted by FDPuthuff (Post 4049571)
I'm all set up and running on my Mac. Any other suggestions as far as Calibre is concerned? :cool:

Did you consistently apply Styles to your DOCX yet?

Then there are a few Sigil plugins that may help:

CustomCleanerPlus lets you "Save As HTML" out of Word/LibreOffice/etc., then it will try to clean up a lot of the HTML cruft leftover.

DOCXImport is much more advanced, but lets you have full control over mapping the DOCX -> HTML conversion (using an advanced commandline tool called Mammoth).

Visit both of those threads for much more details. I don't use or have real in-depth knowledge on them though, so I can't write simple step-by-step instructions. :P


All times are GMT -4. The time now is 10:09 PM.

Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.