MobileRead Forums - View Single Post

retiredbiker · 12-29-2017, 06:34 PM

Quote:

Originally Posted by deback

This is how I automate it:

Do a regex search for the following (be sure to change the mode to Regex in the dropdown box):

>(\d+) (You might have to add a space before the (\d+), depending on the original coding.)

This might find all the chapter numbers, when the word "chapter" is not included anywhere. Then you can do a find and replace to replace the class with the "chapter" class.

Example (after you've found the class that was used; there could be inconsistent classes used by the creator, which is common):

Find the following: (\d+)

Replace it with this: \1

-or, if you prefer, replace it with the following:

Chapter \1

Then go into the ToC editor, click on Generate ToC from XPath. Set up a macro to insert the following on the top Level 1 ToC line (mine is ctrl-shift-T -- or you can type it or you can fill out the lines on the next screen after you click on the wand at the right):

//*[re:test(@class, "chapter", "i")]

Then the Toc Editor will create entries for each chapter.

Create a CSS class for "chapter" to look the way you want it to look.

Here's mine:

.chapter {
display: block;
font-size: 1.4em; (this could change depending on length of the chapter title)
font-weight: bold;
text-align: center;
margin-bottom: 2em;
margin-left: 0;
margin-right: 0;
margin-top: 3em;
}

Convert the file again to have all the chapters start on a new page. You don't have to do this manually. You don't even need the line page-break-before: always; in your "chapter" class, because Convert will do it automatically.

Excellent information, thank you!

For consistent chapter headings that are numeric, have an actual unique style, or even are Roman numerals...yes, I search much as you suggest. The ones I find most aggravating are these:

end of a chapter text...
Mary Goes to Market
On Monday Marry walked to the village...

Where the middle line is actually a chapter break - un-numbered, un-identified. Sometimes it's all caps and [A-Z]{3} or something will help, but these just take time.

On the other hand, spending time looking at the text will often find the odd goof, like a dozen random paragraphs in the middle of the book that are strangely styled, so it's not all bad!

Interesting about the "pagrbreak..." lines. I always take them out, since they cause a blank screen, sometimes two, on the Kindle when reading. But then, I always have each chapter start a new file, which gives a clean break with no blank page. But it sounds like a re-convert will do that file splitting automatically. But will it get rid of "orphan" files that are just pieces of chapters? (I'm a neatnik, I guess.)

Thanks for the XPath example. I've not yet explored this, and just looking at the pop-up help frankly scared me off. I'll try this on my next TOC fix-it job. It looks like the "i" picks up the \1 and increments? Is that right? Then I could take a book with 102 chapters in those **&@^%# Roman numerals and easily convert them to arabic numerals?