View Single Post
Old 08-03-2022, 12:29 AM   #2
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by DaveLessnau View Post
I'm editing my copies of Pratchett's Discworld books.


I'm in the middle of reading Sourcery right now.

Quote:
Originally Posted by DaveLessnau View Post
Essentially, the book is one big blob of text [...] That blob of text has been arbitrarily broken into two or three files (either by the publisher or by Calibre).
In Sigil or Calibre, go into the Reports:
  • Tools > Reports (Ctrl+Shift+R)

Go into the "Files" Reports and sort by filesize.

It's good practice to keep each HTML file <300ish KBs. (This was a limit on really early, really old ereaders with very limited CPU/RAM.)

What Should You Do If The Files Are Too Large?

Split them up.

The easiest way I do this is:
  • Merge everything into one monolithic file.
  • Skim through the merged file, placing "split markers" every so often.
    • In your case, before scenebreaks.
  • Split the book.

I explained the Sigil way in step-by-step instructions here:

Calibre's way is very similar.

- - - - -

Side Note: All you have to do is make sure you don't have HTML files >300 KBs. The devices should be able to work fine from there.

Sometimes, if the book is in one huge, multi-MB HTML file, things can be pretty sluggish.

Quote:
Originally Posted by DaveLessnau View Post
But, Pratchett's books tend to have someplace around 70 thematic breaks throughout them. So, that means a silly amount of "chapters" (i.e., files).
So?

Some books have chapters that are a single page. Or a single word. So what?

Quote:
Originally Posted by DaveLessnau View Post
Is there some limit to the number of files in a book?
Heh.

There are extreme cases where the OPF or NCX file has reached >300 KBs.

In all these years, I don't think I've really heard much about it though.

You won't be reaching that any time soon, unless you were trying to merge decades and decades of articles together into one monolithic omnibus. :P

Side Note: In all the ebooks I've done, there's maybe a handful I ever ran across where a single chapter was >300KBs.

Quote:
Originally Posted by DaveLessnau View Post
I guess since the <h3>s won't have any text associated with them, I don't have to put them into the ToC. But, is there a better alternative?
I find Sigil's splitting to be much easier with the simple:

Code:
<hr class="sigil_split_marker" />
and Edit > Split at Markers.

... but in Calibre, you can also split based on anything—like your custom <hr>s—by using XPaths.

Just:
  • Right-Click an XHTML file > Split at Multiple Locations

You can read a little bit about it here:

Last edited by Tex2002ans; 08-03-2022 at 12:45 AM.
Tex2002ans is offline   Reply With Quote