Quote:
Originally Posted by DaveLessnau
I'm editing my copies of Pratchett's Discworld books.
|
I'm in the middle of reading
Sourcery right now.
Quote:
Originally Posted by DaveLessnau
Essentially, the book is one big blob of text [...] That blob of text has been arbitrarily broken into two or three files (either by the publisher or by Calibre).
|
In Sigil or Calibre, go into the Reports:
- Tools > Reports (Ctrl+Shift+R)
Go into the "Files" Reports and sort by filesize.
It's good practice to keep each HTML file <300ish KBs. (This was a limit on really early, really old ereaders with very limited CPU/RAM.)
What Should You Do If The Files Are Too Large?
Split them up.
The easiest way I do this is:
- Merge everything into one monolithic file.
- Skim through the merged file, placing "split markers" every so often.
- In your case, before scenebreaks.
- Split the book.
I explained the Sigil way in step-by-step instructions here:
Calibre's way is very similar.
- - - - -
Side Note: All you have to do is make sure you don't have HTML files >300 KBs. The devices should be able to work fine from there.
Sometimes, if the book is in one huge, multi-MB HTML file, things can be pretty sluggish.
Quote:
Originally Posted by DaveLessnau
But, Pratchett's books tend to have someplace around 70 thematic breaks throughout them. So, that means a silly amount of "chapters" (i.e., files).
|
So?
Some books have chapters that are a single page. Or a single word. So what?
Quote:
Originally Posted by DaveLessnau
Is there some limit to the number of files in a book?
|
Heh.
There are extreme cases where the OPF or NCX file has reached >300 KBs.
In all these years, I don't think I've really heard much about it though.
You won't be reaching that any time soon, unless you were trying to merge decades and decades of articles together into one monolithic omnibus. :P
Side Note: In all the ebooks I've done, there's maybe a handful I ever ran across where a single chapter was >300KBs.
Quote:
Originally Posted by DaveLessnau
I guess since the <h3>s won't have any text associated with them, I don't have to put them into the ToC. But, is there a better alternative?
|
I find Sigil's splitting to be much easier with the simple:
Code:
<hr class="sigil_split_marker" />
and
Edit > Split at Markers.
... but in Calibre, you can also split based on anything—like your custom <hr>s—by using XPaths.
Just:
- Right-Click an XHTML file > Split at Multiple Locations
You can read a little bit about it here: