![]() |
#1 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 769
Karma: 1537886
Join Date: Sep 2013
Device: Kobo Forma
|
Pratchett and Chapters?
This isn't really an ePub question, but I don't really know where else to ask it. I'm editing my copies of Pratchett's Discworld books. Apparently, Pratchett doesn't like chapters. Essentially, the book is one big blob of text with some classes that work out to thematic breaks (which I have replaced with <hr/>s). That blob of text has been arbitrarily broken into two or three files (either by the publisher or by Calibre). But, when I read the book on my ereader (Kobo Forma), there's a noticeable delay (3-5 seconds) when I hit those loading points.
I was considering changing the <hr/>s to <h3>s (everything else is <h2>) and having Calibre's editor automatically split the book at those points. But, Pratchett's books tend to have someplace around 70 thematic breaks throughout them. So, that means a silly amount of "chapters" (i.e., files). Is that something reasonable to do? Is there some limit to the number of files in a book? I guess since the <h3>s won't have any text associated with them, I don't have to put them into the ToC. But, is there a better alternative? |
![]() |
![]() |
![]() |
#2 | |||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
![]() I'm in the middle of reading Sourcery right now. Quote:
Go into the "Files" Reports and sort by filesize. It's good practice to keep each HTML file <300ish KBs. (This was a limit on really early, really old ereaders with very limited CPU/RAM.) What Should You Do If The Files Are Too Large? Split them up. ![]() The easiest way I do this is:
I explained the Sigil way in step-by-step instructions here: Calibre's way is very similar. - - - - - Side Note: All you have to do is make sure you don't have HTML files >300 KBs. The devices should be able to work fine from there. Sometimes, if the book is in one huge, multi-MB HTML file, things can be pretty sluggish. Quote:
Some books have chapters that are a single page. Or a single word. So what? Heh. There are extreme cases where the OPF or NCX file has reached >300 KBs. In all these years, I don't think I've really heard much about it though. You won't be reaching that any time soon, unless you were trying to merge decades and decades of articles together into one monolithic omnibus. :P Side Note: In all the ebooks I've done, there's maybe a handful I ever ran across where a single chapter was >300KBs. Quote:
Code:
<hr class="sigil_split_marker" /> ... but in Calibre, you can also split based on anything—like your custom <hr>s—by using XPaths. Just:
You can read a little bit about it here: Last edited by Tex2002ans; 08-03-2022 at 12:45 AM. |
|||
![]() |
![]() |
Advert | |
|
![]() |
#3 |
frumious Bandersnatch
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,543
Karma: 19001583
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
There can be thousands of files in an ePub (e.g. 1001 Nights). It may cause some apps to slow down or use more resources (or maybe the opposite), but no Pratchett book will be that extreme.
However, an undesired effect of splitting the book in chapters is that you'll (probably) get a page breaks at every chapter. I'm not sure Pratchett would have liked that (perhaps it was only the publisher who decided not to have page breaks in order to save paper). So at the end I decided to ignore filesizes and put the whole book's text in a single "Text.xhtml" file. At least KOReader handles these large files alright. One thing you may have noticed (depending on which "edition" you're reading) is that some scene breaks are just plain spacing, while others have a star, a row of asterisks or something else. This is just a direct translation of the print version without any understanding of the meaning. There is no (semantic) difference between these two types of scene breaks. In print they're all initially just spacing, but when they fall exactly at a page boundary, an asterisk is printed to make it explicit, as otherwise it would be almost invisible. In an ebook, there's no way (without javascript) of having this effect, so I opted for having an asterisk at every scene break. |
![]() |
![]() |
![]() |
#4 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,136
Karma: 144284184
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
What I do is fix the Discworld books so each break breaks at a section break. I don't find it as slow to load the next html file. But then, I make sure I'm not making them too large when I do this.
|
![]() |
![]() |
![]() |
#5 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 769
Karma: 1537886
Join Date: Sep 2013
Device: Kobo Forma
|
@Tex2002ans: I keep forgetting the Reports tool in Calibre's editor. I checked on the current book I'm reading and the two "text" files are about 200KB each. So, it shouldn't be a problem. But, the loading is noticeable. I'd also forgotten that I could tell Calibre to split on other things than headings (h1, h2...). Thanks.
@Jellby: I was wondering about the different types of breaks I found in the original book. Their trying to keep the book looking the same as the print version explains it. Thanks. @JSWolf: I don't recall seeing actual section breaks (<section>) in the books. Are you including things like <hr/> in there. I'm assuming you added them manually. How do you determine what's a good size for a break? |
![]() |
![]() |
Advert | |
|
![]() |
#6 | |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,136
Karma: 144284184
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Quote:
Code:
.paraNoIndent { display: block; margin-bottom: 0em; margin-top: 1em; text-indent:0em; } <p class="paraNoIndent"> Code:
hr { margin-top: 0.9em; margin-right: 40%; margin-bottom: 0.9em; margin-left: 40%; border-top: 2px solid; } |
|
![]() |
![]() |
![]() |
#7 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 769
Karma: 1537886
Join Date: Sep 2013
Device: Kobo Forma
|
@JSWolf: In the previous Pratchett books I'd edited, there was some variation (or two (or three)) of a section break class as you described. And, like you, I'd replaced it with my <hr/>. In the current book I'm editing, they didn't use such a class. It looks like they're using a pairing of empty paragraph marks and their first paragraph in the chapter class:
Code:
<p></p> <p class="chapteropenertext"> |
![]() |
![]() |
![]() |
#8 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,136
Karma: 144284184
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
|
![]() |
![]() |
![]() |
#9 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 769
Karma: 1537886
Join Date: Sep 2013
Device: Kobo Forma
|
It's "Wyrd Sisters" (Discworld #6). There's no need to look at the code. I've looked it over and just replaced those groups with <hr/>. There were also 8 situations that had an anchor (<a...>) instead of the paragraph mark pairs. And I replaced those as well.
|
![]() |
![]() |
![]() |
#10 | |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,136
Karma: 144284184
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Quote:
Code:
<p class="para">*</p> <p class="paraNoIndent"> |
|
![]() |
![]() |
![]() |
#11 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 769
Karma: 1537886
Join Date: Sep 2013
Device: Kobo Forma
|
Odd. I thought I'd looked for the section breaks before doing any editing. It's possible I deleted the para tags already. But, I'm almost positive there wasn't a ParaNoIndent in my copy. Oh, well. I managed to find them and replace them regardless.
Also, I had the Calibre editor break the files at those <hr/>s. In this book, there are about 90 book text files now. It all seems to work fine and the delay upon loading is gone. Thanks. |
![]() |
![]() |
![]() |
#12 | |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,136
Karma: 144284184
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Quote:
|
|
![]() |
![]() |
![]() |
#13 |
Still reading
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 13,703
Karma: 103837201
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
|
Some of his books just have anonymous chapter breaks.
Dune has epigrams (extracts from a fictional history book) and no page breaks or chapter headings. I don't think ebooks should slavishly follow paper books, so if the scene break is really an anonymous chapter start I'd be happy with a page break. Similarly with Dune. I have a lot of Dune on paper and recently got the 1st 3 cheap as ebook, so I will put page breaks and TOC to the start of each "history" quote. I have all Discworld and much other Pratchett on paper, so unless it's legitimately offered very cheap for the series… Or I discover I'm living for 100s of years. |
![]() |
![]() |
![]() |
#14 | |||||||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
![]() ![]() This is why you should always put asterisks or SOMETHING as scenebreaks. Not just rely on a gap. Print book ≠ Ebook. See my:
Quote:
In this specific case of Discworld books, there is no difference in scenebreaks (I'd trust Jellby's judgement on that). Depending on the book, you have to go through and make sure. (But in 99.9% of all books you encounter, they're normal and don't have multiple types of scenebreaks. ![]() Quote:
- - - Side Note: If you want more details, this scenebreaks discussion already occurred in: We covered lots of examples + all the technical details you'd ever need. (Even bleeding-edge Javascript to try to emulate "only show scenebreaks at the bottom of pages/screens"... but I definitely wouldn't choose that "solution".) - - - Side Note #2: If you wanted even more extreme technical details, I described some of the "surgical" approach to editing in: but this is really getting into the weeds. This explains how/why you may not want to wipe everything away, and clean/edit the ebooks with more precise cleanup methods. Again, 99% of the time, you're working from complete garbage, so it's easier to wipe everything and start from scratch. But in some cases, markup might be left in the original ebook—like "invisible" scenebreaks—which may make your life easier when trying to fix EPUB->EPUB. Quote:
![]() The Reports are AMAZING. Definitely take lots of advantage of those. You can also use them to list all links in a book, making it very easy to:
Quote:
![]() Quote:
Like I said, 99% of the time, it's probably just your typical one-type-of-scenebreak... but there are very odd things out there. ![]() Note: Like the Wild Cards books I read many years ago, they used a ♥, ♦, ♣, or ♠ for certain scenebreaks. Quote:
![]() Did you have complicated CSS? The Kobo definitely shouldn't be chugging on a file like that. Last edited by Tex2002ans; 08-03-2022 at 03:20 PM. |
|||||||
![]() |
![]() |
![]() |
#15 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,136
Karma: 144284184
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Part of the problem with Discworld is that it uses a 1em blank space for the spacebreak. When I was using blank space for a section break, I used 2em. 1em just doesn't work.
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
best way to add numbered chapters to a book with no chapters | stumped | Sigil | 17 | 05-10-2017 03:30 AM |
Pratchett's Discworld | stop__dreaming | Reading Recommendations | 15 | 12-13-2012 05:21 AM |
Terry Pratchett | glenlyan | Introduce Yourself | 3 | 01-25-2012 09:20 PM |
azw to mobi: Not detecting chapters/page break at chapters and no TOC | RachDvn | Calibre | 3 | 01-16-2011 09:53 AM |
ePub Chapters vs. Stanza Chapters | kjk | Sigil | 4 | 09-14-2009 10:50 AM |