MobileRead Forums - View Single Post

JimmXinu · 05-28-2022, 09:25 AM

Sarrenthal: This is a bit more complicated than it may seem. More complicated than I want to deal, honestly.

If you are commonly split epubs from the same source, it probably seems obvious what the 'right' choices are, but I have to consider all possibilities plus the edge and corner cases they can each have.

Here are just a few of the complexities:

- Some epubs have many parts (different files internally), but few TOC entries.
- Some epubs have may TOC entries, but few part/files.
Should the automatic split points be by TOC entry or by part/file counts?

- In some epubs the TOC entries point to beginning of each part/file.
- In some the TOC entries point to HTML tags somewhere.
Some epubs include content in the same part/file before the TOC tag, if split from the TOC tag, that prior content will be in the previous split.

- TOC entries and part/files are never guaranteed to be evenly sized.
Splitting every 10 or 100 entries may have wildly different sized splits.

05-28-2022, 09:25 AM	#361
JimmXinu Plugin Developer Posts: 7,025 Karma: 4604635 Join Date: Dec 2011 Location: Midwest USA Device: Kobo Clara Colour running KOReader	Sarrenthal: This is a bit more complicated than it may seem. More complicated than I want to deal, honestly. If you are commonly split epubs from the same source, it probably seems obvious what the 'right' choices are, but I have to consider all possibilities plus the edge and corner cases they can each have. Here are just a few of the complexities: - Some epubs have many parts (different files internally), but few TOC entries. - Some epubs have may TOC entries, but few part/files. Should the automatic split points be by TOC entry or by part/file counts? - In some epubs the TOC entries point to beginning of each part/file. - In some the TOC entries point to HTML tags somewhere. Some epubs include content in the same part/file before the TOC tag, if split from the TOC tag, that prior content will be in the previous split. - TOC entries and part/files are never guaranteed to be evenly sized. Splitting every 10 or 100 entries may have wildly different sized splits.