![]() |
#1 |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 23
Karma: 13884
Join Date: Jan 2014
Device: apple ipad (3rd generation)
|
Can I break up an HTML file using a TOC?
I have some public domain ebooks in epub that contain a Table of Contents, but it seems that the books contains only a small handful of HTML files, each with multiple chapters in them. However my e-reader only recognizes chapter progress within “sections”, meaning within each HTML file and not according to the TOC which is just linking to paragraphs within each HTML file.
I’d like to break up the chapters into their own separate HTML files using the TOC as a guide. I’d like to be able to do it automatically rather than manually. |
![]() |
![]() |
![]() |
#2 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,680
Karma: 23983815
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
If you see all chapter titles, you can simply insert a Sigil split marker tag before each chapter heading tag. For example, if all chapter headings are <h1> tags, you'd use: Find:<h1 Replace:<hr class="sigil_split_marker" /><h1 and then select Edit > Split at markers followed by Tools > Table of Contents > Generate Table of Contents. If the TOC is empty when you select Tools > Table of Contents > Generate Table of Contents, you can use KevinH's TOCSaver plugin to change paragraph tags to heading tags or insert hidden heading tags. |
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 878
Karma: 2457540
Join Date: Nov 2011
Device: none
|
If the document consistently gives chapter titles a particular tag (and doesn't use that tag elsewhere) a simple Search and Replace to insert "sigil_split_marker" will do the job.
But if the code were that organised, I suspect the TOC would have been sorted out already. You may have no practical alternative to finding the chapter titles you want to list in a TOC and applying the h1 tag manually. How many chapters? Some jobs are really too small to be worth automating. |
![]() |
![]() |
![]() |
#4 |
Running with scissors
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,582
Karma: 14328510
Join Date: Nov 2019
Device: none
|
In addition to what's said above, what I also do in order to have chapter breaks only before chapter headings is to join all of the chapter files into one large html/xhtml file, then do the splitting as per above. (But I've forgotten how I joined the separate files so hunt around and experiment.)
|
![]() |
![]() |
![]() |
#5 |
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 44,755
Karma: 168431891
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
To continue from what @hobnail said, I also tend to join all the chapter files into a single file since Gutenberg has a love for having massive files with chapters split between the files. To do this, I select the files I want to merge, and then right click and merge or Ctrl-M. After this, I insert the split markers and split.
Quite often the split markers are simple to insert but at other times, the regex to insert the split markers can be a learning experience. |
![]() |
![]() |
Advert | |
|
![]() |
#6 | |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 878
Karma: 2457540
Join Date: Nov 2011
Device: none
|
Quote:
Indeed. I'm all in favour of learning experiences. But sometimes you have to balance an hour's research into Regex with the time taken to manually insert 16 chapter breaks! |
|
![]() |
![]() |
![]() |
#7 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,088
Karma: 144284184
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
While it might take longer to learn regex, once you've learned it, it will eventually take less time.
|
![]() |
![]() |
![]() |
#8 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 878
Karma: 2457540
Join Date: Nov 2011
Device: none
|
But when that hour results in the conclusion that chapter's AREN'T marked in any consistent and unique way.... :-( If these were well-constructed EPUB files we wouldn't be having to do this job in the first place.
|
![]() |
![]() |
![]() |
#9 |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 23
Karma: 13884
Join Date: Jan 2014
Device: apple ipad (3rd generation)
|
Thank you all for the replies. There were quite a lot of chapters and a number of books. It required a combination of all your suggestions. One book had a TOC that linked to subsections of chapter and I didn't want to create a new TOC as it would only link to the main ones, so I used the search function for <title> tags (fortunately the chapters had them) and inserted splits then broke the files up from there.
Now why isn't there a shortcut for file renaming? I do loathe having to right-click and scroll through a pop-up menu to rename an HTML file. I'm surprised double click doesn't do the trick as that's the way it works on my system folders. Edit: Found it! CTRL+OPTION+ENTER Tomorrow I'll be asking for advice on how best to eat with a spoon. Last edited by wellesradio; 04-11-2021 at 03:13 PM. |
![]() |
![]() |
![]() |
#10 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,475
Karma: 5703586
Join Date: Nov 2009
Device: many
|
Try Sigil's regex renamer to bulk rename files.
Last edited by KevinH; 04-12-2021 at 08:52 AM. |
![]() |
![]() |
![]() |
#11 |
Running with scissors
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,582
Karma: 14328510
Join Date: Nov 2019
Device: none
|
|
![]() |
![]() |
![]() |
#12 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 13,296
Karma: 78876004
Join Date: Nov 2007
Location: Toronto
Device: Libra H2O, Libra Colour
|
|
![]() |
![]() |
![]() |
#13 |
Running with scissors
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,582
Karma: 14328510
Join Date: Nov 2019
Device: none
|
Reminds me of that childhood limerick:
I eat my peas with honey I've done it all my life They do taste kind of funnyBut it keeps them on my knife
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
HTML Entities placed in ToC break Kobo Aura | trekky0623 | Calibre | 11 | 12-16-2016 04:22 PM |
Kindler previewer not recognizing toc.ncx file, my html toc, or the start point... | petercrowell | Kindle Formats | 2 | 05-01-2012 08:14 AM |
HTML input plugin stripping text within toc tags in child html file | nimblebooks | Conversion | 3 | 02-21-2012 03:24 PM |
NCX file generator (and html ToC and opf) | GiorgioC | Workshop | 0 | 07-12-2011 06:55 AM |
can't generate a toc from an html file | p3aul | Calibre | 13 | 08-27-2010 05:44 AM |