![]() |
#1 |
Connoisseur
![]() ![]() Posts: 83
Karma: 192
Join Date: Jul 2013
Location: Planet Ocean
Device: Kobo Glo HD, Onyx Boox Note Pro 2, Samsung Galaxy Tab S5e, Pixel 4a
|
automatic index generation
is there a way to automatically generate indexes the same way we can automatically generate table of contents? i am wondering because i have a large ePUB where H1 tags are in the TOC, but it would be too unwidely to also include H2 and H3 tags, which I would like as (two separate) indexes... Is that possible?
|
![]() |
![]() |
![]() |
#2 |
A Hairy Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,313
Karma: 20171571
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 15/11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
|
I don't think there is a fully automatic way, but try creating separate toc's including just h1, and just h2, and just h3. Then copy/paste the results to a new sheet.
|
![]() |
![]() |
Advert | |
|
![]() |
#3 | |
Connoisseur
![]() ![]() Posts: 83
Karma: 192
Join Date: Jul 2013
Location: Planet Ocean
Device: Kobo Glo HD, Onyx Boox Note Pro 2, Samsung Galaxy Tab S5e, Pixel 4a
|
Quote:
I also don't see how to include only headers below H1: i can get "only h1", "only h1 and h2" or "everything". i'm using 0.9.7, if that matters at all... |
|
![]() |
![]() |
![]() |
#4 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
What is the use-case? If we can understand the details of the project, maybe there is a better way of handling it.
The only cases of multiple TOCs I can think of off the top of my head are for "List of Tables" or "List of Illustrations". Also, the title of the topic is a little confusing. The title says "Indexes", but it seems you are talking about TOCs. |
![]() |
![]() |
![]() |
#5 |
Connoisseur
![]() ![]() Posts: 83
Karma: 192
Join Date: Jul 2013
Location: Planet Ocean
Device: Kobo Glo HD, Onyx Boox Note Pro 2, Samsung Galaxy Tab S5e, Pixel 4a
|
Well, you suggested to use multiple TOCs, I want an index.
![]() The use case is this: I have a library of song lyrics. I want to compound those in ebook of some sort. There are a lot of songs from different artists and albums. I created a simple structure in HTML where every h1 is the artist, h2 is the album and h3 is the song title. Then the song lyrics are in a PRE tag after. This is ordered by Artist/Album so it works out okay in the main TOC: I take only H1 tags and get a table of contents for authors. So far so good. But making a (sorted!) TOC for song titles doesn't make sense anymore, because the content is sorted by Artist. Hence the idea of using indexes instead. The idea would be to have an index of all song titles at the end of the ePUB, ordered by song name. |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,488
Karma: 5703586
Join Date: Nov 2009
Device: many
|
Then use search or grep to build a text file of just artists and of just albums. The feed those text files into Sigil Index Generating tool. Save the generated index after each run in an external html file and merge them.
See the old but still valid Sigil User's Guide to see how to generate an index using the Index Generation tool from a tab list of words / phrases. Should work. |
![]() |
![]() |
![]() |
#7 | |
A Hairy Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,313
Karma: 20171571
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 15/11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
|
Quote:
What I meant was that you could run an automated TOC while selecting just H2 to show up on the list. When that is done you would copy and paste that to a different sheet. Delete the TOC, then repeat that with just H3 selected, and copy/paste the results, etc. etc. That would create your links to the different tags. Yes it is more manual work than you probably want, but a lot less than it could be. Hopefully the index options these others have posted about will work better. If they do please post back! Cheers, |
|
![]() |
![]() |
![]() |
#8 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
STEP 0 Make sure you aren't doing this on your actual EPUB. Save As and make a copy! MAKE SURE YOU HAVE "Mode: Regex" + "Current File" selected. MAKE SURE YOU ARE IN "Code View". I attached a sample EPUB to the end of this post. I will be using that as the code examples: Original Code: Spoiler:
STEP 1 So... we generated the Sigil TOC. Now we have to throw everything out and only be left with the just the <h3> (Songs). Regex is your friend. This Regex takes Sigil's TOC code, and gets rid of the <h1>s (Artists): Search: <div class="sgc-toc-level-1">\s+(<a[^>]+>[^<]+</a>) Replace: This gets rid of the <h2>s (Albums): Search: <div class="sgc-toc-level-2">\s+(<a[^>]+>[^<]+</a>) Replace: And since we need the Songs... what I like to do is just change Sigil's TOC <div> into a <p> with a class: Search: <div class="sgc-toc-level-3">\s+(<a[^>]+>[^<]+</a>) Replace: <p class="tocthree">\1</p> STEP 2 Right click > Reformat HTML > Mend and Prettify OR press Tools > Reformat HTML > Mend and Prettify All HTML Files. That should leave you with a list of ONLY the Song Names: Spoiler:
Warning If you mess up any of the Regex or Search/Replacing, when Sigil is trying to cleanup the leftover <div>s, it may remove important code. This is why you need to back up. STEP 3 Now, to alphabetize these songs. Run another Regex: Search: (<a .+?>)(.+?)(</a>) Replace: \2\1\3 What this does is capture the Song name and put it before the <a> link: Before: <p class="tocthree"><a href="../Text/Artist01.xhtml#sigil_toc_id_3">Albumus Songimus 2</a></p> After: <p class="tocthree">Albumus Songimus 2<a href="../Text/Artist01.xhtml#sigil_toc_id_3"></a></p> Spoiler:
STEP 4 Now just toss that HTML into any tool that sorts alphabetically for you (I use Notepad++, or you may want to use a website like Text Mechanic). It should alphabetize all the songs: Spoiler:
STEP 5 Stick the HTML back into Sigil and move the song names back in the links: Search: <p class="tocthree">(.+?)(<a .+?>)(</a>) Replace: <p class="tocthree">\2\1\3 That should reverse Step 3. Before: <p class="tocthree">Albumus Songimus 2<a href="../Text/Artist01.xhtml#sigil_toc_id_3"></a></p> After: <p class="tocthree"><a href="../Text/Artist01.xhtml#sigil_toc_id_3">Albumus Songimus 2</a></p> Spoiler:
Now you have your fully alphabetized list of songs with links. Toss that in the Song Index at the end of your book. ![]() Last edited by Tex2002ans; 07-15-2017 at 01:35 AM. |
|
![]() |
![]() |
![]() |
#9 |
Connoisseur
![]() ![]() Posts: 83
Karma: 192
Join Date: Jul 2013
Location: Planet Ocean
Device: Kobo Glo HD, Onyx Boox Note Pro 2, Samsung Galaxy Tab S5e, Pixel 4a
|
thanks so much for the detailed responses!
in the end, i found another way of creating that ePUB, in the end. i'm generating a RST document which Sphinx turns into an ePUB, PDF or HTML: https://github.com/beetbox/beets/pull/2628 |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
automatic TOC generation | Neurone | Conversion | 1 | 06-03-2015 06:36 AM |
automatic language tag generation | donald1 | Plugins | 1 | 07-25-2013 10:12 AM |
Index: Making a linked index in epub | virtual_ink | ePub | 21 | 10-19-2011 11:23 PM |
Automatic Index of Books Available for Download | HarryT | BBeB/LRF Books | 6 | 09-11-2009 09:49 PM |
Automatic index links creation in mobipocket | ragdoll | Kindle Formats | 1 | 02-08-2008 07:07 AM |