Quote:
Originally Posted by anarcat
The use case is this: I have a library of song lyrics. I want to compound those in ebook of some sort. There are a lot of songs from different artists and albums. I created a simple structure in HTML where every h1 is the artist, h2 is the album and h3 is the song title. Then the song lyrics are in a PRE tag after. This is ordered by Artist/Album so it works out okay in the main TOC: I take only H1 tags and get a table of contents for authors. So far so good.
But making a (sorted!) TOC for song titles doesn't make sense anymore, because the content is sorted by Artist. Hence the idea of using indexes instead. The idea would be to have an index of all song titles at the end of the ePUB, ordered by song name.
|
I would do what Turtle91 mentioned. Generate the entire Sigil TOC (include <h1> (Artists) -> <h3> (Songs)).
STEP 0
Make sure you aren't doing this on your actual EPUB. Save As and make a copy!
MAKE SURE YOU HAVE "Mode: Regex" + "Current File" selected.
MAKE SURE YOU ARE IN "Code View".
I attached a sample EPUB to the end of this post. I will be using that as the code examples:
Original Code:
STEP 1
So... we generated the Sigil TOC. Now we have to throw everything out and only be left with the just the <h3> (Songs).
Regex is your friend.
This Regex takes Sigil's TOC code, and gets rid of the <h1>s (Artists):
Search: <div class="sgc-toc-level-1">\s+(<a[^>]+>[^<]+</a>)
Replace:
This gets rid of the <h2>s (Albums):
Search: <div class="sgc-toc-level-2">\s+(<a[^>]+>[^<]+</a>)
Replace:
And since we need the Songs... what I like to do is just change Sigil's TOC <div> into a <p> with a class:
Search: <div class="sgc-toc-level-3">\s+(<a[^>]+>[^<]+</a>)
Replace: <p class="tocthree">\1</p>
STEP 2
Right click > Reformat HTML > Mend and Prettify
OR press
Tools > Reformat HTML > Mend and Prettify All HTML Files.
That should leave you with a list of ONLY the Song Names:
Warning If you mess up any of the Regex or Search/Replacing, when Sigil is trying to cleanup the leftover <div>s, it may remove important code. This is why you need to back up.
STEP 3
Now, to alphabetize these songs.
Run another Regex:
Search: (<a .+?>)(.+?)(</a>)
Replace: \2\1\3
What this does is capture the Song name and put it before the <a> link:
Before:
<p class="tocthree"><a href="../Text/Artist01.xhtml#sigil_toc_id_3">
Albumus Songimus 2</a></p>
After:
<p class="tocthree">
Albumus Songimus 2<a href="../Text/Artist01.xhtml#sigil_toc_id_3"></a></p>
STEP 4
Now just toss that HTML into any tool that sorts alphabetically for you (I use Notepad++, or you may want to use a website like
Text Mechanic). It should alphabetize all the songs:
STEP 5
Stick the HTML back into Sigil and move the song names back in the links:
Search: <p class="tocthree">(.+?)(<a .+?>)(</a>)
Replace: <p class="tocthree">\2\1\3
That should reverse Step 3.
Before:
<p class="tocthree">
Albumus Songimus 2<a href="../Text/Artist01.xhtml#sigil_toc_id_3"></a></p>
After:
<p class="tocthree"><a href="../Text/Artist01.xhtml#sigil_toc_id_3">
Albumus Songimus 2</a></p>
Now you have your fully alphabetized list of songs with links. Toss that in the Song Index at the end of your book.