View Single Post
Old 07-15-2017, 01:24 AM   #8
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by anarcat View Post
The use case is this: I have a library of song lyrics. I want to compound those in ebook of some sort. There are a lot of songs from different artists and albums. I created a simple structure in HTML where every h1 is the artist, h2 is the album and h3 is the song title. Then the song lyrics are in a PRE tag after. This is ordered by Artist/Album so it works out okay in the main TOC: I take only H1 tags and get a table of contents for authors. So far so good.

But making a (sorted!) TOC for song titles doesn't make sense anymore, because the content is sorted by Artist. Hence the idea of using indexes instead. The idea would be to have an index of all song titles at the end of the ePUB, ordered by song name.
I would do what Turtle91 mentioned. Generate the entire Sigil TOC (include <h1> (Artists) -> <h3> (Songs)).

STEP 0

Make sure you aren't doing this on your actual EPUB. Save As and make a copy!

MAKE SURE YOU HAVE "Mode: Regex" + "Current File" selected.

MAKE SURE YOU ARE IN "Code View".

I attached a sample EPUB to the end of this post. I will be using that as the code examples:

Click image for larger version

Name:	RegexCurrentFile.png
Views:	208
Size:	95.3 KB
ID:	157951 Click image for larger version

Name:	ExampleArtistAlbumSongTOC.png
Views:	196
Size:	57.1 KB
ID:	157950

Original Code:

Spoiler:
Code:
  <div class="sgc-toc-level-1">
    <a href="../Text/Artist01.xhtml">Artist 1</a> 
    <div class="sgc-toc-level-2">
      <a href="../Text/Artist01.xhtml#sigil_toc_id_1">Albumus Example</a> 
      <div class="sgc-toc-level-3">
        <a href="../Text/Artist01.xhtml#sigil_toc_id_2">Albumus Songimus 1</a>
      </div>
      <div class="sgc-toc-level-3">
        <a href="../Text/Artist01.xhtml#sigil_toc_id_3">Albumus Songimus 2</a>
      </div>
      <div class="sgc-toc-level-3">
        <a href="../Text/Artist01.xhtml#sigil_toc_id_4">Albumus Songimus 3</a>
      </div>
      <div class="sgc-toc-level-3">
        <a href="../Text/Artist01.xhtml#sigil_toc_id_5">Albumus Songimus 4</a>
      </div>
    </div>
    <div class="sgc-toc-level-2">
      <a href="../Text/Artist01.xhtml#sigil_toc_id_6">Bulbumus Example</a> 
      <div class="sgc-toc-level-3">
        <a href="../Text/Artist01.xhtml#sigil_toc_id_7">Bulbumus Songimus 1</a>
      </div>
    </div>
    <div class="sgc-toc-level-2">
      <a href="../Text/Artist01.xhtml#sigil_toc_id_8">Callbumus Example</a> 
      <div class="sgc-toc-level-3">
        <a href="../Text/Artist01.xhtml#sigil_toc_id_9">Callbumus Songimus 1</a>
      </div>
    </div>
    <div class="sgc-toc-level-2">
      <a href="../Text/Artist01.xhtml#sigil_toc_id_10">Dollbumus Example</a> 
      <div class="sgc-toc-level-3">
        <a href="../Text/Artist01.xhtml#sigil_toc_id_11">Dollbumus Songimus 1</a>
      </div>
    </div>
  </div>
  <div class="sgc-toc-level-1">
    <a href="../Text/Artist02.xhtml">Bartist 2</a> 
    <div class="sgc-toc-level-2">
      <a href="../Text/Artist02.xhtml#sigil_toc_id_12">Bartimus Example</a> 
      <div class="sgc-toc-level-3">
        <a href="../Text/Artist02.xhtml#sigil_toc_id_13">Bartimus Songimus 1</a>
      </div>
    </div>
  </div>

  <div class="sgc-toc-level-1">
    <a href="../Text/Artist03.xhtml">Cartist 3</a> 
    <div class="sgc-toc-level-2">
      <a href="../Text/Artist03.xhtml#sigil_toc_id_14">Cartimus Example</a> 
      <div class="sgc-toc-level-3">
        <a href="../Text/Artist03.xhtml#sigil_toc_id_15">Cartimus Songimus 1</a>
      </div>
    </div>
  </div>


STEP 1

So... we generated the Sigil TOC. Now we have to throw everything out and only be left with the just the <h3> (Songs).

Regex is your friend.

This Regex takes Sigil's TOC code, and gets rid of the <h1>s (Artists):

Search: <div class="sgc-toc-level-1">\s+(<a[^>]+>[^<]+</a>)
Replace:

This gets rid of the <h2>s (Albums):

Search: <div class="sgc-toc-level-2">\s+(<a[^>]+>[^<]+</a>)
Replace:

And since we need the Songs... what I like to do is just change Sigil's TOC <div> into a <p> with a class:

Search: <div class="sgc-toc-level-3">\s+(<a[^>]+>[^<]+</a>)
Replace: <p class="tocthree">\1</p>

STEP 2

Right click > Reformat HTML > Mend and Prettify

OR press

Tools > Reformat HTML > Mend and Prettify All HTML Files.

That should leave you with a list of ONLY the Song Names:

Spoiler:
Code:
 <p class="tocthree"><a href="../Text/Artist01.xhtml#sigil_toc_id_2">Albumus Songimus 1</a></p>
  <p class="tocthree"><a href="../Text/Artist01.xhtml#sigil_toc_id_3">Albumus Songimus 2</a></p>
  <p class="tocthree"><a href="../Text/Artist01.xhtml#sigil_toc_id_4">Albumus Songimus 3</a></p>
  <p class="tocthree"><a href="../Text/Artist01.xhtml#sigil_toc_id_5">Albumus Songimus 4</a></p>
  <p class="tocthree"><a href="../Text/Artist01.xhtml#sigil_toc_id_7">Bulbumus Songimus 1</a></p>
  <p class="tocthree"><a href="../Text/Artist01.xhtml#sigil_toc_id_9">Callbumus Songimus 1</a></p>
  <p class="tocthree"><a href="../Text/Artist01.xhtml#sigil_toc_id_11">Dollbumus Songimus 1</a></p>
  <p class="tocthree"><a href="../Text/Artist02.xhtml#sigil_toc_id_13">Bartimus Songimus 1</a></p>
  <p class="tocthree"><a href="../Text/Artist03.xhtml#sigil_toc_id_15">Cartimus Songimus 1</a></p>


Warning If you mess up any of the Regex or Search/Replacing, when Sigil is trying to cleanup the leftover <div>s, it may remove important code. This is why you need to back up.

STEP 3

Now, to alphabetize these songs.

Run another Regex:

Search: (<a .+?>)(.+?)(</a>)
Replace: \2\1\3

What this does is capture the Song name and put it before the <a> link:

Before:

<p class="tocthree"><a href="../Text/Artist01.xhtml#sigil_toc_id_3">Albumus Songimus 2</a></p>

After:

<p class="tocthree">Albumus Songimus 2<a href="../Text/Artist01.xhtml#sigil_toc_id_3"></a></p>

Spoiler:
Code:
  <p class="tocthree">Albumus Songimus 1<a href="../Text/Artist01.xhtml#sigil_toc_id_2"></a></p>
  <p class="tocthree">Albumus Songimus 2<a href="../Text/Artist01.xhtml#sigil_toc_id_3"></a></p>
  <p class="tocthree">Albumus Songimus 3<a href="../Text/Artist01.xhtml#sigil_toc_id_4"></a></p>
  <p class="tocthree">Albumus Songimus 4<a href="../Text/Artist01.xhtml#sigil_toc_id_5"></a></p>
  <p class="tocthree">Bulbumus Songimus 1<a href="../Text/Artist01.xhtml#sigil_toc_id_7"></a></p>
  <p class="tocthree">Callbumus Songimus 1<a href="../Text/Artist01.xhtml#sigil_toc_id_9"></a></p>
  <p class="tocthree">Dollbumus Songimus 1<a href="../Text/Artist01.xhtml#sigil_toc_id_11"></a></p>
  <p class="tocthree">Bartimus Songimus 1<a href="../Text/Artist02.xhtml#sigil_toc_id_13"></a></p>
  <p class="tocthree">Cartimus Songimus 1<a href="../Text/Artist03.xhtml#sigil_toc_id_15"></a></p>


STEP 4

Now just toss that HTML into any tool that sorts alphabetically for you (I use Notepad++, or you may want to use a website like Text Mechanic). It should alphabetize all the songs:

Spoiler:
Code:
  <p class="tocthree">Albumus Songimus 1<a href="../Text/Artist01.xhtml#sigil_toc_id_2"></a></p>
  <p class="tocthree">Albumus Songimus 2<a href="../Text/Artist01.xhtml#sigil_toc_id_3"></a></p>
  <p class="tocthree">Albumus Songimus 3<a href="../Text/Artist01.xhtml#sigil_toc_id_4"></a></p>
  <p class="tocthree">Albumus Songimus 4<a href="../Text/Artist01.xhtml#sigil_toc_id_5"></a></p>
  <p class="tocthree">Bartimus Songimus 1<a href="../Text/Artist02.xhtml#sigil_toc_id_13"></a></p>
  <p class="tocthree">Bulbumus Songimus 1<a href="../Text/Artist01.xhtml#sigil_toc_id_7"></a></p>
  <p class="tocthree">Callbumus Songimus 1<a href="../Text/Artist01.xhtml#sigil_toc_id_9"></a></p>
  <p class="tocthree">Cartimus Songimus 1<a href="../Text/Artist03.xhtml#sigil_toc_id_15"></a></p>
  <p class="tocthree">Dollbumus Songimus 1<a href="../Text/Artist01.xhtml#sigil_toc_id_11"></a></p>


STEP 5

Stick the HTML back into Sigil and move the song names back in the links:

Search: <p class="tocthree">(.+?)(<a .+?>)(</a>)
Replace: <p class="tocthree">\2\1\3

That should reverse Step 3.

Before:

<p class="tocthree">Albumus Songimus 2<a href="../Text/Artist01.xhtml#sigil_toc_id_3"></a></p>

After:

<p class="tocthree"><a href="../Text/Artist01.xhtml#sigil_toc_id_3">Albumus Songimus 2</a></p>

Spoiler:
Code:
  <p class="tocthree"><a href="../Text/Artist01.xhtml#sigil_toc_id_2">Albumus Songimus 1</a></p>
  <p class="tocthree"><a href="../Text/Artist01.xhtml#sigil_toc_id_3">Albumus Songimus 2</a></p>
  <p class="tocthree"><a href="../Text/Artist01.xhtml#sigil_toc_id_4">Albumus Songimus 3</a></p>
  <p class="tocthree"><a href="../Text/Artist01.xhtml#sigil_toc_id_5">Albumus Songimus 4</a></p>
  <p class="tocthree"><a href="../Text/Artist02.xhtml#sigil_toc_id_13">Bartimus Songimus 1</a></p>
  <p class="tocthree"><a href="../Text/Artist01.xhtml#sigil_toc_id_7">Bulbumus Songimus 1</a></p>
  <p class="tocthree"><a href="../Text/Artist01.xhtml#sigil_toc_id_9">Callbumus Songimus 1</a></p>
  <p class="tocthree"><a href="../Text/Artist03.xhtml#sigil_toc_id_15">Cartimus Songimus 1</a></p>
  <p class="tocthree"><a href="../Text/Artist01.xhtml#sigil_toc_id_11">Dollbumus Songimus 1</a></p>


Now you have your fully alphabetized list of songs with links. Toss that in the Song Index at the end of your book.
Attached Files
File Type: epub ExampleArtistLyrics.epub (3.9 KB, 138 views)

Last edited by Tex2002ans; 07-15-2017 at 01:35 AM.
Tex2002ans is offline   Reply With Quote