View Single Post
Old 06-25-2020, 12:36 AM   #17
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,746
Karma: 24032915
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by Mister L View Post
Just curious, should I give up on this or does anyone with the skills to make a plugin think it's a good idea?
As a stop-gap, I created a quick & dirty BeautifulSoup-based plugin that adds title attributes to h1..h6 entries. (It doesn't check the TOC and it doesn't merge successive h1..h6 entries.)

It changes:

Code:
<h1 id="toc_marker-26">21</h1>

    <h2><span class="Cap">E</span><span class="SmallCap">N CHEMIN POUR</span> <span class="Cap">S</span><span class="SmallCap">HADAR</span> <span class="Cap">L</span><span class="SmallCap">OGOTH</span></h2>
to:

Code:
 <h1 id="toc_marker-26" title="21">21</h1>
  <h2 title="En chemin pour shadar logoth"><span class="Cap">E</span><span class="SmallCap">N CHEMIN POUR</span> <span class="Cap">S</span><span class="SmallCap">HADAR</span> <span class="Cap">L</span><span class="SmallCap">OGOTH</span></h2>
You can download it from my Dropbox.

The plugin code is:

Spoiler:
Code:
def run(bk):

    # process all files
    for html_id, href in bk.text_iter():
        html = bk.readfile(html_id)
        file_name = os.path.basename(href)
        soup = BeautifulSoup(html, 'html.parser')
        orig_soup = str(soup)

        # process all headings
        headings = soup.find_all(['h1', 'h2', 'h3', 'h4', 'h5', 'h6'])
        for heading in headings:
            title = heading.get_text().strip()
            title = title.lower().capitalize() # ABC DEF > Abc def
            #title = title.title() # ABC DEF > Abc Def
            if title != '':
                heading['title'] = title

        # update html if the code was changed
        if str(soup) != orig_soup:
            bk.writefile(html_id, str(soup.prettyprint_xhtml(indent_level=0, eventual_encoding="utf-8", formatter="minimal", indent_chars="  ")))
            print('{} updated.'.format(file_name))

    return 0


For English books, change the following section:

Code:
            title = title.lower().capitalize() # ABC DEF > Abc def
            #title = title.title() # ABC DEF > Abc Def
to:

Code:
            #title = title.lower().capitalize() # ABC DEF > Abc def
            title = title.title() # ABC DEF > Abc Def

Last edited by Doitsu; 06-25-2020 at 01:20 AM.
Doitsu is offline   Reply With Quote