View Single Post
Old 12-01-2020, 07:10 PM   #32
EbookMakers
Enthusiast
EbookMakers began at the beginning.
 
Posts: 26
Karma: 38
Join Date: Nov 2019
Location: Paris, France
Device: none
Automatically fill in the <title> tag of text pages

In the <head> section, the absence of a <title> tag causes an epubcheck error. It also happens to find something like (depends on the language):
<title>Unknown</title> or <title></title>

In these cases, the regex-function will look for the title in the metadata to fill in the <title> tag of the <head> sections of the xhtml pages. If the title in the metadata is not filled in or itself has the default value “Unknown”, the function leaves it as is. You can then fill in the <dc: title> tag in the opf, save the epub, re-open it in the editor and then restart the regex-function.

The function is commented out. You must adapt the regex and the function to the language of the epub if it is not English or French to add the equivalent word to “Unknown”.

The regex :

Code:
<title>(?:[Ii]nconnu\(e\)|[Uu]nknown)?</title>|<head>(?:(?!<title).)+\K(</head>)
Dot matches all (new lines).

The function :

Code:
# execute the function with this regex : 
# <title>(?:[Ii]nconnu\(e\)|[Uu]nknown)?</title>|<head>(?:(?!<title).)+\K(</head>)
# Dot matches all

def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs):
    # Funct-regex to fill in (with <dc:title> of the opf) a <title> tag in xml files
    # if there is no such tag or if it's 'Unknown', or localized equivalent


    # This tuple and the regex should be adapted to the language of the epub
    # Add in this tuple the string to target, in your language
    #  +++ Must be in lower case +++, since the string in the test is lowered
    no_title =  ('unknown', 'inconnu(e)')


    # 'is_dc_title' is true if metada.title is defined
    # Warning : if no <dc:title> in the opf, metada.title will take
    # the value 'Unknown' or its localized value (ex : Inconnu(e) for french)
    is_dc_title =  ( metadata.title is not None \
            and len(metadata.title) > 0         \
            and metadata.title.lower() not in no_title )

    # no capturing group : <title> is empty or 'Unknown'
    # (we capture a group only if we reach </head> without finding <title>)
    if not match.group(1):
        if is_dc_title:
            title = "  <title>" + metadata.title + "</title>"
        else:
            title = match.group()

    # found (</head>), thus <title> tag is missing 
    else:
        if is_dc_title:
            title = "  <title>" + metadata.title + "</title>"  + '\n' + match.group(1)
        else:
            title =  match.group(1)
            ######## Shall we fill in a tag if none ? ###########
            # comment/uncomment this line below if you want to write <title></title>
            # in case tag <title> is missing and <dc:title> is not defined 
            # if commented, tag will be still missing
            # title = "  <title></title>\n"  + title

    return title

Last edited by EbookMakers; 12-08-2020 at 04:31 AM.
EbookMakers is offline   Reply With Quote