View Single Post
Old 11-18-2013, 04:12 AM   #535
ShellShock
Wizard
ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.
 
ShellShock's Avatar
 
Posts: 1,178
Karma: 2431850
Join Date: Sep 2008
Device: IPad Mini 2 Retina
Suggestion for more advanced Smarten punctuation options

I posted about this last year, so apologies for a bit of repetition here. I really like this plug-in, except I find the "Smarten punctuation" a bit of a blunt tool, so I have created my own version of the code in the plug-in:

Code:
        def smarten_punctuation_for_page(html):
            preprocessor = HeuristicProcessor(None, self.log)
            start = 'calibre-smartypants-'+str(uuid4())
            stop = 'calibre-smartypants-'+str(uuid4())
            html = html.replace('<!--', start)
            html = html.replace('-->', stop)
            html = preprocessor.fix_nbsp_indents(html)
            # Do not use Calibre smartyPants.
            # html = smartyPants(html)
            html = html.replace(start, '<!--')
            html = html.replace(stop, '-->')
            # convert ellipsis to entities to prevent wrapping
            html = re.sub(r'(?u)(?<=\w)\s?(\.\s?){2}\.', '&hellip;', html)
            # convert double dashes to em-dash
            html = re.sub(r'\s--\s', u'\u2014', html)
            # Convert short em-dash to em-dash.
            html = re.sub(u'\u2013', u'\u2014', html)
            # Convert space, hyphen, space to em-dash.
            html = re.sub(r'\s-\s', u'\u2014', html)
            # Convert punctuation, hyphen to em-dash.
            html = re.sub(r'([\'|"|\u2018|\u201C])-(\w)', r'\1\u2014\2', html)
            # Convert hyphen, punctuation to em-dash.
            html = re.sub(r'(\w)-([\'|"|\u2019|\u201D|?|!|,|\.])', r'\1\u2014\2', html)
            # Convert space, em-dash, space to em-dash.
            html = re.sub(r'\s*\u2014\s*', u'\u2014', html)
            return substitute_entites(html)
You can see I do not use the smartypants within Calibre, and I have added some fixes for em-dash. Perhaps there could be an "Advanced" options in the plug-in UI to give the user finer control over exactly what the smarten punctuation does?
ShellShock is offline   Reply With Quote