View Single Post
Old 03-31-2016, 10:57 PM   #26
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by eschwartz View Post
EDIT: Yes, DiapDealer's PunctuationSmarten plugin is your friend.
I second this. Or here is "Diap's Editing Toolbag" (Calibre version of the Plugin) which I use all the damn time:

https://www.mobileread.com/forums/sho....php?p=2980740

I find it to be much more helpful than Calibre's built in Smarten Punctuation because this one has options (such as don't touch ellipses, or don't touch dashes).

Side Note: My personal method is three rounds:

Round #1: Diapdealer's Toolbag, Smarten Punctuation.

Round #2: I run the book through a lot of the regex fixes (of common errors I have come across). I do my usual code cleanup + OCR fixing + everything else.

Round #3: As the final step, I run the final text through Toxaris's Dialogue Check.

Quote:
Originally Posted by HarryT View Post
Tools tend to have problems with words with initial apostrophes, particularly in books where the single apostrophe is also used for speech marks.
Yep, the automated tools do make quite a few mistakes (typically around em dashes, italics/other HTML tags, [...]).

The books I work on don't really have too many of the "'tis a jolly good day" + "'twas the night before Christmas" + "go get 'em", but I have this in my Saved Regexes:

Search: ‘(Em|em|Til|til|Tis|tis|Twas|twas)
Replace: ’\1

You can easily just append whatever words needed in there with a pipe between, and it can make it easier to find/change the Left Single Quote (wrong) quotes into Right Single Quote (Correct).

(I believe Diap's Toolbag also has an "exception" list if you wanted to take that route.)

I also have this Regex to handle years, such as "’90s":

Search: ‘([0-9])
Replace: ’\1

Quote:
Originally Posted by JSWolf View Post
What I would like is a way to convert UK style quotes to US style quotes because UK style quotes just look unnatural.
I mean come on JSWolf, you can't be serious. That is just because you primarily read US material.

I suspect you already came across this the multitude of times me (and Toxaris) have posted this:

https://en.wikipedia.org/wiki/Quotat...ious_languages

All different languages use all different types of quotation marks (High/Low, Left/Right, Quotes/Guillemets, [...]). If you read Finnish books you might be used to ”…” instead.

UK to US has no easily automated way to do it... you would have to manually replace all Left/Right Single Quotes with their Double Quote equivalents (and change all Double -> Single).

Then you try to catch a lot of the accidentally converted apostrophes like:

Search: ([a-zA-Z])”([a-z])
Replace: \1’\2

And step through and try to catch apostrophes at the end of words:

Search: ([s])”(\s)
Replace: \1’\2

And a ton more elbow grease.

I have done UK -> US quotes a handful of times, and it is brutal/boring work.

Quote:
Originally Posted by Jellby View Post
It all boils down to distinguishing between right single quote and apostrophe. Unfortunately, in Unicode they are the same character (a design mistake, I'd say).
Hmmmm... yeah this does seem to be the crux of a lot of the Smarten Punctuation issues. It would simplify a lot. :P

Side Note: On a related note, does everyone here remember the glorious Smarten Punctuation (plus other typography) discussion we had back in 2014? (My gods, how time flies):

https://www.mobileread.com/forums/sho...58#post2912458

Last edited by Tex2002ans; 03-31-2016 at 11:08 PM.
Tex2002ans is offline   Reply With Quote