View Single Post
Old 09-02-2014, 07:13 PM   #24
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by DiapDealer View Post
[...] It just guesses what should be an opening quote or a closing quote or an apostrophe. It's guessing extremely well in my experience. Granted; I'm sure there's certain complex quotation situations (or extra spacing between words and quotation marks) where the algorithm may fail ... but I gave up worrying about it. I don't deal with anything that complex (quotationally speaking). It handles nested quotes and continuation quotes (no closing quote for the previous para) just fine in my experience.
There was also this topic a few years back which discussed Smarten Punctuation breaking due to spaces before/after quotation marks:

https://www.mobileread.com/forums/sho...d.php?t=171920

Which ALSO reminded me of another case where I have seen it break, is when a closing quote is right before/after an em or en dash. Again, I don't have any specific examples on hand, but I can recall it happening.

And I thought of another example while I was OCRing last night, where "quotations" just get MANGLED. I deal with a lot of equations in text as well, and there are many cases of using "prime", "double prime", "triple prime", etc. etc. So x', y', m'', t'''. Again, I would avoid using the actual "prime" characters, and stick with the dumb equivalent (because of font issues on certain devices).

In some cases, there are HUNDREDS of "primes" throughout the text, and running the Smartening Algorithms will also just completely mangle those (and mangle subsequent quotation marks).

Quote:
Originally Posted by DiapDealer View Post
You may appreciate the plugin's ability to consult a user-defined, custom list of words that start with apostrophes.
Sounds fantastic, next time I have to run it, I will let you know. Currently, I have another large journal I am OCRing. This time, instead of a ~2 million word journal, it is just a lowly ~1.1 million words.

Quote:
Originally Posted by BetterRed View Post
@Tex2002ans - Is this it ==>> Find straight quotes in the text

I searched for threats in the Editor with posts by Tex....
Yes yes, I believe that might have been the topic. I knew it was hiding there somewhere. Usually I am good at hunting down these older posts. (or stumbling upon other posts, like that one you mentioned Tex2002ans + LaTeX!)

I really have to get around to organizing/categorizing older posts. So much good information just gets lost in the abyss!
Tex2002ans is offline   Reply With Quote