![]() |
#1 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 103
Karma: 57138
Join Date: May 2010
Device: Sony 505, iPad 1 & 3, Galaxy Note 8.1
|
Converting UK punctuation to US
Trying to find a tool or utility to convert files from UK punctuation to US (single quote for dialog to double quote for example). Any help would be appreciated.
|
![]() |
![]() |
![]() |
#2 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,685
Karma: 24031401
Join Date: Dec 2010
Device: Kindle PW2
|
Quote:
For example, you could use the following very simple expressions in Sigil and Calibre to replace single quotes with double quotes. Find:‘(.*?)’ Replace:“\1” (Make sure to select Regex from the Mode dropdown box.) Note that you'll probably also have to convert British spellings to American spellings using VarCon or a similar tool. |
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 103
Karma: 57138
Join Date: May 2010
Device: Sony 505, iPad 1 & 3, Galaxy Note 8.1
|
My problem with a simple regex is things like possessives or other times when you have a ' in the middle of a line. That's why I was looking for a tool that did a little more analysis of the text before making changes. It isn't an easy thing to just convert them I know, just wondered if anyone knew of any tool that did so.
|
![]() |
![]() |
![]() |
#4 | |
Whatever...
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 197
Karma: 1114225
Join Date: Feb 2015
Location: Austria
Device: PocketBook InkPad 840, Touch HD 2
|
Quote:
I think that any ’ followed by a letter is an apostroph, and any ’ preceded by a letter is an apostroph, too? Because if it were a closing quote, it would be preceded by a punctuation mark? If I'm right, then you can use regular expressions to replace ’[a-zA-Z] and [a-zA-Z]’ with anything you like that doesn't appear in the text, then replace single quotes with double quotes (if opening and closing ones have the same numbers that's a good sign), and finally restore the apostrophs. And then check if this really worked ![]() Hm, you may still have to deal with nested quotes... replace opening and closing double quotes with something else before you do the above, and replace them with single ones as the final step. Never entirely trust any automated process, though... |
|
![]() |
![]() |
![]() |
#5 |
frumious Bandersnatch
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,543
Karma: 19001583
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
I'm afraid it's not so, you can have ‘single’ words or phrases between quotes, not to mention mistakes where the punctuation may be missing. It may be possible to detect these cases by keeping track of whether a quote has been opened or not, though.
|
![]() |
![]() |
Advert | |
|
![]() |
#6 | |
Whatever...
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 197
Karma: 1114225
Join Date: Feb 2015
Location: Austria
Device: PocketBook InkPad 840, Touch HD 2
|
Quote:
A faint type of both characters may be found in the Surinam Yarico of Captain John Gabriel Stedman, whose ‘Narrative of a Five Years’ Expedition’ appeared in 1796. So, I guess, it cannot be done without manual checking. Am I right that an apostroph at the end of a word will only appear after an s? If that's true, then it should be possible to search for [sS]’[ .,;:!?—] (shouldn't be too many) and replace the ones that are closing quotes with an unambiguous placeholder -- and for the rest the rule "if followed or preceded by a letter it's an apostrophe" applies? Does this work now? |
|
![]() |
![]() |
![]() |
#7 | |
frumious Bandersnatch
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,543
Karma: 19001583
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
Quote:
(Not to mention that occasionally you can find the apostrophe after some s-sound which is not written with s, like -x or -ce.) Last edited by Jellby; 03-21-2015 at 11:12 AM. |
|
![]() |
![]() |
![]() |
#8 | |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,633
Karma: 29710510
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
Probably more often seen in older non-US/Canada publications. BR |
|
![]() |
![]() |
![]() |
#9 |
Whatever...
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 197
Karma: 1114225
Join Date: Feb 2015
Location: Austria
Device: PocketBook InkPad 840, Touch HD 2
|
Yes, sorry, you're both right of course.
It will have to be [a-zA-Z]’[ .,;:!?—] then, which depending upon the text, may mean having to look at quite a number of ’ |
![]() |
![]() |
![]() |
#10 | |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,633
Karma: 29710510
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
In more than a few non-US/Can publications I've found that real apostrophes are used where they ought be used (contractions and possessives) and single quotes are used around dialogue and quotes. If that's true then maybe just convert the quotes and leave apostrophes alone to fulfill their designated purpose. <sa-joke>We'll have a shortage of closing single quotes if they keep getting used as apostrophes, just as there's a shortage of semicolons since Algol and is successors started using them as full stops.<\sa-joke> BR |
|
![]() |
![]() |
![]() |
#11 | |
frumious Bandersnatch
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,543
Karma: 19001583
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
Quote:
![]() When I code ebooks, I sometimes use ’ for the quote and ’ for the apostrophe. These are just synonyms for the same U+2019 character, but at least they are easy to tell in the code. |
|
![]() |
![]() |
![]() |
#12 |
Whatever...
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 197
Karma: 1114225
Join Date: Feb 2015
Location: Austria
Device: PocketBook InkPad 840, Touch HD 2
|
Just read this today:
This Is How ‘Interstellar’’s Co-Writer Wanted the Movie to End Unless I'm mistaken, the first ’ is the closing quote, and the second one the apostrophe... (and in OCR'd text, ’’ can also stand for mis-recognized closing double quotes when you have nested quotes...) And, in headings or in verses, [a-zA-Z]’$ can be either an apostrophe or a closing quote... I think that using the same symbol for apostrophes and closing quotes really hasn't been the best idea the British ever had. I'm not an expert on Unicode and stay with Windows 1252 whenever I can, but even Unicode doesn't really help to clear up the mess, it seems: http://www.unicode.org/L2/L2007/07241-mirroring.txt So... a bit playing with regular expressions, and then we're back to good old proofreading/editing... |
![]() |
![]() |
![]() |
#13 | |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,633
Karma: 29710510
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
![]() ![]() It could be that the 'apostrophes' I sometimes see are primes as in feet, inches, minutes and seconds marks. Pretty sure they're not straight quotes, I think I see them most in public domain official & legal texts from courts and tribunals etc, and media transcripts - maybe they come from the transcription technology/services they use. I don't write the stuff, or republish it, I just download it, skim read it and file it. But I think I have also seen them in commercial books from UK - would be factual books not fictional. BR |
|
![]() |
![]() |
![]() |
#14 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
And typically when I run across these source files, it is just one set of quotes that are completely wrong. Example, all right single quotes = acutes, while all left single quotes = dumb version. Code:
This is an 'example´ of the 'mess´ I am 'talking´ about. Last edited by Tex2002ans; 03-23-2015 at 12:49 AM. |
|
![]() |
![]() |
![]() |
#15 |
frumious Bandersnatch
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,543
Karma: 19001583
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
If it's the opposite, it could be a LaTeX source, or a text file influenced by that:
Code:
This is an `example' of how LaTeX reads `single' and ``double'' quotes |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Smarten punctuation only? | Psymon | Conversion | 3 | 10-20-2013 09:28 AM |
Punctuation - who knows where? | gmw | Writers' Corner | 13 | 08-03-2013 01:16 AM |
Strange Punctuation converting PDF to MOBI | BuzzB | Conversion | 1 | 04-08-2012 04:52 PM |
Punctuation | Dresden | Calibre | 7 | 08-31-2010 05:14 AM |
Punctuation | jgray | Workshop | 10 | 04-14-2010 07:38 AM |