MobileRead Forums - View Single Post

Danger · 06-29-2012, 10:56 AM

Quote:

Originally Posted by ondrejandrej

...to replace ASCII quotation marks with the proper Unicode quotes in Czech (example: „quoted text“). Many other languages use the same quotation marks. I use regular expressions to replace the opening ASCII quotation mark with a different character than the closing one. If I do it in Code view, the quotes in HTML tags are replaced as well (bad)....

I do this when replacing straight quote marks with proper curly quotes:

FIND: >"
REPLACE: >“

Replaces all quote marks after any > which usually denotes the end of an HTML tag.

Next, replace all " in front of a capital letter:

FIND: "([A-Z])
REPLACE: “\1

That should take care of most if not all quotes that start a sentence.

I then go on to replace the ending quote of a sentence with curly quote mark.

FIND: "</
REPLACE: ”\1

then replace all quote marks after a period, comma, question mark, exclamation mark:

FIND: ([.,?!])"
REPLACE: .”

I usually do a step through to check to make sure I got all the marks:

FIND: “(.*?)”

I then do a final step through to see if there are any straglers, this WILL find " marks in HTML tags, but not all of them by skipping any that start with an equal sign (classes) or ending in > end of class:

FIND: ([^=])"([^>])

It's a bit of work but MUCH less than stepping through each " mark and judging if it should be replaced. As all but the last two can "Replace All"