Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 04-15-2011, 01:54 AM   #1
habanr
Junior Member
habanr began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Feb 2011
Location: Europe
Device: Amazon Kindle
Search & Replace doesn't work for quotes

I've discovered that "Search & Replace" function doesn't work for closing quotes >>”<< and for opening >>“<<. Test search using wizard's Regex Builder correctly finds them but quotes replacement doesn't work during the conversion. I have the "Smarten punctuation" function switched off...
habanr is offline   Reply With Quote
Old 04-15-2011, 06:34 AM   #2
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Try using a '\' in front of the quote (in case you're using straight quotes).

If you're using curly quotes I'm not certain why they're not working off the top of my head, but try replacing them with unicode notation:
” = \u201D
“ = \u201C
ldolse is offline   Reply With Quote
Old 04-15-2011, 06:54 AM   #3
habanr
Junior Member
habanr began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Feb 2011
Location: Europe
Device: Amazon Kindle
It doesn't work. As I told before Test function successfully finds quotes, but the result of the conversion is wrong - quotes are unchanged...
habanr is offline   Reply With Quote
Old 04-15-2011, 07:00 AM   #4
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
What are you actually typing into the search and replace box, and what in general are you actually trying to attempt?

I just tried this:
(“|”)

And it replaced them fine.

Make sure your input encoding is configured correctly under look and feel.
ldolse is offline   Reply With Quote
Old 04-15-2011, 07:26 AM   #5
habanr
Junior Member
habanr began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Feb 2011
Location: Europe
Device: Amazon Kindle
I've tried this:
Search >>“<< Replace >>|<< Doesn't work (syntax is corect as Regex Builder test finds the quotes)
Search >>a<< Replace >>|<< Works correctly

Generally I try to replace English quotes with their Czech equivalents.
habanr is offline   Reply With Quote
Old 04-15-2011, 07:51 AM   #6
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Use (?u) at the beginning of the pattern.

Code:
(?u)“
I'll need to look at why it's matching in the Regex Builder. It should not be. You need to manually specify you are using and to match on unicode characters.
user_none is offline   Reply With Quote
Old 04-15-2011, 08:11 AM   #7
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Quote:
Originally Posted by user_none View Post
Use (?u) at the beginning of the pattern.

Code:
(?u)“
I'll need to look at why it's matching in the Regex Builder. It should not be. You need to manually specify you are using and to match on unicode characters.
(?u) only changes things like \w to use unicode character maps, I don't think it tells python the string itself is unicode. The string itself needs to be specified as a unicode string, but I'm pretty sure this happens automatically for these config variables.

Anyway my testing confirmed that unicode characters worked just fine:
Code:
(“|”)
Matched all the curly quotes and replaced them with straight quotes in my test (both in the regex builder and the actual conversion). I think something else must be the root cause.
ldolse is offline   Reply With Quote
Old 04-15-2011, 08:15 AM   #8
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Quote:
Originally Posted by ldolse View Post
(?u) only changes things like \w to use unicode character maps, I don't think it tells python the string itself is unicode.
You are correct. For some reason I was thinking of u'' when I said (?u). But that's not an issue because the strings are already turned into unicode strings internally.
user_none is offline   Reply With Quote
Old 04-20-2011, 05:49 AM   #9
habanr
Junior Member
habanr began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Feb 2011
Location: Europe
Device: Amazon Kindle
Hi,
have you already found any solution, how to solve this "quote search & replace" problem?
habanr is offline   Reply With Quote
Old 04-20-2011, 06:06 AM   #10
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
You're going to have to supply more details about what you're doing, i.e. what exactly is the source, specific steps you're using for the whole import/conversion process, etc.. I suspect you have some sort of file encoding problem. UTF-8 works fine for us doing the same thing, so the most likely explanation is that your file is not UTF-8 and you haven't specified the correct encoding. Why the Regex wizard is still showing you a match is a bit of a mystery, but that's a slightly different code path from conversion.
ldolse is offline   Reply With Quote
Old 04-22-2011, 09:44 AM   #11
Phate17
Junior Member
Phate17 began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Apr 2011
Device: Hannlin V5
Hello,
I have the same problem. My ebooks are in PDB format with english LEFT DOUBLE QUOTATION MARK (Unicode201C - ALT 0147) at the beginning of direct speach and RIGHT DOUBLE QUOTATION MARK (Unicode201D - ALT 0148) at the ending. I need to change it to czech file format with quotation mark Uicode201E - ALT 0132 at the beginning and quotation mark Unicode201C - ALT 0147 at the ending.

When I search the left english quotation mark in regex builder (I enter ALT 0147 and press TEST) it finds all occurrences and marks them with yellow color. But when I do the same in Search&Replace (search ALT 0147 and replace with WHATEVER CHARACTER e.g. Q ) no changes are made Search of ALT 0148 is the same result.

I must open my ebooks in other text editor, replace quotation marks as I wish and then open it in calibre and convert to my favorit MOBI format.

Thanks for any help
Phate17 is offline   Reply With Quote
Old 04-22-2011, 11:50 AM   #12
Manichean
Wizard
Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Manichean's Avatar
 
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
I'll just quote this instead of retyping it:
Quote:
Originally Posted by ldolse View Post
You're going to have to supply more details about what you're doing, i.e. what exactly is the source, specific steps you're using for the whole import/conversion process, etc.. I suspect you have some sort of file encoding problem. UTF-8 works fine for us doing the same thing, so the most likely explanation is that your file is not UTF-8 and you haven't specified the correct encoding. Why the Regex wizard is still showing you a match is a bit of a mystery, but that's a slightly different code path from conversion.
Manichean is offline   Reply With Quote
Reply

Tags
bug, conversion, search & replace

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Using the Search & Replace feature Manichean Conversion 0 01-26-2011 05:26 PM
Search & Replace - Regular expression oldbwl Calibre 2 01-09-2011 09:33 AM
Search & Replace Suggestion Philosopher Calibre 6 12-31-2010 11:55 AM
Search & Replace Pat Nickholds Sigil 2 10-21-2010 11:18 PM
Search & replace TEXT ToeRag Calibre 3 04-10-2010 01:44 PM


All times are GMT -4. The time now is 06:23 AM.


MobileRead.com is a privately owned, operated and funded community.