08-21-2011, 04:58 PM | #91 |
Junior Member
Posts: 3
Karma: 10
Join Date: Aug 2011
Device: Android smartphone
|
Helo
With update of plugin i have this error: Code:
calibre, version 0.8.15 ERROR: Nieznany wyjątek: <b>UnicodeEncodeError</b>:'latin-1' codec can't encode character u'\u0142' in position 65: ordinal not in range(256) Traceback (most recent call last): File "calibre_plugins.search_the_internet.action", line 110, in search_web_link File "calibre_plugins.search_the_internet.action", line 116, in search_web_for_book File "calibre_plugins.search_the_internet.action", line 143, in open_tokenised_url UnicodeEncodeError: 'latin-1' codec can't encode character u'\u0142' in position 65: ordinal not in range(256) Sorry for my english... |
08-21-2011, 05:11 PM | #92 |
Calibre Plugins Developer
Posts: 4,636
Karma: 2162064
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
@Puciek - welcome to MobileRead.
As for your error - I guess it must be a side-effect of the changes made to the safe_format() functions that I had to switch between since Calibre code got ripped out I used to use. There is quite possibly a simple fix I can make in my code to workaround this but hopefully chaley or Kovid can tell me what it is. If they happen to stumble across this thread, the code is effectively doing this: Code:
url = template_formatter.safe_format(tokenised_url, fixed_vals, 'STI template error', mi) open_url(QUrl.fromEncoded(url)) Code:
http://www.foo.com/myquery?search=ęń |
Advert | |
|
08-21-2011, 10:51 PM | #93 |
creator of calibre
Posts: 43,835
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
The changes to safe_formatter were for thread safety, I don't see how they could possibly have any bearing on this. Why are you using QUrl.fromEncoded
As per the Qt documentation: Parses input and returns the corresponding QUrl. input is assumed to be in encoded form, containing only ASCII characters. |
08-22-2011, 03:44 AM | #94 |
Calibre Plugins Developer
Posts: 4,636
Karma: 2162064
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
I only mentioned safe_formatter because the user said the problem occurred only when they upgraded to the latest version. The latest version only had a one line change of the import statement from the old safe_formatter that got deleted to the new one. Perhaps the user was wrong and this was always broken for this particular scenario, or perhaps there has been some other change to Qt. Or perhaps they were on a REALLY old version of this plugin.
I should have posted extra lines to give more info. The reason I use QUrl.fromEncoded is because the Url is not exactly like how I posted. Instead there are two further complications: (1) The url may contain quotes as part of it's template specification, encoded as %22 Code:
http://www.google.com/#sclient=psy&q=%22{author}%22+%22{title}%22 Code:
text = quote_plus(text.encode(encoding, 'ignore')) So instead I was using QUrl.fromEncoding(url) - which solves the %22 stuff but apparently will barf on the other "encoded" foreign characters... Last edited by kiwidude; 08-22-2011 at 04:09 AM. |
08-22-2011, 11:13 AM | #95 |
creator of calibre
Posts: 43,835
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Use the unquote function from urllib and then QUrl(unquoteed_url)
|
Advert | |
|
08-22-2011, 03:41 PM | #96 |
Calibre Plugins Developer
Posts: 4,636
Karma: 2162064
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Thx Kovid. Unfortunately the problem with doing that is it "undoes" the other stuff that the quoting is trying to achieve. Take for example this book author/title:
E. M. Cioran De l'inconvénient d'être né To search Amazon.com.fr for that book, what I need supplied is for the URL to be in this format: Code:
http://www.amazon.fr/s/ref=nb_sb_noss?url=search-alias%3Dstripbooks&field-keywords=E.%20M.%20Cioran+De%20l%27inconv%E9nient%20d%27%EAtre%20n%E9 However if instead I use open_url(QUrl(unquote(url))) then what gets generated is this: Code:
http://www.amazon.fr/s/ref=nb_sb_noss?url=search-alias=stripbooks&field-keywords=E.%20M.%20Cioran+De%20l%27inconv%C3%A9nient%20d%27%C3%AAtre%20n%C3%A9 "E. M. Cioran De l'inconvénient d'être né" My head hurts |
08-22-2011, 03:54 PM | #97 | |
Junior Member
Posts: 3
Karma: 10
Join Date: Aug 2011
Device: Android smartphone
|
Quote:
Last working version of the plugin was 1.6.5 |
|
08-22-2011, 04:52 PM | #98 |
creator of calibre
Posts: 43,835
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Try
QUrl.fromEncoded(quote(unquote(url))) The proper fix is to not have partially quoted URLs in the first place. Start with keeping your URLs pure strings with no quoting, and only quote them just before calling open_url. |
08-22-2011, 05:27 PM | #99 |
Calibre Plugins Developer
Posts: 4,636
Karma: 2162064
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Thanks but that doesn't work either I'm afraid. All very frustrating for something that should be so freaking simple!
I have a template url which could be something like: Code:
http://www.google.de/#sclient=psy&q=%22{author}%22+%22{title}%22 I think it is the open_url function requiring a QUrl which is making this all go pear shaped, unlike webbrowser.open() call which took the string directly. To keep as a string and delay quoting, that would imply my starting point is something like this: Code:
http://www.google.de/#sclient=psy&q="{author}"+"{title}" |
08-22-2011, 05:35 PM | #100 |
creator of calibre
Posts: 43,835
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
quoting is only supposed to apply to the path component of a URL. So you'd do it like this:
1) Start unencoded 2) Use urlparse.urlsplit to split the url into components. 3) Quote the correct components (path and query) encoded as UTF-8 quote(component.encode('utf-8')) 4) Use urlparse.urlunsplit to reconstitute the url 5) Use QUrl.fromEncoded Last edited by kovidgoyal; 08-22-2011 at 05:41 PM. |
08-22-2011, 05:40 PM | #101 |
creator of calibre
Posts: 43,835
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
And IIRC for amazon rather than using utf-8 you want latin1
|
08-22-2011, 06:06 PM | #102 |
Calibre Plugins Developer
Posts: 4,636
Karma: 2162064
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Thx for your patience Kovid. I'm trying what you suggest, but still no joy. The problem is that I have the "query" part from urlsplit, which will look like this:
'q="Abraham, Daniel"+"An Autumn War"' Now if I do as you suggest and use quote(parts[3].encode(encoding, 'ignore')) that will turn into something like this: 'q%3D%22Abraham%2C%20Daniel%22%2B%22An%20Autumn%20 War%22' So it is encoding things like the = and + signs. |
08-22-2011, 06:22 PM | #103 |
creator of calibre
Posts: 43,835
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Then do the quoting in the format() function.
final_url=template.format(title=quote(titlevar), author=quote(authorvar)) |
08-23-2011, 02:25 PM | #104 |
Junior Member
Posts: 3
Karma: 10
Join Date: Jun 2011
Device: Kindle
|
Excellent, thanks
|
08-23-2011, 03:48 PM | #105 |
Calibre Plugins Developer
Posts: 4,636
Karma: 2162064
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
@Kovid - but there's a hole in my bucket...
How is that any different to what I began with? Other than what you show does not include the .encode()? I'm close to the point of saying "sod the linux users" and putting it back to webbrowser.open(). |
Tags |
book details, search the internet |
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Any web-to-epub plugin for internet browser? | bthoven | ePub | 7 | 07-10-2011 05:14 AM |
Fictionwise Browser Search Plugin | Zero9 | Deals and Resources (No Self-Promotion or Affiliate Links) | 17 | 07-27-2009 03:15 PM |
Diesel eBooks Browser Search Plugin | Zero9 | Deals and Resources (No Self-Promotion or Affiliate Links) | 10 | 07-27-2009 12:16 PM |
eReader.com Browser Search Plugin | Zero9 | Deals and Resources (No Self-Promotion or Affiliate Links) | 0 | 07-24-2009 09:44 PM |
BooksOnBoard Browser Search Plugin | Zero9 | Deals and Resources (No Self-Promotion or Affiliate Links) | 10 | 07-24-2009 03:27 PM |