![]() |
#1 |
Developer
![]() ![]() ![]() Posts: 155
Karma: 280
Join Date: Nov 2010
Device: Kindle 3 (Keyboard) 3G / iPad 9 WiFi / Google Pixel 6a (Android)
|
Receipe with raw string literal fails in calibre 3.44
Hi,
after the update from 3.39.1 to 3.44 I noticed that one of my recipes failed to load. The reason is that the recipe contains a windows path stored as a raw string literal, e.g. r"C:\Users\foo\Dropbox" I can work around the bug by using "C:\\Users\\foo\\Dropbox" instead, but as this is a regression I think it should be fixed in Calibre instead. Here is a small bogus recipe which shows this issue when loaded in Calibre 3.44 (32bit Windows version): Code:
from calibre.web.feeds.news import AutomaticNewsRecipe OUPUT_PATH = r"C:\Users\foo\Dropbox" class BasicUserRecipe(AutomaticNewsRecipe): title = u'Test based on Planet Python' language = 'en' __author__ = 'Jelle van der Waa' oldest_article = 10 max_articles_per_feed = 100 feeds = [(u'Planet Python', u'http://planetpython.org/rss20.xml')] Code:
calibre, version 3.44.0 ERROR: Invalid recipe: Failed to compile the recipe, with syntax error: (unicode error) 'rawunicodeescape' codec can't decode bytes in position 2-3: truncated \uXXXX (<string>, line 4) |
![]() |
![]() |
![]() |
#2 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,347
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
It's not a bug. recipe loading has been changed to use unicode literals to prepare for python 3. And in python 2 you cannot have a \U in a unicode string that is not of the form \U000xxxxxx
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Developer
![]() ![]() ![]() Posts: 155
Karma: 280
Join Date: Nov 2010
Device: Kindle 3 (Keyboard) 3G / iPad 9 WiFi / Google Pixel 6a (Android)
|
I don't quite understand why it is necessary to change this in python 2, as for python 3 r"C:\Users\foo\Dropbox" works just fine.
|
![]() |
![]() |
![]() |
#4 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,347
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
because the way python 3 porting is being done is that the codebase is being made as python 3 like as possible, incrementally, to catch as many bugs as possible, before pulling the trigger. That means making as many strings unicode as possible. Exposing end users to as few regressions as possible when making the move is one of my core principles of porting. If I cannot do the port in that fashion, it simply will not happen.
|
![]() |
![]() |
![]() |
#5 | |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Quote:
"regression" does not mean "it behaves differently", it means "supported functionality was lost or became broken". This has not happened here -- your ambiguous bytestring has never had a formally documented contract of correctness, and it was arguably always wrong, so "writing python2 compatible code" has always meant doing the most correct thing and being compatible with unicode strings. Given that unicode_literals is more helpful to people who contribute to calibre development than it is obstructionist to people who don't contribute to calibre development, it seems perfectly reasonable to carry on as-is. But if you have suggestions backed by code, that suit your needs while at the same time fitting in with the development ideologies represented by the last 9 months of calibre's git history and the various discussions occurring on Github, then by all means, suggest away -- we shall be all ears. |
|
![]() |
![]() |
Advert | |
|
![]() |
#6 | |
Developer
![]() ![]() ![]() Posts: 155
Karma: 280
Join Date: Nov 2010
Device: Kindle 3 (Keyboard) 3G / iPad 9 WiFi / Google Pixel 6a (Android)
|
Quote:
From my point of view, calibre "magically" changed my valid Python 2.7 raw string literal to an illegal raw unicode string literal, which breaks my recipe. |
|
![]() |
![]() |
![]() |
#7 | ||||
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Quote:
Quote:
Aside: it might be more advisable, at least for Windows, to explicitly work with bytes and decode to utf-8. Windows paths are weird and fragile things. But I don't know why your recipe hardcodes a path to C:\ so ![]() Quote:
Quote:
I'm not sure what else you want. Clearly, no other recipes had this issue. And no one can possibly know what issues you will be having with your recipes that hardcode paths to C:\, since no one else is doing any such thing and the one person who *is* (you) is not discussing his needs or use cases, nor contributing to the ongoing development of the recipe subsystem -- so who exactly is supposed to *know* that you're having an issue? This is the inevitable problem faced by lone wolf coders. Last edited by eschwartz; 06-30-2019 at 04:43 PM. |
||||
![]() |
![]() |
![]() |
#8 | |||
Developer
![]() ![]() ![]() Posts: 155
Karma: 280
Join Date: Nov 2010
Device: Kindle 3 (Keyboard) 3G / iPad 9 WiFi / Google Pixel 6a (Android)
|
Quote:
Quote:
Quote:
|
|||
![]() |
![]() |
![]() |
#9 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,347
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Once again, strings are being changed to unicode literals throughout calibre's codebase. This is the best incremental way to minimize regressions when moving to python3.
And I will note that calibre has not used bytestrings for paths for almost a decade. The original python2 decision to use bytestrings for paths was probably inspired by linux's bone-headed decision to have paths be bags of bytes in unknown encodings. I reversed that python 2 mistake a long time ago. So using them in recipes is unsupported. You should have been using unicode literals in recipes, anyway. |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Receipe request for Calibre | indianinva | Recipes | 1 | 01-14-2019 07:06 AM |
Baltimore Sun Receipe Fails | tlchost | Recipes | 5 | 01-29-2013 01:38 PM |
Malkin Receipe Fails | tlchost | Recipes | 0 | 01-18-2013 02:22 PM |
Mathch a string while ignoring some character in that string? | ElMiko | Sigil | 12 | 12-01-2011 10:05 PM |
Calibre: TypeError:expected string or buffer | Jonimeesermann | Calibre | 4 | 10-02-2010 11:40 AM |