08-02-2012, 09:39 AM | #16 |
Guru
Posts: 776
Karma: 2751519
Join Date: Jul 2010
Location: UK
Device: PW2, Nexus7
|
My post had nothing to do with how much you have done for calibre, (I may well be using features that you worked on), but was simply a comment on your attitude regarding the lxml issue.
|
08-02-2012, 10:08 AM | #17 |
onlinenewsreader.net
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
That is not what the traceback says, and I confirmed that with a try/except traceback at line 177 in templates.py
|
08-02-2012, 10:16 AM | #18 |
creator of calibre
Posts: 43,857
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
A try except does not confirm anything other than that an exception occurs there. something that is already evident from the traceback. The question is why the exception occurs. Since non printable ascii chars and null chars have already been removed, you need to figure out why the exception is occurring and if you do figure that out, come up with a patch that prevents it happening in general.
|
08-02-2012, 10:44 AM | #19 |
onlinenewsreader.net
Posts: 324
Karma: 10143
Join Date: Dec 2009
Location: Phoenix, AZ & Victoria, BC
Device: Kindle 3, Kindle Fire, IPad3, iPhone4, Playbook, HTC Inspire
|
And the message in the original traceback means nothing? My purpose in placing the try was to confirm where the error arose. I then checked the periodical being processed and sure enough the garbage characters showed up in the web browser. So I'm not sure what further diagnosis cab be done except to look at the code you say is stripping out these characters. If you point me at the code i will have a look.
|
08-02-2012, 11:17 AM | #20 |
creator of calibre
Posts: 43,857
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Error messages often dont mean what they say. People rarely update error messages to cover every condition that can trigger the error.
Article.__init__ in feeds/__init__.py "Garbage characters" doesn't mean much. The fact that the characters are not interpretable by a human does not neccessarily mean anything. lxml chokes on unicode strings that ascii control codes or null characters. Both those are taken care of already. If there is some other well defined character set that lxml chokes on, that is not valid in a unicode text string, then that can be added, though I suspect that is not the case. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
0.7.30 crash | nickredding | Calibre | 1 | 11-27-2010 01:40 PM |
Pseudo-crash w/V 6.39 | petercreasey | Calibre | 12 | 02-11-2010 05:59 AM |
calibre-0.6.31, mechanize and lxml | taurnil | Calibre | 5 | 01-01-2010 07:47 AM |
calibre python-lxml problem on ubuntu | carpii | Calibre | 5 | 11-29-2008 05:34 AM |
upgrade failed - but not python-lxml fault | alexxxm | Calibre | 7 | 10-06-2008 09:36 AM |