![]() |
#1 |
Member
![]() Posts: 17
Karma: 10
Join Date: Apr 2019
Device: Android phone
|
Metadata download plugin help with text encoding disorder
Hello, this is for me difficult to trace, I'm writing metadata download plugin and got stuck at extracting metadata from book details page.
Doing the testing with this book Serhii Plokhy - Chernobyl: The History of a Nuclear Catastrophe Firstly, the author field seems correctly extracted to authors string and it prints to log as 'Serhii Plokhy', but when constructing a Metadata structure by Code:
mi = Metadata(title, authors) Code:
Author(s) : S & e & r & h & i & i & & P & l & o & k & h & y Secondly, which may be related when parsing other book details like publisher, tags etc. from details table, the data are stored in table Code:
<tr> <td>name</td> <td>value</td> <tr> Yet another difficulty with debugging, I'm not able to figure out where log.info(...), log.debug(...) and log.error(...) commands print. Calling calibre-debug -opens a textual log after closing Calibre, but the log doesn't contain any debug info printed by anu of these commands. What only works for me is using print(...) instead which appears in %temp%/calibre_XXXXXX/*.log files. I need a clue how to debug log properly. Last edited by Ubiquity; 06-26-2019 at 04:47 AM. |
![]() |
![]() |
![]() |
#2 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,345
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
That indicates you are setting the value of authors to a string instead of a list of strings. And the output of the log statements will go into the metadata dwnload log, which you get by clicking the view log button on the download dialog.
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
As Kovid said. But also keep in mind that when you try to use a string in a place where a list is expected, the string will be turned into a list by splitting each character. Then each character is treated as a separate author name, joined by "&", and potentially gets sorted alphabetically.
|
![]() |
![]() |
![]() |
#4 |
Member
![]() Posts: 17
Karma: 10
Join Date: Apr 2019
Device: Android phone
|
I wouldnot say that my artist string is in list form. Fetching it and joining to reult string which initializes Metadata structure
Code:
authors = root.xpath('//h2[@class="authornames"]/span/a/text()') if authors: authors = ' & '.join(authors).strip() . . . mi = Metadata(title, authors) If I parse the detail page in testing standalone script, all acuted string literals print out in proper form. If I print same string constants from plugin source, they print out in broken form, representing higher ascii characters in two byte unreadable chunks. But both plugin sources and testing script are saved in UTF-8 encoding. |
![]() |
![]() |
![]() |
#5 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,211
Karma: 1419583
Join Date: Dec 2016
Location: Goiânia - Brazil
Device: iPad, Kindle Paperwhite, Kindle Oasis
|
Quote:
So, you should initialize your variable as a list, and then append the values, like this example: Code:
authors = [] for author_node in author_nodes: authors.append(author_node.text_content().strip()) |
|
![]() |
![]() |
Advert | |
|
![]() |
#6 | ||
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Quote:
Quote:
|
||
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
LibraryThing metadata download plugin | rtype | Plugins | 30 | 09-08-2015 07:24 PM |
Regarding using metadata objects in identify method of metadata download plugin api | aprekates | Development | 1 | 07-06-2014 03:35 AM |
[Metadata download plugin] CBDB.cz | cerda | Plugins | 0 | 07-23-2013 11:58 AM |
[Metadata Download Plugin] Goodreads Metadata **Deprecated** | kiwidude | Plugins | 30 | 04-23-2011 02:10 PM |