![]() |
#1 |
Connoisseur
![]() Posts: 76
Karma: 22
Join Date: Mar 2008
Location: uk
Device: Sony PRS505
|
libprs500 - title/author matching regex
I've just started playing with libprs500 (0.4.46) in preperation for a Sony PRS505 I have on the way and I'm having a spot of bother trying to get the standard regex to correctly identify the author and title from the filename.
The standard syntax I believe is: (?P<author>.+) - (?P<title>[^_]+) Which, if in the test box, I paste in the following string "H.P Lovecraft - At the Mountains of Madness.txt" correctly reports the following: Title: "At the Mountains of Madness" Author: "H.P. Lovecraft" Series: "No Match" Series Index: "No Match" However, actually importing that same file into the library displays the following: Title: "H.P. Lovecraft - At the Mountains of Madness" Author: "H.P. Lovecraft" (all other columns are blank as expected) Is this standard behaviour or a bug? |
![]() |
![]() |
![]() |
#2 |
Connoisseur
![]() Posts: 76
Karma: 22
Join Date: Mar 2008
Location: uk
Device: Sony PRS505
|
Upon further investigation it only seems to do this with PDF documents; the author and title fields seem to map correctly against html, zip and text based files.
So if I rename a pdf, an html file, a text file and a zip all to the same name: wibble - wobble.[pdf|zip|txt|html] ...then the html, text and zip version of the file will all correctly display as title="wobble", author="wibble". However the pdf file will show as title="wibble - wobble" and author="wibble". |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,267
Karma: 27111060
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
libprs500 tries to read metadata from the file itself first. Only if that fails does it use the filename.
|
![]() |
![]() |
![]() |
#4 |
Connoisseur
![]() Posts: 76
Karma: 22
Join Date: Mar 2008
Location: uk
Device: Sony PRS505
|
Is this right though? I've attached an example of the difference in behaviour with the same filename for three different file types. There is no metadata set in the PDF file.
|
![]() |
![]() |
![]() |
#5 |
Connoisseur
![]() Posts: 76
Karma: 22
Join Date: Mar 2008
Location: uk
Device: Sony PRS505
|
Ok, digging a bit and would I be correct in thinking that pdf-meta.exe is used to determine the author and title of PDF documents?
Running pdf-meta on my renamed document I get the following: pdf-meta.exe author\ -\ title.pdf Title : author - title Author : Unknown Publisher: None Category : None Comments : None ISBN : None It looks like libprs500 is taking the Title as shown by pdf-meta and not running the regex to split it based on the filename. I have a whole load of PDF docs that have varying states of correct/incorrect meta data and I'd rather load them into libprs500 using the filenames to determine author and title. Other than using pdftk and writing a script to recurse through all of my files to insert metadata based on the filename, can we force libprs500 to use the filename instead, even for PDF's? |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,267
Karma: 27111060
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Open a ticket for a config option to customize this behavior.
|
![]() |
![]() |
![]() |
#7 | |
Connoisseur
![]() Posts: 76
Karma: 22
Join Date: Mar 2008
Location: uk
Device: Sony PRS505
|
I've recursed through all of my PDF documents and ran the following script:
Quote:
AUTHOR - SERIES - TITLE.pdf or AUTHOR - TITLE.pdf However... libprs500 is still displaying the PDF files that I have correctly set the metadata on in the form of "author - title". Almost as if it is ignoring both the metadata *and* the filename regex pattern matching altogether and simply using the filename, minus the pdf extension. |
|
![]() |
![]() |
![]() |
#8 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,267
Karma: 27111060
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
What does pdf-meta give you on the corrected PDF files?
|
![]() |
![]() |
![]() |
#9 | |
Connoisseur
![]() Posts: 76
Karma: 22
Join Date: Mar 2008
Location: uk
Device: Sony PRS505
|
pdf-meta now shows the correct author, but the title is still the filename minus the extension. e.g.
Quote:
|
|
![]() |
![]() |
![]() |
#10 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,267
Karma: 27111060
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Attach one of these PDF files here
|
![]() |
![]() |
![]() |
#11 |
Connoisseur
![]() Posts: 76
Karma: 22
Join Date: Mar 2008
Location: uk
Device: Sony PRS505
|
Ok, will do that when I get back in from work.
|
![]() |
![]() |
![]() |
#12 | ||
Connoisseur
![]() Posts: 76
Karma: 22
Join Date: Mar 2008
Location: uk
Device: Sony PRS505
|
Ok, this is a version of Douglas Adams HHGTTG. Not a great version, but that's not relevant.
Original version Quote:
Quote:
|
||
![]() |
![]() |
![]() |
#13 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,505
Karma: 145863177
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
I've had to remove the attachments as they are of a copywritten book. Please use the Libprs500 website's ticket system to attach them there.
|
![]() |
![]() |
![]() |
#14 | |||
Connoisseur
![]() Posts: 76
Karma: 22
Join Date: Mar 2008
Location: uk
Device: Sony PRS505
|
My apologies. Here's one that's now in the public domain. E.E Smith's 'Triplanetary'.
No metadata to start with. Metadata added with the following command: Quote:
Quote:
Quote:
|
|||
![]() |
![]() |
![]() |
#15 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,267
Karma: 27111060
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Fixed in svn
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Changing from Title-Author to Author - Title | Amalthia | Calibre | 17 | 01-22-2017 11:20 PM |
looking for a book title and author | Joebill | Reading Recommendations | 16 | 05-23-2010 06:07 AM |
Regex search author field to locate books? | Starson17 | Calibre | 2 | 12-21-2009 10:40 AM |
Author Plus Title Folders | gargoyle67 | Calibre | 2 | 12-15-2009 05:07 PM |
libprs500 - Author Alphabetizing | bingle | Sony Reader | 5 | 10-07-2007 08:05 PM |