Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Library Management

Notices

Reply
 
Thread Tools Search this Thread
Old 03-01-2015, 09:30 PM   #1
JohnnyBook
Groupie
JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.
 
Posts: 196
Karma: 126824
Join Date: Dec 2008
Location: Out There
Device: K3 W/3G (Fixed screen!) & Paperwhite Wifi
Regex help on reading Metadata from file name.

90% of my files are in the format: (The other 10% do not include the pub date)

Format: Series-series number title (author) Pub date.txt
(Series 2-4 letters)
(Series number 2-4 digits)
(pub date-year only)

ex.

BA-123 How it works (John Smith) 1989.txt
ROT-4089 Make it this way (Jane Smith) 2009.txt

Playing around with the regex I was able to separate the series and number but I could not work out the title and author Typicaly I ended up with

Title: How it works (John Smith
Author: )

And pubdate does not work at all.

Unfortunetly I kept changing it around and now it does not work at all and I cant remember what I had that almost worked.

One thing I have had trouble with is the "(" and ")" and trying to search for them in the title. I CAN search and replace the titles to remove them to substitute them for another character to make it easier to run a regex if necessary (just not "-" as some titles have a "-" in them.

Anyone have any clue how to do this?

edit: This is as close as I can come to what I had
Code:
(?P<series>[^_0-9-]*)-(?P<series_index>[0-9]*)(?P<title>[^_-]+) \(?(?P<author>[^_].+) -?(?P<date>[^_].+) ?
JohnnyBook is offline   Reply With Quote
Old 03-01-2015, 09:39 PM   #2
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Code:
(?P<series>[A-Za-z ]+)-(?P<series_index>[0-9]+) (?P<title>.+) \((?P<author>[A-Za-z. ]+)\) (?P<published>[0-9]+)
Pubdate seems to be making up its own month/day

Last edited by eschwartz; 03-01-2015 at 09:47 PM.
eschwartz is offline   Reply With Quote
Advert
Old 03-02-2015, 01:07 AM   #3
JohnnyBook
Groupie
JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.
 
Posts: 196
Karma: 126824
Join Date: Dec 2008
Location: Out There
Device: K3 W/3G (Fixed screen!) & Paperwhite Wifi
Great that seems to have done it. (At least in the test box) I will run some books through and see if importing works... But I am sure it will.

Thanks,
JohnnyBook is offline   Reply With Quote
Old 03-02-2015, 10:08 AM   #4
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
You're welcome.
eschwartz is offline   Reply With Quote
Old 03-04-2015, 05:07 PM   #5
JohnnyBook
Groupie
JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.
 
Posts: 196
Karma: 126824
Join Date: Dec 2008
Location: Out There
Device: K3 W/3G (Fixed screen!) & Paperwhite Wifi
Ok, in the test box it works perfectly. (and as you noted the strange bug, all the "published" years have Mar-15 appended for month and day)

and once files are processed, in Calibre the data is correct except....

The titles are the original file name, not the book tittle from the add books regex.

ex,
ROT-4089 Make it this way (Jane Smith) 2009.txt
in the regex box is:

Series: ROT [4089]
Author: Jane Smith
Title: Make it this way
Published: 2009-03-15

But once imported in Calibre
Title:ROT-4089 Make it this way (Jane Smith) 2009

Why is it ignoring the regex for the tittle?

EDIT: Never mind. I forgot to uncheck the box for read from metadata instead of file name.
JohnnyBook is offline   Reply With Quote
Advert
Old 03-04-2015, 05:54 PM   #6
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Glad you figured it out!

Also, regarding your PM re: covers -- no, there is no good solution for getting covers for a TXT file.
You can import using Click image for larger version

Name:	calibre-add-books.png
Views:	230
Size:	57.3 KB
ID:	135571 but you would need a metadata.opf also.

Alternatively, do a metadata download (shortcut key is CTRL+D) which redownloads covers from various sources... which also takes time, although you can set it running automatically.
eschwartz is offline   Reply With Quote
Old 03-04-2015, 06:39 PM   #7
JohnnyBook
Groupie
JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.
 
Posts: 196
Karma: 126824
Join Date: Dec 2008
Location: Out There
Device: K3 W/3G (Fixed screen!) & Paperwhite Wifi
Yes, that is how I import them, I may have to use the method posted in that thread I linked in my PM to generate covers and opfs, then do a cover substitution and re-import.

Thanks for your assistance.
JohnnyBook is offline   Reply With Quote
Old 03-04-2015, 08:32 PM   #8
JohnnyBook
Groupie
JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.
 
Posts: 196
Karma: 126824
Join Date: Dec 2008
Location: Out There
Device: K3 W/3G (Fixed screen!) & Paperwhite Wifi
Quote:
Originally Posted by eschwartz View Post
Code:
(?P<series>[A-Za-z ]+)-(?P<series_index>[0-9]+) (?P<title>.+) \((?P<author>[A-Za-z. ]+)\) (?P<published>[0-9]+)
Pubdate seems to be making up its own month/day
Ok. it works in Calibre, except, if the book does not have a published date... then it fails and no parsing is done at all and the title is the file name and the rest are blank.

Is there a way to make the published date part optional?

Last edited by JohnnyBook; 03-04-2015 at 08:35 PM.
JohnnyBook is offline   Reply With Quote
Old 03-04-2015, 11:10 PM   #9
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Yes, append that group with a question mark.
Code:
(?P<series>[A-Za-z ]+)-(?P<series_index>[0-9]+) (?P<title>.+) \((?P<author>[A-Za-z. ]+)\) (?P<published>[0-9]+)?
If I had known the published date was supposed to be optional, I would've done that in the first place. You get what you asked for.
If I didn't have short-term memory loss and forget part of the OP...

Last edited by eschwartz; 03-05-2015 at 04:44 AM.
eschwartz is offline   Reply With Quote
Old 03-04-2015, 11:49 PM   #10
JohnnyBook
Groupie
JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.
 
Posts: 196
Karma: 126824
Join Date: Dec 2008
Location: Out There
Device: K3 W/3G (Fixed screen!) & Paperwhite Wifi
Quote:
Originally Posted by eschwartz View Post
Yes, append that group with a question mark.
Code:
(?P<series>[A-Za-z ]+)-(?P<series_index>[0-9]+) (?P<title>.+) \((?P<author>[A-Za-z. ]+)\) (?P<published>[0-9]+)?
If I had known the published date was supposed to be optional, I would've done that in the first place. You get what you asked for.

First line in my OP.

"90% of my files are in the format: (The other 10% do not include the pub date)"

In any case, Thanks bunches for all your help.

EDIT: Nope still did not do it... Maybe since it does not find the space after the Author, it still fails?

ROT-4089 Make it this way (Jane Smith) 2009.txt
ROT-4089 Make it this way (Jane Smith).txt

The first works, the second is not parsed.

Last edited by JohnnyBook; 03-05-2015 at 12:01 AM.
JohnnyBook is offline   Reply With Quote
Old 03-05-2015, 04:43 AM   #11
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Heh, overlooked that, sorry. That is indeed my fault then.


Dur -- because it was still trying to find a space at the end. This time I did the right thing and tested it.
Code:
(?P<series>[A-Za-z ]+)-(?P<series_index>[0-9]+) (?P<title>.+) \((?P<author>[A-Za-z. ]+)\) ?(?P<published>[0-9]+)?
eschwartz is offline   Reply With Quote
Old 03-05-2015, 07:35 PM   #12
JohnnyBook
Groupie
JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.JohnnyBook holds these truths to be self-evident.
 
Posts: 196
Karma: 126824
Join Date: Dec 2008
Location: Out There
Device: K3 W/3G (Fixed screen!) & Paperwhite Wifi
That did it. it works great now.

And it looks like that other thread, to generate covers and opfs, then do a cover substitution and re-import is actually pretty fast and easy to do.

https://www.mobileread.com/forums/sho...d+cover&page=2
JohnnyBook is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Missing cover when reading metadata from file name emphyrion Library Management 4 01-31-2014 09:49 AM
Reading some fields from filename and others from file metadata Daniel_321 Calibre 1 11-25-2012 07:14 AM
Metadata/regex help lathom Library Management 3 11-10-2011 01:52 PM
RegEx - filename metadata help ejjenkins Calibre 4 12-28-2010 05:47 PM
Recognition of author and title from html files/reading metadata from a seperate file Lethe Calibre 5 04-03-2010 08:35 AM


All times are GMT -4. The time now is 07:09 PM.


MobileRead.com is a privately owned, operated and funded community.