View Single Post
Old 09-29-2011, 10:28 PM   #9
unboggling
Wizard
unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.
 
Posts: 1,065
Karma: 858115
Join Date: Jan 2011
Device: Kobo Clara, Kindle Paperwhite 10
Quote:
Originally Posted by Starson17 View Post
Odd, the posted regex doesn't seem to work for the Save to Disk template above.
Preferences/SaveToDisk, Save metadata in OPF file, was checked. When I turned it off, you're right, that previously posted regex didn't work.

Quote:
Originally Posted by Starson17 View Post
Code:
{author_sort}/{title}/{title} - {authors} - (({id}))
If it's working for you, that's fine, but if not, add this to the end :
Code:
 - \(\(\d+\)\)
to strip off " - ((id#))" at the end of your filename.
Thanks. I played with that and various permutations of Save templates and Add regexes for awhile. Recently I switched to using FN LN instead of LN, FN for Authors column, and I wanted to make all of these relatively congruent: my Save template, Add regex, FN LN convention in calibre, and usual browsing-order/sorting-convention in calibre (Authors, Series, Title).

For Save template, I went with single parentheses instead of double parentheses around the "{id}". I like that the {id} makes any saved book-record's filename unique so solves any OS filename conflict problems for identical Authors/Titles. This is what I decided to use as Save Template for now:
Code:
{author_sort}/{title}/{authors} - {series}{series_index:0>2s| | - }{title} - ({id})
For Add regex, I had been using this for awhile but it wasn't doing enough of what I wanted, and I didn't understand it well enough to mess with (though I tried…):
Code:
(?P<author>[^_-]+) -?\s*(?P<series>[^_0-9-]*)(?P<series_index>[0-9]*)\s*-\s*(?P<title>[^_].+) ?
So I also changed to a slightly simpler Add regex, that does mostly what I wanted, that I understand a little better:
Code:
(?P<author>[^_]+?) - ((?P<series>.*) (?P<series_index>[0-9]*) - )?(?P<title>.+)
I usually Save To Disk with checkboxes checked to Save cover separately, Update metadata in saved copies, and Save metadata in OPF file. I usually do Adds with checkbox checked to Read from file contents, only using that checkbox unchecked to Read from filename as a last resort when Reading from file contents didn't work well. So for my purposes, for Adding I don't usually need to use Reading by filename or include the phrase Starson suggested for the ID digits that were appended to filename in the Save template.

For Add reading from filename, trying various permutations of "( - \(\d+\))" on various Add regexs, I couldn't get any to work in all cases of has ID versus doesn't have ID, only for has but not doesn't have, and vice versa. I was probably doing something wrong in my regex noobness, but couldn't make any of these work well in enough cases (listing for illustration purposes just a few attempts with just the simpler Add regex):

Code:
(?P<author>[^_]+?) - ((?P<series>.*) (?P<series_index>[0-9]*) - )?(?P<title>.+)(| - \(\d+\)|)??

(?P<author>[^_]+?) - ((?P<series>.*) (?P<series_index>[0-9]*) - )?(?P<title>.+)(| - \(\d+\)|)?

(?P<author>[^_]+?) - ((?P<series>.*) (?P<series_index>[0-9]*) - )?(?P<title>.+)( - \(\d+\))?

(?P<author>[^_]+?) - ((?P<series>.*) (?P<series_index>[0-9]*) - )?(?P<title>.+)( - \(\d+\))
I wonder if the greedy ".+" after <title> was interfering with my trying to make optional the "\d+" phrase. If I'm interpreting it correctly, the ".+" after <title> was saying match any single character except LF or CR (that was the dot), repeat that once or more (that was the plus), and include all of that in the <title> capture (that was the entire title phrase). That seemed to be the cause of including the "\d+" element in the title capture rather than ignoring it. But removing the ".+" after <title> made things worse rather than better. Sigh. I don't want the ID included in title capture, whether it exists in the filename or not.

I'd appreciate it if anyone can explain what I was doing wrong with the "\d+" phrasing or anything else.

Last edited by unboggling; 09-29-2011 at 10:35 PM. Reason: clean-up
unboggling is offline   Reply With Quote