View Single Post
Old 12-30-2013, 02:13 PM   #406
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Quote:
Originally Posted by ACGAuthor View Post
So, after spending a few days fooling around with things, here are places that I've noticed the three-column "General Metadata" plugboard method fails:

1) As already mentioned, any title with an ampersand (&) gets a space after the ampersand when it's reduced down to its initials. (I know eschwartz gave me some code that is supposed to fix this, but as I mentioned in my other posts, it resulted in an error.)

So "Title & Title" would be abbreviated, "T& T" this gets especially absurd when you have a series name like (and this is an honest-to-god series) "Vampires & Mages & Weres, Oh My!" which is them abbreviated to "V& M& WOM"

However, if the title were "Title and Title" it would be abbreviated "TaT" so clearly inserting the space after the ampersand is something that only happens there.

It would be an easy enough fix to just change all series titles to use "and" instead of "&" but sometimes that doesn't happen when you're retrieving the metadata off the web and you have to go through and check everything manually.

Personally I would prefer to see "Title and Title" abbreviated as "T&T" instead of "TaT", however, this would mess with titles using "and" in any other context (off the top of my head I can't come up with another context for it but I'm sure one exists somewhere.)

Actually, this seems to be a recurring problem with any single-letter word, because....

2) If a series title begins with the word "I" (as a personal pronoun referring to oneself" it gets a space after it. So a series titled "I Spy" would be abbreviated "I S" Again, not sure how to fix this without messing with other words containing the letter "i".

3) Titles with an "a" as a single word in the middle get a space after the "a" when being reduced to their initials. Eg: "Measure of a Man" is abbreviated to "Moa M" and "What's a Boy to Do?" would be abbreviated "Wa BtD." As above, with "i", I'm not sure how to fix this without messing up anything containing an "a" as part of a word.

4) If you have a series title like "A to Z" it would be nice to abbreviate it to "A-Z" but again, this would mess the word "to" in any other context (eg: "Brothers to the End" would then be abbreviated "B-tE" which just doesn't work. Not sure what to do about that.)

5) Another interesting way to shorten series titles would be to get rid of extraneous words at the end as well as at the beginning. A (partial, I'm sure) list I came up with would be: Series, Stories, Novels, Books, Trilogy, etc.

For example (for you, eschwartz) if we added "Files" onto that list, the series "The Dresden Files" would be shortened to "Dresden" instead of "DF". So if the series name is "The Home Series" (as in, that is the metadata that downloads, complete with "Series" at the end) it would be shortened to "Home" rather than "HS." "The Impulse Trilogy" would be "Impulse" instead of "IT".

6) This one isn't limited to the 3-step General Metadata method, but it would be great to have a way to format the digits for series so that they stay in sequential order if they have numbers >1 with a decimal place and have 10 or more titles in the series (ie 2.50 happens sequentially after 10, because it isn't formatted as 02.50.)

Also, if there is a series with over 100 titles (I happen to have one; it's not so much of a series as it is a collection of shorts people submitted for an event over on Goodreads, and there are ~190 of them) then you have the following situation:

01, 02 ... 09, 10, 100, 101 ... 109, 11, 110, 111 ... 119, 12, 120, 121 ... and so forth. In this case, there would not need to be any accommodation for decimal places, but for proper sequencing, we would need 001, 002, 003 ... 011, 012, etc.

This is all really nitpicky stuff that probably only matters to the exceptionally anal-retentive, but it's been interesting hunting down these sorts of glitches, and if eschwartz or someone wants to stretch the boundaries of what to do with these plugboards, this could give them something to chew on, so I thought I would share.
You make an excellent point about single-letter words, this actually gives me a new direction to think in. I'll get back to you. Meanwhile,...

4 & 5 I really think the best way to deal with it is to change the actual series name, you can do this really fast in the tag editor to the left. Right-click on any series name and choose "rename", it will change for all books with that metadata field.

6 can be done easily, no problem: In the calibre manual for advanced formatting is all kinds of cool stuff, which you may want to take a look at for lots of cool new tricks, but this particular bit is what you are looking for right now:

Spoiler:
Quote:
Second: formatting. Suppose you wanted to ensure that the series_index is always formatted as three digits with leading zeros. This would do the trick:

Quote:
{series_index:0>3s} - Three digits with leading zeros
If instead of leading zeros you want leading spaces, use:

Quote:
{series_index:>3s} - Three digits with leading spaces
For trailing zeros, use:

Quote:
{series_index:0<3s} - Three digits with trailing zeros
If you use series indices with sub values (e.g., 1.1), you might want to ensure that the decimal points line up. For example, you might want the indices 1 and 2.5 to appear as 01.00 and 02.50 so that they will sort correctly. To do this, use:

Quote:
{series_index:0>5.2f} - Five characters, consisting of two digits with leading zeros, a decimal point, then 2 digits after the decimal point
If you want only the first two letters of the data, use:

Quote:
{author_sort:.2} - Only the first two letter of the author sort name
The calibre template language comes from python and for more details on the syntax of these advanced formatting operations, look at the Python documentation.

using ":" after the field name says lets go advanced formatting. "0" says lets add zeros. ">" says lets add them to the beginning ("<" would put them at the end) and "3s" says let there always be at least 3 characters in the string.

So "{series_index:0>3s}" will add enough leading zeros to make sure there are always at least 3 characters.

If you wish to take into account decimal-notated short stories, use "5.2f" instead, which adds the "0" to beginning and end to format the number as "5" digits, with ".2" or two digits AFTER the decimal. The value is calculated as a "f" that takes decimals and is very strict that there will always be "5" digits.



So: Use {series_index:0>3s} for series' into the hundreds, and {series_index:0>5.2f} for series' into the tens with short stories in between. If you wish to do both, use {series_index:0>6.2f} which will make all series' show as 001.00 and onward.
eschwartz is offline   Reply With Quote