![]() |
#1 |
Quack! Quack!
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 92
Karma: 9176
Join Date: Apr 2011
Location: Florida
Device: kindle 3 & sony daily prs950sc
|
Importing Version Info...Possible?
So I created a user made column called version in which I would like the version info that is usually in the filenames of the books to be added:
Example: George RR Martin - Ice & Fire 1 - Game of Thrones (v5.0).epub Is it possible to import the version info to the user created column and if so how? When you go to the test field for import it doesn't seem to let you create a test field for it that I know of so that you can check if your work is correct or not. This is what i tried and a few variations of: ^(?P<author>[^-]+)(\s*-\s*(\[?(?P<series>[^-0-9]+)\s*(?P<series_index>[0-9.]+)?]?)?)?.*?-\s*(?P<title>[^-]+)(\s*-\s*(\[?(?P<version>[^-]+) with and with out a # sign in front of version. so I don't know if the regex is incorrect and or it is or isn't possible to even import to user created columns. I think it would be very useful in importing large collections with duplicates of titles indifferent versions to not strip the version info and have it go into its own column so you can out put it with or without. Its also great in combination with find duplicates with fuzzy logic... you can then get rid of old versions. I have currently been importing them with: ^(?P<author>[^-]+)(\s*-\s*(\[?(?P<series>[^-0-9]+)\s*(?P<series_index>[0-9.]+)?]?)?)?.*?-\s*(?P<title>[^-]+) so that it will import the version info as part of the title like so: Author: George RR Martin Title: Game of Thrones (v5.0) Series: Ice & Fire Series Index: 1 I stopped using this: ^(?P<author>[^-]+)(\s*-\s*(\[?(?P<series>[^-0-9]+)\s*(?P<series_index>[0-9.]+)?]?)?)?.*?-\s*(?P<title>[^\]{[()]+\w) which strips all brackets including version info... i prestrip all brackets other than version bracket info and clean up files with flash renamer pre-import to calibre. The issue with having version info in the title is that when you go to get metadata it can throw it in to a loop and not recognize the title. Does anyone else find this request useful for them as well? or would also like the ability to do that? I do a lot of filename clean up in bulk with regex, wildcards and commands that can be stored and run in batch with Opus and Flash Renamer. for example a file like this before running my batch command might look like this: MCMARTIN, GeoRge R. R. - {Songs OF Fire & ICE 01] - a_game_of_THRONes [unabridged] (V5.1) {epub}.epub will get fixed to this: George RR McMartin - Songs of Fire & Ice 01 - A Game of Thrones (v5.1).epub And i can run it against thousands of files at once. Another useful tool is ExtractNow which will bulk extract archives to their respective folder and subfolders to a folder of your choice including delete the archives after if you want without manually having to go into each folder/subfolder. Pretty useful for some downloaded collections. If anyone is interested in any of the other software batch commands i have set up for them just pm me and ill be happy to help. one other problem its a modification of what i'm using in a find and replace: this is what im using: INFO-----: Swap Lastname, Firstname if in front of a title or series with a dash(-) EXAMPLE-: Carlin, George P - Stupid Jokes 1 - Your Mama!.epup RESULT--: George P Carlin - Stupid Jokes 1 - Your Mama!.epub FIND-----: ^(\w+), *([\w \.]+)[ ]+-[ ]*(.*) REPLACE-: \2 \1 - \3 OR-------: (Depending on what program your using) REPLACE-: $2 $1 - $3 But it won't work on file names that are like this: Carlin, George P & Swift, Taylor L. L. - Stupid Jokes 1 - Your Mama!.epup The intials could have periods. The file should end up looking like this. George P Carlin & Taylor L. L. Swift - Stupid Jokes 1 - Your Mama!.epup I need one that can do either both multi and single namess or a sepperate one that can handle multinames whichever is easier. any ideas? Thanks for your time whoever solves the problem. Last edited by penguinaka; 06-11-2011 at 04:05 PM. |
![]() |
![]() |
![]() |
#2 | |||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
Quote:
Quote:
|
|||
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Quack! Quack!
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 92
Karma: 9176
Join Date: Apr 2011
Location: Florida
Device: kindle 3 & sony daily prs950sc
|
Another Question: What about Importing The version into the publisher info field.... is it possible the data can then be transferred into the user created column in bulk by some command?
Then if the info is imported into calibre as publisher info will it try to merge a duplicate book if it is set to merge but the publisher/version info is different? for example 2 books... george rr martin - game of thrones (v1.5).epub george rr martin - game of thrones (v5.0).epub george rr martin - game of thrones (v4.0).mobi These are obviously different version... the 5.0 being an improvement in quality. If i have it set so that the books will import with the version info going into the publisher info will it attempt to merge them? Last edited by penguinaka; 06-12-2011 at 09:05 PM. |
![]() |
![]() |
![]() |
#4 | |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,908
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
Only including Version in the Title will make separate entries. |
|
![]() |
![]() |
![]() |
#5 |
Quack! Quack!
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 92
Karma: 9176
Join Date: Apr 2011
Location: Florida
Device: kindle 3 & sony daily prs950sc
|
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Calibre Plugins Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,721
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Surely if you turn off auto merge you will just get prompted about the duplicate, in which case you can tell the prompt not to merge?
|
![]() |
![]() |
![]() |
#7 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Sure, you can do that in bulk metadata search & replace. Use regex mode. As for the other comments, see theducks' earlier post.
|
![]() |
![]() |
![]() |
#8 | |
Quack! Quack!
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 92
Karma: 9176
Join Date: Apr 2011
Location: Florida
Device: kindle 3 & sony daily prs950sc
|
Quote:
I guess i'll have to do it in a few steps like the suggestion earlier in the post. I appreciate all the feedback from everyone thank you. If the series is different then a duplicate is not merged correct? or for example 1 has series info the other doesn't? Thanks Manichean! |
|
![]() |
![]() |
![]() |
#9 | |
Calibre Plugins Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,721
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Quote:
To illustrate that point, series is certainly not considered, only the title and author. Remember that whether it is automerge, calibre's duplicate detection or the Find Duplicates plugin, a duplicate is determined by a portion of its title. The granularity of that comparison can only be controlled by the Find Duplicates plugin. What are you trying to achieve? Just keep the highest version? Keep all versions with the version in a column? I think you are in for a world of pain whichever route you take. I don't think that automerge being on should be something you should consider unless you are very specific about what you are adding though. |
|
![]() |
![]() |
![]() |
#10 | |
Quack! Quack!
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 92
Karma: 9176
Join Date: Apr 2011
Location: Florida
Device: kindle 3 & sony daily prs950sc
|
Quote:
I was under the impression that auto merge. grouped together the same name but different file types and if it was the same name and file type that it kept the first one it comes to and discards the next ones. The definition of same file type was if the author, series and title and extension matched. yes? and your saying series isn't considered... i thought it was. As far as what i was trying to achieve with version. I was going to use the find duplicate with fuzy logic on both settings then pick and choose which versions to keep. That being highest version with consideration as to if it was a v5 pdf (they convert poorly) better to keep it in its orginal fomat. everything is getting converted to mobi but i'm keeping copies of .epub's, & v5 .pdf's everything else i'm gets deleted..(of course i keep a copy of the orginal backup pre-deletions). |
|
![]() |
![]() |
![]() |
#11 |
Calibre Plugins Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,721
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Don't confuse the regex you use with the filename with the logic Calibre has internally to decide whether to treat two books as duplicates for automerge purposes. Automerge has some "fuzziness" in its comparison of book titles, stripping a whole bunch of characters like brackets, punctuation, title sort characters like "the, a, an" etc. It is certainly not an "exact match" experience.
It will not throw away numeric values (but will the periods separating them). You might "get lucky" and provided every book you import has enough different characters you get it to do what you want. However you are absolutely playing with fire with this, and as the saying goes you may get burned. It has its purposes - the best usage of it imho is when you have another format of an existing book in your library that you want to add. Say you have a book record in MOBI format, and now you get an EPUB version from somewhere. For that purpose AutoMerge is brilliant. However trying to use Automerge in combination with bringing in multiple versions of epubs sounds "wrong" to me. You might get lucky, you might end up with a mish-mash mess. ![]() |
![]() |
![]() |
![]() |
#12 | |
Quack! Quack!
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 92
Karma: 9176
Join Date: Apr 2011
Location: Florida
Device: kindle 3 & sony daily prs950sc
|
Quote:
|
|
![]() |
![]() |
![]() |
#13 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
kiwidude's comments are all right on point.
Automerge off With Automerge off (and accepting the duplicates), you get each format as a separate record. The author/title will be whatever you specified (obtained from metadata or calculated by regex from the filename). You then use Merge to put matching records together. This is the best for those OCD users who want to deal manually with every record merger. If the title of the html version is "The Oasis: A Novel", and the title of the EPUB is "The Oasis - A Novel" you get two records with those different titles. kiwidude's Find Duplicate plugin will let you find them and my Merge will let you select the better title and Merge them. Automerge: Fuzzy Title Matching Automerge was written before Find Duplicates, for those who didn't want to have to do it all manually. It will see the above two titles as the same (punctuation is ignored and multiple spaces are collapsed to single spaces). The first title on the first format entered will be the title used for the book, and later titles will be discarded if they are a close enough match. Authors must match exactly. Automerge fuzzy matching won't ignore any differences in the author, and won't ignore any character order differences, except if the start of the title is an indefinite article ("The, A", etc.) for whatever indefinite articles you've set to be ignored in your language for the applicable Tweak. There will still be lots of non-duplicate duplicate books as a result of these non-matches. Find them with Find Duplicates and use Merge to pick the better Author and Title. Automerge: Duplicate Formats The other question is what to do with incoming formats when Author/Title match according to AutoMerge fuzzy matching rules and the incoming format already exists in the matching record. You have three choices: For OCD users, tell Automerge to create a new record, use Find Duplicates to locate the dupes and manually Merge them. For less compulsive users, tell it to ignore the incoming as a duplicate. I have added thousands of books in testing, and have yet to find an inadvertent AutoMerge match. That said, if your books come from different sources there will likely be many non-matches. If so, Find Duplicates still has to be used. For those who like to work on a copy of a book, then replace the old with the new, or those who assume that newer copies are better - set AutoMerge to overwrite. |
![]() |
![]() |
![]() |
#14 | |
Quack! Quack!
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 92
Karma: 9176
Join Date: Apr 2011
Location: Florida
Device: kindle 3 & sony daily prs950sc
|
Quote:
|
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
How to install prior version of Calibre when newer version installed? | SilentSeven | Calibre | 3 | 04-13-2011 12:46 PM |
Duplicates when importing... | john_es | Library Management | 1 | 03-21-2011 09:24 AM |
Updated Christian Bible Launches eBook Version Before Print Version | tubemonkey | News | 21 | 12-30-2010 03:53 PM |
importing ebooks | iconeo | Calibre | 4 | 05-05-2009 03:35 AM |
Importing | Importing121 | Lounge | 2 | 05-20-2008 11:24 AM |