10-14-2010, 02:47 PM | #1 |
Member
Posts: 12
Karma: 10
Join Date: May 2009
Location: Minneapolis, MN.
Device: Bebook mini
|
Tips on Bulk Edit Metadata
Hi all;
I've loaded the contents of the Project Gutenberg 042010-DVD, over 30,000 files, into a Calibre Library. Now I have PG#'s in the title column. I want to keep the PG ID's in a custom column and get the title and author from the text or html. Any tips, tricks, or suggestions would be greatly appreciated. |
10-14-2010, 05:02 PM | #2 |
Wizard
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Build your custom column, use search & replace to get everything from the title column into your custom column. Done.
Edit: I ought to elaborate, I think. Use regular expression search mode, title field as source, your custom column as target. Search pattern is Code:
(.*) Code:
\1 Last edited by Manichean; 10-14-2010 at 05:08 PM. |
Advert | |
|
10-15-2010, 09:06 AM | #3 |
Member
Posts: 12
Karma: 10
Join Date: May 2009
Location: Minneapolis, MN.
Device: Bebook mini
|
Thanks Manichean !
That worked like a charm! I was trying to do it with out regular expressions so I didn't see the extra field to have an output other than the source field. If anyone can advise me as to the best scripting language to learn so I get the title and author from the txt and html files I would appreciate it greatly. Are sed and grep the best way or should I invest the time to learn python? |
10-15-2010, 09:18 AM | #4 |
Wizard
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
I don't know much about scripting languages, but I'm currently learning Python. It's a great language for non- timecritical tasks, I think. If you know a way to do it in a shell script, I'd suggest using that, unless you really want to learn a script language.
|
10-15-2010, 09:32 AM | #5 |
Grand Sorcerer
Posts: 11,741
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
If you feel up to it, you might consider perl. Text matching and regular expressions are integral to the language, making it good for text hacking.
Python has the advantage of being like 'normal' languages such as C or Java, bug with some nice string manipulation thrown in. It wouldn't be my first choice for hacking text, but it isn't a bad choice by any means. Python also has the advantage of being able to use calibre's libraries, so you could directly set the fields in the database. That by itself could make it the best choice around. sed/grep/awk/etc can work too, if you can work out the patterns and chaining. This option would be best if the title & author information is easy to locate (standard bracketing text). You probably would need to generate a mess of calibre command line scripts as output, but that is also true for perl. Have fun. |
Advert | |
|
10-15-2010, 10:54 AM | #6 |
Member
Posts: 12
Karma: 10
Join Date: May 2009
Location: Minneapolis, MN.
Device: Bebook mini
|
Thank for fast replies.
Looks like it will be worth my time to learn python, I saw perl extension libs for python if worse comes worst. Thanks again. Dave. |
Tags |
bulk edit metadata, library management, metadata import, tips & tricks |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Minor bug: tab order on bulk edit dialog in 0.7.23 | kiwidude | Calibre | 1 | 10-11-2010 11:45 AM |
Suggestion: Remove all tags button in the bulk edit screen | Daemon | Calibre | 3 | 08-23-2010 06:58 AM |
Updating Metadata in Bulk | Turt99 | Calibre | 5 | 06-07-2010 03:19 PM |
Bulk edit - how to set the rating to 0 stars? | highwaykind | Calibre | 3 | 02-01-2010 01:17 PM |
metadata in bulk | Lorraine Froggy | Calibre | 1 | 11-14-2009 09:42 PM |