08-01-2011, 03:44 PM | #1 |
Member
Posts: 23
Karma: 10
Join Date: Aug 2011
Device: none
|
Yet more PDF tags metadata questions...
Hello, just started organizing my ebook collection and started using calibre. Though I find it useful in many ways, I find it seems unable to write the tags it downloads to actual pdf metadata.
I couldn't quite work out from the various threads whether calibre is supposed to be able to do this or not. Can someone clarify this for me? My solution so far has been to use calibredb database export function and exiftool to generate a list of tags and the corresponding files, along with a bash script to assign the appropriate tags to all the pdfs at once which, amazingly, (considering my scripting skills) actually works. Well, 99% of the time, anyway. Just wondered if anyone has a better solution... Thanks! |
08-01-2011, 04:37 PM | #2 | |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
Does that answer your question? |
|
Advert | |
|
08-01-2011, 04:45 PM | #3 |
Wizard
Posts: 1,065
Karma: 858115
Join Date: Jan 2011
Device: Kobo Clara, Kindle Paperwhite 10
|
Now I'm confused. I thought metadata couldn't be incorporated into pdf's from calibre except in associated opf file. Converting nearly everything to epub, I don't have to mess with pdf's much anymore except for initial conversion and cleanup of incoming pdf's
Last edited by unboggling; 08-02-2011 at 07:25 PM. |
08-01-2011, 05:01 PM | #4 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
I avoid pdf's like the plague, so I don't have much firsthand knowledge. However, Calibre can write metadata to pdfs, but there are some bugs in the pdf code Calibre uses. If he's expecting Calibre to put metadata into the library copy, then that's why he doesn't see it.
|
08-01-2011, 05:03 PM | #5 |
Wizard
Posts: 1,065
Karma: 858115
Join Date: Jan 2011
Device: Kobo Clara, Kindle Paperwhite 10
|
After conversion to epub I delete the pdf format. If I want to incorporate metadata including tags into the actual format of that book, I assign the tags in appropriate calibre fields then do a conversion - with structure detection tab "insert metadata as page at start of book" selected - of epub or other format that supports internal tags from calibre. I didn't think pdf's allow that.
|
Advert | |
|
08-01-2011, 05:05 PM | #6 |
Wizard
Posts: 1,065
Karma: 858115
Join Date: Jan 2011
Device: Kobo Clara, Kindle Paperwhite 10
|
Starson. Oh, ok. Thanks for the clarification. Our posts crossed.
|
08-01-2011, 05:08 PM | #7 |
Member
Posts: 23
Karma: 10
Join Date: Aug 2011
Device: none
|
"Specifically, metadata is not updated until the ebook is exported"
Yes, I mean it does not update metadata tags in my pdfs, even if and especially when exported to disk. "I thought metadata couldn't be incorporated into pdf's from calibre except in associated opf file" I think I agree! I don't like having the metadata in a separate .opf file. But, I've been having a lot of luck brute force embedding the tags directly into the pdfs like i said, with exiftool and calibre command line tools. It's very messy though... Thanks for the replies, I guess everyone hates pdf huh? |
08-01-2011, 05:26 PM | #8 |
Wizard
Posts: 1,065
Karma: 858115
Join Date: Jan 2011
Device: Kobo Clara, Kindle Paperwhite 10
|
epubs are much easier to work with. I bought Acrobat full version a couple months ago so I could edit headers and footers and page numbers out of pdf formats. Little did I know that it's so difficult and confusing to use and doesn't handle most of my h, f, pn problems anyway, though it does have intriguing batch functions. It's much easier for me to convert pdf to epub, see what's wrong with it, tag appropriately, convert to rtf, mess with search/replace in Word, save as docx to get rid of a lot of extraneous MS RTF format garbage, which usually reduces size a lot, then run the docx through open office into odt format to further clean up MS garbage and reduce size again, and add back in to calibre. That sequence sounds like a lot, but it works really well for me. Eventually I'll know enough regex to handle stripping h, f, pn 's directly from calibre search/replace - but until I do, the process I described works remarkably well.
|
08-01-2011, 05:30 PM | #9 |
Wizard
Posts: 1,065
Karma: 858115
Join Date: Jan 2011
Device: Kobo Clara, Kindle Paperwhite 10
|
I believe metadata is updated into opfs soon after changing it in calibre's library view or in Edit Metadata window. I read that somewhere here just the other day.
Last edited by unboggling; 08-03-2011 at 12:00 AM. |
08-02-2011, 12:24 AM | #10 | |||
US Navy, Retired
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
|
Quote:
Save to disk Send to device Connect to folder and since stason17 included it getting a book via the Content server. Quote:
Quote:
PDF is a fine format for the desktop and for printing, but it is not a reflowable ereader format. Because it is primarily a print format most of the time it doesn't convert well at all. Read here for PDF conversion issues. |
|||
08-02-2011, 06:53 AM | #11 |
Member
Posts: 23
Karma: 10
Join Date: Aug 2011
Device: none
|
PDF metadata writer plugin is definitely enabled and working for title and author...but not tags...for me, anyway.
I'm not just grabbing and dumping from calibre library. I'm using save to disk (single directory, update metadata checked, separate image unchecked, save separate .opf unchecked (also tried this checked, still doesn't work...) Calibre is great at mass information downloads and some other stuff like converting books, but I just don't like using it to actually manage my books (not yet, anyway...maybe it'll grow on me). So the metadata in .opf, rather than the actual .pdf, is not what I want I don't think (I only just started out trying to manage my thousands of .pdfs, so I don't really know what I'm talking about...) My command line tricks involve cleaning the filenames containing stuff like "&_-'"<>" with some scripts (after exporting from calibre with filename {title}, then doing a: "calibredb catalog whatever.csv --fields=tags --sort-by=title_sort" So then, exported files on my disk "line up" (I hope!) with the list of tags exported using calibredb, when the first line is removed from the file containing list of tags. Then, using awk, merge a list of files exported to disk line for line with the list of tags..and add "exiftool -keywords+=" to every line. Then execute the file. It works, as long as calibre is set not to download too many tags... That sounds really simple when you write it down! (it isn't) I don't have an ereader, so these other formats are a little unusual to me. |
08-02-2011, 07:23 AM | #12 | |
US Navy, Retired
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
|
Quote:
I just saved two of my 4 PDFs to disk and the Title and Author were updated but there were no keywords added, the area I though tags might go. Maybe Title and Author is the only metadata that gets added to the pdf. Hopefully someone with more experience can shed some light on the subject. |
|
08-02-2011, 01:06 PM | #13 |
creator of calibre
Posts: 43,843
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
IIRC, only title, author and book producer are updated in PDF files.
|
08-02-2011, 02:08 PM | #14 |
creator of calibre
Posts: 43,843
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
I just committed support for reading/writing tags in PDF.
|
08-02-2011, 02:53 PM | #15 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
HTML Metadata for use as ePub tags | myudkowsky | Calibre | 9 | 12-12-2010 12:45 PM |
Tags and Authors in Metadata | cossetter | Calibre | 8 | 12-22-2009 08:24 PM |
Social metadata\tags | DaCrump | Calibre | 10 | 11-18-2009 04:18 AM |
Change Metadata tags en masse | thibaulthalpern | Calibre | 1 | 03-20-2009 07:02 PM |
Editing metadata tags | thibaulthalpern | 9 | 03-19-2009 04:17 PM |