View Full Version : Searching for a way to batch-update PDFs metadata


Pulp
01-11-2009, 07:47 AM
I have a bunch of PDF-files sorted into folders like this:
Christie, Agatha/Agatha Christie - The Secret Adversary.pdf
Christie, Agatha/Agatha Christie - The Mysterious Affair at Styles.pdf
etc.

Trying out calibre I found that in most of the files the meta-information is really bad (the name including author and title, the pdf-creator puting his/her own name in the authors field instead of the real authors name, etc.) and as calibre uses that information the list of books is really useless to find anything at the moment.

What I would need is a piece of software that can go through all my PDFs and update the meta-data using information from the filenames and directories.

Is there anything out there that can do that?

=X=
01-13-2009, 02:13 AM
I have a bunch of PDF-files sorted into folders like this:
Christie, Agatha/Agatha Christie - The Secret Adversary.pdf
Christie, Agatha/Agatha Christie - The Mysterious Affair at Styles.pdf
etc.

Trying out calibre I found that in most of the files the meta-information is really bad (the name including author and title, the pdf-creator puting his/her own name in the authors field instead of the real authors name, etc.) and as calibre uses that information the list of books is really useless to find anything at the moment.

What I would need is a piece of software that can go through all my PDFs and update the meta-data using information from the filenames and directories.

Is there anything out there that can do that?

Well yes and no. There is a tool here that is called soPDF. It optimizes a PDF for the SONY Reader by croping the margins, and/or rotating the PDF, while keeping the PDF.

I use the same convention as you and actually parse out the name/title then pass those values into soPDF (so they will show up correctlin in calibre/Sony PRS505.

Let me know if you'ld like the Perl script.

Also the soPDF tool does only works on Windows and does not modify secure PDF.

=X=

Pulp
01-13-2009, 07:40 AM
I already made a python-script that generates meta-files for pdftk and runs it on every file, but the problem is that pdftk does not seem to support special characters so it's useless for german books ("").

If there is a simple commandline tool that can update the meta-information i can do the "batch-work" myself now.

PS: I like soPdf, though for other purposes, I'll give it a try to prepare pdfs for reading them directly on my Cybook, thanks!

Update: found a way to include the special characters, now I only have to finish my script :)