View Full Version : Non-English / Unicode PDF metadata?


CoolDragon
03-03-2010, 04:28 PM
How do I add/edit and save non-English/Unicode PDF metadata? I tried some PDF metadata editor but none worked.

joblack
03-03-2010, 04:36 PM
How do I add/edit and save non-English/Unicode PDF metadata? I tried some PDF metadata editor but none worked.

I thought it's UTF16 (and not UTF8) so you might try this.

CoolDragon
03-03-2010, 04:52 PM
How do I specify the encoding of the metadata to be UTF16?

joblack
03-03-2010, 05:46 PM
How do I specify the encoding of the metadata to be UTF16?

Depends on the editor - some of the may not have the option.

CoolDragon
03-03-2010, 07:13 PM
Someone can recommend any metadata editor which can specify the encoding?

frabjous
03-03-2010, 08:12 PM
I just tested with calibre (http://calibre-ebook.com/)'s ebook-meta command line program, and it did fine with unicode characters.

E.g., I put in:

ebook-meta Temp.pdf --title="ἀἐἠ"

And sure, enough, the title became ἀἐἠ.

Of course, you may have a problem if your shell/terminal/command line prompt doesn't support unicode input...

I imagine you could do it through calibre's GUI too, though for some reason, calibre's GUI confuses me more than using the command line does (and doesn't give me control over the file name and where it goes, which is frustrating)... so I prefer to use the command line tools.

Your mileage may vary.

joblack
03-03-2010, 08:47 PM
How do I add/edit and save non-English/Unicode PDF metadata? I tried some PDF metadata editor but none worked.

Just remembered - perhaps your fonts just doesn't support these characters.

CoolDragon
03-16-2010, 07:52 PM
Just remembered - perhaps your fonts just doesn't support these characters.

OK, out of several PDF metadata editors, only Calibre worked for this purpose. The reason is that all other programs don't write Unicode metadata into the file. But I don't want to install Calibre for just this purpose. So I wrote my own perl script which uses pdftk.

I will just post it here in case someone need it:

#!/bin/perl

use utf8;
use Encode;

binmode(STDOUT, ":utf8");

$numArgs = $#ARGV + 1;

if ( $numArgs > 1 ) {

open(INFOFILE, "> info.tmp") or die "Cannot open ./info.tmp!\n";

print INFOFILE "InfoKey: Title\n";
print INFOFILE "InfoValue: ";
$title = $ARGV[1];
$title = decode('gb2312', $title);
for $char ( split //, $title ) {
print INFOFILE "&#",ord($char),";";
}
print INFOFILE "\n";

print INFOFILE "InfoKey: Author\n";
print INFOFILE "InfoValue: ";
$author = $ARGV[2];
$author = decode('gb2312', $author);
for $char ( split //, $author ) {
print INFOFILE "&#",ord($char),";";
}
print INFOFILE "\n";

close(INFOFILE);
system("pdftk $ARGV[0] update_info info.tmp output $ARGV[0].update.pdf");
system("rm info.tmp");
} else {
print "Usage: pdfmeta <in.pdf> <title> [author]";
}