11-20-2009, 08:06 PM | #1 |
Junior Member
Posts: 7
Karma: 10
Join Date: Nov 2009
Device: none
|
Extract ISBN from PDF?
Hi,
Calibre looks great and I'd like to use it to organize my PDF ebooks. I started on this, but it seems that for each file I must mess around with the title, author, and ISBN in order to fetch the correct metadata. It's a slow process. If Calibre could fetch the ISBN from inside the PDF, like Alfa Ebooks manager, this could really be automatic. Is this possible? Thanks, M |
11-20-2009, 09:59 PM | #2 |
.
Posts: 3,408
Karma: 5647231
Join Date: Oct 2008
Device: never enough
|
that's funny...not your idea, which sounds intriguing, just the description...because I was just marveling the other day how, by "just" entering the title and author of a book, Calibre could go online, and find the ISBN, metadata, and book cover, all automatically!
edit: wait, do you have an account with isbndb.com? it really makes things pretty automatic, for me at least. |
Advert | |
|
11-20-2009, 10:36 PM | #3 |
Junior Member
Posts: 7
Karma: 10
Join Date: Nov 2009
Device: none
|
Hi,
Yes, I have an account with isbndb, and I don't see anything terribly "intriguing" about this feature, nor is it my idea. As I said, Alfa Ebooks Manager can fetch an ISBN from a PDF. FWIW, Zotero can do this, too, though there are other limitations with Zotero. To be honest, it doesn't sound like you really read my post before replying. Think about it this way: the ISBN is already in the ebook, it's a unique identifier for that book, it could be read from the file and used to fetch all the metadata automatically, so why should you have to enter any information at all? I have a number of ebooks, and for me the lack of this feature is kind of a deal breaker. It's hard to justify spending hours twiddling metadata. Calibre is pretty slow, too, but I would certainly live with that if it could fetch metadata automatically. Finally, I would prefer an open-source application like Calibre over the commercial Alfa Ebooks Manager (which is only for Windows, too), but I guess I'll just have to keep looking at other possibilities. M |
11-20-2009, 10:48 PM | #4 | |
.
Posts: 3,408
Karma: 5647231
Join Date: Oct 2008
Device: never enough
|
Quote:
I did read your post, I was just making the general statement that one person's "deal breaker" is another person's favorite feature...you were complaining about something in Calibre that I found to be a great timesaver, thats all. And I've never heard of that commercial program, and use a Mac, so I didn't realize it was such a common thing for programs to find ISBNs in PDFS. No reason to take such an insulting tone about it. |
|
11-20-2009, 11:33 PM | #5 |
creator of calibre
Posts: 44,391
Karma: 23798586
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
A quick search in the calibre ticket system is all you needed to do:
http://calibre.kovidgoyal.net/ticket/3013 |
Advert | |
|
11-30-2009, 06:48 PM | #6 |
Junior Member
Posts: 7
Karma: 10
Join Date: Nov 2009
Device: none
|
Thank you for adding this feature in 0.6.25. I just downloaded this version but unfortunately, I can't figure out how to make it work.
I've looked through the "Edit Metadata" interface and tried removing and adding a PDF to the library again, but I don't see how to get Calibre to fetch the ISBN from the PDF content. What would really be ideal is a new menu item on the "Edit meta information" pop-up, called something like "Extract ISBN in bulk". When selected, this function would open each PDF, scan its content for the first ISBN, and use that to replace the one that appears in the "Edit Meta Information" window. Then, it would be possible to select "Download metadata and covers" which uses that ISBN, and the whole operation would only take two menu clicks. Thanks. |
11-30-2009, 10:51 PM | #7 |
Guru
Posts: 644
Karma: 1242364
Join Date: May 2009
Location: The Right Coast
Device: PC (Calibre), Nexus 7 2013 (Moon+ Pro), HTC HD2/Leo (Freda)
|
I'm not knocking the idea per se, but I'm not sure this would be a viable implementation.
Different editions of the book are created with different ISBNs and covers. Many times older ISBNs and covers are simply deleted in favor of the newer information. While this might not be critical, it has the potential for being another problematic area. Particularly if someone wants a particular cover - perhaps to match the physical book they own. |
11-30-2009, 10:55 PM | #8 |
creator of calibre
Posts: 44,391
Karma: 23798586
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
When you add a PDF to calibre, if the PDf contains the ISBN it will be read. The feature however is untested as I didn't have any PDFs lying around to test it with. So if it isn't working for your PDFs, re-open the bug and attach a couple of PDFs.
|
12-01-2009, 02:10 AM | #9 |
Junior Member
Posts: 7
Karma: 10
Join Date: Nov 2009
Device: none
|
I've created Ticket #4113 with a problem PDF attached.
Thanks. |
12-01-2009, 02:22 AM | #10 | |
Junior Member
Posts: 7
Karma: 10
Join Date: Nov 2009
Device: none
|
Quote:
Personally, I wouldn't feel too picky about covers if Calibre fetched the metadata automatically. It seems straight-forward enough to change the cover using the existing "Edit Meta Information" page. |
|
12-15-2009, 01:35 AM | #11 |
Junior Member
Posts: 7
Karma: 10
Join Date: Nov 2009
Device: none
|
Update: In addition to the PDF attached to ticket #4113, I've tried more and I can't get this feature to work at all. All the relevant details are already in the ticket.
Thanks. |
12-13-2016, 05:24 AM | #12 |
Junior Member
Posts: 2
Karma: 10
Join Date: Dec 2016
Device: none
|
Hello to everyone
Hello guys,
About find ISBN on PDF files, I was writing a Bash script to do this job for me, like a batch process. The only argument needed is the library folder's path, or a single book folder, and that's it! You may improve it as you want. I recommend to you, try first with a single book folder, before to try recursively from library top folder. You will need to install pdfgrep, and basic Linux console tools. Best regards... PD: the comments and some messages are in Spanish, you may to change as you want. |
12-13-2016, 09:51 AM | #13 |
Deviser
Posts: 2,265
Karma: 2090983
Join Date: Aug 2013
Location: Texas
Device: none
|
PDF ISSNs for Magazines and Periodicals
Magazines and Periodicals in PDF format use ISSNs (although occasionally they also have a very generic ISBN that is not issue specific).
To extract ISSN instead of ISBN, use the Library Codes plug-in. See the attached image. Obviously, if the PDF is comprised purely of scanned images rather than normal text, then nothing can be done. It is just a series of pictures that only OCR software can deal with. DaltonST |
12-13-2016, 02:03 PM | #14 | |
null operator (he/him)
Posts: 20,947
Karma: 27620688
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
Quote:
Moderator Notice
@Jose_Manuel - the thread you posted to is a day short of SEVEN YEARS OLD!!! A calibre plugin, Extract ISBN has been in existence for MORE THAN FIVE YEARS!!! And it works on the Windows, OSX and Linux versions of calibre. And it updates the Identifiers column of the library database And it doesn't need any additional software, because it uses the pdf libraries shipped within calibre. BR Last edited by BetterRed; 12-13-2016 at 02:05 PM. |
|
12-16-2016, 07:32 AM | #15 | |
Junior Member
Posts: 2
Karma: 10
Join Date: Dec 2016
Device: none
|
Quote:
I know that exist a plugin to extract ISBN, but I was frustrated many times with this, because it doesn't work as I expected, just find ISBN in the easy cases, in many other just don't. By this reason I write my own script two years ago, and now I decide to sharing with the community, if someone wants to use it, go ahead, if no, is freely to use the plugin. And about scanned books without OCR, its possible to find the ISBN, I do it somedays ago using some post processing OCR command. I thinking to add to my script, but if the community doesn't care about it, then I just post the new version on my webpage and thats it. Bye. |
|
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
[Old Thread] Extract ISBN from file name | ChristianQ | Calibre | 59 | 12-09-2015 05:08 AM |
[Old Thread] Auto Extract ISBN-Feature request | UnraisedArc | Calibre | 60 | 03-23-2011 09:31 AM |
ISBN scrapping out of pdf | pilx | Calibre | 5 | 04-15-2010 01:01 AM |
Extract PDF from Palm PDB-file? | Tobago | 1 | 02-18-2010 07:32 AM | |
[REQ] Extract the first PDF page as image | Format C: | 2 | 02-09-2009 10:53 AM |