Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 02-26-2017, 03:38 PM   #1
ZahraB
Junior Member
ZahraB began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Feb 2017
Device: none
Question search the contents of pdf files in my pc like google books

Hello dear friends!

I'm new here and very pleased to be among you.

I hope that you'll help me with my problem.

I have numerous PDF files stored in my PC and want to search through them just as you do in Google Books.
I have tried a few softwares but no good news in using them.

Please, help me with your suggestions!
Can Calibre solve my problem? If yes, how?
If no, then please advise me!

Thank you all in advance.
I hope the best of all for you.
ZahraB is offline   Reply With Quote
Old 02-27-2017, 05:58 AM   #2
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,725
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by ZahraB View Post
I have numerous PDF files stored in my PC and want to search through them just as you do in Google Books.
Can Calibre solve my problem? If yes, how?
If no, then please advise me!
@ZahraB - a lot of things referred to as search tools operate on the file system data (file name, dates etc) or the file metadata (title, author, publisher, artist, etc). I'm assuming you want something that searches the actual text inside the PDF.

Whilst calibre has some content search facilities here and there, IMO they are not very useful, some are too slow others too restrictive.

If you're running Windows you can use Windows Search from Windows/File Explorer. You may need to configure and start the Indexing service from Control Panel (once only).

I use Windows Search on a library of a 100,000+ books (50% PDFs). Search time is typically 2-5 seconds.

Assuming the PDF's are in a calibre library there's a calibre plugin - Drop Search Results (DSR) - which can be used to integrate most Windows Search tools with calibre. Its disarmingly simple in concept and to use.

Windows search tools produce a list of files that meet the search criteria, if a list from a search on a calibre library is drag/dropped into the DSR dropzone the corresponding books are marked in the calibre library, from there it's very easy to put the marked books into a reading list, tag them, send them to your device etc.

If your running OSX you can use its Spotlight content search facility for PDFs. I'm not aware of any integration of Spotlight with Calibre such as DSR.

If your running a Linux variant, the distro probably includes some content search utilities that can handle PDF. There's a calibre plugin for the Recoll content search tool, but I am not sure of its status - i.e. whether it works, I have a suspicion it stopped working a while ago.

BR

Last edited by BetterRed; 02-27-2017 at 06:28 PM.
BetterRed is offline   Reply With Quote
Old 02-27-2017, 10:04 AM   #3
dwig
Wizard
dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.
 
dwig's Avatar
 
Posts: 1,613
Karma: 6718541
Join Date: Dec 2004
Location: Paradise (Key West, FL)
Device: Current:Surface Go & Kindle 3 - Retired: DellV8p, Clie UX50, ...
It should be noted that as good as they are, none of these search tools incorporate their own OCR (Optical Character Recognition) functions. When searching in formats like PDF, they can't "read' words that exist only as pictures, the most common contents of PDFs made by scanning printed materials, nor can they "read" words that are actually vector art (e.g. "text" in files saved from Illustrator, et. al., after the text has been converted to outlines/paths.
dwig is offline   Reply With Quote
Old 02-27-2017, 05:34 PM   #4
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,725
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by dwig View Post
It should be noted that as good as they are, none of these search tools incorporate their own OCR (Optical Character Recognition) functions.
Not necessarily true - Windows Search is extensible with 3rd party IFilters

FX : ABBYY OCR IFilter and that's not the only OCR enabled PDF IFilter. I'm unsure if it's packaged in Fine Reader, can't see why not - apart from marketing.

Recent MS PDF IFilters may have it too, given that OneNote does a rather good job of doing OCR on images.

OCR enabled IFilters will use a lot of processor time when they index an image PDF, but that only happens once per PDF. And Windows Search Indexing is normally set so that it only executes when there's nothing else wanting the CPU - you have to go out of your way to make it otherwise.

OCR enabled IFilters will inherit the well know problems of OCR in general, but not all image PDFs originate from scans of 16th century Blackletter on vellum

And I wouldn't be at all surprised if Spotlight didn't have OCR enabled PDF search.

BR

Last edited by BetterRed; 02-27-2017 at 06:27 PM. Reason: add last 2 paras
BetterRed is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Can I use Calibre to search the contents of the books in my libray? BunnyGal Calibre 6 06-01-2013 06:40 PM
Table of contents in pdf files pvdas Onyx Boox 3 12-14-2012 05:45 AM
Search for free PDF books in google ProDigit Deals and Resources (No Self-Promotion or Affiliate Links) 1 03-03-2012 03:52 PM
problem opening pdf files from books.google.com arunava77 iRex 11 02-20-2009 01:30 AM
Google Book Search to search full-text books online Bob Russell Deals and Resources (No Self-Promotion or Affiliate Links) 1 08-19-2006 12:13 PM


All times are GMT -4. The time now is 12:43 PM.


MobileRead.com is a privately owned, operated and funded community.