Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 08-21-2013, 04:58 AM   #1
cptnemo
Enthusiast
cptnemo began at the beginning.
 
Posts: 35
Karma: 10
Join Date: Oct 2011
Device: Kindle 3
PDF output is searchable with Adobe Reader but not with Mac Preview

Hello,

I converted an .epub into a .pdf. Once I try to search the text with Mac Preview (but also with Skim) I can't find any correspondence. Also when I try to copy some text from the .pdf I get a blank string. I can highlight the text normally.

All the above problems disappear when I use Adobe Reader. I can search and copy.

Should I select different options for the conversion?
cptnemo is offline   Reply With Quote
Old 08-21-2013, 07:31 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 44,539
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
That;s because mac preview has no support for unicode cmaps, a feature that was introduced to PDF over six years ago.
kovidgoyal is offline   Reply With Quote
Advert
Old 08-21-2013, 07:22 PM   #3
cptnemo
Enthusiast
cptnemo began at the beginning.
 
Posts: 35
Karma: 10
Join Date: Oct 2011
Device: Kindle 3
Quote:
Originally Posted by kovidgoyal View Post
That;s because mac preview has no support for unicode cmaps, a feature that was introduced to PDF over six years ago.
Any easy way to change the character encoding of the pdf output in Calibre? (Or hasn't it anything about the problem? I saw there are options for the input encoding, but can't find for the output...)
cptnemo is offline   Reply With Quote
Old 08-21-2013, 08:47 PM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 44,539
Karma: 24495948
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
No .
kovidgoyal is offline   Reply With Quote
Old 08-21-2013, 10:40 PM   #5
cptnemo
Enthusiast
cptnemo began at the beginning.
 
Posts: 35
Karma: 10
Join Date: Oct 2011
Device: Kindle 3
Thanks.

I found I could use a workaround for my character encoding problem. Maybe someone is interested. So see below.

If you want to copy & paste from the PDFs using Preview.app or Skim.app you can't use calibre to generated the PDF. I don't grasp all the technicalities but, in few words, the problem has to do with the encoding: calibre produce PDF with text with encoding "Identity-H", while Preview.app and Skim.app need "Ansi" encoding. Then you can:

1) Convert with calibre to HTMLZ.
2) Replace the .htmlz extension of the file with .zip
3) Unzip the file
4) Create the PDF with Adobe Acrobat Pro using "Create from webpage"

(I tried also Word and LibreOffice, problem here they don't keep the internal links)
cptnemo is offline   Reply With Quote
Advert
Old 08-26-2014, 04:48 AM   #6
t04sty
Junior Member
t04sty began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Aug 2014
Device: Android
Quote:
Originally Posted by kovidgoyal View Post
No .
So close, but this is a dealbreaker. A PDF that isn't searchable in MacOS Preview and lacks the ability to copy content is useless to me. The Calibre PDF output looked very nice once I tweaked the epub in Sigil, though.

Thanks for your efforts in developing Calibre, but I guess I will have to look for a different software package that can do what I need.

(The HTMLZ workaround breaks the ToC if I open it in Word, so it's back to square 1)
t04sty is offline   Reply With Quote
Old 08-26-2014, 06:52 AM   #7
t04sty
Junior Member
t04sty began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Aug 2014
Device: Android
Quote:
Originally Posted by cptnemo View Post
Thanks.
I found I could use a workaround for my character encoding problem. Maybe someone is interested. So see below.
If you are looking for a fast solution and don't mind sacrificing the internal links, simply open the Calibre-produced PDF in Mac OS Preview, go to the Print dialog, and from there "Save as PDF". This will produce a PDF that is searchable, with copyable content, but which lacks the internal links.
t04sty is offline   Reply With Quote
Old 01-01-2015, 01:36 AM   #8
Todd Fincannon
Junior Member
Todd Fincannon began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Jan 2015
Device: Kindle
On OS X Yosemite, I was able to work around the problem and still retain the TOC and internal links. Do File > Duplicate, then File > Save. This may work on earlier versions of OS X, but I haven't tried it.
Todd Fincannon is offline   Reply With Quote
Old 02-18-2015, 05:14 AM   #9
JHavermans
Junior Member
JHavermans began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Feb 2015
Device: iPad
What fonts did you use?

Quote:
Originally Posted by Todd Fincannon View Post
On OS X Yosemite, I was able to work around the problem and still retain the TOC and internal links. Do File > Duplicate, then File > Save. This may work on earlier versions of OS X, but I haven't tried it.
Hi Todd,
I have the same issue with the cmaps and so. I tried to duplicate but it doesn't work for me. Can you tell me what fonts you did use in Calibre when you click on convert books and then in the left column on pdf output?

Thank you
Johan
JHavermans is offline   Reply With Quote
Old 02-18-2015, 04:33 PM   #10
dwig
Wizard
dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.
 
dwig's Avatar
 
Posts: 1,613
Karma: 6718541
Join Date: Dec 2004
Location: Paradise (Key West, FL)
Device: Current:Surface Go & Kindle 3 - Retired: DellV8p, Clie UX50, ...
Quote:
Originally Posted by t04sty View Post
So close, but this is a dealbreaker. A PDF that isn't searchable in MacOS Preview and lacks the ability to copy content is useless to me. ...
Then a lot of PDFs produced by a lot of apps will also be problems. MacOS's Preview supports only a modest subset of PDF attributes and functionality. It is particularly limited in its font support.

IMHO, Preview is too poor a viewer to use it for PDF on a regular basis. The only time I view a PDF in Preview is to see if clients using a Mac with only Preview as their viewer will have difficulty. When I view a PDF for myself on a Mac I use a good viewer, usually Adobe Reader.
dwig is offline   Reply With Quote
Old 05-13-2015, 12:54 PM   #11
JustinCarone
Junior Member
JustinCarone began at the beginning.
 
Posts: 3
Karma: 10
Join Date: May 2015
Device: Kindle Paperwhite/iPad 3
Todd's fix worked for me. Notice that it does increase file size fairly considerably, and requires that you use an embedded font during the Calibre conversion process that Preview can use. I don't have a list of compatible fonts, but Georgia does work for me. If anyone finds a list of compatible fonts let everyone know.

There is probably a way to write an Apple Script to automatically replace the Calibre outputted file with a Preview searchable one... but I might have to leave that to someone else for now. I'll try it with Automator later, but that's about the limit of my ability. Is there a way for Calibre to automatically accomplish this without going through the external process with Preview? I know I'd appreciate it.
JustinCarone is offline   Reply With Quote
Old 05-13-2015, 01:09 PM   #12
JustinCarone
Junior Member
JustinCarone began at the beginning.
 
Posts: 3
Karma: 10
Join Date: May 2015
Device: Kindle Paperwhite/iPad 3
After the duplication/save process is complete you can then open the PDF in Acrobat and save as a reduced sized PDF and searching in Preview persists but the file sized is reduced to near it's original size after Calibre conversion. Copy and paste also still works... An apple script for all of these processes would be great, or somehow making it part of Calibre.
JustinCarone is offline   Reply With Quote
Old 05-13-2015, 02:04 PM   #13
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
The fact that Mac OSX comes with a builtin default PDF viewer that is lousy, should not be an excuse to use it!!!
eschwartz is offline   Reply With Quote
Old 05-13-2015, 02:16 PM   #14
JustinCarone
Junior Member
JustinCarone began at the beginning.
 
Posts: 3
Karma: 10
Join Date: May 2015
Device: Kindle Paperwhite/iPad 3
Okay, figured it out. If you run a Quartz Filter on the PDF after a compatible font is embedded during the Calibre conversion process it will make the content of the PDFs searchable and copyable. I made an Automator script that takes any files and adds an image filter to them that is ostensibly set to reduce the size of images, but I just set it to "uncompressed" so that there is no impact that I can discern on the images. You can process a ton of files quickly this way.

As for eschwartz’s comment, this isn't just for Preview. This also makes PDFs searchable on my iPad using PDF Expert, in Papers 3 on Mac and iPad, and entirely indexable by the above applications as well as Alfred and Spotlight. For my purposes being able to call up all documents with specific keywords or names through a central interface is absurdly useful, and not possible if I don’t make it compatible with Preview and other applications like it which don’t have the full capabilities of some readers. Not to mention making it compatible with the broadest range of possible applications, not just the newest, leaves options open for other ways of using the file that I might not have thought of yet. It also has no negative impact on how I use the files, so I see only positives and no downsides for my personal use.
JustinCarone is offline   Reply With Quote
Old 11-28-2017, 09:37 AM   #15
Pkoetsie
Junior Member
Pkoetsie began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Nov 2017
Device: Kindle Paperwhite 2015
I had the same issue with a couple of PDFs that were created with Calibre, and managed to fix it by transforming them using Ghostscript.
I ran the following command to have Ghostscript recreate the PDFs:
gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -sOutputFile="<New PDF file>" "<Original PDF file>"
Pkoetsie is offline   Reply With Quote
Reply

Tags
pdf


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Urgent: How To Convert Wikibook PDF Into a Searchable Index? deerayolia Kindle Formats 4 05-28-2012 07:52 AM
Sony Reader Guide for creating optimized PDF content - Exclusive Preview Bob Russell Sony Reader 51 06-22-2011 12:31 PM
Adobe PDF on kobo reader? domromer Kobo Reader 4 10-29-2010 02:00 AM
DR800 TechPDF: Yet another PDF reader (technology preview) GregorRichards iRex 35 06-11-2010 11:40 AM
Adobe Reader V9.0 for Windows and Mac released Alexander Turcic News 21 07-05-2008 07:14 PM


All times are GMT -4. The time now is 09:53 PM.


MobileRead.com is a privately owned, operated and funded community.