Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 12-20-2010, 03:37 PM   #1
chilady1
I devour books!
chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.
 
chilady1's Avatar
 
Posts: 771
Karma: 1285226
Join Date: Mar 2009
Device: iPad Air, Kindle 3/Kobo Aura HD, iPhone 6
Unhappy pdf to mobi problem

I posted this problem sometime ago but never really got a good explanation so I will try again. I have a 382 page PDF when opened in Adobe shows up no problem. There is only one graphic in the document and that is the cover. When I import this document into Calibre and try to convert to mobi - the document converts with only 2 pages. It's as if the conversion process doesn't see the accompanying text associated with the file.

I am so frustrated and thought perhaps someone or someone(s) could help me with this. I have even tried some online conversion websites to see if perhaps I was doing something wrong in Calibre. When the websites produce the documents (I tried to convert the PDF into RTF, HTML, LIT and EPUB just on the off chance they would produce the entire doc) it still only shows 2 pages.

Has anyone else experienced this and can someone tell me what I might do to get the ENTIRE PDF file to convert?
chilady1 is offline   Reply With Quote
Old 12-20-2010, 03:39 PM   #2
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
It's most likely that every page is actually a graphic image, with no underlying text hints. You would need to either use the OCR in Acrobat pro to create underlying text, or extract all the pages to images and use a full fledged OCR program like ABBYY.
ldolse is offline   Reply With Quote
 
Advertisement
Old 12-20-2010, 04:38 PM   #3
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by chilady1 View Post
I posted this problem sometime ago but never really got a good explanation so I will try again.
You got the same answer then that you got here.

http://www.mobileread.com/forums/showthread.php?t=99400

You were told it was probably an image of a page of text, you came back and said it was text and you were asked if you were sure - would you 1) check the size (images are bigger) and 2) try to select individual text with the selection tool. You didn't respond.

Quote:
I have a 382 page PDF when opened in Adobe shows up no problem. There is only one graphic in the document and that is the cover. When I import this document into Calibre and try to convert to mobi - the document converts with only 2 pages. It's as if the conversion process doesn't see the accompanying text associated with the file.

I am so frustrated and thought perhaps someone or someone(s) could help me with this. I have even tried some online conversion websites to see if perhaps I was doing something wrong in Calibre. When the websites produce the documents (I tried to convert the PDF into RTF, HTML, LIT and EPUB just on the off chance they would produce the entire doc) it still only shows 2 pages.

Has anyone else experienced this and can someone tell me what I might do to get the ENTIRE PDF file to convert?
So you are asked again, is it an image of text, or is it really text? How big is the file? Can you select the text with the selection tool, or does the whole page select? People are willing to help, and this is a common problem, but you have to help them solve it, not just post that you're frustrated.

Last edited by Starson17; 12-20-2010 at 04:40 PM.
Starson17 is offline   Reply With Quote
Old 12-20-2010, 07:24 PM   #4
chilady1
I devour books!
chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.
 
chilady1's Avatar
 
Posts: 771
Karma: 1285226
Join Date: Mar 2009
Device: iPad Air, Kindle 3/Kobo Aura HD, iPhone 6
Well Thanks Starson17 for making me feel like an idiot - didn't realize this wasn't a forum to express frustration. And clearly I asked the same question because obviously I didn't understand the answer the first time. However, I will say that Idolse answer shed some light on the fact that perhaps as was clearly pointed out before...this is a PDF OCR file.

I come to this board often because I find that most of the people here are very versed with many different type of applications and methods to do things related to ebooks. I have learned a great deal from this particular board which I find more technical than other boards. Most people have a way of explaining complex issues simply to laypersons like myself.

My apologies for not being technical savvy - I will refrain from annoying people with the same old questions.

Appreciate your answer.

Last edited by chilady1; 12-20-2010 at 07:29 PM.
chilady1 is offline   Reply With Quote
Old 12-21-2010, 12:41 AM   #5
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 8,885
Karma: 12755553
Join Date: Feb 2009
Location: North Carolina
Device: Nexus 7
Quote:
Originally Posted by chilady1 View Post
Quote:
Originally Posted by Starson17 View Post
Quote:
Originally Posted by chilady1 View Post
I posted this problem sometime ago but never really got a good explanation so I will try again.
You got the same answer then that you got here.

http://www.mobileread.com/forums/showthread.php?t=99400

You were told it was probably an image of a page of text, you came back and said it was text and you were asked if you were sure - would you 1) check the size (images are bigger) and 2) try to select individual text with the selection tool. You didn't respond.
Well Thanks Starson17 for making me feel like an idiot - didn't realize this wasn't a forum to express frustration. And clearly I asked the same question because obviously I didn't understand the answer the first time.
Starson17's point was that there was not a definitive answer to you the "first time," because when asked for more information you apparently decided providing that additional information wasn't worth your time.

Folks here tried to help you and you snubbed them.

Quote:
Originally Posted by chilady1 View Post
I come to this board often because I find that most of the people here are very versed with many different type of applications and methods to do things related to ebooks. I have learned a great deal from this particular board which I find more technical than other boards. Most people have a way of explaining complex issues simply to laypersons like myself.
That is why I come to this board too. But when I am frustrated and ask for help I value the time folks put into working with me and solving a problem. I try and not waste other folks time and if they ask for information to help resolve a issue I brought up I am grateful and provide that information. I don't snub people in the process of assisting me by not responding to questions.

Quote:
Originally Posted by chilady1 View Post
My apologies for not being technical savvy - I will refrain from annoying people with the same old questions.
No one here has offered any offense to you, so I'm unsure why you are responding as if they did. Your opening line of your post in this thread though was offensive to the folks in the previous thread who were trying to help you.

Quote:
Originally Posted by chilady1 View Post
I posted this problem sometime ago but never really got a good explanation so I will try again.
You implied that somehow the folks that tried to help you the first time you asked this question didn't give you the information you needed, when in fact you stopped participating in the attempt to provide you with a complete answer. The reason you never got a "good explanation" was because you left the attempt to provide you with a "good explanation."

I agree that Idolse' insight is most likely correct but if you want to know for sure and understand how to tell in the future you might want to consider the questions that Starson17 presented you.

Last edited by DoctorOhh; 12-21-2010 at 12:44 AM.
DoctorOhh is offline   Reply With Quote
Old 12-21-2010, 12:08 PM   #6
chilady1
I devour books!
chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.
 
chilady1's Avatar
 
Posts: 771
Karma: 1285226
Join Date: Mar 2009
Device: iPad Air, Kindle 3/Kobo Aura HD, iPhone 6
Thank you all for the help!
chilady1 is offline   Reply With Quote
Old 12-21-2010, 12:17 PM   #7
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by dwanthny View Post
I agree that Idolse' insight is most likely correct
So do I, but it would be nice to know. I meant no offense, but I felt it was unfair to say you didn't get "a good explanation" when you didn't answer the questions that would have confirmed the explanation you were given (at this point by 5 different people).

People who provide help here don't ask for much - a simple thank you, and/or confirmation that the problem was solved is fine. Your case looked like the problem Idolse diagnosed, but none of us can be sure.

It looked like the "scanned images of text in a pdf" problem when I answered you (20 minutes after you asked for help in the previous thread). You came back and said I was wrong: you had only text, not images of text. I suspected that you simply didn't understand what I wrote, but I didn't have to post my suspicion, since itimpi and Perkin had already done so. It looked like the same problem to them and they wanted you to doublecheck your answer to me.

Since you never answered their questions, we didn't know if Calibre had some kind of problem, or if our initial diagnosis was correct.

Be at peace.
Starson17 is offline   Reply With Quote
Old 12-21-2010, 05:22 PM   #8
chilady1
I devour books!
chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.
 
chilady1's Avatar
 
Posts: 771
Karma: 1285226
Join Date: Mar 2009
Device: iPad Air, Kindle 3/Kobo Aura HD, iPhone 6
Quote:
Originally Posted by Starson17 View Post
So do I, but it would be nice to know. I meant no offense, but I felt it was unfair to say you didn't get "a good explanation" when you didn't answer the questions that would have confirmed the explanation you were given (at this point by 5 different people).

People who provide help here don't ask for much - a simple thank you, and/or confirmation that the problem was solved is fine. Your case looked like the problem Idolse diagnosed, but none of us can be sure.

It looked like the "scanned images of text in a pdf" problem when I answered you (20 minutes after you asked for help in the previous thread). You came back and said I was wrong: you had only text, not images of text. I suspected that you simply didn't understand what I wrote, but I didn't have to post my suspicion, since itimpi and Perkin had already done so. It looked like the same problem to them and they wanted you to doublecheck your answer to me.

Since you never answered their questions, we didn't know if Calibre had some kind of problem, or if our initial diagnosis was correct.

Be at peace.
Thank you for everyone's help
chilady1 is offline   Reply With Quote
Old 12-21-2010, 05:45 PM   #9
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by chilady1 View Post
Thank you for everyone's help
So what is the answer? Do you have a normal PDF problem caused by having scanned images of pages with text on them, or do you have something unusual going on where your documents have true text, but won't convert because of a bug or bad character in the text?
Starson17 is offline   Reply With Quote
Old 12-21-2010, 06:19 PM   #10
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 645
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD
@chilady, how large in MB is the pdf and how many pages are in it, and how many pictures (covers etc.)?
Perkin is offline   Reply With Quote
Old 12-21-2010, 07:14 PM   #11
chilady1
I devour books!
chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.
 
chilady1's Avatar
 
Posts: 771
Karma: 1285226
Join Date: Mar 2009
Device: iPad Air, Kindle 3/Kobo Aura HD, iPhone 6
Quote:
Originally Posted by Starson17 View Post
So what is the answer? Do you have a normal PDF problem caused by having scanned images of pages with text on them, or do you have something unusual going on where your documents have true text, but won't convert because of a bug or bad character in the text?
It is scanned images of pages with text which I believe is OCR and for whatever reason - won't convert using Calibre.
chilady1 is offline   Reply With Quote
Old 12-22-2010, 05:26 AM   #12
Sunlite
Zealot
Sunlite can program the VCR without an owner's manual.Sunlite can program the VCR without an owner's manual.Sunlite can program the VCR without an owner's manual.Sunlite can program the VCR without an owner's manual.Sunlite can program the VCR without an owner's manual.Sunlite can program the VCR without an owner's manual.Sunlite can program the VCR without an owner's manual.Sunlite can program the VCR without an owner's manual.Sunlite can program the VCR without an owner's manual.Sunlite can program the VCR without an owner's manual.Sunlite can program the VCR without an owner's manual.
 
Sunlite's Avatar
 
Posts: 125
Karma: 165452
Join Date: Mar 2008
Location: Berlin, Germany
Device: Kobo Aura, PRS-T1, PB602, CyBook Gen3
OCR (Optical Character Recognition) is a method to turn text on scanned images into actual text.

The OCR software tries to connect the shape of a letter (seen on the image) to a letter. Depending on the quality of the scan and the font used in the original book this can work well or quite horrible. For example the letters "h" and "b" are often mixed up. So are some other letter combinations.

The process of character recognition is rather complicated. That is why good OCR software is often very pricey and why Calibre does not provide it.

As far as I understand the PDF conversion in Calibre, it tries to first decide if the PDF is text based or image based. If it encounters an image based PDF, it creates an output of the images. If it encounters a text based PDF, it tries its best to convert the text to a good text based output. During that images that are still in the text based PDF get lost.

In your case I think you have a mainly image based PDF that contains some text probably at the beginning. Calibre encounters the text in the PDF and decides that the PDF is text based and produces an output of the available text. It can neither know that the images are the actual important content, nor could it convert them into text if it did.

I hope this explanation is understandable, but if you or someone else got further questions I or someone else on this board will try to answer them. We just need to know what this questions are.
Sunlite is offline   Reply With Quote
Old 12-22-2010, 10:21 AM   #13
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by chilady1 View Post
It is scanned images of pages with text which I believe is OCR and for whatever reason - won't convert using Calibre.
It's not OCR. Your problem is that the scanned images of pages of text have not been processed with an OCR program to produce text. Calibre isn't an OCR program and can't convert pictures of text into text (by now I'm pretty sure you understand this) . Your only option is to use an OCR program for conversion or to keep the images and read those. The former can be done in Adobe Acrobat or ABBY and the latter can be done by keeping/reading the original PDF or by removing any leading text, so the document is pure images and converting it the way a comic is converted.
Starson17 is offline   Reply With Quote
Old 12-22-2010, 10:56 AM   #14
chilady1
I devour books!
chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.chilady1 ought to be getting tired of karma fortunes by now.
 
chilady1's Avatar
 
Posts: 771
Karma: 1285226
Join Date: Mar 2009
Device: iPad Air, Kindle 3/Kobo Aura HD, iPhone 6
Understood, this makes sense and I appreciate everyone's great info on the differences. Won't need to ask this question anymore. Thanks all!
chilady1 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
PDF-to-mobi conversion problem ruffunatio Calibre 3 09-26-2010 04:01 PM
calibre(kindle 3) pdf to mobi problem. lutwey Calibre 17 09-23-2010 01:15 PM
Epub/Mobi TO pdf conversion problem Hitch Calibre 4 06-15-2010 06:28 PM
Pdf to Mobi/Epub Format Problem. dubmehard Calibre 4 02-19-2010 02:53 PM
PDF to Mobi conversion problem DavidJD Calibre 6 10-04-2009 12:27 PM


All times are GMT -4. The time now is 04:33 PM.


MobileRead.com is a privately owned, operated and funded community.