Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 10-16-2015, 02:35 PM   #1
ger0g3n
Junior Member
ger0g3n began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Oct 2015
Device: Kindle 5 All New Touch 7
Exclamation Scanned PDF + steps I've made so far. Need help.

So first of all, hello everyone and in advance I apologise for any mistake I make during this post, I'm from Poland and English is not my native language.

Secondly, I've got a problem hence the post


So I'm a newbie to ebook readers world and I decided to order a Kindle 5 All New Touch 7th Generation ebook reader. Decent prize and thought it was a good choice.

I've seen people having kind of the problems I met with my pdf's but, these were old posts and none of them actually solved my issue.

So I study at Technical University and most ebooks I've got on my PC are crappy scanned pages put in PDF. I've tried to work with it, downloaded billion programs like Calibre(ofc), Wondershare PDFelement, ABBYY PDF Transformer+ and tried to make a readable copy of one of the scanned Physics book for my Kindle. Steps I've made so far.

1. I did the OCR in Wondershare of my pdf and it looks like this(not bad I think even though I know it still contains images)
http://s1130.photobucket.com/user/ge...a/ex1.jpg.html

2. Then I read I should convert my OCR'd pdf to one of the format that Calibre will read and convert to AZW3/MOBI that my Kindle will read nicely so I did convert from Wondershare my OCR'd pdf first to EPUB then tried to DOCX, then to HTML and all the results were the same. So in Calibre it looked like this every single time, no matter what the format was:
http://s1130.photobucket.com/user/ge...a/ex2.jpg.html

3. I tried the same thing with ABBYY and result wasn't much better, all the letters in random places, huge blank spaces(sometimes even on page was whole white and on the next one there was 1 word....etc.)


So my question is:

Is there any way I can make my pdf customized for my Kindle 5 in azw3/mobi format in a way like all other 'normal' ebooks?



Thanks in advance!



ger0g3n
ger0g3n is offline   Reply With Quote
Old 10-16-2015, 02:50 PM   #2
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,046
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
PDF is a terrible source. (there is a sticky at the top of this forum)

Complex pages are NOT created in a linear fashion as HTML is. Thus the random appearance of images and special features.

My understanding,is that the Kindle will display PDF (AZW4 is rumored to be a wrapped PDF)
theducks is online now   Reply With Quote
Advert
Old 10-16-2015, 04:41 PM   #3
Rizla
Member Retired
Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.
 
Posts: 3,183
Karma: 11721895
Join Date: Nov 2010
Device: Nook STR (rooted) & Sony T2
Scanned pdfs? It might be best to just read them as scans, especially when they are complex.

Jailbreak the Kindle (see the sticky in the developer forum. The jailbreak is coming soon) then install something like Koreader and read them as scans.

Put your Kindle in airplane mode while waiting for the jailbreak and before Amazon patch the jailbreak. Do not turn on wifi.
Rizla is offline   Reply With Quote
Old 10-16-2015, 04:52 PM   #4
ger0g3n
Junior Member
ger0g3n began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Oct 2015
Device: Kindle 5 All New Touch 7
So from what I see there is no option and way for me to make my complex PDF working like a regular azw3/mobi ebook?
ger0g3n is offline   Reply With Quote
Old 10-22-2015, 07:10 AM   #5
Wolfrott
Member
Wolfrott began at the beginning.
 
Wolfrott's Avatar
 
Posts: 23
Karma: 10
Join Date: Dec 2013
Device: iPad Mini / Voyage
Quote:
Originally Posted by ger0g3n View Post
So first of all, hello everyone and in advance I apologise for any mistake I make during this post, I'm from Poland and English is not my native language.

Secondly, I've got a problem hence the post


So I'm a newbie to ebook readers world and I decided to order a Kindle 5 All New Touch 7th Generation ebook reader. Decent prize and thought it was a good choice.

I've seen people having kind of the problems I met with my pdf's but, these were old posts and none of them actually solved my issue.

So I study at Technical University and most ebooks I've got on my PC are crappy scanned pages put in PDF. I've tried to work with it, downloaded billion programs like Calibre(ofc), Wondershare PDFelement, ABBYY PDF Transformer+ and tried to make a readable copy of one of the scanned Physics book for my Kindle. Steps I've made so far.

1. I did the OCR in Wondershare of my pdf and it looks like this(not bad I think even though I know it still contains images)
http://s1130.photobucket.com/user/ge...a/ex1.jpg.html

2. Then I read I should convert my OCR'd pdf to one of the format that Calibre will read and convert to AZW3/MOBI that my Kindle will read nicely so I did convert from Wondershare my OCR'd pdf first to EPUB then tried to DOCX, then to HTML and all the results were the same. So in Calibre it looked like this every single time, no matter what the format was:
http://s1130.photobucket.com/user/ge...a/ex2.jpg.html

3. I tried the same thing with ABBYY and result wasn't much better, all the letters in random places, huge blank spaces(sometimes even on page was whole white and on the next one there was 1 word....etc.)


So my question is:

Is there any way I can make my pdf customized for my Kindle 5 in azw3/mobi format in a way like all other 'normal' ebooks?



Thanks in advance!



ger0g3n
I've found ABBY to be terrible, TBPH. Glaring errors. Adobe's OCR feature is better, but not easy or flawless. PDFs turn out ugly and riddled with errors.

The best trick I've found is to grab the text from the scan using Microsoft OneNote's OCR add on, then manually proofread as always in a Word document, and then go from there converting to whatever format you want using Calibre. Depending on the scanned book's native font, I rarely find errors - the most common I've found are mm, nn, rr mistaken for the latter, i's become 1's, and sometimes "" doesn't get recognised.

So the steps:
I. Scan book.
II. Past pages into OneNote + OCR.
III. Proofread resulting text.

Takes me about a month.

Last edited by Wolfrott; 10-22-2015 at 07:12 AM.
Wolfrott is offline   Reply With Quote
Advert
Reply

Tags
azw3, conversion, mobi, pdf, scanned


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
chm to pdf, appears as scanned in some pdf softwares syriaccj Calibre 0 05-19-2013 02:51 PM
Scanned PDF onto Kindle 2. Help! Tac420oma PDF 6 07-20-2012 08:42 AM
903 scanned pdf files nirious PocketBook 9 08-26-2011 06:33 AM
scanned pdf excalibra PDF 5 04-08-2011 04:41 AM
Scanned pdf's issue ululu Sony Reader 1 11-18-2010 06:45 PM


All times are GMT -4. The time now is 05:50 PM.


MobileRead.com is a privately owned, operated and funded community.