Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Readers > Amazon Kindle

Notices

Reply
 
Thread Tools Search this Thread
Old 06-17-2009, 06:08 PM   #1
Gideon
Wearer of Pants
Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.
 
Gideon's Avatar
 
Posts: 1,050
Karma: 7634
Join Date: Jan 2008
Location: Norman, OK
Device: Amazon Kindle DX / iPhone
OpticBook 3600 and Kindle DX

Well, I've spent the better part of the afternoon trying to get the Opticbook and the Kindle DX to play nice. It has been a rousing failure.

I've tried various resolutions and colors, and if I make one page into a PDF it looks okay (not great) but the moment I try to create a multi-page document the quality goes down the hill real fast.

I'm using OpticBook just for scanning, and then bringing the TIFF (600 dpi or 300 dpi grayscale) and putting them together with Adobe Acrobat Pro. Adobe, even on the best setting for multiple files, drastically shrinks the file. Which would be okay, but there is an enormous quality loss.

If anyone has some ideas how to do this better, I'd be very interested. I'd very much like to scan a few of my books that are a bit older.
Gideon is offline   Reply With Quote
Old 06-17-2009, 06:35 PM   #2
jfrancis
Junior Member
jfrancis began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Nov 2007
Device: Shopping
Hi Gideon, any chance you could shoot me a few of those tiff files and I'll see what I can do? You can just zip them and email them to johnwfrancis AT gmail DOT com.

Regards,
John
jfrancis is offline   Reply With Quote
Old 06-17-2009, 08:03 PM   #3
jharker
Developer
jharker could sell banana peel slippers to a Deveel.jharker could sell banana peel slippers to a Deveel.jharker could sell banana peel slippers to a Deveel.jharker could sell banana peel slippers to a Deveel.jharker could sell banana peel slippers to a Deveel.jharker could sell banana peel slippers to a Deveel.jharker could sell banana peel slippers to a Deveel.jharker could sell banana peel slippers to a Deveel.jharker could sell banana peel slippers to a Deveel.jharker could sell banana peel slippers to a Deveel.jharker could sell banana peel slippers to a Deveel.
 
Posts: 345
Karma: 3473
Join Date: Apr 2007
Location: Brooklyn, NY, USA
Device: iRex iLiad v1, Blackberry Tour, Kindle DX, iPad.
Scanning documents into pdfs is a tough problem and I've never found a really good solution. That said, here are a few random brainstormy thoughts in no particular order...
  • First and foremost, if you don't like how Adobe shrinks your files you might try shrinking them yourself, first. In particular, if you turn the image to black-and-white, then save as a PNG file, it will be about as small as it can be. JPEG doesn't actually compress B&W very well.
  • You might limit your resolution. On the Kindle DX anything above 150 dpi is lost information anyways. You can always archive the original scans if you don't want to lose the detail, but for everyday pdfs I suggest lower-resolution versions.
  • Instead of Adobe, you might try an alternative like (for example) ImageMagick. I'm pretty sure that ImageMagick will happily compile a bunch of jpgs/tiffs/pngs into a multi-page pdf. It will probably also let you specify the amount of compression to use! You just have to figure out the right options...
  • You can also reduce file size without losing much quality by converting to B&W, then doing a slight gaussian blur. This leaves most of the page as compressible white space while still giving the letters nice soft edges.
I hope one of those is helpful!
jharker is offline   Reply With Quote
Old 06-17-2009, 10:35 PM   #4
Studio717
Addict
Studio717 will become famous soon enoughStudio717 will become famous soon enoughStudio717 will become famous soon enoughStudio717 will become famous soon enoughStudio717 will become famous soon enoughStudio717 will become famous soon enough
 
Posts: 208
Karma: 575
Join Date: Oct 2006
Location: California
Device: Various Kindles, iPhone, iPad, Galaxy 10.1
Sorry you're having issues, Gideon.

I just tried a PDF I'd created a couple of years ago and it seems fine. Only issue I noticed was that since it was a very old book, the margins weren't pristine so the DX didn't crop them. Still very readable.

I can go through the whole process a bit later (Opticbook --> software --> Acrobat 8 --> PDF --> DX) and see if there are any issues I can see. I use the same process that you do, scanning to .tiff and then using Acrobat to pull them in to a single PDF. Note, though, that I have Acrobat 8, not 9.

Edited to add: I usually scan in the 150 - 300 dpi range because that's what most OCR engines prefer. I don't remember offhand what dpi I scanned the above PDF in, but I would doubt if it was above 300.

Another edit to add: I usually scan B&W (not grayscale) unless it's an image like a photograph.

Last edited by Studio717; 06-17-2009 at 10:41 PM.
Studio717 is offline   Reply With Quote
Old 06-18-2009, 03:17 AM   #5
Studio717
Addict
Studio717 will become famous soon enoughStudio717 will become famous soon enoughStudio717 will become famous soon enoughStudio717 will become famous soon enoughStudio717 will become famous soon enoughStudio717 will become famous soon enough
 
Posts: 208
Karma: 575
Join Date: Oct 2006
Location: California
Device: Various Kindles, iPhone, iPad, Galaxy 10.1
I tried it with another book (an excerpt, actually, of 42 pages) and it came out fine. (Other than I really do need to crop the pages before sending it to DX.)

Specs: I used the Opticbook 3600 and Opticbook software, scanned at 300 dpi in B&W (not grayscale) to .tiff, then used Acrobat 8 Pro to create a PDF. The final file shows up as 2MB.

I wish I could help, but I had no hitches along the way, so don't have any suggestions for you. Sorry.

Good luck with fixing the problems.
Studio717 is offline   Reply With Quote
Old 06-18-2009, 11:30 AM   #6
Gideon
Wearer of Pants
Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.
 
Gideon's Avatar
 
Posts: 1,050
Karma: 7634
Join Date: Jan 2008
Location: Norman, OK
Device: Amazon Kindle DX / iPhone
Here are a few pages of the original scan:
http://rapidshare.com/files/24593425...chive.zip.html
(can only be downloaded 10 times, so only grab it if you really want to play with it.)

My thinking is that perhaps this was just a very bad book to scan. I don't remember having these kinds of problems before and having much nicer looking scans.

I'm going to try something else later today and see what happens. I have to move my virtual machine (so I can run the software, all I have are Macs) to a laptop so I can get settled in someplace a bit more comfortable for long term scanning.
Gideon is offline   Reply With Quote
Old 06-18-2009, 03:11 PM   #7
DDHarriman
Guru
DDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheese
 
Posts: 854
Karma: 1200
Join Date: Feb 2008
Location: Almada, Portugal
Device: Cybook Gen3, Sony PRS 505, Kindle DXG and Samsung Galaxy Note
Hi Gideon

I have taken the liberty to download the file you have put in rapidshare, hope that is not a problem.

First of all let me check with you if the file does not contain corrupted tif files.
They are 6, scanned in grey at 300dpi, but all show a black bad part under each oneÖ is this what you intended?
Iím posting here a pdf of page 3 for you to check.

Anyway, my advice.

1 - check the Plusteck 3600 scanning software and limit your scan area with it, you will save lots of time and get the almost perfect page size so you will not need to crop the image later or if so you will have to do it just a bit;

2 - do not scan these types of books in greyscale, itís too much information for nothing, 300 dpiís at black and white is perfect for it;

3 - if just using Acrobat to produce the final pdf (what version do you have?), apply a final compression to the file using (in Acrobat 8 here) advanced/pdf optimizer and check the parameters there before compressing, per example in this situation you should have for the Monocrome Images option - Downsampling Bicubit Downsampling to 300 pixels/inche. For images above 450 pixels/inch and for the Compression option JBIG2 and Quality Lossless.

Iím putting here also what I have done with your tif images (6 pages, total 20.5 Mbytes), and the result is a 145 Kbyte pdf file(!):

1 - opened the tif files in Finereader Pro 9.0 converting automatically the files to black and white;

2 - cropped in Finereader the pages, mainly cutting the black bar I mention above;

3 - applied ocr to the files and saved the result in pdf format with 2 layers, above the image and under the text;

4 - opened the pdf in acrobat 8 and optimized itís size as I describe above in (3).

Please check if it works for you in the DX.

Best regards,
Attached Files
File Type: pdf Devices_ 0003.pdf (143.8 KB, 234 views)
File Type: pdf Book.pdf (144.2 KB, 203 views)
DDHarriman is offline   Reply With Quote
Old 06-18-2009, 05:29 PM   #8
Gideon
Wearer of Pants
Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.
 
Gideon's Avatar
 
Posts: 1,050
Karma: 7634
Join Date: Jan 2008
Location: Norman, OK
Device: Amazon Kindle DX / iPhone
Those were actually the wrong files (Preview chewed em up and spit em out) but I'll give these a go.

The real files are here (or rather, those that didn't get mauled)
http://rapidshare.com/files/246004538/Archive.zip.html

Wow... first one was a bit of a mess, but the second (book.pdf) was very nice.

I'll give it a go. As Preview ate the first few pages I'm going to have to scan again anyway. But I'll do it in black and white this time (I had always heard to use grayscale). Thanks for taking the time to help!
Gideon is offline   Reply With Quote
Old 06-18-2009, 07:21 PM   #9
DDHarriman
Guru
DDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheese
 
Posts: 854
Karma: 1200
Join Date: Feb 2008
Location: Almada, Portugal
Device: Cybook Gen3, Sony PRS 505, Kindle DXG and Samsung Galaxy Note
Hi Gideon

No problem, itís a pleasure to help.

Here you have the new pages you have posted above using the technique I explain above - 8 pages, 289 Kbytes in size. See if how it looks in your DX.

Concerning the grey scanning.
It should be used if your result file is to be a text file, a word file, or a text and images pdf file; you want the most perfect results for the recognised text, and the original is old, like burned by the sun pages, the size of the font is less then 10 points, or the original text is less then 300 dpi's printed (like fax text).

For normal books, black and white printed, new (or relative) and good printed fonts (10 points or above), b&w 300 dpiís is perfect, even for the output I point above.

Let me know if you think I can be of some more help.

Best regards,
Attached Files
File Type: pdf Book1.pdf (288.2 KB, 304 views)
DDHarriman is offline   Reply With Quote
Old 06-19-2009, 12:22 AM   #10
Gideon
Wearer of Pants
Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.
 
Gideon's Avatar
 
Posts: 1,050
Karma: 7634
Join Date: Jan 2008
Location: Norman, OK
Device: Amazon Kindle DX / iPhone
Well, I scanned two books tonight with your advice in mind. Worked really well.

The key is really making sure the borders are all gone. This can be minimized to some extent with the Plustek software, but as you move through a book (especially a hardback) the sides shift some so you're still going to need to do some trimming in Acrobat (I'm using 8). This is a bit easier with "Even only" and "Odd only' though many pages still need personal attention.

Still, HUGE help. Thanks again. It seems to be taking longer than it used to, but perhaps I'm just rusty.
Gideon is offline   Reply With Quote
Old 06-19-2009, 02:00 AM   #11
Gideon
Wearer of Pants
Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.
 
Gideon's Avatar
 
Posts: 1,050
Karma: 7634
Join Date: Jan 2008
Location: Norman, OK
Device: Amazon Kindle DX / iPhone
Okay, I was wrong...

One of my scans needed quite a bit of cropping. Unfortunately, Acrobat doesn't really crop so much as mask and the Kindle isn't fooled.

The book is about 381 pages though (and I'm not positive Preview really crops either) so I really need to start with "erase white space" and then edit individual pages as needed.

Any ideas?
Gideon is offline   Reply With Quote
Old 06-19-2009, 04:33 AM   #12
emellaich
Wizard
emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.
 
Posts: 1,018
Karma: 3630887
Join Date: Oct 2007
Device: Palm=> Ebookman=> IPaq=> Axim=> Cybook=> Kindle 2=>IPAD 1 & Kindle 3SO
Well,

This is complete brainstorming -- I don't understand your workflow, and I don't know quite what you are trying to do. However, I've been more impressed with Foxit than with Adobe in many cases. Would something like Foxit Page organizer allow you to stick together separate files? I've never used it, I just went through the foxit web site with your problem in mind.

www.foxitsoftware.com
emellaich is offline   Reply With Quote
Old 06-19-2009, 04:47 AM   #13
Gideon
Wearer of Pants
Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.Gideon knows the square root of minus one.
 
Gideon's Avatar
 
Posts: 1,050
Karma: 7634
Join Date: Jan 2008
Location: Norman, OK
Device: Amazon Kindle DX / iPhone
Well, I worked out a way around it.

I exported my 'cropped' to TIFF and then created a PDF with them again. After that, I ran OCR again. Ended up with about a 20 meg file but.. so it goes. Optimizing just makes it bigger at this point. But then I 'reduced PDF size' and it looked pretty rough on the computer, but looked good on the Kindle (and page turns weren't too slow.)

So... in conclusion:
  • Scan text at 300dpi B&W
  • Create PDF with images
  • Crop whitespace
  • Then go through and crop individual pages that extend beyond the text area (usually because of edges)
  • Export to TIFF files again
  • Create a NEW PDF with these files
  • Run OCR & optimize
  • Save a nice 'archive' copy
  • Then 'Save As...' and setup a new file.
  • Run 'Reduce File Size' at highest version interoperability (for me this was Adobe 8)
  • Pop into Calibre, adjust metadata and put on Kindle
  • Read, and bemoan your lack of Table of Contents because evidently Amazon doesn't actually use the features they create and realize what a stupid thing that was to leave out.

---
I think Foxit is PC only, so it wouldn't really help. I have to run Windows in emulation which is good enough for scanning, but any real work needs to be done in Mac mode - it's too aggravating otherwise.
Gideon is offline   Reply With Quote
Old 06-19-2009, 07:40 AM   #14
DDHarriman
Guru
DDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheese
 
Posts: 854
Karma: 1200
Join Date: Feb 2008
Location: Almada, Portugal
Device: Cybook Gen3, Sony PRS 505, Kindle DXG and Samsung Galaxy Note
Hi

Quite a workflow you have here!

With the software and hardware you have now, maybe you can diminish it a bit, here are two ideas:

1 - crop what you want correctly when scanning. For each page preview first, then adjust the scanning window to the final size you want and then scan. You will be spending twice the time (at least) scanning but after having the tif files it’s just one more step - build the pdf (ocred or not);

2 - with your scanner you have received several pieces of software. One is finereader 5 sprint. Install it and se if it has an option to crop the images after loading them into the program (finereader 9 as it). It’s a great functionality (with the straightening function on loading, who aligns inclined scanned images) and permits, in a very fast manner, to crop everything up to the point one wants (this is what I have used to create from your files books1 and 2).

Finnaly, you can check the several programs that permit one to crop images or pdf’s that several people have posted about in the forum, specially in the pdf zone of the forum I think.

Wish you success with your endeavour, and remember, even if it takes time, once done it’s done for good, you will have not just the pdf for the DX that you can use now, but also a matrix document you can come back in the future to convert it to any other file format you want when the need comes - new reader, etc…

Best regards,
DDHarriman is offline   Reply With Quote
Old 06-19-2009, 08:32 AM   #15
301verbs
Enthusiast
301verbs began at the beginning.
 
Posts: 27
Karma: 12
Join Date: May 2009
Device: none yet
Wait, why are you creating the PDF twice? Besides what DDHarriman suggested, another option is to just scan to TIFFs, then crop the images (lots of image viewers also have batch crop options), and then export to PDF. Or am I missing something?
301verbs is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
OpticBook 3600 and Double Page Gideon Workshop 5 06-17-2009 10:31 AM
Question about Opticbook 3600 301verbs Workshop 4 06-15-2009 09:08 PM
Want to buy Plustek OpticBook 3600 PieOPah Workshop 7 01-26-2009 10:37 AM
OpticBook 3600 for $229.99 at TigerDirect Bob Russell Deals, Freebies, and Resources (No Self-Promotion) 16 10-21-2008 03:43 PM
Few questions for Opticbook 3600 users Eldric Workshop 7 02-29-2008 08:41 AM


All times are GMT -4. The time now is 06:18 AM.


MobileRead.com is a privately owned, operated and funded community.