Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Readers > Sony Reader

Notices

Reply
 
Thread Tools Search this Thread
Old 05-20-2011, 05:23 PM   #1
UpSpin
Enthusiast
UpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbit
 
Posts: 28
Karma: 60000
Join Date: Apr 2011
Device: Sony PRS-650
Lightbulb PDF Reader Review and Guide: View, Optimize and Create PDF files

PDF Reader Review and Guide: View, Optimize and Create PDF files


Preamble
  • I made a lot of screenshots from the eReader display to show what Iím talking about and how it looks like. Remember, the screenshots use JPEG compression and you view it on your usual LCD. So the difference between some settings looks less impressive on your LC-Display than it will on the eReaders eInk display.
  • My native language is German, thus some sentences may sound funny, ignore it or help me to improve it
  • I just wrote down what I know but if you have some further tips or know some handy tools, feel free to tell it us and Iíll add it here.


Index
  1. PDF overview, pros and cons
  2. Review of the eReader PDF Viewer
  3. Optimize view on the Reader
  4. Optimize the PDF file
  5. Worklog, from a physical book to a finished eReader optimized PDF

Last edited by UpSpin; 05-20-2011 at 06:12 PM.
UpSpin is offline   Reply With Quote
Old 05-20-2011, 05:26 PM   #2
UpSpin
Enthusiast
UpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbit
 
Posts: 28
Karma: 60000
Join Date: Apr 2011
Device: Sony PRS-650
PDF overview, pros and cons

1. PDF overview, pros and cons

Why PDF?
Most eBooks are saved in common eBook file formats, like epub. Those eBooks are most often normal story books with text and maybe some images. You can change the font size and the reader resizes everything automatically and optimizes it to fit on the eReader screen. It would be idiotic to convert such a book to a PDF format.
The PDF file format is used for other things. Maybe you want to read a science book on your eReader, containing equations, mathematical expressions, chemical reactions, then the PDF format is the way to go. Such books have a strict layout and itís impossible to convert such a book to an ePub format. Equations arenít supported by normal fonts, so it just wonít work.
Some online sites (like Nature) also publish their articles in the PDF format. To view them, you need a PDF viewer on the eReader and have to deal with the advantages but also disadvantages of the PDF file format.



PDF advantages
As already said, the big advantage is that a PDF file and thus a Viewer can display anything! Itís not limited to text or images, it can contain anything you can imagine. It can contain pixel based images (bitmaps), but also vector based images (you can zoom in without a quality loss), and also text and whatever other stuff. Thatís a big problem for eReaders because you canít just remove the layout and fit it on the relative small screen.

PDF disadvantages
Displaying anything means such PDF files have custom layouts, maybe like newspapers or some science books with two or more columns. So changing the font size wonít work as easy as on ePubs because it will destroy the whole layout and thus create a huge mess, most often.

My Usage Scenarios
  • Reading news articles in the bus or train which Iíve downloaded to the eReader in the morning
  • Reading science books published as PDF or scanned and converted to PDF
  • Reading lecture scripts published by my prof (I donít want to carry 600 pages around ^^) or created by other students with LaTeX in the lectures (always up to date and if some errors got fixed, I donít have to print it again)
  • Reading exercise sheets, so I donít have to print them.
UpSpin is offline   Reply With Quote
Old 05-20-2011, 05:29 PM   #3
UpSpin
Enthusiast
UpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbit
 
Posts: 28
Karma: 60000
Join Date: Apr 2011
Device: Sony PRS-650
Review of the eReader PDF Viewer

2. Review of the eReader PDF Viewer

In this section we’ll take a closer look at the features, but also limitations of the included Adobe Reader.

Features
First of all: It’s a very powerful reader and it supports any PDF file in any size. It’s very stable, fast and easy to use. The display quality is excellent, too. So all in all it’s a really great program.
You can magnify, change view mode, change font size, enable text reflow mode and change the rendering settings.
While reading you can add ink notes (write, draw something, or underline a word), add text notes as a bookmark, highlight text or select text and open the dictionary.
You can navigate by bookmarks, page number or history and finally search in the PDF file.



View
If you open a PDF file in portrait mode the PDF Viewer will zoom out until it fits the whole page on the display. Normally this results in an unreadable tiny font size because the display is much smaller than the page size of books or magazines.




Magnification
So we press the magnification button and get three options: “Zoom In”, “Page Mode”, “XS, S, M, L, XL, XXL”

“XS, S, M, L, XL, XXL”
With these selections we can activate/deactivate the reflow mode and change the font size. If we select S the PDF gets displayed just as it is. With its custom layout, font size, etc. That’s the only mode you should use to view a PDF file properly!
If we select any of the other modes, like M or L, the PDF Viewer will remove the whole layout, take the text and display the text as flow text, it will reflow the text. This means: The font gets larger and readable, but the layout gets removed. The issue with this mode is it doesn’t work with every PDF. If you have a normal book in PDF format, then use this mode, it should work just fine with some hiccups. If you have a normal newspaper article, then you can use this mode, too, to remove columns. However, if you read some science stuff or odd PDF documents and use the reflow mode, it won’t work properly. It will resize the font, but scramble parts of the page. If some equations are included, too, forget it, it just doesn’t work.




Zoom In
Just as the name implies, the Sony reader has a zoom function. Press it and you can zoom in. You can pan the page with the touchscreen just as on a smartphone but you can’t flip to the next page. However, there’s a lock button. If you lock it, the additional zoom buttons, the user interface, vanishes and you can turn the page and the zoom settings stay, even after a page turn. So if you have a PDF file and only want to read a small part of it, located on every page at the same position then you can use this feature. Else, just use the zoom function to view a graph in detail, or other things magnifyied. (You can’t add notes while in zoom mode)




Page mode
This function is important to view newspaper articles. Such articles consist of two or three columns. Reading them on the eReader in original view is impossible, the font is too small. Using the zoom tool is awkward because it’s too slow to pan around (an e-Ink display has a slow response time), so you can use the 2-Column Split or 3-Column Split. That way the PDF file gets magnified and you view just a part of it. If you flip to the next page the next part gets displayed, and finally goes to the next page, displaying the first part again. It’s very useful and works pretty well. (Note: These two modes don’t work in landscape mode)
The other remaining modes “Margin Cut” and “Full Page” should do what they’re called like. Margin Cut, tough, hadn't any larger effect on the PDF files I view yet.




Landscape mode
The best method to increase font size is to switch to landscape mode. That way the PDF file gets split in an upper and lower part, thus the useable width gets much larger and the font bigger.
The only issue with the landscape mode is that it splits the PDF file in only two parts. If you have a very high and narrow page, then it does not fit the page to the width of the screen, but zooms out and reduces the page size until it can split it in two parts again. It’s stupid that they don’t split it in two or three parts then. This fact is important for the PDF file optimizations discussed later.




Adjust view (options)
In the options dialog you can adjust the view. This means you can change the contrast and brightness, making text lighter or darker, edges harder or software, the background white or gray. You can use one of the 5 presets or use a custom one with custom contrast and brightness settings.




Notes
You can highlight a single word or whole sentences with the touchscreen. Ink notes are also supported and only limited by the touchscreen resolution. It’s useful to make a small comment, draw a symbol or underline words. With an included eraser tool you can remove both ink notes and highlighting again. On each page you can add a bookmark, too. This bookmark can be either a text message, written with the on screen keyboard, or an ink message, drawn with the pen.
To manage the notes you can open a list, containing all your notes.



(the handwriting looks ugly because the pen accuracy is low and I have to keep my wrist lifted, I’m used to write on a tablet PC with a Wacom pen, so it’s not my fault )


Search
Of course you can search in the PDF file.

Last edited by UpSpin; 05-20-2011 at 06:08 PM.
UpSpin is offline   Reply With Quote
Old 05-20-2011, 05:30 PM   #4
UpSpin
Enthusiast
UpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbit
 
Posts: 28
Karma: 60000
Join Date: Apr 2011
Device: Sony PRS-650
Optimize view on the Reader

3. Optimize View on the Reader

First of all you have to understand that youíre working with a PDF file. A PDF file is not bad, itís great, versatile, powerful and widely used, but itís difficult to port to a different layout. Not because itís a Ďstupidí PDF, but because the content was optimized for this specific layout. And compared to content in ePub format or others (without any layout), the content of a PDF file is most often highly complex. So to get the best results, try to preserve the given layout.


General tips

View
Play with the settings and try the presets to see whatís possible. I personally recommend a custom setting of: (text and b/w images look much better, but some greyscale images may look worse)
Brightness -40
Contrast +40
The text gets bolder, clearer, sharper and better readable. (on the eReader screen the difference is more visible)


Reflow mode
Only use it if you work with text only PDF files without complex content. But then, you can try to get the book in ePub format already, too or try to convert it to ePub with a computer software.

Zoom
Only use it casually. If you intend to Ďcropí a page with it, better do it on a PC


Articles
To read articles you have to use the two Column/Three Column mode. Sadly it only works in portrait mode, but it works very well. Itís not as eReader optimized as ePub files, but it makes PDF articles viewable on the eReader.
You may want to remove any white margin on a PC prior, so the font gets larger on the eReader screen.

Books
Use landscape mode to increase font size without losing the given layout. But you should also use a PDF editor on a computer to remove not necessary margins.

Last edited by UpSpin; 05-20-2011 at 05:45 PM.
UpSpin is offline   Reply With Quote
Old 05-20-2011, 05:30 PM   #5
UpSpin
Enthusiast
UpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbit
 
Posts: 28
Karma: 60000
Join Date: Apr 2011
Device: Sony PRS-650
Optimize the PDF file

4. Optimize the PDF file

To optimize a PDF file you need a PDF editor. I do everything with Adobe Acrobat or Bluebeam PDF Revu. Both are very powerful PDF editors, sadly their price is pretty high, too. Acrobat is the standard! Next to PDF editing and creation features, it has a very powerful OCR engine and lots of features for print production.
PDF Revu on the other hand has the same editing and creation features, but its main use is in working with PDF files on a tablet PC, mobile PC or in the field. If you own a tablet PC, take a look at it, it’s the best PDF tool to annotate with pen and ink and work with PDF documents.
There are also some free PDF editors available which do this, like:
http://sourceforge.net/projects/briss/

General
There are some free tools, written by some users, which shall improve the PDF pages, by splitting and cropping them automatically and changing the font.
Personally, I really don’t like hard splitting a PDF page in two or three parts. By doing this, you won’t be able to view the full single page on the eReader again and you’ll have to create a second version of the PDF file, for the eReader only. Thus, if you work with the PDF file on your computer, too, you’ll create a small organizational mess.
So my preferred method is to fit the page to the eReader screen as good as possible.

Crop pages and change page size:
That’s the most important and most effective way to optimize the PDF file. Remove the white border and maybe change the page size.
The Adobe Reader on the Sony reader has one ideal width height ratio of 0,775. Everything other than that and it will add some grey border around it. So by changing the page size, and resizing the content, you can convert the grey border to white border and thus create more space for ink annotations.

If you crop a book page, remember you have an eReader, so a header with the chapter title or footer with the page number isn’t necessary, so remove it, too, if possible but only if it helps. If the page content is very narrow, so the eReader adds a huge grey margin to left and right, then you have to remove it to reduce this margin. If the content is very wide and the eReader creates a margin on the top and bottom, then removing it won’t improve anything.



In the first two image I’ve opened a PDF file with its default dimensions (A4). We have both a huge white and grey margin, both in portrait and landscape mode (wrong aspect ratio)



In the second two image I’ve cropped it, so every white margin is gone. Because the page size has the wrong aspect ratio (too narrow) the reader still adds a grey margin on the left and right.



And finally I’ve cropped the margin with the correct aspect ratio. Compared to the second picture I don’t gain any additional size increase, but the grey margin is gone. Now I could have left on only one side a large margin, on the other none, so I have some space to add annotations, sacrificing nothing.

So if you crop the pages, try to achieve the correct aspect ratio (and if you want to annotate try to make the margin on one side large, on the other small).

If your page is very wide you'll get following result. It gets displayed properly in landscape mode, but you still have to scroll, so why not extending the bottom to get additional space to take notes on without sacrificing anything again.



The correct aspect ratio is Width/Height=0.775
If the cropped page is 24cm tall, you should try to set a width of 0.775*24cm = 18.6cm
If the cropped page is 20cm width, you should try to set a height of 20cm/0.775 = 25.81cm
(The same in inch or any other unit)

Sometimes the PDF file has the wrong aspect ratio, but there’s no margin to replace the grey one with white one on which you can write. Then you have to change the page size / the content size on the page . A rather difficult task with Acrobat. You have to print the PDF file to a new page size. So print it on a larger page, then crop this page afterwards.

I use Bluebeam PDF Revu for this which supports to resize the content on the page directly (I do most of my PDF work with it, but it’s a rather exotic piece of software, so I don’t go in detail here).

1:1 copy
I haven’t used this yet, but maybe others use a reader for more than just reading and want to view the objects in PDF file with the right size ratio. So a 10cm large house in the PDF file is 10cm large on the eReader display, too.

Then you have to select following page size:

If you want a 1:1 copy in portrait mode set:
Width: 89,4 mm
Height: 115,4 mm

If you want a 1:1 copy in landscape mode set:
Width: 120,4 mm
Height: 155,4 mm

(I got these values and also the aspect ratio by measuring it. You can’t just take the display measurements, because the eReader adds a little margin around every PDF file. Maybe there’s a way to receive the internal saved values but I don’t know this way)

OCR
If you digitize a book then don’t leave the pages as images. It’s not only a waste of space and increases loading time but it also looks fuzzy and strange on the eReader. Convert it to text with an OCR tool.
Either do it with some specialized OCR software and convert it to an ePub format, or, if it isn’t text only and thus the PDF format necessary, then use the so called ClearScan method in Acrobat.



It doesn’t convert the text to a given font, but creates a custom font based on the scanned image. This allows Acrobat to replace the pixel based image with a vector based freely scalable font, and thus converts everything perfectly.



On the left you see the PDF file as image, on the right it got converted to vector font with ClearScan

The font looks in detail different than the usual font you know, tough it’s the best method to view a scanned book on the eReader, especially if the pages contain more than just flow text.



The upper two images show the PDF file on the eReader whereas the scanned page got converted to PDF without OCR, so it's still a bitmap.
The lower two images show the same PDF file converted with ClearScan.

Last edited by UpSpin; 05-20-2011 at 07:10 PM.
UpSpin is offline   Reply With Quote
Old 05-20-2011, 05:30 PM   #6
UpSpin
Enthusiast
UpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbit
 
Posts: 28
Karma: 60000
Join Date: Apr 2011
Device: Sony PRS-650
Worklog, from a physical book to a finished eReader optimized PDF

5. Worklog, from a physical book to a finished eReader optimized PDF

So here we go, a small explanation on how to digitize a physical book, so it gets perfectly readable on the Sony eReader.

Scan the book
I use a special scanner to scan a book. Itís the so called Plustek OpticBook 3600 Plus. Itís a usual flatbet scanner with the advantage that it has a very narrow border on one edge, Ö Because that makes it special, you pay a premium for it. But you buy a scanner once, and if you plan to work with digitized books, itís worth a consideration.
You can also use a normal flatbet scanner, just make sure that you press on the back of the book so thereís no distortion.
I donít recommend the super flat LED scanners, they work fine as long as the object lies flat on the surface, which isnít the case on a book. As soon as the object has a small distance they donít work any longer.
If you use a camera, then it gets a bit more difficult to get good results because the post processing needs additional tools to remove the distortion.
Just select 300 DPI and start the scanning. I prefer to scan each page to a single page first. That way I can postprocess the pages with whatever program I want and in whatever way I want.

Postprocessing the pages
I use Adobe Photoshop. Other free tools should be sufficient, too. We just need to change the white and blackbalance and maybe increase brightness and contrast, all done in a batch process. Thatís a very important step, because only that way you get perfect OCR results and true black and whites on the eReader.
In an additional step you can crop the pages already (I do this most often), however, this can also be done in the PDF later.

Creating a PDF file
Just create a PDF file out of the images. I recommend the best quality setting to keep the 300 DPI and keep compression artifacts at a minimum, later we will OCR it thus space consumption doesnít matter yet.
Crop the PDF / optimize margin
Remove not needed white or black space, if possible to a value which fits on the eReader screen best. Add margin by changing the content size in the PDF page if you want to remove the useless grey margin.

OCR
In Adobe Acrobat select ClearScan and set the image resolution to 300DPI. After that your PDF file size is tiny, the pages are searchable and the text replaced with a vector based font.

Structure
Now we are almost done. You can add an index so navigation gets easier on the eReader.

eReader
Copy the file to the eReader, open it, change the view settings to custom with Brightness -40, Contrast +40, select landscape mode and enjoy your PDF

Summary
You may think that this is a lot of work, but keep in mind, itís not intended for usual books. It's intended for science books used by students or other people, who work with such a book several weeks, months, years. For a normal book, which you use a few days and then Ďthrowí it away, thatís not worth the trouble, but there you can buy an official eBook most often already.

Last edited by UpSpin; 05-20-2011 at 06:00 PM.
UpSpin is offline   Reply With Quote
Old 05-20-2011, 06:16 PM   #7
UpSpin
Enthusiast
UpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbitUpSpin with a running start, can leap into geosynchronous orbit
 
Posts: 28
Karma: 60000
Join Date: Apr 2011
Device: Sony PRS-650
Feel free to comment.

I hope this guide answers a lot of PDF related question and maybe more people will use or buy the eReader not only to read eBooks but also view PDF files.

If you have any additional informations then just post them and I'll try to integrate them.
UpSpin is offline   Reply With Quote
Old 05-25-2011, 01:13 PM   #8
Prestidigitweeze
Fledgling Demagogue
Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.
 
Prestidigitweeze's Avatar
 
Posts: 2,237
Karma: 24802569
Join Date: Feb 2011
Location: White Plains
Device: Aura HD; Nexus 7; PRS-350, 950; Kindle K; OnePlus One; Galaxy S4; MBP.
I have one comment (so far): The mods should sticky this thread.
Prestidigitweeze is offline   Reply With Quote
Old 05-25-2011, 03:37 PM   #9
quisvir
Addict
quisvir is kind to children and small, furry animalsquisvir is kind to children and small, furry animalsquisvir is kind to children and small, furry animalsquisvir is kind to children and small, furry animalsquisvir is kind to children and small, furry animalsquisvir is kind to children and small, furry animalsquisvir is kind to children and small, furry animalsquisvir is kind to children and small, furry animalsquisvir is kind to children and small, furry animalsquisvir is kind to children and small, furry animalsquisvir is kind to children and small, furry animals
 
quisvir's Avatar
 
Posts: 238
Karma: 6875
Join Date: Feb 2009
Location: Netherlands
Device: Kindle PW2
Great work UpSpin! Definitely worth a sticky imho.
quisvir is offline   Reply With Quote
Old 06-16-2011, 12:41 PM   #10
Soul_Est
Gadgetic Young Man
Soul_Est has a complete set of Star Wars action figures.Soul_Est has a complete set of Star Wars action figures.Soul_Est has a complete set of Star Wars action figures.Soul_Est has a complete set of Star Wars action figures.
 
Soul_Est's Avatar
 
Posts: 57
Karma: 300
Join Date: Nov 2010
Location: Toronto, Ontario, Canada
Device: Sony Ericsson XPERIA Ray, Sony PRS-650, Samsung Galaxy Note 10.1
I agree with both Prestidigitweeze and quisvir; Mods please sticky this thread. Best information I've found for working with PDFs for and on my 650 to date.
Soul_Est is offline   Reply With Quote
Old 06-16-2011, 01:16 PM   #11
ScalyFreak
Sith Wannabe
ScalyFreak ought to be getting tired of karma fortunes by now.ScalyFreak ought to be getting tired of karma fortunes by now.ScalyFreak ought to be getting tired of karma fortunes by now.ScalyFreak ought to be getting tired of karma fortunes by now.ScalyFreak ought to be getting tired of karma fortunes by now.ScalyFreak ought to be getting tired of karma fortunes by now.ScalyFreak ought to be getting tired of karma fortunes by now.ScalyFreak ought to be getting tired of karma fortunes by now.ScalyFreak ought to be getting tired of karma fortunes by now.ScalyFreak ought to be getting tired of karma fortunes by now.ScalyFreak ought to be getting tired of karma fortunes by now.
 
ScalyFreak's Avatar
 
Posts: 1,644
Karma: 6202604
Join Date: Jun 2011
Location: I'm not sure... it's kind of dark.
Device: PRS-950, Aluratek Libre Pro, HP Touchpad, Kindle Touch, Galaxy S3
Another vote to sticky this. And a bow and tip of the hat to you for taking the time and effort to write this up.
ScalyFreak is offline   Reply With Quote
Old 11-25-2011, 10:04 PM   #12
Rizla
Wizard
Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.Rizla ought to be getting tired of karma fortunes by now.
 
Rizla's Avatar
 
Posts: 1,835
Karma: 4155921
Join Date: Nov 2010
Device: PRS-650 / Cybook Opus / Nook STR (rooted)
After much experimentation, I am of the opinion that the pdf file format should be used only as a last resort.

If you possibly can, convert the pdf to html and then to epub. Strip out unnecessary images. These make sensible conversion more difficult.

If it is a scanned file that for some reason you cannot OCR, reduce the pdf to a series of jpgs, zip them up, change the suffix to cbz and convert to epub in calibre. You can then view in landscape and the file will use the whole page.
Rizla is offline   Reply With Quote
Old 11-26-2011, 06:59 AM   #13
Prestidigitweeze
Fledgling Demagogue
Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.
 
Prestidigitweeze's Avatar
 
Posts: 2,237
Karma: 24802569
Join Date: Feb 2011
Location: White Plains
Device: Aura HD; Nexus 7; PRS-350, 950; Kindle K; OnePlus One; Galaxy S4; MBP.
Rizla:

It's very possible that EPUB 3 and Kindle Format 8 will eliminate the necessity of UpSpin's task. But for those who are using current- and older-gen eInk readers and want to continue to get some life out of their devices, UpSpin's tutorial is a yahwehsend. I'm amazed it still hasn't been stickied.

I might try your method of zipping and renaming as well if you tell us exactly how you mean to "reduce the pdf to a series of jpgs" and someone with UpSpin's needs could "strip out unnecessary images." Do you have any screenshots of the results of your doing these things with that sort of book?

Last edited by Prestidigitweeze; 11-26-2011 at 07:06 AM.
Prestidigitweeze is offline   Reply With Quote
Old 11-26-2011, 08:29 AM   #14
Analogus
Fanatic
Analogus ought to be getting tired of karma fortunes by now.Analogus ought to be getting tired of karma fortunes by now.Analogus ought to be getting tired of karma fortunes by now.Analogus ought to be getting tired of karma fortunes by now.Analogus ought to be getting tired of karma fortunes by now.Analogus ought to be getting tired of karma fortunes by now.Analogus ought to be getting tired of karma fortunes by now.Analogus ought to be getting tired of karma fortunes by now.Analogus ought to be getting tired of karma fortunes by now.Analogus ought to be getting tired of karma fortunes by now.Analogus ought to be getting tired of karma fortunes by now.
 
Analogus's Avatar
 
Posts: 522
Karma: 2155774
Join Date: Apr 2011
Device: 2x Sony PRS-350 (silver, blue); PRS-300 (Ü), Kindle Paperwhite
Quote:
Originally Posted by Prestidigitweeze View Post
Rizla:

It's very possible that EPUB 3 and Kindle Format 8 will eliminate the necessity of UpSpin's task.
I cannot see why the necessity will be vanish. There will be enough PDF's being outside, and they still want to be read.

Regarding PDF and handling:

I for my part use PDF very often on my 5"-reader. Therefor I developed ;-) following 'decision-tree':

1) Try to use the file without altering it on the reader in re-flow-mode
2) If experience is OK --> read on. If experience is bad try
a) cropping headers and footers as described above
or
b) convert the PDF into EPUB --> goto (3)
3) Converting PDF:

Fact: I do not want to experiment with every PDF to see what happens.
Fact: I sometimes want to have pictures and other times just plain text.
Fact: PDF's come in different quality

My solution for ALL PDF's:

I use the (sadly not free of charge) software ABBY-PDF-Transformer .
It takes EVERY technical form of PDF and do a complete OCR-process. Sometimes it is necessary to do a ~30 min. manual correction of picture-frames.

Details:
  1. Crop headers and footers (page numbers, ...) with whatever software you want (for ex. Adobe Acrobat)
  2. Load it in Abby-PDF-Transformer and do just a recognition of the different areas (pictures, test, tables)
  3. manually correct areas if necessary, especially picture areas. This step is ev. necessary for 50% of PDF's
  4. Do the OCR and produce a HTML-file without original layout
  5. Open it in MS Word and save it as RTF. Close Word and reopen the RTF. Save it a second time as HTML.
  6. Load the HTML-file in CALIBRE
  7. Convert it into EPUB

That procedure sounds ridiculous, but there is just (and maybe) one longer part: Correction of areas in Abby-software.

I did a huge number of PDF-converting in various ways and ended up as described.

A.

Last edited by Analogus; 11-26-2011 at 08:32 AM.
Analogus is offline   Reply With Quote
Old 11-26-2011, 09:03 AM   #15
Prestidigitweeze
Fledgling Demagogue
Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.Prestidigitweeze ought to be getting tired of karma fortunes by now.
 
Prestidigitweeze's Avatar
 
Posts: 2,237
Karma: 24802569
Join Date: Feb 2011
Location: White Plains
Device: Aura HD; Nexus 7; PRS-350, 950; Kindle K; OnePlus One; Galaxy S4; MBP.
Quote:
Originally Posted by Analogus View Post
I cannot see why the necessity will be vanish. There will be enough PDF's being outside, and they still want to be read.
You seem to have misunderstood me. My apologies if I failed to make myself clear.

In the context of this discussion, my suggestion was that Format 8 and epub 3 could make make pdfs unnecessary for those who want to format graphics- and typography-intensive books from scratch.

Also note that I said this was possible, not that I was certain it would happen.

That said, thanks for your instructions and results-tested experience, which I'm sure will prove useful to those who lack the time and patience to start with a physical book (and I don't mean that as a put-down in any way).

Again, if you have the time, please post screen shots of your results so that we may admire them and decide whether or not we wish to try your approach with books that have been converted into pdfs already.
Prestidigitweeze is offline   Reply With Quote
Reply

Tags
ereader, ocr, pdf

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
PRS-650 Optimize PDF UpSpin Sony Reader 7 05-20-2011 06:14 PM
Troubleshooting Optimize PDF for the Kindle DX nerys Amazon Kindle 2 07-26-2010 02:05 PM
Use InstaCropper to optimize scanned pdf, comic for reader dracodoc PDF 0 04-06-2009 03:52 PM
Can I view images in PDF files ? eisho Sony Reader 1 08-03-2008 08:49 PM
The sony reader and PDF files. A short review with rasterfarian athlonkmf Sony Reader 10 06-26-2007 04:58 AM


All times are GMT -4. The time now is 02:31 AM.


MobileRead.com is a privately owned, operated and funded community.