Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book General > News

Notices

Reply
 
Thread Tools Search this Thread
Old 10-02-2009, 11:17 PM   #46
JeremyZ
Addict
JeremyZ ought to be getting tired of karma fortunes by now.JeremyZ ought to be getting tired of karma fortunes by now.JeremyZ ought to be getting tired of karma fortunes by now.JeremyZ ought to be getting tired of karma fortunes by now.JeremyZ ought to be getting tired of karma fortunes by now.JeremyZ ought to be getting tired of karma fortunes by now.JeremyZ ought to be getting tired of karma fortunes by now.JeremyZ ought to be getting tired of karma fortunes by now.JeremyZ ought to be getting tired of karma fortunes by now.JeremyZ ought to be getting tired of karma fortunes by now.JeremyZ ought to be getting tired of karma fortunes by now.
 
JeremyZ's Avatar
 
Posts: 303
Karma: 1000702
Join Date: Sep 2009
Location: Chicago
Device: Nook ST, Kindle 2, Samsung Galaxy Stellar phone
Quote:
Originally Posted by gazza View Post
My wife, because of illness, can only use the iPod Touch although I am sure other equally small and suitable devices will come along.
Really!? If one could use an iPod Touch, wouldn't one be able to use a Kindle?

I should take your word, but that is surprising.
JeremyZ is offline   Reply With Quote
Old 10-04-2009, 12:33 AM   #47
unicorn23
Member
unicorn23 began at the beginning.
 
Posts: 11
Karma: 10
Join Date: Feb 2009
Location: Australia
Device: HanLin V3
Quote:
Originally Posted by JeremyZ View Post
Really!? If one could use an iPod Touch, wouldn't one be able to use a Kindle?

I should take your word, but that is surprising.
Three words... Land Down Under. Amazon doesn't do Kindle out here, unless you want to jump through all sorts of (potentially dodgy) hoops . Believe us, we'd like Amazon ebooks as much as anyone else, but I doubt they're coming here any time soon.
unicorn23 is offline   Reply With Quote
Old 10-04-2009, 02:02 AM   #48
Elfwreck
Grand Sorcerer
Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.
 
Elfwreck's Avatar
 
Posts: 5,187
Karma: 25133758
Join Date: Nov 2008
Location: SF Bay Area, California, USA
Device: Pocketbook Touch HD3 (Past: Kobo Mini, PEZ, PRS-505, Clié)
Quote:
Originally Posted by wayrad View Post
Do you use the page crop feature, or is there a better way?
There's a better way.

If your pages are the same size and layout, or close to it, you can save the text blocks you use, and load them on all the pages at once.

I have FineReader 7 Pro. How I'd do this:
-Go to a standard-looking page of your document
-Ctrl-E to place zones on the page. Delete unwanted text/image blocks.
-Shape wanted text block(s) to just a bit bigger than the main text of the page; give a bit of margin in case of pages that are shifted a bit to one side or the other.
-Image-->Save Blocks: save blocks out (usually with the name of the book, so you remember which one it is.
-Select all pages in your book (or all besides the cover page & TOC, which may need different zoning)
-Image-->Load Blocks; apply to selected pages.

This will only work if your pages are substantially identical--but it'll save hours if they are. And it can be done to all pages, and then you can quickly flip through and look for any that need to be zoned differently.
Elfwreck is offline   Reply With Quote
Old 10-04-2009, 04:20 AM   #49
orion2001
Groupie
orion2001 can name that song in three notesorion2001 can name that song in three notesorion2001 can name that song in three notesorion2001 can name that song in three notesorion2001 can name that song in three notesorion2001 can name that song in three notesorion2001 can name that song in three notesorion2001 can name that song in three notesorion2001 can name that song in three notesorion2001 can name that song in three notesorion2001 can name that song in three notes
 
Posts: 162
Karma: 24658
Join Date: Sep 2009
Device: PRS-505
Quote:
Originally Posted by Elfwreck View Post
There's a better way.

If your pages are the same size and layout, or close to it, you can save the text blocks you use, and load them on all the pages at once.

I have FineReader 7 Pro. How I'd do this:
-Go to a standard-looking page of your document
-Ctrl-E to place zones on the page. Delete unwanted text/image blocks.
-Shape wanted text block(s) to just a bit bigger than the main text of the page; give a bit of margin in case of pages that are shifted a bit to one side or the other.
-Image-->Save Blocks: save blocks out (usually with the name of the book, so you remember which one it is.
-Select all pages in your book (or all besides the cover page & TOC, which may need different zoning)
-Image-->Load Blocks; apply to selected pages.

This will only work if your pages are substantially identical--but it'll save hours if they are. And it can be done to all pages, and then you can quickly flip through and look for any that need to be zoned differently.
That is great advice! I have to go try this
orion2001 is offline   Reply With Quote
Old 10-04-2009, 06:33 AM   #50
lybrary
Member
lybrary began at the beginning.
 
Posts: 13
Karma: 10
Join Date: Sep 2009
Device: none
Folks, wonderful discussion. I am scanning and converting books for 10 years and have personally converted several hundred books approaching a 1000 works. We have used everything from simple flatbed scanners to two camera rigs to the most sophisticated automatic page turn robots. We do this for a living because we publish and sell these ebooks. Of course we first get the ok from the author or publisher to do so - no copyright violations here at Lybrary.com.

Having done this for a long time I can share a couple of insights and tips:

1) ABBYY is the best OCR software as of today. It has been said here before, but I wanted to stress this. Of course, it is important that you spend time with it to learn all the little features, twists and tricks. I am using the software since its 4.0 version and it has been worth every penny. It really depends on the book you scan how you should use ABBYY, so I can't make any general recommendations except to study all its features and try. When I started I converted the same book 5 times to test various approaches.

2) These days PDF can compress simply scanned and not proof read documents pretty well. When exporting PDF from ABBYY make sure to select 'Enable Mixed Raster Content'. This will bring down the file size by up to 10x. We recently converted a 900 page work (each page is letter sized) and the total size is only 90MBytes, even though each page is a scanned page and we ran OCR without subsequent proof reading. That is merely 100kB for each page - still more than a fully converted text page but not that much more. Another important tip here is to clean up your scanned pages. You want to have a white background as much as you can, no black borders, etc. All of this image content increases the file size.

3) For some who don't want to bother with the scanning you might want to look into scan services. There are companies that scan a page anywhere form 10 cent to 50 cent. In some cases this might be a better option than doing everything yourself.

Here is one thought I am contemplating but I haven't found a good and workable solution for it: It has been stated here that if you buy a book you are free to prepare your own digital version of it. I agree with that interpretation of the copyright law. The next question is what if my friend bought the same book. Can I give him my digital copy of it (remember he has bought the same printed book)? Again my interpretation is yes because my friend has bought the same book and I can certainly share my own work of digitization with him. I couldn't do so with somebody who has not bought the book because that would be in clear violation in copyright law. Does anybody have an opinion on this legal question?

If the answer is yes one can do that without violating copyrights, then it might be possible to pool the resources of people who like to convert their books and share it with others who have the same book. This would eliminate double work. The real problem here is how do you ensure that only those who have bought the printed work have access to the digitized work. I have not yet found a good answer to it. But once I do I would love to create such an exchange platform for converted and copyrighted books.

The only idea I have to make this check if somebody has the book is to ask to mail in the first page or a part of the page. This way it is clear that the person has the physical book. And since the page is now removed nobody else can use this copy to claim that he owns the book. Comments?
lybrary is offline   Reply With Quote
Old 10-04-2009, 09:25 AM   #51
wayrad
Fanatic
wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.
 
Posts: 551
Karma: 1121392
Join Date: May 2008
Location: USA
Device: HTC One M8
Quote:
Originally Posted by Elfwreck View Post
There's a better way.

If your pages are the same size and layout, or close to it, you can save the text blocks you use, and load them on all the pages at once.

I have FineReader 7 Pro. How I'd do this:
-Go to a standard-looking page of your document
-Ctrl-E to place zones on the page. Delete unwanted text/image blocks.
-Shape wanted text block(s) to just a bit bigger than the main text of the page; give a bit of margin in case of pages that are shifted a bit to one side or the other.
-Image-->Save Blocks: save blocks out (usually with the name of the book, so you remember which one it is.
-Select all pages in your book (or all besides the cover page & TOC, which may need different zoning)
-Image-->Load Blocks; apply to selected pages.

This will only work if your pages are substantially identical--but it'll save hours if they are. And it can be done to all pages, and then you can quickly flip through and look for any that need to be zoned differently.
Thanks! I will give it a try. The "save without headers or footers" thing is super fast, but AFAIK only works when you save to Word format. Knowing how to accomplish the task when saving to any format will be extremely useful if I ever change reading devices. Which will probably happen eventually, even though I have two backup Zodiacs...
wayrad is offline   Reply With Quote
Old 10-05-2009, 01:49 PM   #52
Elfwreck
Grand Sorcerer
Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.
 
Elfwreck's Avatar
 
Posts: 5,187
Karma: 25133758
Join Date: Nov 2008
Location: SF Bay Area, California, USA
Device: Pocketbook Touch HD3 (Past: Kobo Mini, PEZ, PRS-505, Clié)
Quote:
Originally Posted by lybrary View Post
Here is one thought I am contemplating but I haven't found a good and workable solution for it: It has been stated here that if you buy a book you are free to prepare your own digital version of it. I agree with that interpretation of the copyright law. The next question is what if my friend bought the same book. Can I give him my digital copy of it (remember he has bought the same printed book)? Again my interpretation is yes because my friend has bought the same book and I can certainly share my own work of digitization with him. I couldn't do so with somebody who has not bought the book because that would be in clear violation in copyright law. Does anybody have an opinion on this legal question?
Not a lawyer. Copyfight fanatic.

Opinion: Copyright law is psychotic and takes no notice of common sense and reasonable prevention of unnecessary effort.

"Fair use" is not actually defined. It is described as having allowances for educational purposes, and parody, and de minimis use (except in music), and is acknowledged to cover other uses which have been checked against the four factors. There is no equation to use to decide if a particular use is, or is not, acceptable.

The practical side of things: If you scan & convert a book for your friend, nobody knows, and nobody cares. If you start a book club with converted versions of the Harry Potter books for "everyone who can prove ownership of the physical copy," which sounds like a very reasonable (which does not mean "legal") format-shifting option, you can bet you'll be facing a lawsuit as fast as Warner Brothers can draft the C&D order.

Will it be successful?

Pointless question. The real question is:

How much time, money & lawyer resources do you have to spend on this?
Elfwreck is offline   Reply With Quote
Old 10-06-2009, 05:32 AM   #53
athlonkmf
Guru
athlonkmf ought to be getting tired of karma fortunes by now.athlonkmf ought to be getting tired of karma fortunes by now.athlonkmf ought to be getting tired of karma fortunes by now.athlonkmf ought to be getting tired of karma fortunes by now.athlonkmf ought to be getting tired of karma fortunes by now.athlonkmf ought to be getting tired of karma fortunes by now.athlonkmf ought to be getting tired of karma fortunes by now.athlonkmf ought to be getting tired of karma fortunes by now.athlonkmf ought to be getting tired of karma fortunes by now.athlonkmf ought to be getting tired of karma fortunes by now.athlonkmf ought to be getting tired of karma fortunes by now.
 
Posts: 714
Karma: 1014039
Join Date: May 2007
Device: Sony PRS-500, Sony PRS-505, Kindle 3, Sony PRS350, iPad 64GB
It's too bad that you can't use the power of the mass for these kind of jobs. Proofreading... if only you can like open a wiki for people to edit the page...

Or even better, attach your scanned files to recaptcha
athlonkmf is offline   Reply With Quote
Old 10-07-2009, 12:20 PM   #54
=X=
Wizard
=X= ought to be getting tired of karma fortunes by now.=X= ought to be getting tired of karma fortunes by now.=X= ought to be getting tired of karma fortunes by now.=X= ought to be getting tired of karma fortunes by now.=X= ought to be getting tired of karma fortunes by now.=X= ought to be getting tired of karma fortunes by now.=X= ought to be getting tired of karma fortunes by now.=X= ought to be getting tired of karma fortunes by now.=X= ought to be getting tired of karma fortunes by now.=X= ought to be getting tired of karma fortunes by now.=X= ought to be getting tired of karma fortunes by now.
 
=X='s Avatar
 
Posts: 3,671
Karma: 12205348
Join Date: Mar 2008
Device: Galaxy S, Nook w/CM7
@lybrary I really like this idea, the sharing of OCR books. I think a good way to validate is just to email a photo with an ID and the book. (Not crazy about having to deface a book just to share an OCR--Bookstores might get a bit irritated what the first page of all their books are missing)

I do agree with Elfwreck, the real risk is not if you are in the right, the real risk is being sued.

I think a way to reduce risk would be to discuss over a forum but only share/verify on a personal level.
=X=
=X= is offline   Reply With Quote
Old 10-07-2009, 12:41 PM   #55
Hellmark
Wizard
Hellmark ought to be getting tired of karma fortunes by now.Hellmark ought to be getting tired of karma fortunes by now.Hellmark ought to be getting tired of karma fortunes by now.Hellmark ought to be getting tired of karma fortunes by now.Hellmark ought to be getting tired of karma fortunes by now.Hellmark ought to be getting tired of karma fortunes by now.Hellmark ought to be getting tired of karma fortunes by now.Hellmark ought to be getting tired of karma fortunes by now.Hellmark ought to be getting tired of karma fortunes by now.Hellmark ought to be getting tired of karma fortunes by now.Hellmark ought to be getting tired of karma fortunes by now.
 
Hellmark's Avatar
 
Posts: 2,592
Karma: 4290425
Join Date: Jun 2009
Location: Foristell, Missouri, USA
Device: Nokia N800, PRS-505, Nook STR Glowlight, Kindle 3, Kobo Libra 2
Quote:
Originally Posted by =X= View Post
@lybrary I really like this idea, the sharing of OCR books. I think a good way to validate is just to email a photo with an ID and the book. (Not crazy about having to deface a book just to share an OCR--Bookstores might get a bit irritated what the first page of all their books are missing)

I do agree with Elfwreck, the real risk is not if you are in the right, the real risk is being sued.

I think a way to reduce risk would be to discuss over a forum but only share/verify on a personal level.
=X=
Photos can be faked. Even if it isn't who is to say that you let your friends take pictures of your book (or vice versa)?
Hellmark is offline   Reply With Quote
Old 10-07-2009, 12:44 PM   #56
Steven Lyle Jordan
Grand Sorcerer
Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.
 
Steven Lyle Jordan's Avatar
 
Posts: 8,478
Karma: 5171130
Join Date: Jan 2006
Device: none
I'd like to share a tip that has improved my scan output quality, and minimized errors in final text, and it's not as bad as it sounds: Add a scan step to your process.

Specifically, use a good photocopier to create letter/A4-sized pages of your books. If your book page is smaller than letter/A4, set the copier to enlarge the copy to fit the page. That way, you get larger letters, clearer spaces and punctuation, making the OCR process easier. You can also take advantage of any copier image controls to improve text/background contrast on the pages, further improving character legibility.

The advantage of this is that you can then feed those letter/A4 sheets through a high-quality professional scanner... they are optimized for letter/A4 page processing, and most will give you 300-600DPI TIF image files.

I've done this in the past, typically taking 10-30 minutes to copy the pages of an average book, depending on the copier type. The rest of the process takes about as long, but if your scanner has an automatic feeder, it can scan 50-100 pages a minute, and save you even more time in the scan process. Not to mention generating fewer errors in OCR.

FYI: Sorry, I don't have the access to copiers and scanners that I used to, so I can't recommend brands...

Last edited by Steven Lyle Jordan; 10-07-2009 at 01:14 PM. Reason: Said "JPG" when I meant to say "TIF". Sorry!
Steven Lyle Jordan is offline   Reply With Quote
Old 10-07-2009, 01:04 PM   #57
igorsk
Wizard
igorsk ought to be getting tired of karma fortunes by now.igorsk ought to be getting tired of karma fortunes by now.igorsk ought to be getting tired of karma fortunes by now.igorsk ought to be getting tired of karma fortunes by now.igorsk ought to be getting tired of karma fortunes by now.igorsk ought to be getting tired of karma fortunes by now.igorsk ought to be getting tired of karma fortunes by now.igorsk ought to be getting tired of karma fortunes by now.igorsk ought to be getting tired of karma fortunes by now.igorsk ought to be getting tired of karma fortunes by now.igorsk ought to be getting tired of karma fortunes by now.
 
Posts: 3,442
Karma: 300001
Join Date: Sep 2006
Location: Belgium
Device: PRS-500/505/700, Kindle, Cybook Gen3, Words Gear
Do not use JPEG files for scans, the JPEG format is ill-suited for images with sharp edges such as text or line art. Best to use a lossless format such as TIFF or PNG (or PDF, as long as it's not using JPEG compression for images). Check FineReader manual for more advice on how to get best scans for OCR.
Another option is to use a digital camera, ABBYY has some tips about it too.
igorsk is offline   Reply With Quote
Old 10-07-2009, 01:13 PM   #58
Steven Lyle Jordan
Grand Sorcerer
Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.Steven Lyle Jordan ought to be getting tired of karma fortunes by now.
 
Steven Lyle Jordan's Avatar
 
Posts: 8,478
Karma: 5171130
Join Date: Jan 2006
Device: none
Quote:
Originally Posted by igorsk View Post
Do not use JPEG files for scans...
My bad... I did say JPG. TIF is the better format, though I have used JPG successfully.
Steven Lyle Jordan is offline   Reply With Quote
Old 10-07-2009, 02:01 PM   #59
Shaggy
Wizard
Shaggy ought to be getting tired of karma fortunes by now.Shaggy ought to be getting tired of karma fortunes by now.Shaggy ought to be getting tired of karma fortunes by now.Shaggy ought to be getting tired of karma fortunes by now.Shaggy ought to be getting tired of karma fortunes by now.Shaggy ought to be getting tired of karma fortunes by now.Shaggy ought to be getting tired of karma fortunes by now.Shaggy ought to be getting tired of karma fortunes by now.Shaggy ought to be getting tired of karma fortunes by now.Shaggy ought to be getting tired of karma fortunes by now.Shaggy ought to be getting tired of karma fortunes by now.
 
Shaggy's Avatar
 
Posts: 4,293
Karma: 529619
Join Date: May 2007
Device: iRex iLiad, DR800SG
Quote:
Originally Posted by lybrary View Post
Here is one thought I am contemplating but I haven't found a good and workable solution for it: It has been stated here that if you buy a book you are free to prepare your own digital version of it. I agree with that interpretation of the copyright law. The next question is what if my friend bought the same book. Can I give him my digital copy of it (remember he has bought the same printed book)? Again my interpretation is yes because my friend has bought the same book and I can certainly share my own work of digitization with him. I couldn't do so with somebody who has not bought the book because that would be in clear violation in copyright law. Does anybody have an opinion on this legal question?
No. Your friend may have a right to the digital version, but you are not authorized to distribute that material. If you gave your digital copy to your friend you would be committing copyright infringement, even though your friend is allowed to digitize their own version.

Further, while you are not allowed to give your digital copy to your friend, it's probably true that your friend can hire you to digitize his copy of the printed book. Even though the end result is exactly the same, this time it is probably legal.

Nobody claimed that current copyright law makes any sense.

Of course, the other question is would the copyright holder care or even know that you gave your friend a copy? Probably not. But, technically, it would be infringement.
Shaggy is offline   Reply With Quote
Old 10-07-2009, 02:17 PM   #60
wayrad
Fanatic
wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.
 
Posts: 551
Karma: 1121392
Join Date: May 2008
Location: USA
Device: HTC One M8
Quote:
Originally Posted by igorsk View Post
Do not use JPEG files for scans, the JPEG format is ill-suited for images with sharp edges such as text or line art. Best to use a lossless format such as TIFF or PNG (or PDF, as long as it's not using JPEG compression for images).
I've tried doing the same pages (mass market paperback text) as JPEG and TIFF, and the error count after OCR with Finereader wasn't significantly different. It may make more of a difference for illustrations.
wayrad is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
calibre crashes when scanning and adding books oncdoc Calibre 8 04-21-2010 03:03 PM
Scanning books - New need help Sporadic Workshop 9 04-19-2009 01:11 PM
Scanning paper (out of copyright) books. Charles Gray Workshop 18 03-25-2009 02:06 PM
Scanning books Nate the great Lounge 10 11-04-2007 01:20 AM
Scanning books from your own library Alexander Turcic Deals and Resources (No Self-Promotion or Affiliate Links) 13 06-16-2006 12:28 AM


All times are GMT -4. The time now is 11:08 AM.


MobileRead.com is a privately owned, operated and funded community.