Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book General > General Discussions

Notices

Reply
 
Thread Tools Search this Thread
Old 04-04-2014, 01:56 PM   #31
alanHd
Addict
alanHd ought to be getting tired of karma fortunes by now.alanHd ought to be getting tired of karma fortunes by now.alanHd ought to be getting tired of karma fortunes by now.alanHd ought to be getting tired of karma fortunes by now.alanHd ought to be getting tired of karma fortunes by now.alanHd ought to be getting tired of karma fortunes by now.alanHd ought to be getting tired of karma fortunes by now.alanHd ought to be getting tired of karma fortunes by now.alanHd ought to be getting tired of karma fortunes by now.alanHd ought to be getting tired of karma fortunes by now.alanHd ought to be getting tired of karma fortunes by now.
 
alanHd's Avatar
 
Posts: 374
Karma: 1408579
Join Date: Jul 2012
Location: UK
Device: Kindle Touch, Ipod Touch, Ipad Air
My book was small enough to fit both pages on the bed.
I didn't lower the lid and I never pressed on the book, The scans came out great but I don't know how much of that is down to my scanner being better than the norm has I got it to Scan my photos for restoration. The OCR gave me great results.

I did try with a book that is too big to scan both pages at once and I gave up after a couple of test pages, Now I am debating whether to cut out the pages and feed them through the auto feed at work, I'm in too minds as the book is fairly rare.
alanHd is offline   Reply With Quote
Old 04-04-2014, 03:43 PM   #32
rkomar
Wizard
rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.
 
Posts: 3,054
Karma: 18821071
Join Date: Oct 2010
Location: Sudbury, ON, Canada
Device: PRS-505, PB 902, PRS-T1, PB 623, PB 840, PB 633
Quote:
Originally Posted by alanHd View Post
My book was small enough to fit both pages on the bed.
I didn't lower the lid and I never pressed on the book, The scans came out great but I don't know how much of that is down to my scanner being better than the norm has I got it to Scan my photos for restoration. The OCR gave me great results.

I did try with a book that is too big to scan both pages at once and I gave up after a couple of test pages, Now I am debating whether to cut out the pages and feed them through the auto feed at work, I'm in too minds as the book is fairly rare.
The problem with scanning both pages at once occurs with books where the print gets too close to the gutter. The characters get distorted on the part of the page that pulls up from the platen, and can get hidden in the shadows in the gutter. Some books have this problem, and some don't. The closer the characters get to the gutter, the more you have to squash the book down. Cutting the book is often the only way to get good scans for extreme cases.

P.S. For some trade paperbacks, you can use an iron at a low setting to soften the glue on the spine and pull the cover off. The remaining glue keeps the pages together, but the book is now flexible enough that you can open it wider and reduce the gutter. Afterwards, you can iron the cover back on. I've done this with dozens of books, and it works great. I haven't had any success doing this with the MMPB books I've tried, though. The glue is usually too hard, and the book falls apart if you manage to get the cover off.
rkomar is online now   Reply With Quote
Old 04-04-2014, 05:20 PM   #33
Hamlet53
Nameless Being
 
Quote:
Originally Posted by rkomar View Post
The problem with scanning both pages at once occurs with books where the print gets too close to the gutter. The characters get distorted on the part of the page that pulls up from the platen, and can get hidden in the shadows in the gutter. Some books have this problem, and some don't. The closer the characters get to the gutter, the more you have to squash the book down. Cutting the book is often the only way to get good scans for extreme cases.

.
Yes, that's one of the problems I encountered when I tried to scan an intact book with a flatbed scanner. That and if the book orientation was even slightly off the OCR software I was using (not Abbyy FineReader that I am using now) would do a lousy job converting to text.

That led me to purchase the scanner I have now with automatic feed of sheets and double side scanning. I have encountered one problem though that maybe people here can offer help on. When the page includes page numbers and either chapter or book title this is incorporated into the text. Is there any way to prevent this? As it is I have to edit it out when making other corrections.
  Reply With Quote
Old 04-04-2014, 06:35 PM   #34
Kumabjorn
Basculocolpic
Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.
 
Kumabjorn's Avatar
 
Posts: 4,356
Karma: 20181319
Join Date: Jul 2010
Location: Sweden
Device: Kindle 3 WiFi, Kindle 4SO, Kindle for Android, Sony PRS-350 and PRS-T1
With ABBYY Fine Reader you can do a test scan and then set an actual scan area that would leave headers and footers outside the scanning area.
Those who have double side paper feeding scanners, what brand and how much? Is there a huge spread to the more advanced scanners?
Kumabjorn is offline   Reply With Quote
Old 04-05-2014, 05:12 AM   #35
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383099
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by Kumabjorn View Post
With ABBYY Fine Reader you can do a test scan and then set an actual scan area that would leave headers and footers outside the scanning area.
Those who have double side paper feeding scanners, what brand and how much? Is there a huge spread to the more advanced scanners?
The Fujitsu ScanSnap has a very good reputation, and is relatively inexpensive (as these things go).
HarryT is offline   Reply With Quote
Old 04-05-2014, 06:49 AM   #36
Hamlet53
Nameless Being
 
I purchased this scanner: Fujitsu ScanSnap S1300i Instant PDF Sheet-Fed Mobile Document Scanner It's worked very well for me.
  Reply With Quote
Old 04-06-2014, 01:10 PM   #37
alecE
Evangelist
alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.
 
alecE's Avatar
 
Posts: 412
Karma: 546196
Join Date: Mar 2009
Location: UK canal boat
Device: sony prs505, prs650, kobo Glo HD liseuses
I buy old paperbacks specifically for destructive ebook creation. Covers are removed and the book split into the publisher's signatures. Then the glue/gutter is removed. I use a Canon P-150 scanner which feeds automatically and does both sides. Abbyy Fine Reader works extremely well for the OCR process. However OCR cannot make a complete success - words hyphenated over two pages, phrases in italics, poor quality original typescript etc., all require an extended bout of editing.
I use Notepad++ for the basic editing, converting the text to html, amending quotation marks, correcting capitalisation and paragraphing (my regex skills are s l o w l y improving).
Finally Sigil for the ebook creation, application of css, spell check etc.

Over the last 70+ books I've treated like this, my average time to completion has been just short of 10 hours. However, a thorough read through in 'recreational' mode will then reveal all the little things I've missed - so another hour or so in Sigil after the first read-through.
alecE is offline   Reply With Quote
Old 04-06-2014, 05:15 PM   #38
jhowell
Grand Sorcerer
jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.
 
jhowell's Avatar
 
Posts: 7,073
Karma: 91577715
Join Date: Nov 2011
Location: Charlottesville, VA
Device: Kindles
I am having trouble understanding the mindset of wanting to spend the time to scan, OCR, format, and proofread a book in order to convert it from paper to e-book format. I can see the utility of doing this for public domain books where the results may be of use to a large number of people. But I get the impression that many people are making this effort for in-copyright works for their own personal use, where the number of readers of the result may be only one, the person doing the work, who has probably already read the book in question.

Maybe it is because I have more years behind me than ahead, but I would rather spend my time reading instead of hoarding books.

Last edited by jhowell; 04-06-2014 at 05:21 PM.
jhowell is offline   Reply With Quote
Old 04-06-2014, 07:37 PM   #39
shalym
Wizard
shalym ought to be getting tired of karma fortunes by now.shalym ought to be getting tired of karma fortunes by now.shalym ought to be getting tired of karma fortunes by now.shalym ought to be getting tired of karma fortunes by now.shalym ought to be getting tired of karma fortunes by now.shalym ought to be getting tired of karma fortunes by now.shalym ought to be getting tired of karma fortunes by now.shalym ought to be getting tired of karma fortunes by now.shalym ought to be getting tired of karma fortunes by now.shalym ought to be getting tired of karma fortunes by now.shalym ought to be getting tired of karma fortunes by now.
 
shalym's Avatar
 
Posts: 3,058
Karma: 54671821
Join Date: Feb 2012
Location: New England
Device: PW 1, 2, 3, Voyage, Oasis 2 & 3, Fires, Aura HD, iPad
Quote:
Originally Posted by jhowell View Post
I am having trouble understanding the mindset of wanting to spend the time to scan, OCR, format, and proofread a book in order to convert it from paper to e-book format. I can see the utility of doing this for public domain books where the results may be of use to a large number of people. But I get the impression that many people are making this effort for in-copyright works for their own personal use, where the number of readers of the result may be only one, the person doing the work, who has probably already read the book in question.

Maybe it is because I have more years behind me than ahead, but I would rather spend my time reading instead of hoarding books.
For me, it makes sense to do because I have horrible eyesight and arthritis, so reading paper books is difficult and painful. I haven't done this yet, but I am getting more and more tempted every day. There are some favorite books of mine that haven't been made into ebooks yet, and eventually I will make the leap and do it. I may just use 1 dollar scan, though--I'm not sure if the cost of buying the double sided scanner would be worth it for me.

Shari
shalym is offline   Reply With Quote
Old 04-06-2014, 08:44 PM   #40
Marcy
Guru
Marcy ought to be getting tired of karma fortunes by now.Marcy ought to be getting tired of karma fortunes by now.Marcy ought to be getting tired of karma fortunes by now.Marcy ought to be getting tired of karma fortunes by now.Marcy ought to be getting tired of karma fortunes by now.Marcy ought to be getting tired of karma fortunes by now.Marcy ought to be getting tired of karma fortunes by now.Marcy ought to be getting tired of karma fortunes by now.Marcy ought to be getting tired of karma fortunes by now.Marcy ought to be getting tired of karma fortunes by now.Marcy ought to be getting tired of karma fortunes by now.
 
Marcy's Avatar
 
Posts: 897
Karma: 950683
Join Date: Oct 2009
Device: Kobo Libra2
Quote:
Originally Posted by jhowell View Post
I am having trouble understanding the mindset of wanting to spend the time to scan, OCR, format, and proofread a book in order to convert it from paper to e-book format. I can see the utility of doing this for public domain books where the results may be of use to a large number of people. But I get the impression that many people are making this effort for in-copyright works for their own personal use, where the number of readers of the result may be only one, the person doing the work, who has probably already read the book in question.

Maybe it is because I have more years behind me than ahead, but I would rather spend my time reading instead of hoarding books.
I have a paperbook I bought that I am unable to read because the type is too small. I want to be able to read it.

Since it is for my own personal use, I'm not concerned with proofing it to be exactly like the original. I can live with a misplaced comma or italics.
Marcy is offline   Reply With Quote
Old 04-06-2014, 08:48 PM   #41
Hamlet53
Nameless Being
 
Quote:
Originally Posted by shalym View Post
For me, it makes sense to do because I have horrible eyesight and arthritis, so reading paper books is difficult and painful. I haven't done this yet, but I am getting more and more tempted every day. There are some favorite books of mine that haven't been made into ebooks yet, and eventually I will make the leap and do it. I may just use 1 dollar scan, though--I'm not sure if the cost of buying the double sided scanner would be worth it for me.

Shari
^This (the eye sight part).

As well just converting to a different type of hoarding. I've so many old books, and those that are in the worst shape with pages coming loose from the binding, paper yellowing and becoming flimsy, things like that, tend to be some of my previously read favorites. Converting to an ebook lets me free up space in my house and gives me a permanent version of the book. Reading again as part of the proofing process is just a bonus.

I do pick books that are not available as an ebook, are old enough that it is unlikely that this will change anytime soon, but not so old that entry into the public domain will happen anytime soon.
  Reply With Quote
Old 04-07-2014, 04:09 AM   #42
alecE
Evangelist
alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.
 
alecE's Avatar
 
Posts: 412
Karma: 546196
Join Date: Mar 2009
Location: UK canal boat
Device: sony prs505, prs650, kobo Glo HD liseuses
Quote:
Originally Posted by jhowell View Post
I am having trouble understanding the mindset of wanting to spend the time to scan, OCR, format, and proofread a book in order to convert it from paper to e-book format. I can see the utility of doing this for public domain books where the results may be of use to a large number of people. But I get the impression that many people are making this effort for in-copyright works for their own personal use, where the number of readers of the result may be only one, the person doing the work, who has probably already read the book in question.

Maybe it is because I have more years behind me than ahead, but I would rather spend my time reading instead of hoarding books.
I have a severe space problem - an unalterable 27 feet of bookshelves (at 90%+ capacity) and no option to extend, use the floor etc. Many books that I convert are my own, old, yellowing paperbacks that are falling apart, or titles bought specifically for conversion that I had to discard years ago.
Yes, I'd love to spend more time reading, but until the day when everything I want to read is digital, I'll just keep on with the format shifting. (And yes, I do get a passing creative buzz from the work as well).
alecE is offline   Reply With Quote
Old 04-07-2014, 08:13 AM   #43
jhowell
Grand Sorcerer
jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.
 
jhowell's Avatar
 
Posts: 7,073
Karma: 91577715
Join Date: Nov 2011
Location: Charlottesville, VA
Device: Kindles
Thank you all for the feedback (and for not being offended by my heavy-handed post). I hadn't considered well enough the reasons why someone might want to scan their own paper books. Improving readability and limited physical storage space are two very good ones.

By the way, Open Library has a large collection of scanned books, many of which have not yet been published in e-book form. Participating libraries can be found here. PDF (scanned images) and EPUB (OCR), both protected using Adobe DRM and DAISY accessible versions are available.

Last edited by jhowell; 04-07-2014 at 10:55 AM.
jhowell is offline   Reply With Quote
Old 04-08-2014, 05:41 AM   #44
cadele
Addict
cadele ought to be getting tired of karma fortunes by now.cadele ought to be getting tired of karma fortunes by now.cadele ought to be getting tired of karma fortunes by now.cadele ought to be getting tired of karma fortunes by now.cadele ought to be getting tired of karma fortunes by now.cadele ought to be getting tired of karma fortunes by now.cadele ought to be getting tired of karma fortunes by now.cadele ought to be getting tired of karma fortunes by now.cadele ought to be getting tired of karma fortunes by now.cadele ought to be getting tired of karma fortunes by now.cadele ought to be getting tired of karma fortunes by now.
 
cadele's Avatar
 
Posts: 372
Karma: 3710372
Join Date: Feb 2010
Device: Kindles, Sony 650
Quote:
Originally Posted by Hamlet53 View Post
I purchased this scanner: Fujitsu ScanSnap S1300i Instant PDF Sheet-Fed Mobile Document Scanner It's worked very well for me.
Oh, I'm very tempted by this. I am currently using the flatbed scanner at work during my lunch break but it's a bit of a pain.

Are you cutting off the spine of the books, and if so how "neat" do you have to be? I just wonder if the scanner can handle slightly ragged edges.

Thanks!
cadele is offline   Reply With Quote
Old 04-08-2014, 05:48 AM   #45
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383099
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by cadele View Post
Are you cutting off the spine of the books, and if so how "neat" do you have to be? I just wonder if the scanner can handle slightly ragged edges.
Best thing is to take the book to your local printer, and ask them to cut off the spine with their guillotine (mine is happy to do it free of charge). Ragged edges to the page will, more likely than not, cause misfeeds in the sheet feeder - you'll get multiple pages going through together. Shouldn't be a problem if you're feeding the pages one at a time, but that rather defeats the object of the exercise .

If you have no access to a guillotine, I'd suggest cutting the pages with a very sharp knife - a scalpel or similar - so you get a clean edge.

Last edited by HarryT; 04-08-2014 at 05:56 AM.
HarryT is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Regex engine huebi Sigil 1 02-23-2012 02:53 AM
How to convert an OCR file to a Non-OCR one res9282 PDF 1 08-05-2011 05:58 AM
Search Engine alroy Calibre 1 11-06-2010 01:39 AM
Regex engine? troymc Sigil 10 07-09-2010 04:52 PM


All times are GMT -4. The time now is 09:33 AM.


MobileRead.com is a privately owned, operated and funded community.