Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 09-11-2010, 02:38 AM   #1
Kumabjorn
Basculocolpic
Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.
 
Kumabjorn's Avatar
 
Posts: 4,356
Karma: 20181319
Join Date: Jul 2010
Location: Sweden
Device: Kindle 3 WiFi, Kindle 4SO, Kindle for Android, Sony PRS-350 and PRS-T1
Scanning project

Kindle 3 owner.
Have just started to dibble in Calibre, iow complete newbie.

I have a bunch of P-books and thought that I could have a multi-year project in turning them into E-books using a scanner and Calibre. Basically I would do two or three books a week.

Here is my problem, P-books have headers and page numbering. I realize that I can select scanning zones in the scanning software, that however is fairly time consuming. I would prefer to set up a single pan-optimized zone and then have footers and page numbers omitted in the conversion process.

Is it possible to do that inside Calibre?
Kumabjorn is offline   Reply With Quote
Old 09-11-2010, 02:58 AM   #2
Lady Fitzgerald
Wizard
Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.
 
Lady Fitzgerald's Avatar
 
Posts: 2,013
Karma: 251649
Join Date: Apr 2010
Location: Tempe, AZ, USA, Earth
Device: JetBook Lite (away from home) + 1 spare, 32" TV (at home)
I don't know the answer to your question but have you checked the Workshop forum under E-Book Formats in MobileRead? There has been a lot of discussion on scanning p-books to make e-books. You might find something there or someone might be able to help you.
Lady Fitzgerald is offline   Reply With Quote
Old 09-11-2010, 05:27 AM   #3
itimpi
Wizard
itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.
 
Posts: 4,552
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
Calibre will not be any help in the OCR side of this process in turning the images into text.

Calibre does have the facility of defining regex expressions for text (typically from headers and/or footers) that is to be omitted when converting a book. How well that would work in your case is likely to depend on the quality of the scanning/OCR process so that the text is predictable enough that a regex expression can be written to match the header/footer text.
itimpi is offline   Reply With Quote
Old 09-11-2010, 06:00 AM   #4
Kumabjorn
Basculocolpic
Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.
 
Kumabjorn's Avatar
 
Posts: 4,356
Karma: 20181319
Join Date: Jul 2010
Location: Sweden
Device: Kindle 3 WiFi, Kindle 4SO, Kindle for Android, Sony PRS-350 and PRS-T1
Quote:
Originally Posted by Lady Fitzgerald View Post
I don't know the answer to your question but have you checked the Workshop forum under E-Book Formats in MobileRead? There has been a lot of discussion on scanning p-books to make e-books. You might find something there or someone might be able to help you.
Thanks for that tip. I noticed that you and Iain has a fairly advanced project going. My aim is to output them in MOBI so I can have it as regular books on the Kindle. I need to look into one of those book guillotines you seem to be using. If I can get the book into digital format i don't really care if I have to destroy the paper version.
Kumabjorn is offline   Reply With Quote
Old 09-11-2010, 08:34 AM   #5
wayrad
Fanatic
wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.wayrad ought to be getting tired of karma fortunes by now.
 
Posts: 551
Karma: 1121392
Join Date: May 2008
Location: USA
Device: HTC One M8
What works for me is to OCR with FineReader 9.0 and then save to Word. There's an option buried in FR's Word save options to omit headers/footers. Then I convert the Word file to whatever format I ultimately want. Itworks for me, although I don't know whether it would fit with anyone else's workflow.

Last edited by wayrad; 09-11-2010 at 08:37 AM.
wayrad is offline   Reply With Quote
Old 09-11-2010, 09:02 AM   #6
Lady Fitzgerald
Wizard
Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.
 
Lady Fitzgerald's Avatar
 
Posts: 2,013
Karma: 251649
Join Date: Apr 2010
Location: Tempe, AZ, USA, Earth
Device: JetBook Lite (away from home) + 1 spare, 32" TV (at home)
Quote:
Originally Posted by Kumabjorn View Post
Thanks for that tip. I noticed that you and Iain has a fairly advanced project going. My aim is to output them in MOBI so I can have it as regular books on the Kindle. I need to look into one of those book guillotines you seem to be using. If I can get the book into digital format i don't really care if I have to destroy the paper version.
Iain has the advanced project going because he developing new software to controll the scanning, OCR, and much of the post OCR process with little or no intervention from the user and hopes to market the software (he is a software developer by trade so he probably can pull it off). My project is merely massive (at least 1200 books). I don't have time to deal with OCR and the post OCR editing so I'm just settling on scanning my books to PDF and using an e-book reader that is capable of making most of them easily read (the JetBook Lite has worked best for me and still is small enough to fit in my purse). Larger books that have text too small to easily read in my JBL I can read at home on my TV screen (patched into my computer). It's a compromise but it works for me.

Since you don't mind destroying the paper books, a guillotine is the way to go. The kind Iain has is same as the first one I had (the one he has is not the same as the one pictured in his blog unless he changed it recently). Mine broke after about 300-400 books averaging 1" in thickness and, when I found the guillotine I bought was probably a counterfeit of the original design, I jumped through a few hoops and was able to get a full refund. I recently replaced it with a better designed one that is easier and faster to use and seems to be much better made. It's a Perfect G12 Pro. It cost about 50% more than the genuine version of the first scanner I got would have cost but it is well worth it. I've had mine only three days and have cut only 32 books with it so time will tell how well it will hold up.

There was a bit of a learning curve finding out the best way to adjust the fence to easily and safely position the book properly for cutting off the spine. I shared with Iain via PMs how to do it and he is now using a variation of my method. Since there seems to an interest, I probably should post a copy of that on the Workshop forum.

How were you planning on scanning your books after cutting off the spines?

Last edited by Lady Fitzgerald; 09-11-2010 at 09:06 AM.
Lady Fitzgerald is offline   Reply With Quote
Old 09-11-2010, 09:31 AM   #7
Kumabjorn
Basculocolpic
Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.Kumabjorn ought to be getting tired of karma fortunes by now.
 
Kumabjorn's Avatar
 
Posts: 4,356
Karma: 20181319
Join Date: Jul 2010
Location: Sweden
Device: Kindle 3 WiFi, Kindle 4SO, Kindle for Android, Sony PRS-350 and PRS-T1

That is one serious guillotine! It could take off a hand unless you are careful. I fear I might end up as a cautionary tale on one of those ER doctor forums.
I have two flatbed scanners, one is an old SCSI HP scanner with ADF. I thought that could be setup for specialty use for a book scanning project. I also have OmniPage Pro 12 which should be good enough for OCR needs. Since this is limited to personal use a few misreads shouldn't be a major problem.
Kumabjorn is offline   Reply With Quote
Old 09-11-2010, 10:19 AM   #8
Lady Fitzgerald
Wizard
Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.
 
Lady Fitzgerald's Avatar
 
Posts: 2,013
Karma: 251649
Join Date: Apr 2010
Location: Tempe, AZ, USA, Earth
Device: JetBook Lite (away from home) + 1 spare, 32" TV (at home)
Quote:
Originally Posted by Kumabjorn View Post

That is one serious guillotine! It could take off a hand unless you are careful. I fear I might end up as a cautionary tale on one of those ER doctor forums.
I have two flatbed scanners, one is an old SCSI HP scanner with ADF. I thought that could be setup for specialty use for a book scanning project. I also have OmniPage Pro 12 which should be good enough for OCR needs. Since this is limited to personal use a few misreads shouldn't be a major problem.
Oh yes, it wouldn't be difficult to lose body parts to the guillotine. In fact, I had to go to the emergency room before I cut my first book on my first guillotine because I dragged my thumb against the raised and locked blade of the cutter and I couldn't stop the bleeding (I'm on blood thinners so that didn't help). Fortunately, the thumbnail was what took the worst of the damage and most of it has grown back. It should finish growing back in a couple of months. The tip of my thumb is still a bit numb although it's not as tender as it was at first. We old folks don't heal very fast.

After I tried parting company with my thumb, I worked out a procedure using shims that allowed me to quickly and accurately position the books for cutting and still keep my hands far away from the blade. It's worked out well so I won't need to change my name to Frodo.
Lady Fitzgerald is offline   Reply With Quote
Old 09-11-2010, 10:49 AM   #9
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Not sure if you're aware of some of the alternatives to slicing up the book:
http://diybookscanner.org/
http://bkrpr.org/doku.php

Seems faster, less chance of mixed up pages, and also non-destructive... A lot of software/workflow discussions on the first site as well.
ldolse is offline   Reply With Quote
Old 09-11-2010, 11:31 AM   #10
Lady Fitzgerald
Wizard
Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.Lady Fitzgerald ought to be getting tired of karma fortunes by now.
 
Lady Fitzgerald's Avatar
 
Posts: 2,013
Karma: 251649
Join Date: Apr 2010
Location: Tempe, AZ, USA, Earth
Device: JetBook Lite (away from home) + 1 spare, 32" TV (at home)
Quote:
Originally Posted by ldolse View Post
Not sure if you're aware of some of the alternatives to slicing up the book:
http://diybookscanner.org/
http://bkrpr.org/doku.php

Seems faster, less chance of mixed up pages, and also non-destructive... A lot of software/workflow discussions on the first site as well.
While those do have the advantage of not destroying a book, I fail to see how they can be faster than a good ADF scanner. I haven't had any problems with mixing up pages. In my case, I have to get rid of the books due to a lack of space in my future home but I don't want to give them up completely so destroying them is not a problem for me since, once I have a copy of them, I can't give them away or sell them without violating the copyright.
Lady Fitzgerald is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Scanning books - New need help Sporadic Workshop 9 04-19-2009 01:11 PM
Microsoft joins Cornell U in mass book-scanning project Alexander Turcic News 9 08-20-2008 07:49 AM
on scanning Paul Moews iRex 9 10-17-2007 01:42 AM
Book scanning kusmi iRex 33 10-09-2007 05:34 AM
Win for Google book-scanning project in Germany Alexander Turcic News 0 07-01-2006 08:54 AM


All times are GMT -4. The time now is 03:11 PM.


MobileRead.com is a privately owned, operated and funded community.