06-07-2013, 10:25 PM | #16 |
Junior Member
Posts: 7
Karma: 10
Join Date: Sep 2012
Device: iPad 4
|
ok i ran it through ABBYY and then saved the files as a word doc. now wverything is just centered. chNging alignment doesnt work either. its all centered,
btw, i do like abbyy much better than paperport. |
06-08-2013, 01:38 AM | #17 |
Wizard
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
You can save a Word document from ABBYY in multiple manners. I usually take 'workable copy' as I think it is called.
Intelligent S&R are smart search and replace action you can create, usually with wildcards. It resembles RegEx very much. |
06-12-2013, 03:24 AM | #18 | |
Junior Member
Posts: 7
Karma: 10
Join Date: Sep 2012
Device: iPad 4
|
Quote:
i didnt understand a word you said about S&R BUT....omg...what a difference using ABBY has made. i love it! minimum mistakes so far. i opened the book in microsoft word from abby and created a table of contents from there. i transferred the book to my ipad and i used PerefectReader/Stanza/iBooks to open the pdf. When i click the icons to view contents it says there are none. yet, the second page is the table of contents and is very much 'clickable'. but if i want to go from chapter to chapter, i would have to o back to the beginning of the book and click from there. any way around this?? |
|
06-12-2013, 03:43 AM | #19 |
Nameless Being
|
The way I do it is:
1. Scan. 2. OCR to convert the images to a Word file. 3. Cut and paste the entire Word document into Notepad. This gets rid of all the formatting. I use ABBYY Finereader too, and the OCR package knows the difference between text and page headings etc. It does page headings/page numbers as headers/footers in word, so copy/pasting to Notepad gets rid of them all quickly. 4. Cut and past back from Notepad into a new Word document. 5. Run spellchecker (as has been commented above, this gets rid of the repeated and obvious errors). 6. Insert page breaks where chapter breaks are supposed to be. 7. Import the Word document into Calibre and convert to epub. 8. Import the epub into Jutoh (which is the epub editing software I use), and go through the whole thing word-by-word fine tuning, adding in italics, links, and any other stuff. 9. Create a new epub. 10. Delete the original epub from Calibre, and import the new, corrected, version into Calibre to replace it. Sort out the metadata (author, series etc). 11. Done. There are lots of more knowledgeable people on here than I am, who understand about code and things like that, so their way might be better than mine. My ABBYY Finereader is version 9; it came packaged with the scanner I use. It's been updated since, but what I've got works pretty well. And as far as I'm concerned, S&R means 'search and rescue'. Intelligent S&R actions are presumably wearing a lifejacket and keep your control centre updated with where you are and how it's going? |
06-12-2013, 04:50 AM | #20 |
Wizard
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
I use the following method:
1. Scan the book 2. Run through ABBYY, save as DOCX, HTML (for images and to see if text is seen as image by mistake) and PDF/A (to simplify searching the original scan) 3. Start Word, load the DOCX and run my add-in (first procedures) to solve/fix a large number of OCR issues. 4. Before the final two steps, I also run a spellcheck 5. Perform the conversion to HTML and generate the basic ePUB 6. Make final touches to the ePUB via Sigil. Usually that is somewhat more complex formatting and the TOC. |
06-16-2013, 01:24 AM | #21 |
Junior Member
Posts: 2
Karma: 10
Join Date: May 2013
Device: Ipad
|
Great tips everyone...My workflow is similar to Jen_Smith.
I use AbbyFinereader Express as it is the only option available for Mac. Unfortunately, it limited in its function when compared to the PC version. I scan 2 pages of the book in 1 go. I recently started to convert to html and found that there is less conversion issues vs converting to text. When conveted, the html looks as I scan it ... 2 pages. Once converted, I copy all and paste to my text edit program on Mac. But the downside of this is that all the formatting (italics etc ) is lost. I have an MS Word but I cant figure it out. How can I open the html file in MS word. I have tried to copy and paste into the MS word but it paste as how I scan it. How do I get it into one column instead of 2? I am trying to reduce one step in the process by not having to find and add in italics. Any help is appreciated. Thanks. |
09-01-2013, 11:18 PM | #22 |
Junior Member
Posts: 3
Karma: 10
Join Date: Sep 2013
Device: Kindle Paperwhite, Nexus 7
|
Hello all:
First post here. I recently sent one of favorite books to 1DollarScan and got the PDF with OCR back from them. I have run that through ABBY and it looks fantastic! Cleaned up a bunch of issues in ABBY and have saved that to HTML. From there I have run that HTML file through Calibre and while the results are good I am getting short lines of text as one of other users mentions. I would love to get the book as close as possible to the original. That being said, in ABBY, once I run the PDF through that should I then save it to Open Office, clean up the issues in formatting and then save that as a HTML file and then into Calibre? Thanks a lot for any assistance on this, I am a total noob at this but spent the entire weekend experimenting with this. |
09-01-2013, 11:29 PM | #23 | |
Bookaholic
Posts: 14,391
Karma: 54969924
Join Date: Oct 2007
Location: Minnesota
Device: iPad Mini 4, AuraHD, iPhone XR +
|
Quote:
|
|
09-02-2013, 12:26 AM | #24 | |
Junior Member
Posts: 3
Karma: 10
Join Date: Sep 2013
Device: Kindle Paperwhite, Nexus 7
|
Quote:
Cant I just to from Open Office Writer (MS Word) direct to HTML & then Calibre or will the formatting still go weird on me? Thanks. |
|
09-02-2013, 04:13 AM | #25 |
Wizard
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
It will probably be weird. Take notice that most makers will create a mobi via an ePUB...
|
09-02-2013, 11:18 AM | #26 |
Junior Member
Posts: 3
Karma: 10
Join Date: Sep 2013
Device: Kindle Paperwhite, Nexus 7
|
I just did a fast test by saving to Open Office Writer from ABBY, then running that into Calibre, I edited "some" pages in Writer first, I think the first 9 pages. A really big difference. Havent compared them side by side, Im off to bed soon (work nights) but just wanted to report that the formatting is really good (not 100%, more like 80%). might even be good enough for my tastes. Will keep you informed when Im back at work and have more time to read.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
sending scanned book to Kindle for conversion? | Georgia Swan | Workshop | 0 | 07-31-2011 10:52 AM |
scanned book to epub | langmarp | General Discussions | 3 | 06-28-2010 08:44 AM |
Scanned in book only works sideways, or upside down | PGA | Workshop | 2 | 03-12-2010 03:01 PM |
Scanned book conversion | jabberwock_11 | Calibre | 2 | 01-25-2010 03:37 AM |
Google Book Settlement Site Is Up; Paying Authors $60 Per Scanned Book | yagiz | News | 8 | 04-26-2009 01:43 AM |