![]() |
#1 |
Junior Member
![]() Posts: 1
Karma: 10
Join Date: Jun 2013
Device: none
|
Converting from PDF to ePub using Abbyy Fine Reader
Hi Everyone,
I am new to the forums and am not sure if this is where I should post a question about converting a PDF to an ePub (so please bear with me). At the moment I am using Abbyy Fine Reader (v11) to convert "The Tibetan Book of Living & Dying" from PDF to ePub. The process is relatively straight forward, however I am stuck on a few points - 1). Page 2 shows a little yellow exclamation point in the bottom right hand corner of the page thumbnail. and it then goes on to state "Page Not Recognized" [Image violates guidelines for size - MODERATOR] so I am wondering how to overcome this issue. 2). Whilst the PDF includes the cover of the book when I export to ePub the cover is no longer present. Once again the "cover page" states "Page Not Recognized", however this time there is no yellow exclamation point next to the page thumbnail. 3). Even when I go into properties and set the Author, as pictured below, [Image violates guidelines for size - MODERATOR] when I export to ePub format the Author information is not retained. If anybody can offer any help with any of these issues it will be greatly appreciated. Kind Regards, Davo Last edited by Dr. Drib; 02-08-2015 at 12:08 PM. Reason: Put oversize graphics in spoilers |
![]() |
![]() |
![]() |
#2 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
The resulting ePUB is not a good start. The document will contain a lot of errors and mistakes due to the OCR process.
You will be better of choosing another export format and clean the source before creating the ePUB or clean the ePUB itself. |
![]() |
![]() |
![]() |
#3 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 263
Karma: 1492476
Join Date: Jun 2012
Location: Scotland
Device: Kindle
|
You could try using Calibre, which has a number of different import formats (including PDF) and also outputs to most common e-publishing formats including EPUB and Mobi.
|
![]() |
![]() |
![]() |
#4 |
Color me gone
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
|
This book is available in Kindle format from Amazon. That might be a better starting place. It is also most definitely in copyright, being published in 1994, so distributing the results of your work would not be legal in any country.
The source book for this book, the Tibetan Book of the Dead, though published in 1927, is not out of copyright even in Canada, since the author died in 1965. |
![]() |
![]() |
![]() |
#5 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
|
![]() |
![]() |
![]() |
#6 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 450
Karma: 343115
Join Date: Nov 2009
Location: Romania
Device: PW2 2014
|
ABBYY FineReader isn't very good at exporting ePub directly... But I guess it works fine, considering that the feature was just added in version 11. Try updating it to the latest build from the official website. Otherwise, you're gonna need Sigil to add a cover and metadata info.
The part about retaining the "Author information" in the ePub sounds like a bug in FineReader. It's most likely added to the file if you export it as a DOC, DOCX, or PDF, but not ePub. This was probably fixed in later builds, so make sure that you have the latest one. |
![]() |
![]() |
![]() |
#7 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 681
Karma: 4568205
Join Date: Jan 2010
Location: Sweden
Device: Kobo Forma
|
I have just started to play around createing epubs, so I am sure there are much better ways to do things than what I currently do.
What I do is this: - Export cover from pdf to a single file - Open and ocr pdf in Finereader 11 - "Verification" through the whole book to fix ocr errors - Adjust area for figures/graphs/illustrations - Save as html - Open html in Sigil - (Here you can spend as much time as you like formatting, fixing typos, etc.) - Create Chapters (separate files), creating toc - Import cover from the cover-file - Add metadata (author, title, published date, etc.) - Save as epub - Import in Calibre - Send to device - Read the book, either fix errors directly in Sigil or highlight in the book and fix later - Be happy :-) I would be interested to hear from all of you using Word/OO between Finereader and Sigil, what do you do that is not easily done in Sigil? |
![]() |
![]() |
![]() |
#8 |
mostly an observer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,518
Karma: 987654
Join Date: Dec 2012
Device: Kindle
|
How do you "Save as html"? Using the Save As / Web Page (whatever) in Word?
|
![]() |
![]() |
![]() |
#9 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
I use Word for the following steps:
- "Verification" through the whole book to fix ocr errors - (Here you can spend as much time as you like formatting, fixing typos, etc.) - Import cover from the cover-file - Add metadata (author, title, published date, etc.) - Save as epub After that I open it in Sigil and make the final touches like some formatting and TOC. |
![]() |
![]() |
![]() |
#10 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 681
Karma: 4568205
Join Date: Jan 2010
Location: Sweden
Device: Kobo Forma
|
|
![]() |
![]() |
![]() |
#11 | |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 450
Karma: 343115
Join Date: Nov 2009
Location: Romania
Device: PW2 2014
|
Quote:
I do not recommend converting. PDF isn't the most friendly format out there, and if it wasn't saved as a tagged PDF (i.e. if you select some random text, the selection should NOT look like there are several letters and groups of letters separated; then it's not a tagged PDF), like over 90% of PDFs out there are, then it's really not worth trying to convert using Mobipocket or whatever. OCR it. Because the software will have to approximate the location of paragraphs (since each of those groups have individual coordinates, like on a blank piece of paper) and it may result in paragraphs within paragraphs, or a paragraph placed before a wrong paragraph, and so on. No, thanks! The title of this thread is wrong. You do not convert with ABBYY FineReader. You OCR with it, and then manually tweak the stuffing out of it with some other software. Think of FineReader as an extraction tool. You extract text from images, and that's it. There are no layout options in FineReader. |
|
![]() |
![]() |
![]() |
#12 |
mostly an observer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,518
Karma: 987654
Join Date: Dec 2012
Device: Kindle
|
|
![]() |
![]() |
![]() |
#13 | |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 681
Karma: 4568205
Join Date: Jan 2010
Location: Sweden
Device: Kobo Forma
|
Quote:
But for doing ocr, IMHO Finereader is really, really good. I tried to export directly to epub from Finereader, but that was a bit of a mess. Much better, IMHO again, to export as html and let Sigil + some editing do the conversion to epub. But I am a fairly new at this epub-creating thing, I may do something completely different in a couple of months. ![]() I am slowly reading through the forums here and learning all the time. |
|
![]() |
![]() |
![]() |
#14 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
The HTML export of ABBYY is usually full of internal styling, making it cluttered.
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
PDF looks fine, ePUB formatting is awful (pics included) | knet229 | Conversion | 4 | 05-07-2013 10:38 AM |
If I have ABBYY Finereader, do I need ABBYY PDF Transformer? | graycyn | 2 | 06-12-2012 06:23 PM | |
Help converting epub to pdf . | drofart | Conversion | 7 | 03-04-2012 01:25 AM |
Epub works fine on Reader, fails epubcheck spectacularly | jmatthew | ePub | 3 | 01-05-2011 06:03 AM |
Calibre epub works fine on Reader, fails epubcheck spectacularly | jmatthew | Calibre | 2 | 01-04-2011 03:12 PM |