01-06-2017, 09:22 AM | #1 |
Junior Member
Posts: 1
Karma: 10
Join Date: Jan 2017
Device: Kindle Paperwhite (2015)
|
Fix spelling errors in epub file via rtf of same book.
Hello
I am reading a book I got as ePub but it has a lot of ocr errors, for example: "is" becomes "!5" and so on. I tried to find another version of the book without the errors but the only one I could find is an RTF file, meaning it doesn't have the formatting I like from the ePub. I am therefore wondering if it's possible to fix the errors in the ePub by comparing the text in the ePub and the RTF and then replacing the errors in the ePub with the correct words from the RTF. Is this possible? Kindly Emil |
01-06-2017, 10:24 AM | #2 |
The Grand Mouse 高貴的老鼠
Posts: 71,507
Karma: 306214458
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Voyage
|
You'll probably be better off getting a better proof-read ePub of the book. Have you checked the MobileRead Library for one?
Of course, if that's a pirated copy of an in-copyright ebook, you'll get no help here. |
Advert | |
|
01-07-2017, 09:12 AM | #3 |
Guru
Posts: 919
Karma: 417282
Join Date: Jun 2015
Device: kobo aura h2o, kobo forma
|
It might be possible to extract the text from both the epub and the rtf and strip all the the formatting out and then do a conventional comparison (diff?) of the two raw texts.
Also, calibre's book editor has a spell check in it that might help some. |
01-07-2017, 12:47 PM | #4 |
Wizard
Posts: 1,613
Karma: 6718479
Join Date: Dec 2004
Location: Paradise (Key West, FL)
Device: Current:Surface Go & Kindle 3 - Retired: DellV8p, Clie UX50, ...
|
It's quite possible to do manually, but that is extremely labor intensive. It would involve opening the ePub in an appropriate editor (e.g. Calibre ebook-editor or Sigil) and opening the RTF in a wordprocessor and then manually correcting the ePub using the RTF for reference.
It would be vastly easier to convert the RTF to ePub and then use an ePub editor to reformat the conversion to fit your taste. |
01-08-2017, 09:33 AM | #5 |
Guru
Posts: 919
Karma: 417282
Join Date: Jun 2015
Device: kobo aura h2o, kobo forma
|
Along those lines, you could probably convert the RTF to epub, and if it really is missing formatting present in the epub you have, after converting, you could use diff to extract the differences including the formatting, and then edit the diff to remove the other changes.
Then if newer rtf's come out, you can reconvert them and reapply the formatting changes. Quite possibly, you could even leverage the tools in git to do some of this automatically. |
Advert | |
|
Tags |
epub, ocr, rtf |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
unable to fix epub file | Abou | Editor | 1 | 12-05-2016 03:24 PM |
Kindle eBooks will have PUBLIC Warnings for spelling/formatting errors! | Hitch | News | 201 | 09-06-2016 05:11 PM |
spellcheck not displaying potential spelling errors | Rob557 | Editor | 7 | 04-01-2015 08:25 AM |
mobi to epub conversions have spelling errors | dawnybros | Conversion | 24 | 09-29-2011 03:55 AM |
Spelling errors and such | starrlamia | General Discussions | 29 | 11-29-2010 03:59 AM |