![]() |
#61 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,395
Karma: 1358132
Join Date: Nov 2007
Location: UK
Device: Palm TX, CyBook Gen3
|
Quote:
It'll return text from Snippet view and No Preview books - which you can't ordinarily access (afaik). I'm having to do this quite a lot for the book I'm proofing at the moment. E.g. The PDF I'm using has "and the dresses of the ladies, ....ped about the piano" Searching Google Books for the text and book name: "and the dresses of the ladies" Diana Trelawny I can see the missing text is "as they stood grouped". ![]() |
|
![]() |
![]() |
![]() |
#62 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,490
Karma: 5239563
Join Date: Jan 2008
Location: Denmark
Device: Kindle 3|iPad air|iPhone 4S
|
Quote:
|
|
![]() |
![]() |
Advert | |
|
![]() |
#63 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,413
Karma: 13369310
Join Date: May 2008
Location: Launceston, Tasmania
Device: Sony PRS T3, Kobo Glo, Kindle Touch, iPad, Samsung SB 2 tablet
|
Quote:
Regards, Alex |
|
![]() |
![]() |
![]() |
#64 | |
eBook Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 85,556
Karma: 93383099
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
Quote:
A good example is the Erksine Childers classic spy story "The Riddle of the Sands". Pretty much all the "free" versions of it, PG included, in a paragraph describing the appearance of the cabin of a boat, include a mysterious reference to "banks of yam". I defy anyone to guess what that really should be, without recourse to an original page scan or a printed copy of the book. The correct text, in case you're wondering, is "hanks of yarn" ![]() |
|
![]() |
![]() |
![]() |
#65 |
Zealot
![]() Posts: 109
Karma: 84
Join Date: Jun 2009
Location: Manchester
Device: Kobo Auroa H2O
|
Couple of points:
Authors' proofing. I follow the blogs of a big name, make neough money to live off it professional genre fiction authors. They spend days manually proofing the 'correct' galleys returned to them by the publishers for final checks. It is time they budget in the writing of any book, so many days for the writing, plus so many extra for the proofing afterwards. What's worse is that by the time they get the galleys they are already mid-flow in a different story, and have to break mindsets to focus on the 'old' story properly. Hard work indeed but part of the job. One is now re-editing corrupted files of backcatalog to sell from their own webportal as ebooks. What should I do when I find an error. Once recent purchased and DRMd ebook, was mostly ok. Until about page 500 where for the next 50 pages wordswerefrequently runaltogether it was really quite extremely annonying. SHould I contact the author, the seller or the publisher? or all of them? |
![]() |
![]() |
Advert | |
|
![]() |
#66 | |
eBook Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 85,556
Karma: 93383099
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
Quote:
|
|
![]() |
![]() |
![]() |
#67 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,395
Karma: 1358132
Join Date: Nov 2007
Location: UK
Device: Palm TX, CyBook Gen3
|
Quote:
![]() Given the context, 'yam' is obviously 'yarn'. 'banks' is more of a puzzler - but 'b' for 'h' is a very common OCR error, and about the only correction candidate that comes to mind for 'banks'. |
|
![]() |
![]() |
![]() |
#68 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,490
Karma: 5239563
Join Date: Jan 2008
Location: Denmark
Device: Kindle 3|iPad air|iPhone 4S
|
[QUOTE=HarryT;551376]What's even more insidious is when OCR is combined with a spell checker, and a "mangled" word has been replaced by another word, which "fits in" to the sentence, and yet is totally wrong.
...[QUOTE] Yikes! That is bad. It's hard to spot words that the spell checker doesn't catch. Quote:
(I know there's some difference between line break and paragraph break, but I haven't looked into it, so I might be missing some information) |
|
![]() |
![]() |
![]() |
#69 |
eReader
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,750
Karma: 4968470
Join Date: Aug 2007
Device: Note 5; PW3; Nook HD+; ChuWi Hi12; iPad
|
I do a lot of freelance editing, proofing, and light rewriting. Almost everything I've seen people complain about (and I've complained about the same things myself) is stuff that's my job, not the author's.
As has been said before, a lot of this is things the author literally cannot see; their brain fills in what's supposed to be there. So when someone tells the author to correct an ebook they're often asking them to do something that's not their job, that they're uniquely ill-suited for, and that they may not even know about because they often have very little if anything to do with ebook releases. Yes, many authors will do what they can, but by this point it's out of their hands. |
![]() |
![]() |
![]() |
#70 | |
Retired & reading more!
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,764
Karma: 1884247
Join Date: Sep 2006
Location: North Alabama, USA
Device: Kindle 1, iPad Air 2, iPhone 6S+, Kobo Aura One
|
[QUOTE=Ea;551462][QUOTE=HarryT;551376]What's even more insidious is when OCR is combined with a spell checker, and a "mangled" word has been replaced by another word, which "fits in" to the sentence, and yet is totally wrong.
... Quote:
There is also a "Show/Hide"button to allow you to see the various formating symbols. You can they see the difference between these two marks. Maybe this will help or maybe I totally missed the problem. |
|
![]() |
![]() |
![]() |
#71 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,490
Karma: 5239563
Join Date: Jan 2008
Location: Denmark
Device: Kindle 3|iPad air|iPhone 4S
|
Quote:
![]() It's not a current problem on my Mac, but it's been bothering me how to handle it, and I may well get a Windows machine next time. |
|
![]() |
![]() |
![]() |
#72 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 317
Karma: 1232685
Join Date: Nov 2008
Location: Ireland
Device: Kindle Voyage, Kobo Aura, Nexus 9
|
A trick I've found with OCR errors is to identify the consistent errors and look for other words that might not be picked up with a spell check. Obviously this only work well if the error occurs all the time as you would expect of an automated process.
For example I had an OCR text that had replaced every cl at the start of a word with d. It was easy to find the words like dothes and doset with a spell checker and do a global replace but I had to search for every word that makes sense with a cl and a d in front of it using a dictionary. And you can't use a global replace with dean/clean or dosed/closed as the context has to be checked. Apologies if this is obvious. |
![]() |
![]() |
![]() |
#73 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,490
Karma: 5239563
Join Date: Jan 2008
Location: Denmark
Device: Kindle 3|iPad air|iPhone 4S
|
Quote:
|
|
![]() |
![]() |
![]() |
#74 | |
Jeffrey A. Carver
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,355
Karma: 1107383
Join Date: Aug 2008
Location: Massachusetts, USA
Device: Lenovo Yoga Tab Plus, Droid phone, Nook HD+
|
Quote:
The error might come from the original file. More likely it came from the conversion. |
|
![]() |
![]() |
![]() |
#75 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,950
Karma: 27060153
Join Date: Apr 2009
Location: USA
Device: iPhone 15PM, Kindle Scribe, iPad mini 6, PocketBook InkPad Color 3
|
It seems there should be a way to exploit the fact that ebooks are in electronic form, and that we are all connected by the internet. Every reader is a potential proof-reader.
Windows and OS X have a 'crash reporting' mechanism that allows users who experience crashes to send a report to the software publisher in question. Something analogous could be developed for ebooks and built into the reader software (at least for those devices which support annotation). One might even institute a microcredit scheme so that people who report the errors are rewarded in some tangible way. (hmm, publishers could intentionally introduce errors and give credits to the first 100 readers who find it, to encourage this proofing activity..) So the idea is that users who encounter an error would invoke their reader's 'report an error' function, which would flag the location and allow the user to type a short note as to the nature of the error, the ebook version, the reader's contact info (if they opt in) etc. These error reports would be collected and forwarded or sent directly to the publisher when the device is 'connected' to the internet or tethered to a host computer. The publisher would then resolve the errors, publish a new edition and make it available for download to anyone who owns that title or is purchasing anew. The reader's librarian software could periodically check for and download updates (Amazon already is set up to update titles automatically - maybe a little too automatically in some cases). eBook marketplaces that institute such a self-correcting system would become preferred to those that do not. |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
The Adventures of Joe Nobody and the Badly Formatted Epub | mklynds | Sigil | 44 | 01-30-2013 02:43 PM |
Classic Bought a Badly Formatted Book From B&N | lionel47 | Barnes & Noble NOOK | 11 | 05-22-2010 04:31 PM |
Unutterably Silly How To Write Badly Well | Madam Broshkina | Lounge | 4 | 11-04-2009 08:26 AM |
battery question (I let it drain really badly) | rheostaticsfan | Bookeen | 5 | 11-01-2008 03:21 PM |
Bricked iLiad after badly done reflash ? | Pode | iRex | 6 | 05-19-2008 03:42 PM |