Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book General > News

Notices

Reply
 
Thread Tools Search this Thread
Old 09-16-2009, 11:42 PM   #1
Wetdogeared
Storm Surge'n
Wetdogeared ought to be getting tired of karma fortunes by now.Wetdogeared ought to be getting tired of karma fortunes by now.Wetdogeared ought to be getting tired of karma fortunes by now.Wetdogeared ought to be getting tired of karma fortunes by now.Wetdogeared ought to be getting tired of karma fortunes by now.Wetdogeared ought to be getting tired of karma fortunes by now.Wetdogeared ought to be getting tired of karma fortunes by now.Wetdogeared ought to be getting tired of karma fortunes by now.Wetdogeared ought to be getting tired of karma fortunes by now.Wetdogeared ought to be getting tired of karma fortunes by now.Wetdogeared ought to be getting tired of karma fortunes by now.
 
Wetdogeared's Avatar
 
Posts: 5,781
Karma: 8213195
Join Date: Nov 2008
Location: Polar Vortex
Device: S0ny PRS-300/350/505/700/T1
Google uses anti-fraud tool to help digitize books

Google acquires Carnegie Mellon's anti-fraud tool (Associated Press)

The next time you key in one of those skewed words to enter a website or complete an online transaction, you may be helping Google digitize a word in an eBook.

Quote:
ReCAPTCHA offers simple word puzzles that users must solve when registering at a Web site or completing an online purchase. Computers can't decipher the twisted letters and numbers, ensuring that real people and not automated programs are at the keyboard.

Unlike other word puzzles, however, ReCAPTCHA's text comes from actual books, letting the system create a digitized version in the process.
Quote:
Google Inc. is already behind a major project to digitize books and put them online, mostly by scanning pages and using optical character recognition, or OCR, to make the texts searchable. OCR doesn't always work on text that is older, faded or distorted. In such cases, often the only way to digitize the works is to manually type them in.

ReCAPTCHA provides an alternative. Snippets that the computer doesn't recognize are split up into single words that can be used as human tests at sites all over the Internet. The ReCAPTCHA system reassembles the text of the book from those responses.
Wetdogeared is offline   Reply With Quote
Old 09-17-2009, 04:28 AM   #2
Krystian Galaj
Guru
Krystian Galaj can tame squirrels without the assistance of a chair or a whip.Krystian Galaj can tame squirrels without the assistance of a chair or a whip.Krystian Galaj can tame squirrels without the assistance of a chair or a whip.Krystian Galaj can tame squirrels without the assistance of a chair or a whip.Krystian Galaj can tame squirrels without the assistance of a chair or a whip.Krystian Galaj can tame squirrels without the assistance of a chair or a whip.Krystian Galaj can tame squirrels without the assistance of a chair or a whip.Krystian Galaj can tame squirrels without the assistance of a chair or a whip.Krystian Galaj can tame squirrels without the assistance of a chair or a whip.Krystian Galaj can tame squirrels without the assistance of a chair or a whip.Krystian Galaj can tame squirrels without the assistance of a chair or a whip.
 
Posts: 820
Karma: 11012
Join Date: Nov 2007
Location: Warsaw, Poland
Device: Bookeen Cybook
If it doesn't know what the answer is, how does it know if one passed the captcha test or not?
Krystian Galaj is offline   Reply With Quote
Advert
Old 09-17-2009, 04:45 AM   #3
Deneb
Groupie
Deneb knows what time it isDeneb knows what time it isDeneb knows what time it isDeneb knows what time it isDeneb knows what time it isDeneb knows what time it isDeneb knows what time it isDeneb knows what time it isDeneb knows what time it isDeneb knows what time it isDeneb knows what time it is
 
Deneb's Avatar
 
Posts: 181
Karma: 2460
Join Date: Jul 2009
Device: Cybook
Quote:
Originally Posted by Krystian Galaj View Post
If it doesn't know what the answer is, how does it know if one passed the captcha test or not?
That's a good question...I think there is probability somewhere in the process, in comparing all the answers and taking the average or the most frequent but it doesn't answer your question totally...
Deneb is offline   Reply With Quote
Old 09-17-2009, 07:17 AM   #4
WillAdams
Wizard
WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.
 
WillAdams's Avatar
 
Posts: 1,259
Karma: 3439432
Join Date: Feb 2008
Device: Amazon Kindle Paperwhite (300ppi), Samsung Galaxy Book 12
I would do this by merging a couple of images --- several known letters, plus a spliced in unknown letter or word. You're using the known letters for the ``captcha'' aspect, and collecting all responses on the unknown letter or word, determining what it is based on a weighted average of the responses.

William
WillAdams is offline   Reply With Quote
Old 09-17-2009, 08:31 AM   #5
Who are you?
Groupie
Who are you? ought to be getting tired of karma fortunes by now.Who are you? ought to be getting tired of karma fortunes by now.Who are you? ought to be getting tired of karma fortunes by now.Who are you? ought to be getting tired of karma fortunes by now.Who are you? ought to be getting tired of karma fortunes by now.Who are you? ought to be getting tired of karma fortunes by now.Who are you? ought to be getting tired of karma fortunes by now.Who are you? ought to be getting tired of karma fortunes by now.Who are you? ought to be getting tired of karma fortunes by now.Who are you? ought to be getting tired of karma fortunes by now.Who are you? ought to be getting tired of karma fortunes by now.
 
Who are you?'s Avatar
 
Posts: 184
Karma: 300001
Join Date: May 2009
Device: 505
It uses one known and one unknown word. It assumes that if you typed the known word correctly you probably also typed the unknown word correctly. Also each word will be tested multiple times.
Who are you? is offline   Reply With Quote
Advert
Old 09-17-2009, 08:50 AM   #6
Abecedary
Exwyzeeologist
Abecedary could sell banana peel slippers to a Deveel.Abecedary could sell banana peel slippers to a Deveel.Abecedary could sell banana peel slippers to a Deveel.Abecedary could sell banana peel slippers to a Deveel.Abecedary could sell banana peel slippers to a Deveel.Abecedary could sell banana peel slippers to a Deveel.Abecedary could sell banana peel slippers to a Deveel.Abecedary could sell banana peel slippers to a Deveel.Abecedary could sell banana peel slippers to a Deveel.Abecedary could sell banana peel slippers to a Deveel.Abecedary could sell banana peel slippers to a Deveel.
 
Abecedary's Avatar
 
Posts: 535
Karma: 3261
Join Date: Jun 2009
Device: :PRS-505::iPod touch:
If I'm not mistaken, the ReCAPTCHA system was being used to digitize the back issues of the NYTimes, among other things. I wonder how Google's acquisition affects projects that have been using this for a while.
Abecedary is offline   Reply With Quote
Old 09-17-2009, 09:45 AM   #7
Mike L
Wizard
Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.
 
Mike L's Avatar
 
Posts: 1,479
Karma: 3846231
Join Date: Apr 2009
Location: Edinburgh, Scotland
Device: Kindle 3, Samsung Galaxy
Quote:
Originally Posted by Who are you? View Post
It uses one known and one unknown word. It assumes that if you typed the known word correctly you probably also typed the unknown word correctly.
Which means that a savvy user would only need to type in the "known" word -- presumably the more obvious of the two - and type any old keystrokes for the other, and they would pass the test.
Mike L is offline   Reply With Quote
Old 09-17-2009, 09:49 AM   #8
ahi
Wizard
ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.
 
Posts: 1,790
Karma: 507333
Join Date: May 2009
Device: none
Quote:
Originally Posted by Mike L View Post
Which means that a savvy user would only need to type in the "known" word -- presumably the more obvious of the two - and type any old keystrokes for the other, and they would pass the test.
Why do you assume there would be any consistent and discernible difference between the known and the unknown word?

Google can have their internal employees do a 1000 of these to start with... ought to be trivial, given the number of employees they have. Then, after that, have unknown words (once identified the same way by X number of people) added to the pool of known words.

- Ahi
ahi is offline   Reply With Quote
Old 09-17-2009, 11:46 AM   #9
emellaich
Wizard
emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.
 
Posts: 1,101
Karma: 4388403
Join Date: Oct 2007
Device: Palm>Ebookman>IPaq>Axim>Cybook>Kndl2>IPAD>Kndl3SO>Voyager>Oasis
Quote:
Originally Posted by Mike L View Post
Which means that a savvy user would only need to type in the "known" word -- presumably the more obvious of the two - and type any old keystrokes for the other, and they would pass the test.
That is correct. The first purpose of captcha is to defeat automated spambots that are trying to spam a blog/forum/etc. site. If you can correctly identify an obscured word then you are probably not a spambot. So for this purpose only one word is needed.

The 'secondary' use is to decipher words that the OCR software has low certainty on. In this case it assumes the correct answer is the mode of the persons answers. You could put anything in if you are an anarchist and feel like spreading chaos, I guess. But why?
emellaich is offline   Reply With Quote
Old 09-17-2009, 11:50 AM   #10
acidzebra
Liseuse Lover
acidzebra ought to be getting tired of karma fortunes by now.acidzebra ought to be getting tired of karma fortunes by now.acidzebra ought to be getting tired of karma fortunes by now.acidzebra ought to be getting tired of karma fortunes by now.acidzebra ought to be getting tired of karma fortunes by now.acidzebra ought to be getting tired of karma fortunes by now.acidzebra ought to be getting tired of karma fortunes by now.acidzebra ought to be getting tired of karma fortunes by now.acidzebra ought to be getting tired of karma fortunes by now.acidzebra ought to be getting tired of karma fortunes by now.acidzebra ought to be getting tired of karma fortunes by now.
 
acidzebra's Avatar
 
Posts: 869
Karma: 1035404
Join Date: Jul 2008
Location: Netherlands
Device: PRS-505
Quote:
Scanned text is subjected to analysis by two different optical character recognition programs; in cases where the programs disagree, the questionable word is converted into a CAPTCHA. The word is displayed along with a control word already known. The system assumes that if the human types the control word correctly, the questionable word is also correct. The identification performed by each OCR program is given a value of 0.5 points, and each interpretation by a human is given a full point. Once a given identification hits 2.5 votes, the word is considered called. Those words that are consistently given a single identity by human judges are recycled as control words.
http://en.wikipedia.org/wiki/Recaptcha
http://arstechnica.com/old/content/2...anuscripts.ars
acidzebra is offline   Reply With Quote
Old 09-17-2009, 01:38 PM   #11
zelda_pinwheel
zeldinha zippy zeldissima
zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.zelda_pinwheel ought to be getting tired of karma fortunes by now.
 
zelda_pinwheel's Avatar
 
Posts: 27,827
Karma: 921169
Join Date: Dec 2007
Location: Paris, France
Device: eb1150 & is that a nook in her pocket, or she just happy to see you?
i think reCAPTCHA is a brilliant initiative and i'm glad it's getting more attention. their slogan is "stop spam, read books" which seems like a worthy goal. plus hopefully this will improve google's ocr quality which has been frequently reported as variable at best. i do hope google won't divert it from previous projects though. i'm sure there are enough sites needing bot-protection to go around.
zelda_pinwheel is offline   Reply With Quote
Old 09-18-2009, 07:44 AM   #12
Mike L
Wizard
Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.
 
Mike L's Avatar
 
Posts: 1,479
Karma: 3846231
Join Date: Apr 2009
Location: Edinburgh, Scotland
Device: Kindle 3, Samsung Galaxy
My reason for suggesting that 'a savvy user would only need to type in the "known" word' had nothing to do with enabling spam. It was more my way of getting a tiny measure of revenge against the growing nuisance of these devices.

But I didn't mean it to be taken seriously.
Mike L is offline   Reply With Quote
Old 09-18-2009, 10:26 AM   #13
emellaich
Wizard
emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.emellaich ought to be getting tired of karma fortunes by now.
 
Posts: 1,101
Karma: 4388403
Join Date: Oct 2007
Device: Palm>Ebookman>IPaq>Axim>Cybook>Kndl2>IPAD>Kndl3SO>Voyager>Oasis
Quote:
Originally Posted by Mike L View Post
My reason for suggesting that 'a savvy user would only need to type in the "known" word' had nothing to do with enabling spam. It was more my way of getting a tiny measure of revenge against the growing nuisance of these devices.

But I didn't mean it to be taken seriously.
So, I'm not striking back out you, because I agree that these are quite tiresome. However, I would suggest that the best target of our frustration is not the recaptcha users, but the sleezy spambot organizations that make recaptcha necessary.

In fact, the Internet in general would be a much more pleasant experience if only we didn't have to guard against spam, viruses, adware and the like. It's amazing when you consider the aggravation tax that we all must bear because of the actions of a few graffiti spraying kids and a handful of sleezy 'business' people.
emellaich is offline   Reply With Quote
Old 09-18-2009, 02:30 PM   #14
Mike L
Wizard
Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.Mike L ought to be getting tired of karma fortunes by now.
 
Mike L's Avatar
 
Posts: 1,479
Karma: 3846231
Join Date: Apr 2009
Location: Edinburgh, Scotland
Device: Kindle 3, Samsung Galaxy
I take your point, Emellaich. In fact, I don't object to the traditional Captcha mechanism. My irritation is directed against ReCaptcha in particular (and, by the way, this has been around for quite a while now, long before Google took an interest).

My point about ReCaptcha is that one of the two words is, by definition, "difficult". That's why it's there. So you have to use quite a lot more effort to negotiate the system than you would with the normal sort of Captcha (which is typically based on just four or five letters, displayed fairly clearly). This makes the whole experience that much more trying.

Now, if that extra difficulty was directed to defeating Spam, I wouldn't object. But its main purpose is to help the company that promotes it to reduce their proof-reading costs. By all means, make your prospective customers work to give you their business, if you can show that it's in their interest, but not if it's just for the convenience of a third party.
Mike L is offline   Reply With Quote
Old 09-18-2009, 08:51 PM   #15
Andybaby
Wizard
Andybaby ought to be getting tired of karma fortunes by now.Andybaby ought to be getting tired of karma fortunes by now.Andybaby ought to be getting tired of karma fortunes by now.Andybaby ought to be getting tired of karma fortunes by now.Andybaby ought to be getting tired of karma fortunes by now.Andybaby ought to be getting tired of karma fortunes by now.Andybaby ought to be getting tired of karma fortunes by now.Andybaby ought to be getting tired of karma fortunes by now.Andybaby ought to be getting tired of karma fortunes by now.Andybaby ought to be getting tired of karma fortunes by now.Andybaby ought to be getting tired of karma fortunes by now.
 
Andybaby's Avatar
 
Posts: 1,279
Karma: 1002683
Join Date: Nov 2008
Location: New York
Device: PRS-700
out of all the captchas I come up against. I prefer Recaptchas. a few of them I have to try 4 or 5 times or more to get cause they are stupidly difficult. re captchas are quick and easy cause they are actual words. the captchas system used at annual credit report .com for instance took me about 5 tries when I tried to do it a couple of weeks ago

I Prefer ReCaptchas first actually followed by Number only Captchas. I ussually get recaptchas the first time, and if I mess up, always on the second
Andybaby is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
I want to digitize my paper books. llwwss Workshop 56 09-02-2010 03:49 AM
Bookshelf reduction: To digitize or not to digitize vivaldirules Lounge 15 12-06-2007 07:00 PM
how to digitize books user Workshop 13 10-05-2007 05:07 PM
Why Dr. Eric Schmidt (Google CEO) may be wrong and right about click fraud Bob Russell Lounge 0 07-09-2006 01:35 PM
How to digitize a million books Bob Russell Workshop 0 03-01-2006 06:10 PM


All times are GMT -4. The time now is 04:42 PM.


MobileRead.com is a privately owned, operated and funded community.