View Single Post
Old 09-20-2015, 07:40 AM   #7
SBT
Fanatic
SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.SBT ought to be getting tired of karma fortunes by now.
 
SBT's Avatar
 
Posts: 580
Karma: 810184
Join Date: Sep 2010
Location: Norway
Device: prs-t1, tablet, Nook Simple, assorted kindles, iPad
@elibrarian: Thanks for the tip on the LibreOffice extensions.

Meanwhile, I've started looking into a particular problem of mine.
"My" ebooks are mostly 19th century Norwegian books, using spelling and grammar that's somewhere half-way between Danish and modern Norwegian, meaning I can't use spell-checkers, because no pre-1907 Norwegian ispell dictionary exists. However, a lot of proper proof-read digital 19th c. Norwegian texts exist (>10,000 pages).

I came across this 21-line(!) spell-checker at norvig.com, the site more famous as the origin of the PowerPoint version of the Gettysburg address. Based on a huge reference text, it checks spelling of a word. Though oriented towards human errors like transposition, it should be possible to tweak it to look for typical OCR mistakes, like 'f00lish' and 'junip'
SBT is offline   Reply With Quote