View Single Post
Old 03-10-2020, 08:29 PM   #21
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by hobnail View Post
Old style numerals have decenders, so the 9 is like a p.

[... On] my body tag [I use] "font-variant: common-ligatures oldstyle-nums proportional-nums;" and for h1, h2, h3, h4 it's "font-variant: common-ligatures lining-nums proportional-nums;" [...]
I was also reading through the CSS3 Fonts Specs on "font-variant-numeric" and saw they have a nice comparison table showing differences between Proportional/Tabular + Lining/Old-Style numbers.

Quote:
Originally Posted by Tex2002ans View Post
And OCR definitely trips up on books with old-style numbers, so it's a very common error to look for:

I always do a Regex+Spellcheck pass for O/o or l/I next to a number (I942 -> 1942). And 1,ooo -> 1,000 is pretty common too.
Aaaand it comes up today!

The book scan in:

"How to handle images in books while doing OCR of books?"

was published in 1918.* Lots of tables and numbers, all in Old-Style:

Click image for larger version

Name:	Humphrey,Milford.-.Some.South.Indian-p70[Old-Style.Numbers].png
Views:	264
Size:	145.6 KB
ID:	177635

"Ist July"
"IO July"
"IOO"
"I,OOO"

It has 'em all!

* I mean, I9I8.
Tex2002ans is offline   Reply With Quote