Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Kindle Formats

Notices

Reply
 
Thread Tools Search this Thread
Old 03-09-2020, 03:59 PM   #16
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 14,905
Karma: 110507267
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
Ah, it's worse than that. I've documents, manuals and books so old that the text is obviously from the ancient kind of typewriter that saved numeric keys, didn't have One or Zero only 2 to 9.
Then some publishers didn't spot this on the proofing so stuff can be type set with I or l for 1 and O for 0.

I don't much fancy correcting OCR anyway unless I have the original, and even then I think I'd rather someone else did it.
Quoth is offline   Reply With Quote
Old 03-09-2020, 04:13 PM   #17
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by Quoth View Post
Ah, it's worse than that. I've documents, manuals and books so old that the text is obviously from the ancient kind of typewriter that saved numeric keys, didn't have One or Zero only 2 to 9.
Then some publishers didn't spot this on the proofing so stuff can be type set with I or l for 1 and O for 0.
I discussed "O vs. 0" last year in more detail on Reddit, and brought up this example:

"So 1,000 would look like I,OOO or l,ooo."

Also mentioned ! missing from keyboards (so you had to apostrophe + backspace + period).

Last edited by Tex2002ans; 03-09-2020 at 04:23 PM.
Tex2002ans is offline   Reply With Quote
Advert
Old 03-09-2020, 05:02 PM   #18
DaleDe
Grand Sorcerer
DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.
 
DaleDe's Avatar
 
Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
To: Tex2002ans

See Typography#Figure_Typography in our wiki. The only difference is the height of the number. In your example it would be the same height as the o.

Dale
DaleDe is offline   Reply With Quote
Old 03-09-2020, 05:15 PM   #19
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 14,905
Karma: 110507267
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
Quote:
Originally Posted by Tex2002ans View Post
I discussed "O vs. 0" last year in more detail on Reddit, and brought up this example:

"So 1,000 would look like I,OOO or l,ooo."

Also mentioned ! missing from keyboards (so you had to apostrophe + backspace + period).
I'd forgotten that gem. OTOH, you can do that with C and = too. I remember using non-destructive print backspace in Wordstar in early 1980s for creative characters on stupid printers. Also it was only 7 bit ASCII so the only way to do accented letters.
Quoth is offline   Reply With Quote
Old 03-09-2020, 09:57 PM   #20
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by DaleDe View Post
Thanks. I'll toss that on my resource list.

Quote:
Originally Posted by DaleDe View Post
The only difference is the height of the number. In your example it would be the same height as the o.
Since we don't have access to old-style figures here on MobileRead... the IOlo were approximations of what it looks like on paper.

And OCR definitely trips up on books with old-style numbers, so it's a very common error to look for:

I always do a Regex+Spellcheck pass for O/o or l/I next to a number (I942 -> 1942). And 1,ooo -> 1,000 is pretty common too.

Quote:
Originally Posted by Quoth View Post
I'd forgotten that gem. OTOH, you can do that with C and = too. I remember using non-destructive print backspace in Wordstar in early 1980s for creative characters on stupid printers. [...]
Way before my time... but C and =, I assume it was an approximation for € Euro?

Quote:
Originally Posted by Quoth View Post
Also it was only 7 bit ASCII so the only way to do accented letters.
Yuck. Glad those times are over. :P

Last edited by Tex2002ans; 03-09-2020 at 09:59 PM.
Tex2002ans is offline   Reply With Quote
Advert
Old 03-10-2020, 08:29 PM   #21
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by hobnail View Post
Old style numerals have decenders, so the 9 is like a p.

[... On] my body tag [I use] "font-variant: common-ligatures oldstyle-nums proportional-nums;" and for h1, h2, h3, h4 it's "font-variant: common-ligatures lining-nums proportional-nums;" [...]
I was also reading through the CSS3 Fonts Specs on "font-variant-numeric" and saw they have a nice comparison table showing differences between Proportional/Tabular + Lining/Old-Style numbers.

Quote:
Originally Posted by Tex2002ans View Post
And OCR definitely trips up on books with old-style numbers, so it's a very common error to look for:

I always do a Regex+Spellcheck pass for O/o or l/I next to a number (I942 -> 1942). And 1,ooo -> 1,000 is pretty common too.
Aaaand it comes up today!

The book scan in:

"How to handle images in books while doing OCR of books?"

was published in 1918.* Lots of tables and numbers, all in Old-Style:

Click image for larger version

Name:	Humphrey,Milford.-.Some.South.Indian-p70[Old-Style.Numbers].png
Views:	266
Size:	145.6 KB
ID:	177635

"Ist July"
"IO July"
"IOO"
"I,OOO"

It has 'em all!

* I mean, I9I8.
Tex2002ans is offline   Reply With Quote
Old 03-11-2020, 04:07 AM   #22
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 14,905
Karma: 110507267
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
Typewriters were invented before reliable fountain pens in the Victorian Era. Wordprocessing only came in during 1970s, originally as dedicated machines.
So I'd expect this to be common between 1890 and 1980.
I had to type my weekly report in my first full time job in the mid 1970s. A big company that had no wordprocessors.
Quoth is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Dictionary font size and style ams KOReader 5 11-09-2019 02:54 AM
Losing Font Style in Conversion Jimmbo Conversion 4 06-27-2015 12:10 PM
How to change font size and font style? butterbescotch Sigil 20 09-06-2013 08:22 PM
Forcing a quotation mark style Hoods7070 Sigil 27 05-04-2013 03:41 AM
Font / style questions holdit Amazon Fire 0 12-22-2012 12:11 PM


All times are GMT -4. The time now is 01:40 AM.


MobileRead.com is a privately owned, operated and funded community.