Quote:
Originally Posted by msundman
Bytes and characters are used differently even if one would use some encoding with 1 byte/character. Each glyph is a byte (including spaces, punctuation, etc.), but only the word characters and digits (0-9 and a-z in English) constitute characters in the sense I'm talking about.
|
Mmhmm, as am artifact of my background I tend to think of digits, letters, punctuation marks, every glyph pretty much, all as characters. You obviously don't. Which is fine, For what we're talking about you're certainly more correct.
Quote:
Originally Posted by msundman
I don't see how "150 kwords" is more manageable than "1.5 Mchars", and "150,000 words" is even less manageable than "1.5 Mchars".
|
Smaller numbers are easier to deal with. And frankly I'd rather not have to use SI units when talking about literature. I can't see any practical benefit to character counts but I'll get into that further below.
Quote:
Originally Posted by msundman
Why would you want to convert from character count to word count?!? (E.g., if it's just about what you are currently used to then I see no real reason, because you could just as well get used to character counts instead.)
|
I know I read 200WPM. I know word recognition is largely glyphic. As in words themselves are glyphs. When one is reading one is reading in whole word chunks for the most part. Individual letters only become significant if one doesn't immediately recognize the word. This would seem to make the word the most significant unit in this transaction.
Quote:
Originally Posted by msundman
However, if you really want to get some ballpark figure then X kchars is something between X/5 and X/4 kwords in English, and it's not very hard to divide by 5 and 4. I see no need to do such a conversion, though.
|
That's the thing, I don't want a rough guess. I want a word count because I read words not letters. The letters make up the words yes, but when read that word "yes" I see yes not Y-E-S. It's an instant recognition as a single glyph. The letters are there but they have no meaning outside of the pattern that constitute the word.
Let's try an example from chemistry. Word count seems to me like insisting that rather than giving atomic numbers as numbers of protons that make up each atom to identify the individual elements. instead give the number of quarks of atoms. Sure atoms are ultimately constituted of quarks but quarks have little relevance to chemistry. It also may confuse matters since perhaps the same numbers of quarks might represent vastly different atoms and arrangements.
Quote:
Originally Posted by msundman
As I've already said, character count reflects quite accurately the length of the text. Much, much better than word count (or page count). It also reflects well the time it takes to read the text, and works in many different languages.
|
Okay, have you tested this? How many characters per minute to you read. Your certain the figure doesn't change depending on how those letters are arranged into words? Do you really break everything down into individual letters as you read?
I'm having trouble imagining a situation where I wouldn't find character count obtuse and cumbersome when deciding how fast a document will be read. Ideally I'd like to see both a character count and a word count, that'd give me an interesting way to measure the density of a text. A high character count and a disproportionate low word count would suggest a technical document filled with lengthy words; technical and scientific jargon. However on average I'd find word count more useful so I argue for that.