Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book General > News

Notices

Reply
 
Thread Tools Search this Thread
Old 09-05-2009, 02:53 PM   #121
tompe
Grand Sorcerer
tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.
 
Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
Quote:
Originally Posted by Teyrnon View Post
What? I've already answered this question. I have no desire to endlessly reiterate myself so I'll try to break it down now. I've given two main reasons 1) Word count provides a useful metric for determining how fast a text can be read. People read words not individual letters.
Yes smaller numerical value is good but you said it provided more information. A specific language have an average world length so for the case were the text is typical the different measurements are equivalent. If the average word length in the text is atypical then it seems to me that the reading speed is more proportional to the character count than to the word count. For very long words you increase the number of fixation points.
tompe is offline   Reply With Quote
Old 09-05-2009, 02:55 PM   #122
tompe
Grand Sorcerer
tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.
 
Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
Quote:
Originally Posted by HarryT View Post
The fact of the matter is, though, that ADE's 1k "page counts" are pretty much always reasonably near to the page count of the equivalent paperback. That's a simple matter of observation.
But that probably mean that the ePub markup characters are counted also. Looking at some books I would say that less than 1500 characters per page is not so common.
tompe is offline   Reply With Quote
Advert
Old 09-05-2009, 03:34 PM   #123
Teyrnon
Groupie
Teyrnon has a complete set of Star Wars action figures.Teyrnon has a complete set of Star Wars action figures.Teyrnon has a complete set of Star Wars action figures.Teyrnon has a complete set of Star Wars action figures.
 
Posts: 190
Karma: 384
Join Date: Jun 2009
Location: South Eastern United States
Device: jetBook, Kindle DX, Kindle 3, Kindle Fire, Nook Simple Touch
Quote:
Originally Posted by HarryT View Post
The fact of the matter is, though, that ADE's 1k "page counts" are pretty much always reasonably near to the page count of the equivalent paperback. That's a simple matter of observation.
Interesting. I've never used ADE so wouldn't know. I just grabbed a random paperback though and came up with about 4k on a particularly dense looking page.
Teyrnon is offline   Reply With Quote
Old 09-05-2009, 03:46 PM   #124
Teyrnon
Groupie
Teyrnon has a complete set of Star Wars action figures.Teyrnon has a complete set of Star Wars action figures.Teyrnon has a complete set of Star Wars action figures.Teyrnon has a complete set of Star Wars action figures.
 
Posts: 190
Karma: 384
Join Date: Jun 2009
Location: South Eastern United States
Device: jetBook, Kindle DX, Kindle 3, Kindle Fire, Nook Simple Touch
Quote:
Originally Posted by tompe View Post
Yes smaller numerical value is good but you said it provided more information. A specific language have an average world length so for the case were the text is typical the different measurements are equivalent. If the average word length in the text is atypical then it seems to me that the reading speed is more proportional to the character count than to the word count. For very long words you increase the number of fixation points.
Hmm, I may have overstated. What I get from word count is a more solid grasp of how fast the text might be read. As far as I can see all character count gives me are larger numbers and a layer of abstraction that takes beyond useful units of language. Neither of which seem desirable to me.

As far as text with long words, I can't think of too many situations where a book might be sufficiently heavy in long unfamiliar words to affect reading speed appreciably on the book as a whole. These things tend to average out.
Teyrnon is offline   Reply With Quote
Old 09-06-2009, 02:51 AM   #125
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by tompe View Post
But that probably mean that the ePub markup characters are counted also. Looking at some books I would say that less than 1500 characters per page is not so common.
If you were to include all the tags in the count the 1k "page" would be a lot shorter than if you only included the text, wouldn't it?
HarryT is offline   Reply With Quote
Advert
Old 09-06-2009, 04:56 PM   #126
tompe
Grand Sorcerer
tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.tompe ought to be getting tired of karma fortunes by now.
 
Posts: 7,452
Karma: 7185064
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Kindle Voyage, Nexus 5, Kindle PW
Quote:
Originally Posted by HarryT View Post
If you were to include all the tags in the count the 1k "page" would be a lot shorter than if you only included the text, wouldn't it?
Yes, I must have been confused when I wrote my text...

Since you have an Opus you can easily take one page in a book and compare it with a paper book (and count the number of characters in the paper book page). Since I could not find any paper book with just 1000 character per page I would really like to have an example were it holds.
tompe is offline   Reply With Quote
Old 09-12-2009, 10:40 PM   #127
msundman
Zealot
msundman has a complete set of Star Wars action figures.msundman has a complete set of Star Wars action figures.msundman has a complete set of Star Wars action figures.
 
Posts: 103
Karma: 269
Join Date: Aug 2006
Device: FBReader on Android
Quote:
Originally Posted by Teyrnon View Post
Well, the basic unit of language is arguably the word.
That's indeed very arguable. E.g. it's definitely not sensible to count the Finnish word "epäjärjestelmällistyttämättömyydelläänsäkäänköhän " as 1 unit, as it's composed of over 10 suffix units altering the meaning of its base.

Quote:
Originally Posted by Teyrnon View Post
Sure words are different lengths but it does average out. Character count isn't reflective of the language nor does it really give a good idea of speed. Speed readers in particular tend to read words as discrete entities whether that word is 1 character long or 9 the word is read at the same speed.
I doubt that's correct. Although it's true that words are usually read whole, it's also true that longer words usually take longer to read than shorter words. AFAIK people almost completely "jump over" (i.e., the eye movement doesn't slow down significantly at) very short words, such as "a", when they read.

If speed readers would read every word equally fast then Finnish speed readers would finish books in considerably less time, but AFAIK this is not the case. AFAIK the number of characters more accurately reflects both the length of the text and the speed with which it's read.

Quote:
Originally Posted by Teyrnon View Post
Character count tells me nothing useful about the text, word count atleast gives me an idea of how long it is in meaningful units of language.
Certainly characters are meaningful units of character-based languages, so I don't know what you're getting at.

Quote:
Originally Posted by Teyrnon View Post
Also, word count produces more manageable values. I'd rather talk about a 150,000 word novel rather than a 1.1 megabyte novel.
With SI-prefixes pretty much any size is as manageable as any other size. E.g., "123 kX" is as easy as "123 TX" to handle, even though the latter is 1,000,000,000 times as large.

And nobody suggested "byte" as anything meaningful in this sense so that "megabyte" thing is a straw man.

Quote:
Originally Posted by Teyrnon View Post
By the way, your 1K=1 Page figure seems a little odd. 1K in English works out to about 150 words. That's a really tiny page. If I'm not mistaken the average pocket size paperback is about 500 words to a page.
You're right, 1 kchar pages are very small. E.g., my Ender's Game pocket is 380 pages (and the pages are quite small) and has 453 kchars (and 107 kwords). I still think it's close enough, though. A 500 kchar book would be 500 very small pages or 250 large(ish) pages. No matter which of the page sizes you want to go with the conversion between it and kchars is very easy to do in your head.
msundman is offline   Reply With Quote
Old 09-12-2009, 11:20 PM   #128
Teyrnon
Groupie
Teyrnon has a complete set of Star Wars action figures.Teyrnon has a complete set of Star Wars action figures.Teyrnon has a complete set of Star Wars action figures.Teyrnon has a complete set of Star Wars action figures.
 
Posts: 190
Karma: 384
Join Date: Jun 2009
Location: South Eastern United States
Device: jetBook, Kindle DX, Kindle 3, Kindle Fire, Nook Simple Touch
Quote:
Originally Posted by msundman View Post
And nobody suggested "byte" as anything meaningful in this sense so that "megabyte" thing is a straw man.
Hold it right there. That's not a straw man, that's just me using a parlance I'm familiar with. I learned computers and programming in the late seventies and early eighties. Byte and character are synonymous in my head. Just reread where I said byte as character and my point still applies.
Teyrnon is offline   Reply With Quote
Old 09-12-2009, 11:32 PM   #129
Teyrnon
Groupie
Teyrnon has a complete set of Star Wars action figures.Teyrnon has a complete set of Star Wars action figures.Teyrnon has a complete set of Star Wars action figures.Teyrnon has a complete set of Star Wars action figures.
 
Posts: 190
Karma: 384
Join Date: Jun 2009
Location: South Eastern United States
Device: jetBook, Kindle DX, Kindle 3, Kindle Fire, Nook Simple Touch
Quote:
Originally Posted by msundman View Post
You're right, 1 kchar pages are very small. E.g., my Ender's Game pocket is 380 pages (and the pages are quite small) and has 453 kchars (and 107 kwords). I still think it's close enough, though. A 500 kchar book would be 500 very small pages or 250 large(ish) pages. No matter which of the page sizes you want to go with the conversion between it and kchars is very easy to do in your head.
Why? Pages have no real usefulness here. It's not helpful to convert to pages which are a rather subjective and imprecise means of referencing the relative size of a text. Converting to word count would be more useful and that's not something I care to do in my head every time I'm trying to pick a book to be read in an alloted time frame.

Now, let me ask. what does character count tell you that's so useful? If it's because it's so easy to convert to pages I don't see that as a selling point for reasons already given earlier in this thread.
Teyrnon is offline   Reply With Quote
Old 09-13-2009, 01:03 AM   #130
msundman
Zealot
msundman has a complete set of Star Wars action figures.msundman has a complete set of Star Wars action figures.msundman has a complete set of Star Wars action figures.
 
Posts: 103
Karma: 269
Join Date: Aug 2006
Device: FBReader on Android
Quote:
Originally Posted by Teyrnon View Post
Quote:
Originally Posted by msundman View Post
And nobody suggested "byte" as anything meaningful in this sense so that "megabyte" thing is a straw man.
Hold it right there. That's not a straw man, that's just me using a parlance I'm familiar with. I learned computers and programming in the late seventies and early eighties. Byte and character are synonymous in my head.
Bytes and characters are used differently even if one would use some encoding with 1 byte/character. Each glyph is a byte (including spaces, punctuation, etc.), but only the word characters and digits (0-9 and a-z in English) constitute characters in the sense I'm talking about.

Quote:
Originally Posted by Teyrnon View Post
Just reread where I said byte as character and my point still applies.
I don't see how "150 kwords" is more manageable than "1.5 Mchars", and "150,000 words" is even less manageable than "1.5 Mchars".

Quote:
Originally Posted by Teyrnon View Post
Quote:
Originally Posted by msundman View Post
A 500 kchar book would be 500 very small pages or 250 large(ish) pages. No matter which of the page sizes you want to go with the conversion between it and kchars is very easy to do in your head.
Why?
Because it's very simple to divide by 1 (or 1,000) and by 2 (or 2,000).

Quote:
Originally Posted by Teyrnon View Post
Pages have no real usefulness here. It's not helpful to convert to pages which are a rather subjective and imprecise means of referencing the relative size of a text.
I wholeheartedly agree, but others seem to be hung up on pages for some reason.

Quote:
Originally Posted by Teyrnon View Post
Converting to word count would be more useful and that's not something I care to do in my head every time I'm trying to pick a book to be read in an alloted time frame.
Why would you want to convert from character count to word count?!? (E.g., if it's just about what you are currently used to then I see no real reason, because you could just as well get used to character counts instead.)

However, if you really want to get some ballpark figure then X kchars is something between X/5 and X/4 kwords in English, and it's not very hard to divide by 5 and 4. I see no need to do such a conversion, though.

Quote:
Originally Posted by Teyrnon View Post
Now, let me ask. what does character count tell you that's so useful?
As I've already said, character count reflects quite accurately the length of the text. Much, much better than word count (or page count). It also reflects well the time it takes to read the text, and works in many different languages.

Quote:
Originally Posted by Teyrnon View Post
If it's because it's so easy to convert to pages I don't see that as a selling point for reasons already given earlier in this thread.
Page counts are not sensible to you and me, but they are to others, so even if this is not a selling point for you or me it certainly is for many others.
msundman is offline   Reply With Quote
Old 09-13-2009, 07:52 AM   #131
Teyrnon
Groupie
Teyrnon has a complete set of Star Wars action figures.Teyrnon has a complete set of Star Wars action figures.Teyrnon has a complete set of Star Wars action figures.Teyrnon has a complete set of Star Wars action figures.
 
Posts: 190
Karma: 384
Join Date: Jun 2009
Location: South Eastern United States
Device: jetBook, Kindle DX, Kindle 3, Kindle Fire, Nook Simple Touch
Thumbs down

Quote:
Originally Posted by msundman View Post
Bytes and characters are used differently even if one would use some encoding with 1 byte/character. Each glyph is a byte (including spaces, punctuation, etc.), but only the word characters and digits (0-9 and a-z in English) constitute characters in the sense I'm talking about.
Mmhmm, as am artifact of my background I tend to think of digits, letters, punctuation marks, every glyph pretty much, all as characters. You obviously don't. Which is fine, For what we're talking about you're certainly more correct.

Quote:
Originally Posted by msundman View Post
I don't see how "150 kwords" is more manageable than "1.5 Mchars", and "150,000 words" is even less manageable than "1.5 Mchars".
Smaller numbers are easier to deal with. And frankly I'd rather not have to use SI units when talking about literature. I can't see any practical benefit to character counts but I'll get into that further below.

Quote:
Originally Posted by msundman View Post
Why would you want to convert from character count to word count?!? (E.g., if it's just about what you are currently used to then I see no real reason, because you could just as well get used to character counts instead.)
I know I read 200WPM. I know word recognition is largely glyphic. As in words themselves are glyphs. When one is reading one is reading in whole word chunks for the most part. Individual letters only become significant if one doesn't immediately recognize the word. This would seem to make the word the most significant unit in this transaction.

Quote:
Originally Posted by msundman View Post
However, if you really want to get some ballpark figure then X kchars is something between X/5 and X/4 kwords in English, and it's not very hard to divide by 5 and 4. I see no need to do such a conversion, though.
That's the thing, I don't want a rough guess. I want a word count because I read words not letters. The letters make up the words yes, but when read that word "yes" I see yes not Y-E-S. It's an instant recognition as a single glyph. The letters are there but they have no meaning outside of the pattern that constitute the word.

Let's try an example from chemistry. Word count seems to me like insisting that rather than giving atomic numbers as numbers of protons that make up each atom to identify the individual elements. instead give the number of quarks of atoms. Sure atoms are ultimately constituted of quarks but quarks have little relevance to chemistry. It also may confuse matters since perhaps the same numbers of quarks might represent vastly different atoms and arrangements.


Quote:
Originally Posted by msundman View Post
As I've already said, character count reflects quite accurately the length of the text. Much, much better than word count (or page count). It also reflects well the time it takes to read the text, and works in many different languages.
Okay, have you tested this? How many characters per minute to you read. Your certain the figure doesn't change depending on how those letters are arranged into words? Do you really break everything down into individual letters as you read?

I'm having trouble imagining a situation where I wouldn't find character count obtuse and cumbersome when deciding how fast a document will be read. Ideally I'd like to see both a character count and a word count, that'd give me an interesting way to measure the density of a text. A high character count and a disproportionate low word count would suggest a technical document filled with lengthy words; technical and scientific jargon. However on average I'd find word count more useful so I argue for that.

Last edited by Teyrnon; 09-13-2009 at 08:17 AM.
Teyrnon is offline   Reply With Quote
Old 09-13-2009, 08:11 AM   #132
Teyrnon
Groupie
Teyrnon has a complete set of Star Wars action figures.Teyrnon has a complete set of Star Wars action figures.Teyrnon has a complete set of Star Wars action figures.Teyrnon has a complete set of Star Wars action figures.
 
Posts: 190
Karma: 384
Join Date: Jun 2009
Location: South Eastern United States
Device: jetBook, Kindle DX, Kindle 3, Kindle Fire, Nook Simple Touch
Quote:
Originally Posted by msundman View Post
That's indeed very arguable. E.g. it's definitely not sensible to count the Finnish word "epäjärjestelmällistyttämättömyydelläänsäkäänköhän " as 1 unit, as it's composed of over 10 suffix units altering the meaning of its base.
Okay, I don't speak or read Finnish. I've never studied it nor seen a text written in it. I haven't a clue what that word is nor what it means. I don't know if you're citing an example of something that's common in the language or some sort of technical term. Is this word an example of what one would see in everyday literature? Are Finnish books filled with words of that length? How common are they?

Quote:
Originally Posted by msundman View Post
I doubt that's correct. Although it's true that words are usually read whole, it's also true that longer words usually take longer to read than shorter words. AFAIK people almost completely "jump over" (i.e., the eye movement doesn't slow down significantly at) very short words, such as "a", when they read.

If speed readers would read every word equally fast then Finnish speed readers would finish books in considerably less time, but AFAIK this is not the case. AFAIK the number of characters more accurately reflects both the length of the text and the speed with which it's read.
That's the thing, I know what I said holds true for English. I can't speak for Finnish but you seem to be suggesting that 50 character words are common. Yeah, I can see where such words might be difficult to process as a single glyph.

Quote:
Originally Posted by msundman View Post
Certainly characters are meaningful units of character-based languages, so I don't know what you're getting at.
What I'm getting at is that by and large it's words not letters that are sensible in reading a text but as I stated above I can only speak in terms of the English language with any certainty. This may not hold true for all or even most languages, I really don't know.
Teyrnon is offline   Reply With Quote
Old 09-13-2009, 08:28 PM   #133
msundman
Zealot
msundman has a complete set of Star Wars action figures.msundman has a complete set of Star Wars action figures.msundman has a complete set of Star Wars action figures.
 
Posts: 103
Karma: 269
Join Date: Aug 2006
Device: FBReader on Android
Quote:
Originally Posted by Teyrnon View Post
Mmhmm, as am artifact of my background I tend to think of digits, letters, punctuation marks, every glyph pretty much, all as characters. You obviously don't. Which is fine, For what we're talking about you're certainly more correct.
I also think about all glyphs as characters (I'm biased towards unicode, so I'm even into "supplementary characters"), but when counting book lengths I would count only "word characters", because other characters aren't "read" in the same sense, but are there for different kinds of structuring and markup. Heck, one could even think of text formatting as characters (e.g., the "horizontal tab" and the "form feed" characters).

I was only trying to say that I think "word character" counts are particularly suitable for measuring text size.

Quote:
Originally Posted by Teyrnon View Post
Smaller numbers are easier to deal with. And frankly I'd rather not have to use SI units when talking about literature.
What on earth do you have against SI units? Not only are they extremely handy, the actual composed units easily become units of their own. E.g., people hardly ever think of "cm" as "m/100" or "km" as "m*1,000", but instead people see "km", "m" and "cm" as units themselves. Yet switching between these units remain incredibly easy.

That said, "1.5" in "1.5 Mchars" is a smaller number than "150" in "150 kwords".

Quote:
Originally Posted by Teyrnon View Post
Quote:
Originally Posted by msundman View Post
Why would you want to convert from character count to word count?!? (E.g., if it's just about what you are currently used to then I see no real reason, because you could just as well get used to character counts instead.)
I know I read 200WPM.
What you're currently used to should be irrelevant, as I said in what you quoted.

Quote:
Originally Posted by Teyrnon View Post
I know word recognition is largely glyphic. As in words themselves are glyphs. When one is reading one is reading in whole word chunks for the most part. Individual letters only become significant if one doesn't immediately recognize the word. This would seem to make the word the most significant unit in this transaction.
If that was the only relevant variable then I would agree with you. However, since words of different length are read with different speed (even though they are read a word at a time) and since words of different length take up more space, it seems that there might be some other way to more accurately describe the speed and size of some text. AFAIK character count is such a way.

Quote:
Originally Posted by Teyrnon View Post
That's the thing, I don't want a rough guess.
Why? Your "200 words/minute" is very rough in itself. You probably read a lot more words in a minute if there's lots of small words, and a lot fewer words in a minute if there's lots of large words.

Quote:
Originally Posted by Teyrnon View Post
I want a word count because I read words not letters. The letters make up the words yes, but when read that word "yes" I see yes not Y-E-S. It's an instant recognition as a single glyph. The letters are there but they have no meaning outside of the pattern that constitute the word.
This is a largely philosophical point of view, and even as such it's highly debatable. E.g., one could count sentences instead of words, and claim that words have no meaning outside of the pattern that constitute the sentence. (Words do have a meaning outside sentences, just as characters have a meaning outside words, although outside their context they carry less information.)

Quote:
Originally Posted by Teyrnon View Post
Let's try an example from chemistry. Word count seems to me like insisting that rather than giving atomic numbers as numbers of protons that make up each atom to identify the individual elements. instead give the number of quarks of atoms. Sure atoms are ultimately constituted of quarks but quarks have little relevance to chemistry. It also may confuse matters since perhaps the same numbers of quarks might represent vastly different atoms and arrangements.
The comparison is invalid. It's more like word count is like atom count whereas character count is akin to counting protons (and possibly neutrons). The former might be more important in some regards, but when we're trying to figure out the total mass then the latter wins hands down.

Also, you seem to think I'm proposing to use character counts for no reason at all, when in reality I've even outlined the reasons.

Quote:
Originally Posted by Teyrnon View Post
Quote:
Originally Posted by msundman View Post
As I've already said, character count reflects quite accurately the length of the text. Much, much better than word count (or page count). It also reflects well the time it takes to read the text, and works in many different languages.
Okay, have you tested this? How many characters per minute to you read.
I have tested it, but only on a very small scale. Also, I'm not equally fluent in all languages I know, so inter-language tests are a bit unreliable. The subject also plays a large role here. E.g., I read fiction novels much faster than scientific papers. All these variables are the same for both character counts and word counts, so neither is better or worse because of these variations in the text.

Since I didn't have any results of old tests I just now made a few new tests. Here are the results:
Code:
#   time   words chars   wpm   cpm   avg.w.len
1 00:02:09   607  2581   282  1200   4.25
2 00:00:36   150   776   250  1293   5.17
3 00:07:02  2166  9548   308  1358   4.41
4 00:00:30   134   651   268  1302   4.86
5 00:02:51   503  3893   176  1366   7.74
6 00:01:16   405  1876   320  1481   4.63
All were different texts by different authors. Texts #1-4 were in English, #5 in Finnish and #6 in Swedish.

Now, the smaller the following differences to the average are, the better the measure is.

English only:
Word counts: 2%, 10%, 11%, 3%
Char counts: 7%, 0%, 5%, 1%

All languages:
Word counts: 6%, 7%, 15%, 0%, 34%, 20%
Char counts: 10%, 3%, 2%, 2%, 2%, 11%

So, the average of the averages are:

Word counts:
  • English only: 7%
  • All languages: 14%
Character counts:
  • English only: 3%
  • All languages: 5%

So, character counts are, at least in this case, 2-3 times as accurate as word counts.

Quote:
Originally Posted by Teyrnon View Post
Your certain the figure doesn't change depending on how those letters are arranged into words? Do you really break everything down into individual letters as you read?
My what?
Of course the meaning behind the letters and the words change the speed with which you read.
I've never even hinted that I would break down words into individual letters as I read, and in fact I've said that words are usually read whole. (It's actually more complicated than that. Pupil tracking shows quite complex patterns. This is all beside the point, though.)

Quote:
Originally Posted by Teyrnon View Post
I'm having trouble imagining a situation where I wouldn't find character count obtuse and cumbersome when deciding how fast a document will be read.
Why?!? Once more, if it's only because you're currently not used to character counts as much as you're used to word counts then I don't see how that would be a real reason since you would adjust very quickly.

Quote:
Originally Posted by Teyrnon View Post
However on average I'd find word count more useful so I argue for that.
You have yet to provide any valid reason for that. The only reason I've deciphered from your replies is the philosophical "words are read whole". I, OTOH, have described how character counts more accurately reflect both the size of the text and the speed at which it is read. IMO my arguments are better because of their pragmatic nature, and thus better suitable for this pragmatic problem.

Quote:
Originally Posted by Teyrnon View Post
Quote:
Originally Posted by msundman View Post
E.g. it's definitely not sensible to count the Finnish word "epäjärjestelmällistyttämättömyydelläänsäkäänk öhän " as 1 unit, as it's composed of over 10 suffix units altering the meaning of its base.
Is this word an example of what one would see in everyday literature? Are Finnish books filled with words of that length? How common are they?
No, that word is extreme. However, many words in English are turned into suffixes in Finnish. E.g., "car" is "auto", but "my car" is "autoni" and "your car" is "autosi".
As for average word lengths, as you can see from my tests outlined above English and Swedish have an average word length of 4-5 characters whereas that number in Finnish is closer to 8.

Quote:
Originally Posted by Teyrnon View Post
Quote:
Originally Posted by msundman View Post
I doubt that's correct. Although it's true that words are usually read whole, it's also true that longer words usually take longer to read than shorter words. AFAIK people almost completely "jump over" (i.e., the eye movement doesn't slow down significantly at) very short words, such as "a", when they read.

If speed readers would read every word equally fast then Finnish speed readers would finish books in considerably less time, but AFAIK this is not the case. AFAIK the number of characters more accurately reflects both the length of the text and the speed with which it's read.
That's the thing, I know what I said holds true for English. I can't speak for Finnish but you seem to be suggesting that 50 character words are common. Yeah, I can see where such words might be difficult to process as a single glyph.
I flat out don't believe for a second that your claim that words of different lengths are read at the same speed is correct. Not in English and not in any other language. I've seen pupil tracking of different people reading, and although they certainly don't read individual characters they do tend to spend more time on longer words than on shorter, and almost completely jump over very short words such as "a".

I haven't suggested that 50 character words are common in Finnish. They are in fact not. However, words tend to be significantly longer, on average, in Finnish than in English.

I haven't been arguing against processing words as single entities, so would you stop arguing against that straw-man, please?

Quote:
Originally Posted by Teyrnon View Post
Quote:
Originally Posted by msundman View Post
Certainly characters are meaningful units of character-based languages, so I don't know what you're getting at.
What I'm getting at is that by and large it's words not letters that are sensible in reading a text
Both are very much sensible, as are sentences as well as more abstract word structures. My point is that the character count reflects both the size of a text and the speed with which it's read more accurately than the word count. Your arguments seem to be of a more philosophical nature, and while I'm a big fan of letting The Right Way(tm) triumph over hyperpragmatism I don't see a very big difference between characters and words in the sense you're trying to convey, while I see a big difference in accuracy in favor of character counts.

Last edited by msundman; 09-13-2009 at 08:48 PM. Reason: added some missing numbers
msundman is offline   Reply With Quote
Old 09-13-2009, 09:11 PM   #134
msundman
Zealot
msundman has a complete set of Star Wars action figures.msundman has a complete set of Star Wars action figures.msundman has a complete set of Star Wars action figures.
 
Posts: 103
Karma: 269
Join Date: Aug 2006
Device: FBReader on Android
Quote:
Originally Posted by Teyrnon View Post
Speed readers in particular tend to read words as discrete entities whether that word is 1 character long or 9 the word is read at the same speed.
I'm not a speed reader, but from my tests above we can see quite clear implications of how I read longer words more slowly than shorter ones. The first column is 'average word length' and the second column is 1300/'words per minute' (I've inverted it and normalized it so the relation is easier to see, but the same relation is obviously there also without this 1300/wpm division):
Code:
4.25	4.60
5.17	5.20
4.41	4.22
4.86	4.85
7.74	7.37
4.63	4.07
So, when the average word length increases the average WPM decreases and vice versa.
msundman is offline   Reply With Quote
Old 09-13-2009, 09:39 PM   #135
Teyrnon
Groupie
Teyrnon has a complete set of Star Wars action figures.Teyrnon has a complete set of Star Wars action figures.Teyrnon has a complete set of Star Wars action figures.Teyrnon has a complete set of Star Wars action figures.
 
Posts: 190
Karma: 384
Join Date: Jun 2009
Location: South Eastern United States
Device: jetBook, Kindle DX, Kindle 3, Kindle Fire, Nook Simple Touch
Quote:
Originally Posted by msundman View Post
So, character counts are, at least in this case, 2-3 times as accurate as word counts.
It would seem so. I concede that character counts would seem to have value here.


Quote:
Originally Posted by msundman View Post
Why?!? Once more, if it's only because you're currently not used to character counts as much as you're used to word counts then I don't see how that would be a real reason since you would adjust very quickly.
I still disagree but you've given me enough to think about that I need to stew on it a while. As for what I'm really used to that would be column inches which are about as useless as page counts for what we're discussing. That's neither here nor there.

Quote:
Originally Posted by msundman View Post
I haven't suggested that 50 character words are common in Finnish. They are in fact not. However, words tend to be significantly longer, on average, in Finnish than in English.
Apologies if I implied you had. I was still trying to understand how significant the really long word you cited was in terms of word length for Finnish. I understand now that it was an extreme example. Thanks.

Quote:
Originally Posted by msundman View Post
Both are very much sensible, as are sentences as well as more abstract word structures. My point is that the character count reflects both the size of a text and the speed with which it's read more accurately than the word count. Your arguments seem to be of a more philosophical nature, and while I'm a big fan of letting The Right Way(tm) triumph over hyperpragmatism I don't see a very big difference between characters and words in the sense you're trying to convey, while I see a big difference in accuracy in favor of character counts.
You've made some excellent points and I'm forced to amend my position somewhat. Character counts would be useful. In a weird way now I want both. I need to research this further and give it more thought.

I find myself wondering if phoneme count might have comparable accuracy. That's just getting silly though.
Teyrnon is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Free Book (Kindle/Christianbook) - The Big 5-OH! koland Deals and Resources (No Self-Promotion or Affiliate Links) 6 06-07-2012 11:08 PM
A.A. BIG BOOK jnconnolly Deals and Resources (No Self-Promotion or Affiliate Links) 22 10-05-2011 09:13 PM
big book--how to break up monsieurms Workshop 8 02-03-2010 11:36 PM
The Big Book - eReader friendly ? rquesty Sony Reader 8 09-03-2009 05:53 PM
BIG BOOK LIST ...but... mariaperreta Deals and Resources (No Self-Promotion or Affiliate Links) 1 12-11-2008 11:31 AM


All times are GMT -4. The time now is 03:54 AM.


MobileRead.com is a privately owned, operated and funded community.