View Single Post
Old 03-05-2008, 03:27 AM   #20
Richard Herley
Author
Richard Herley ought to be getting tired of karma fortunes by now.Richard Herley ought to be getting tired of karma fortunes by now.Richard Herley ought to be getting tired of karma fortunes by now.Richard Herley ought to be getting tired of karma fortunes by now.Richard Herley ought to be getting tired of karma fortunes by now.Richard Herley ought to be getting tired of karma fortunes by now.Richard Herley ought to be getting tired of karma fortunes by now.Richard Herley ought to be getting tired of karma fortunes by now.Richard Herley ought to be getting tired of karma fortunes by now.Richard Herley ought to be getting tired of karma fortunes by now.Richard Herley ought to be getting tired of karma fortunes by now.
 
Richard Herley's Avatar
 
Posts: 203
Karma: 1164907
Join Date: Feb 2008
Location: Norfolk, England
Device: Kindle Oasis
Thanks for your list, BenG. Non-fiction is no problem at all -- the Dewey system deals with that. But as Nekokami says, fiction is not really classifiable. Some fiction overlaps any one genre. How would Secker and Warburg, or a bookstore, have presented 1984 when it was first published? As sci-fi?

Please bear with me here while I digress. I am a bird-watcher, and some years ago I wrote some software to analyse my records (thousands and thousands of them, going back 45 years). Among other things, I was making a study of two different lakes, each unlike, and I wanted to know how the difference between them was reflected in the birdlife. So I wrote a sub-program to assemble an "association table", as I called it.

An association table is compiled for each individual species. Each record in my database has 5 fields: site, date, species, number counted, remarks. For this study I was only interested in site, date and species. The association table lists, for an individual species, all the other species present in the block of records being investigated (either for a single site or a group of sites).

The result is a list showing the percentage of dates on which other species occur with the one being investigated -- which obviously is always present 100% of the time. Say the target species is Mallard. The table might look like this:

Mallard 100%
Coot 97%
Moorhen 93%
Teal 85%
...
Green Woodpecker 4%
Tree Creeper 3%
Wood Warbler 1%

Coot, Moorhen and Teal all get high scores because they're water-birds and likely to occur in the same habitat. The others score lower because they're less likely to occur in that habitat.

For habitat, read "fiction genre". For species, read "title". For date, read "one person's opinion".

What's needed is a large number of lists of people's favourite fiction titles. The list can have anything from 2 entries upwards. Given a large enough sample, you could get a pretty clear idea of where a book fitted into the scheme of things. Provided there were enough lists to analyse, it wouldn't matter if the entries in each list were wildly disparate - if you put James Joyce next to Edgar Rice Burroughs.

It's a bit like that Amazon feature: "people who bought this title also bought these", except that it's much more detailed.

I already have the software, though it's written in BASIC (I wrote my bird program in 1996). If anyone can think of a way to get people to submit lists, we might have the beginnings of a way to classify fiction.

If, furthermore, people gave points (say out of 5) to each title on their list, that would make the data richer still.
Richard Herley is offline   Reply With Quote