pointless statistics

5 posts / 0 new
Last post
pointless statistics

The american amazon site (www.amazon.com) now provides a series of pointless statistics about books. Just look at a books page, if they have analysed it there will be a list of SIPS (statistically improbable phrases) at the top and a "Concordance" and "Text Stats" links under the "Inside This Book" section.

So for instance I now know that Kurt Vonnegut's Cat's Cradle has more references to "bad chemicals" than any other book. That 98% of books are easier to read the Faulkner's Absalom Absalom, and that it contains over 132,000 words but only 3000 sentences. That some of the most popular words in Moby Dick are (unsurprisingly) "Ahab", and "whale". And that War and Peace apparently contains more words than any other book (baring in mind they haven't analysed them all yet) and that you get a massive 51,707 words per dollar.

I have no idea what possible use any of this is, but I like that it is there.

I love crazy stats - I like looking at the weird things this site throws up (sic). But best of all I love cricket statistics. As batsmen are in I work out, run by run, what is happening to their average. I love strike rates, proportion of boundaries etc etc. It's a hopeless case.
Enzo
Anonymous's picture
Well this is the best thing I've seen in a long time. Bit confused though: Vladimir and Nabokov are in the top 100 words used in Lolita. Odd, that... Having just looked at a couple more books, the authors name is in all the top 100 words. Surely not? Are they making this up?
There are lies, damnable lies, then statistics. Who say that? (Even H L Mencken doesn't know)

 

Nabakov is a common word in Lolita because his name's at the top of every other page (you can look up the citations). I was reminded, playing with this, that in David Lodge's Small World there is a character who, after a computer analyses his books and discovers the word he uses most is "grease", cannot write another word of fiction. Every time he sits down and thinks of an adjective, the only one that springs to mind is greasy. It's on pages 184-185 if you want to look it up on amazon. Dissapointingly David Lodge does not seem to suffer from the undue preponderance of any single word. Although he does use "lapel badge" a lot.

 

Topic locked