# Book Statistics

Discussion in 'Community Discussion' started by scem0, Jan 9, 2011.

1. ### scem0 macrumors 604

Joined:
Jul 16, 2002
Location:
back in NYC!
#1
First off, hello to everyone on MR who remembers me (and those who don't, as well). I used to be a prolific poster here (read: major MR addict) .

I'm writing a program that takes the text of a book and then compiles interesting information about that book. For example, it might take Harry Potter and the Sorcerer's Stone and tell you that the top 6 word phrase is "he-who-must-not-be-named" and that the given phrase is used 54 times. It might tell you that the average characters per word is 4.3254 characters. It might tell you that the top used letter in the book is 'e' and it's used 66385 times. You get the idea - calculated statistics about books. I'm figuring out what those statistics will be right now.

So, the question is: What statistics are you interested in knowing about your favorite books?

Keep in mind, these must be calculable figures. That is to say, I can't calculate whether or not a book uses more happy or sad words. I can't calculate the number of protagonists in a book. Most qualitative stuff is out, unfortunately.

Thanks for the input!

 Emerson

2. ### leekohler macrumors G5

Joined:
Dec 22, 2004
Location:
Chicago, Illinois
#2
Hey you! Good to see you posting!

I'm interested in knowing how many books people here read to tell the truth.

3. ### thejadedmonkey macrumors 604

Joined:
May 28, 2005
Location:
Pennsylvania
#3
Most used word.
Most used adjective (most used word compared against a list of adjectives maybe? is that too qualitative?)
Most used first word of a paragraph
Most used sentence (eg, does the author like a certain expression/phrase?)

Leekohler, I love to read, but unfortunately I find myself only reading about a book a month on average, as they are all 500 page textbooks!

If you're into fantasy, a great series is "The Name of the Wind" by Patrick Ruthfuss, the 2nd should be coming out in a few months. In the same genre, I also recommend Sabriel by Garth Nix... I recently found out that it's a few of my friends' favorite book, and they all discovered it independently!

4. ### leekohler macrumors G5

Joined:
Dec 22, 2004
Location:
Chicago, Illinois
#4
I'm not really into fiction at all. I like biographies and such.

5. ### snberk103 macrumors 603

Joined:
Oct 22, 2007
Location:
An Island in the Salish Sea
#5
If the books you are analyzing are fiction, then names are important. If you assume that any word that is both capitalized, and not the first word in a sentence, is a name (place name, character name, etc) then it shouldn't be too hard calculate a bunch of stuff about names.

• How often a name appears.
• How often it appears near other names. (within 10 words, or 15, 20 etc)
• How often two or more names appear in the same paragraph (which is not the same as the item above)
• How many paragraphs a name appears in...
• etc

Could be a cool project. Good Luck.

6. ### leekohler macrumors G5

Joined:
Dec 22, 2004
Location:
Chicago, Illinois
#6
You know what else could be interesting? Find out how many books change the spellings of common names. For example, Spelling "Jennifer" as "Genifer" or using "Mikal" for "Michael". Could be interesting.

7. ### scem0 thread starter macrumors 604

Joined:
Jul 16, 2002
Location:
back in NYC!
#7
Those are really, really awesome suggestions, guys! I especially love most used first word of a paragraph - hadn't thought of that one yet and it's so simple!

Thanks,
Emerson

8. ### iBlue macrumors Core

Joined:
Mar 17, 2005
Location:
London, England
#8
Unfortunately everything I thought of has already been posted but I wanted to say hello and it's good to see you.

9. ### iJohnHenry macrumors P6

Joined:
Mar 22, 2008
Location:
On tenterhooks
#9
Look for hidden messages in the text.

First word of each paragraph, first letter of each sentence, stuff like that.