Uninformed Comment

Word frequency in literature

Posted in Art, Grammar & usage, Internet, Literature, Technology by uninformedcomment on October 27, 2009

Playing with the rather addictive and very neat online toy Wordle, putting in texts from the stalwart Project Gutenberg, I’ve cobbled together “Wordles” of some of the Gutenberg’s Top 100 popular texts from literature, showing word frequencies graphically.

I like to think they encapsulate the very essence of the works, but having read only the first couple of pages of Joyce’s Ulysses and nothing whatsoever of any Brontë, I’m not as sure as I ought to be.

Judge for yourself (click on each picture for larger size):

Dracula, by Bram Stoker

Dracula, by Bram Stoker

Ulysses, by James Joyce

Ulysses, by James Joyce

Great Expectations, by Charles Dickens`

Great Expectations, by Charles Dickens

Wuthering Heights, by Emily Bronte

Wuthering Heights, by Emily Brontë

The Adventures of Sherlock Holmes, by Sir Arthur Conan Doyle

The Adventures of Sherlock Holmes, by Sir Arthur Conan Doyle

The Iliad, by Homer

The Iliad, by Homer

The entire text of each work was pasted into Wordle.   Some of the images have been rotated for easier reading, and some colours have been altered for the same reason.  I insist, in the face of reason, that the inclusion of both The Iliad and Joyce’s Ulysses is entirely coincidental.

With many thanks to both Wordle and Project Gutenberg.

Advertisements

4 Responses

Subscribe to comments with RSS.

  1. Jill said, on November 2, 2009 at 12:13 am

    Those would make very nice bookstore posters.

    • uninformedcomment said, on November 2, 2009 at 12:42 pm

      A thought-provoking idea; maybe I should get in touch with one of the two remaining bookshops in the UK.

      The trouble is, they’d only be interested in diagrams of Dan Brown, J K Rowling or Ainsley Harriot, and I don’t think they’re on the Gutenberg site …

  2. Jill said, on November 4, 2009 at 1:05 am

    No, you’d definitely want stuff that’s in the public domain, with character names that are easy to pick up even if you’re not extremely well read. So Pride & Prejudice, Oliver Twist, those sorts of things – your examples are perfect.

    • uninformedcomment said, on November 4, 2009 at 10:24 pm

      Surely public domain doesn’t matter if you’re not actually showing the text itself? But you’re right, familiarity helps a great deal. (And I was being cynical about Dan Brown, etc!)


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: