I really love the Book Review section of the New York Times. You find gems…and you learn a lot –without reading the book! Ok, so I know that’s probably not the idea. But it’s like a short course on the topic – the overview, some critique, knowledge. In other words, good times.

This article is an interesting approach to informatics and IA, from a literary perspective. Hamlet is obviously the main character in Hamlet…but why? According to the author Hamlet “minimized the sum of the distances to all other vertices” of the network…in other words, he’s at the center of the network. Unfortunately, the network shown above was created by hand – meaning it’s more of an infographic than a data visualization – qualitative more than quantitative, for sure. And the whole point of “distant reading” of books is that there is simply too much literature to read closely. You can read the 200 books associated with the Victorian canon…but there would still be tens of thousands of other 19th century English novels that could illuminate your study…and there’s no way you can read that much, Evelyn Wood be damned.

So…what to do? They did an interesting experiment – researchers fed 30 novels, identified by genre into a machine and had them analyzed by a set of programs – then asked the computer to ID another 6. They were able to pair them all up…but using different means than a human would. The programs used word frequency as well as grammatical and semantic signals. This means that the books are sortable through signals that humans can’t detect.

Data informatics for literary analysis is, then, kind of like a telescope for the eye. And maybe we can have computers chugging away, unaided, to help us do deeper analysis on our literary heritage. Pretty cool stuff.

The coolest part of the review comes from comparing Linnaeus and Vesalius. In IA, sometimes I’m interested in the Taxonomy of things that are new to me – what types of things are there? I don’t think you can really separate that taxonomy from a skeletal structure (the Vesalian impulse) … as soon as I start to identify things, I start to relate them in hierarchical structure – are they really separate then?