3 min read
50 topics, 1000 iterations, 20 printed words:
A business life: made time gave found leave place manner day days return company told long received paris acquainted happened returned knew till
Typical Saturday night: good table money fellow wine company people half hundred give eat box glass guineas pretty made poor drink turned peace
War victory story: war army general english french enemy time country men battle enemies england forces command officer number field troops part success
Conceited Autobiography: great genius taste learning learned wit character opinion poet piece stage works play author judgment characters read friend age merit
All things divine: god man good religion world church heaven true soul divine spirit things body fear human christian truth faith life death
25 topics, 1000 iterations, 10 printed words:
Sounds kinky: passion time made found person husband affection mistress lover fortune
Maritime adventure: captain ship made men great board sea found time place
Basically Robinson Crusoe: man make thing thought time give good great find world
50 topics, 1000 iterations, 5 printed words:
Typical Swarthmore student response to “How are you?”: tears heaven death life grief
Me after reading one page of a novel: great page world learned learning
Conclusion: At first, I thought that reducing the printed words would give us a more concrete and accurate subject topic of novels, but it seems to do the opposite. With 20 or even 10 words, I was able to grasp a bit more of what the novel was about rather than a somewhat superficial topic produced by a 5 words printing limit.
Here are a number of common themes I found through the different iterations of modeling: war, navy, adventure, aristocracy, money, and of course virtue. The the importance of the notion of virtue can be summed up in the topic: honour conduct character virtue reason.
Now I come to how this exercise reminded me of Tristram Shandy. While reading Tristram Shandy, I kept thinking that the novel was about nothing, and everything at once. After about 50 pages, I couldn’t rule out any subject in the world as a potential digression topic for the narrator of Tristram Shandy. The same can be said for topic modeling. And even though Tristram Shandy forms a more coherent narrative than these topics, I’m not sure if it can be reduced to one or even a number of topics. Indeed, I do think that Tristram Shandy is the most interesting novel about nothing. Even if we were to try to reduce it to a topic, it would probably be muddled with mundane words such as “uncle”, “father”, “make”, “give”, and others of the sort.
These mundane words are also sprinkled across all of the topics, no matter what number of words printed. This, in turn, creates a sort of reality effect. These seemingly useless words are necessary to remind us that novels cannot be reduced to substantial and important topics, and this makes these topics more believable. For example, “good table money fellow wine company people half hundred give eat box glass guineas pretty made poor drink turned peace” makes for a much more interesting and intelligible topic than if we were to take some words out and produce “money wine company glass drink poor peace”. I do think that it is much easier to construct a plausible story from the former than the latter. Novels and topics both need these mundane words to produce a more intelligible and “real” story. I think this also relates to my point about topics with fewer words seeming more superficial than topics with more.