3 min read
One thing I'd like to use topic modeling and metadata for is to look for a shift from religious communities/dynastic realms and messianic time to nationalism and homogenous time (as described in Anderson's Imagined Communities. To do this, I think it would be best to split a corpus of eighteenth century novels into groups based on their publication date. One suggested grouping is 1700-1750 and 1751-1799, although one could also create more than two groups by making the year ranges smaller (for example, 1700-1733, 1734-1766, 1767-1799). However, it is possible that the later dates may contain a larger number of novels, in which case it might be better to have uneven groupings so that the split of novels between groups is not too disproportionate--though a large time range could obscure when the shift took place. It might be preferable to run this multiple times with different groupings to see how the groupings affect the results.
To identify the shift, I think one thing to look at would be to see whether there are any topics related to imagined communities: that is, any topic where an imagined community (a community where the constituents don't interact with all other constituents) is associated with descriptive words or traits. One particularly useful thing to look for would be the association of country names with other words. Topic modeling could also be used to look for simultaneity in novels, which Anderson argues enables the concept of nationalism and imagined communities. Here it would be helpful to look to see if topic modeling can help identify instances/trends of simultaneity (where characters are acting separately but at the same time). (Topic modeling may not be the best tool for this, but I'm not sure what would be. Perhaps it would be better to look at the frequency of words like "meanwhile".) It would still be interesting to see if words like "meanwhile" show up in topic modeling, and what words they are associated with if they do. Another thing to look for that doesn't require topic modeling is to see how time is described across eighteenth-century novels. Anderson discusses a shift from sacred time to modern time, and this shift could be identified through a rise of the usage of standardized or clock time. Anderson also talks about the newspaper's role in enabling the new sense of time/imagined community, so it might be worth looking at the mention of newspapers (frequency-wise) or seeing if the word "newspaper" appears in the topic modeling at all. If the word "newspaper" does appear in the topic modeling, it would be interesting to see what words the newspaper is associated with.