Exercise 8

I’d like to explore the relationship between the most prominent topics within a corpus of novels and some pieces of metadata that may correlate to the novel’s availability to different socioeconomic strata of readers in order to see if we can track any changes in these factors along with the assertions of Ian Watt in “The Rise of the Novel,” namely that the novel was born alongside (and helped to create / propagate) the burgeoning middle class created by modern capitalistic systems. The metadata that I would look at would be:

  • price of the volume (can we find any correlations between the supposed content of the novels as revealed by topic modeling? do these prices change in subsequent republishing? when is the price included in the paratext?)
  • place of publication (what kinds of novels were published in Dublin as opposed to London as opposed to Philadelphia? how do these reflect the state of the middle class in each respective area?)
  • year (does the novel become reflect more middle class values as it ages?)

I would group the novels by these divisions of metadata (e.g. all the novels that cost 2-3 shillings or were published in Dublin from 1775 to 1780) and use topic modeling to see if I could find any reflection of these factors in its most prominent topics.