Although the metadata provided us with a lot of quantitative information about the group of novels we studied, there were several aspects of this collection of works that went un-described by the dataset. While completing the exercise, I found myself wondering about the popularity of these novels. Additionally, during the topic modeling exercise, we saw that while some of the topics generated seemed to be random amalgamations of unrelated words, topics that were relatively cohesive and identifiable did appear.

I would like to address the popularity of different topics during the 18th century and combine both metadata and topic modeling to track what subjects people were most interested in reading about. I would first choose several well defined topics (possibilities could include exploration/travel, family, literature/fine arts etc.) and collect data on the individual novels that comprise each topic. Obtaining informationon how many copies of each of these books were printed or sold would most likely require some digging, but if getting a hold of this data were possible, I could then monitor the popularity of each topic over a specified time period.


Topic Modeling

10Topics, 100 iterations

1) European power structures [king people country england power english war time men great prince general lord army france french enemy earl kingdom laws] 2) The businessman [made time manner make thought gave account fortune received give found care gentleman replied till affair money proper opportunity long]

50 Topics, 1000 iterations

3)Family [father young family mother fortune lady daughter son wife made years man time great good husband brother woman child marriage] 4)Literature [author great book genius read taste learned learning life wit works good piece history years public character work poet stage] 5)The good life (for men)[man nature life virtue human men good world natural opinion general means happy true degree light advantage equal make makes] 6)Adventures at Sea [fo ship men great made water sea richard fome adventures capt indians falconer told feveral god good time ifland make] 7)JustGirlyThings? [love heart passion affection soul mind happiness tender sentiments tenderness friendship heaven object present felt beauty eyes longer fortune mistress] 8)England v France [king english england duke army time war parliament france queen french henry crown general men earl began thousand made kingdom] 9)Pamela [mrs good master sir poor dear pamela ll mr jewkes lady hope thing god don jervis pray make mother father] 10)Power and Government [people country power laws government state present nation court order subjects ambition public great liberty kingdom constitution part arts authority]

A few common themes I noticed across the arrays of topics that I generated were: topics containing words related to people (mr, mrs, madam, his, her, etc.), topics relating to adventure or exploration, and topics centered around a particular culture/nationality or combination of cultures (British and Chinese, England and France). There were a number of topics similar to the Pamela topic that seemed to relate to just one volume in particular. These types of topics were indicated primarily by the presence of specific character names within the topic. I found it particularly interesting that a few gender-related topics appeared, which demonstrated the introduction of female subjectivity as described by Armstrong. I was also amused by the one or two topics that were just a bunch of words containing the letter "f" instead of the letter "s".

Using a larger number of topics and iterations definitely produced a greater variety of topics, but there was a lot of overlap between some of the topics. Using only 10 topics and 100 iterations produced some much more generalized topics.


Croft, Herbert, Sir.Love and madness. A story too true in a series of letters between parties, whose names would perhaps be mentioned, were they less known, or less lamented. A new edition. London, 1850.

LOVE AND MADNESS.|A|Story too True|in a|SERIES of LETTERS|Between Parties, whose Names would|perhaps be mentioned were they less|known, or less lamented:|[horizontal line]|Governor. "Who did the bloody deed?|Oroonoro. "The deed was mine,|"Bloody I know it is, and I expect|"Your laws should tell me so. Thus, self condemmed,|"I do resign myself into your hands,|"The hands of Justice."|OROONORO.|[horizontal line]|A New Edition.|[horizontal line|LONDON.|Printed for G.KEARSlY, at No.46,near Serjeants Inn,|Fleet Street; and R.FAULDER,in New Bond Street.|[line]1780.[line]|


2mo. Vol.1. B6-Z6. Aa6-Cc6r. i-viii. 1-298.


advertisement. half title. title. (no A signature?) i-viii contents. preface. dedication/poem. B1-Cc6v text.


Sourced from the British Library. Digital facsimile retrieved through Eighteenth Century Collections Online. Gale Document Number: CW3314147238. Only signatures 1-3 for each letter appear (B1-B3 signatures present, next 6 pages are blank, and then C starts, and so on). Advertisements and picture of the author precede the title page. Illegible cursive handwriting on the title page.


Some general observations from the analysis I performed with Google fusion tables: the number of novels published increased over time between 1700 and 1799 and spikes around 1750. Nearly 30% were epistolary novels, 35% were first person novels and 25% were third person novels. Duodecimo and octavo were the major types of format these novels were published in (65% and 26%, respectively). While we can learn a lot of interesting, quantitative information from this exercise, there are still many aspects about the novels that go undescribed by this particular dataset. If we had data on the number of copies of each novel that were sold, I think it could have been pretty interesting to compare popularity with type of novel (epistolary, dramatic dialogue, etc.).

Next, I made word clouds for titular nouns and adjectives. Some of the most notable nouns that appeared in my cloud were “history”, “adventure”, “letter” and “life”. This cloud wasn’t really that informative and mostly fit my expectations given the other analyses we have done on a large set of novels. My adjective cloud prominently contained the words “entertaining”, “original” and “best”, along with other words that were clearly selected to make the novel more enticing to readers. Again, a pretty unsurprising result.

Overall, I think that while metadata analysis is a useful tool for analyzing a large number of novels, when we break these novels down into pieces of data that fit into uniform categories and make generalizations about them, we lose the some of the uniqueness and individuality that makes an individual novel a true work of art. Exercise6


I found this exercise to be somewhat frustrating. After finding a version of the text on ECCO, I discovered that there is a wide range in the quality of OCR software available for free downloading on the internet. Google Drive refused all of my best efforts to convert the PDF to text form. I then found an application offered for "community download" in the App Store called PDF OCR X. This program wasn't able to handle the handwritten cursive annotations that appear on various pages of the version of the text that I downloaded and mostly converted them to something along the lines of ?u4—7z:z7_—7?41-rz%'.r77r2 L»!-»*' 0‘?. Prizmo worked the best out of all the programs I tried. It was at least smart enough to know not to even try to decipher the cursive. The most common errors that I saw with this program were the mis-conversion of the letter "s" to "f", "l", or some other character altogether and misinterpretation of italicized words. For example, the name Horace in italics appeared as "Horace", "l-Iora#", and "Harae" all on the same page. It seems like a lot of these mistakes could be avoided if the OCR software had some type of internal dictionary or spell-checker that it could compare its translations to before spitting them back out to the user. To me, using OCR kind of takes away from the authentic feel of the text. The different stylistic and typographic elements that really make you realize that you're reading an 18th century book when you pick up a copy of Tristram Shandy are lost in the slew of mistakenly regurgitated numbers and symbols.

For this exercise, I chose to focus in on the use of one word in particular from the word cloud that Voyant generated: “hope”. At the beginning of the novel, Pamela uses it several times in the context of expressing her initial opinions about Mr. B. For example, she says, “I hope I shall never find him to act unworthy of his character; for what could he get by ruining such a poor young creature as me?” Her father also shares a similar concern, saying, “I cannot but renew my cautions on your master’s kindness, and his free expression to you about the stockings. Yet there may not be, and I hope there is not, any thing in it.” Both of these quotes ironically foreshadow the events that are to come later, and there are other instances as well in which Pamela “hopes” for something and then the opposite thing happens. Read through an anti-Pamelist lens, Pamela’s hopes become almost sarcastic in nature.

I also discovered that she uses the word “hope” to describe her own personality and behavior. A couple of my favorite instances are “I hope I shall always know my place,” and “I hope, desperate as my condition seems, that as these trails are not of my own seeking, nor the effects of my presumption and vanity, I shall be enabled to overcome them, and, in God’s own good time, be delivered from them.” Again, I find myself reading these very isolated fragments of text in a very sarcastic way. This interpretation suggests that Pamela knows exactly what she is doing, and that the language she uses is specifically employed to highlight the innocence and purity that she wants to convey to her audience (both in the real world and in the context of the novel). Finally, I think that the frequency of the word hope throughout the novel somewhat pertains to Armstrong’s argument that Pamela acts like a book of conduct. Pamela simultaneously “hopes” all of these things for and about herself, and also in a way is sending the message to young female readers of the novel that she “hopes” they will follow in her footsteps and mimic her example.