Skip to main content

Assignment 5: Textual Data Mining

5 min read

The bibliography of British prose fiction from the 1776-1779 allows us to better understand the literary trend of this period. From this survey, we can compare and contrast the themes of each novel, identifying the “popular” tendencies of late 18th century writers. However, it is fascinating how the novels are classified. The novels, for which the authors remained anonymous, are at the front for each year’s classification Interestingly, most, if not all, anonymous authors are “ladies”, indicating of the stereotyped “domesticity” of women. In other words, many of the female authors chose not to be named in order to avoid attention of or harassment by men. The female authors chose to maintain their “virtue”, focusing on the ties between love and morality. In contrast, most male authors tended to write about adventures, playing into the stereotype of masculinity for the audience. However, what stood out were not the obvious trends that female and male writers were focusing on, but the cost of their novels. Most of the novels costed 5-6 shillings, with bounded novels costing more than their sewed counterparts. Since the average wages of skilled laborers were between 15 and 20 shillings per week, most novels costed about a third of a week’s salary. From our perspective, the novels either costed too much or the laborers were underpaid. Taking into factor that mass production of novels were present in the 18th century, I must support the idea that wages in the 18th century were vastly more unfair than today’s minimum wage. However, we cannot deduct from the fact that the evolution of technology since then made novels much cheaper and, therefore, easier to access for the public.

Analyzing the bibliography of Evelina, or, a Young Lady’s Entrance into the World, we can see that the author is Frances Burney and the publisher is T. Lowndes located in No. 77 in Fleet Street in London. Notably, T. Lowndes seems to have been a prestigious publisher, as the Garside entries show that he did publish many novels. The price of this novel is 7 shillings and 6 pence for a sewed copy and 9 shillings for a bound copy (a much higher price compared to that of other novels). We can see that Lowndes paid 20 guineas to Burney for the two volumes of the novel and published 500 copies. The bibliography also mentions that Burney was initially disappointed at the offered price, but was later “satisfied”. There is also a discrepancy in how many novels were initially published, as Burney claimed that 800 copies, not 500, were made. This bibliography seems to focus more on the relation on how the novel became published, rather than the content.

Using ECCO, I looked into Munster Village, Memoirs of the Countess D’Anois, Learning at a Loss, and The Unfortunate Union. Munster Village by Lady Mary Hamilton is about the divorced female characters of Munster Village. The novel seems to advocate for kindness, as the community survives upon it. However, this novel utilizes footnotes, which I hadn’t seen in older novels. I always thought footnotes originated with more modern novels. Memoirs of the Countess D’Anois is written by Henriette Julie de Castelnau Murat. Although dull, the novel does grant insight to the life of a countess in the 18th century. Notably, this novel does not highlight the virtue of women, but rather the whim. The countess is seemingly trying to explain her perspective in flirting with other men. Therefore, in a sense, this novel is the most realistic, as it chooses not to portray all women as virtuous and naive, but rather having actual desires. Learning at a Loss by Gregory Lewis Way does utilize an epistolary form, much like Evelina. However, all the characters seem to have the same personality, and is therefore hard to distinguish from one another. The Unfortunate Union by Anonymous is also in epistolary form and has a similar setting to Evelina. However this novel seems to advocate for virtue in young minds. As we can see from these novels, there is a general trend urging virtue and kindness, not necessarily towards females, but males as well.

Using ARTEMIS, I can see that this program would be especially useful to track down and analyze old novels. The ARTEMIS program seems to focus on visualizations of the results for the users, allowing them to conceptualize trends for selected terms. Term clusters seems to be like Voyant Tools, except looking at other categories such as authors instead of just the text. This tool especially allows us to see which literary technique (epistolary in this case) is popular in that era. Term frequency is wonderful for highlighting what terms are used as time periods change. Indepthly, this tool helps us analyze the trend of what content the authors are focusing (e.g. love for this era). We can see which words became popular and which became less used. Overall, the ARTEMIS visual graphs are user friendly and easy to interpret.


Exercise 5

5 min read

Step 1: An interesting distinction I noticed in the novels listed in this bibliography was the difference between novels that seemed to me to fit more of the contemporary conventions of novels and those that much more resembled the style of the early novels that we have read so far in class. A few characteristics gave me the impression of this notable difference, one of the most salient being the question of descriptiveness. The novels which seemed most like Robinson Crusoe, Pamela, and Evelina were those overflowing with descriptive information in the title of the book. Many of the novels listed here had more than one title; a primary one such as a person’s name, followed by a clarifying phrase that conveyed the form of the novel (letters, memoirs, etc.) and the novel’s purpose and effect on the reader (to cultivate character, virtue, morality, etc.). The titles that seemed more modern to me left out all of this descriptive information and instead were simple, mysterious phrases which gave only the slightest indication of the focus of the book, thereby piquing the reader’s curiosity. These included titles such as Misplaced Confidence, The Pupil of Pleasure, and The Relapse. Although some of these more modern-seeming titles did appear earlier in the 1770s, the general trend away from the overly descriptive titles can be seen as a progression over time, with more of these mysterious and limited book names appearing in the 1780s than the 1770s.

Interestingly, I noticed the reverse of this distinction when it comes to authorship; that is, the authors that seemed to follow more recent conventions were those that included more information about themselves, whereas anonymity, or the complete lack of information, was a much more prominent feature of the earlier writers. I observed that some of the later authors in the 1780s even employed the technique that is used heavily today by movie directors, in which they boast of their previous famous works in order to sell their most recent production, such as one 1780 title which proclaims to be written by “the author of Liberal Opinions, Pupil of Pleasure, Shenstone Green, etc.”

Step 2: I observed many of the same characteristics that I discussed above in the digital versions of the novels and, specifically, in their title pages. A somewhat surprising observation that I made when comparing a variety of novels from the 1770s and 1780s was that in my categorization of the novels as more or less descriptive and thus more or less modern, Evelina seemed the most like Robinson Crusoe and Pamela and the least like contemporary novels. Evelina’s title page adheres to many of the overly descriptive conventions I have associated with early novels, including the form of the novel (letters), and a secondary, explanatory title (a young woman’s entrance into the world). Interestingly, Evelina was one of the later novels that I studied for this exercise (I looked at two from 1776, one from 1779 and one from 1781, in addition to Evelina) but its title page seemed the most descriptive and thus least modern to me. Even the 1776 novels, such as Emma; or the Child of Sorrow, had sparser title pages than Evelina. Notably absent from Evelina’s title page was any mention of the author, which again reinforced my perception of Evelina as a more conventionally early novel. This observation became especially apparent when I compared Evelina to The Tutor of Truth, also published in 1779. Although both novels emerged in the same year, The Tutor of Truth gave the appearance of being much more modern because the significance of the author was the second most striking information on the title page, and the author was announced in reference to his other popular novels.

Step 3: I felt about the Artemis tool about the same way that I felt when we used the Voyant tool a few weeks ago: it was more fun to play with and interesting to look at the results than actually revealing about the novels themselves. For instance, while I at first tried to draw conclusions about the form of the novels from the relative sizes in the term cluster wheel, I was confused by the repetition of some words in different sizes. At first I thought the term “history” was bigger than the term “novel,” which seemed to suggest something telling about the rise of the novel as an overtly fictitious title for a story, but there were also smaller “history” and “novel” parts of the wheel that confounded this conclusion.

The term frequency and popularity chart was also somewhat limited by the short time period that I chose (1770-1782). Additionally, the results that I saw there seemed to contradict my observations from the term clusters. In the term cluster wheel, for example, “epistolary novel” showed up as one of the biggest terms. I was surprised to see it appeared even bigger than “letters.” However, when I graphed these two words against one another, “letters” appeared consistently way more popular and frequent than “epistolary novel.” Thus, I did not feel that I could draw any significant conclusions about novels from this tool overall.

Words Words Words

6 min read

There were some very clear trends that could be seen even in this small sampling of a year of novels. Mainly anonymous authorship, though the notes on many specified that the novels were written by “a lady.” The majority of the novels published were about ladies, in fact. I thought a pretty good example was an anonymously written novel whose subtitle was “A test of Virtue.” That seems to have been the basic premise of all stories about women. A critic wrote of this one, “the same story might have been told more agreeably by the same writer in a smaller compass. It is something, however, in a modern novel, to find half of it worth reading.” That was pretty funny. It’s nice to find that I’m not the only one who finds the repetitive scenes of dramatic virtue to be excessive. Certainly Austen, later on, agrees, much to my satisfaction.

The novels that were not written about women (these can be characterized as stories about marriage, I think) were about men, and the overwhelming majority of them were stories about adventures. It makes me think, if women are looking for marriage, what are the men doing who are marrying them? They can’t be on adventures and they won’t go on adventures after marrying, so this means they must have gone on adventures already. Which means that their stories have already concluded by the time the women’s begin.

Then I got a little carried away by the pricing of the books. Generally they went for between 5-7 shillings, more when bound rather than sewed. Evelina, bound, went for a premium of 9s. This means a book cost between one quarter to nearly half of a skilled worker’s weekly wages. That’s saying that if someone now who worked 5o hours a week at McDonald’s for $9 per hour (their current wage) wanted to buy a book, they’d be paying between roughly between one and two hundred dollars for the book. This calculation may even be inaccurately low. A construction worker (probably more analogous with a worker from the 18th century) is paid around $15/hr. This ups the price of the book to between roughly $150 and $300. In a way, it makes sense that there was so much fuss about why the book was a worthy read: it wasn’t just a use of your time, but of a huge amount of your money. As much as a frivolous tale may be fun to read, a morally enriching story that will improve your personal character is a much more worthy to buy.

I think the best questions you can ask that will be answered by this bibliography would be about what the public wanted to read. Was it epistolary novels? Stories of suspense and adventure? Stories of suspense and marriage? Stories told by women? How much were these stories worth to them? How much did they cost? From these answers we can then ask, why? Why did they want this, and why did they buy this?

Next, I looked at three of the books listed. The Travels of Hildebrand Bowman into Carnovirria, Taupiniera, Olfactaria, and Auditante, in New-Zealand was very Robinson Crusoe-esque. It is written by the protagonist, or so we are told. From what I can gather from some initial skimming, Bowman is left on a savage land by himself, and he makes friends with an uncivilized but helpful native. The entire piece seems to be part of a very slightly earlier style. It has an extensive table of contents with chapter summaries, for example. The Memoirs of the Countess D’Anois, written by Henriette Julie de Castelnau Murat, was in some ways comparable. It unexpectedly aligned more with an archetypal male adventure novel than a female domestic novel. All of it is written at the end of the Countess’ life, rather than at the beginning. It uses the style of claiming to be written by the main character. It also has an interesting letter to the reader claiming that the book is a justification of her life and a defense against what seems to be accusations of coquetry. It seems to me to be almost the opposite of the editorial prefacing letters in domestic novels which attest to the moral enrichment the story demonstrates. The final novel I was able to find a copy of was The Unfortunate Union: or, The Test of Virtue. A story founded on facts, and calculated to promote the cause of virtue in younger minds. It has the very long and explanatory (as well as moral-enrichment-claiming) title. Beyond the title though, there is no long preface, no note from editors or protagonist. After the title page, the novel simply begins. It also uses an epistolary format, unlike Crusoe but just like Evelina. It is astonishingly similar, in fact, to Evelina. In the first few pages I found the family name Villars referred to, as well as multiple locations often visited in Evelina, including Ranelagh. I have no idea of these are coincidences or what this means exactly.

It is hard to really draw any conclusions from these searches, because I know so little about the entire novels or the novels contemporary with them. I would be uncomfortable to outline many trends, though I do think we’ve already established that a simplification of title and preamble is growing. As for Evelina’s place among these books, it seems more refined and contemporary than the first two, hard to say about the nearly identical-seeming Unfortunate Union.

The ARTEMIS tool is actually very cool. I’m sure that it would prove frustrating if I was doing more focused research, but to get a feel for what types of novels were published in the span of thirty years, it’s great. The visual aides really help. There’s a nice little histogram of amount of books published per decade, and then term frequency makes cool graphs just like voyant, but for a lot of novels rather than one. I like that there’s a popularity option, rather than only looking at the search term by frequency. The term clusters were good for browsing a large body of search results, but it seems to (understandably) miss out on a lot of books and content. As we know from before, OCR is full of errors and inabilities.

Exercise 4

3 min read

I chose a relatively straightforward couple of pages – Volume I Chapter IX (pages 15 and 16) of the 1761 edition – to OCR with ABBYY Pro 12. The OCR program consistently made some mistakes in deciphering old-fashioned typography and seemed a bit oversensitive to various dots or smudges on the page, so that ABBYY Pro has Tristram signing off his dedication as his Lordship’s “moji bumble fervant.” Some of the program’s misreadings were pretty consistent; for instance, the program consistently misread the old-fashioned S as an F, and the lower-case C with a curlicue as an upper-case C. I would guess that some OCR programs, if not this one, have tools that would allow you to create custom settings like “read all those F-like thingies as S-es,” sort of like Voyant allows you to add custom stopwords, although I confess I couldn’t figure out how to create those settings on ABBYY Pro.

The OCR program also gets confused by Tristram’s extensive use of long dashes, and converts them into short dashes, blank spaces, and bizarre indentations. In cleaning up the text, I tried to replicate the length of the original dashes, but I was only able to achieve an approximation. This conversion and the subsequent arbitrary refurbishing of the punctuation mean that the effect of the original haphazard dashes gets compromised. A punctilious editor might decide to standardize the dashes, losing the slapdash effect of those in 1761 version, where they seem to be almost a mimesis of Tristram Shandy’s broader halting, digressive structure. On the other hand, faithfully trying to replicate the length of the 1761 dashes for my modern clean version encodes the particular way they appeared in one version, implying that exact transcription of the novel at a particular historical moment is more authentic or correct. Considering the proliferation of later reprints and bowdlerized editions, which people read throughout time and which all constitute the cultural phenomenon that is Tristram Shandy, it seems as arbitrary to exactly replicate the 1761 text as to not. This exercise shines a light on the tension between the novel as a physical, written form and the novel as a living cultural event. Since the figure of the author was often hidden, and novels like Pamela and Robinson Crusoe were published under the pretense of being authored by their title characters, my sense is that the authors of proto-novels were more shadowy figures than authors are nowadays. The success of response novels such as Shamela or The Clockmaker’s Outcry against the Author of the Life and Opinions of Tristram Shandy suggest that readers saw the novel’s truth as being mobile with the cast of characters rather than lying with the original authors. In that tradition, a haphazard transcription of the 1761 text does not so much “lose something in translation” as augment the rich history of rewritten, retold, and resold versions of the original novel.


Assignment 3

3 min read

Voyant is really neat! Playing around with the word cloud, I found that this tool showed me some things about the novel that were “hiding in plain sight.” For instance, the word cloud’s most prominent word, one I’d applied stopwords, was “said.” Perhaps it’s obvious, but seeing that massive word at the center of the cloud brought to light the way that dialogue is central to Pamela, which seems odd given that Pamela is a novel of letters. The two categories in the cloud which stood out to me were title words such as sir, Mr, Mrs, master, lady, gentleman, and madam; and “innate quality words,” such as good (most prominently), poor, happy, and honour, which occur with approximately equal frequency. Interestingly, master occurs with far greater frequency that mistress. Master is primarily used by Pamela to refer to the position of Mr. B as her social superior an employer. Mistress is not really just the female term for master; instead, it has a sexual connotation, and in Shamela, that connotation is made explicit when Mr. B offers for Sham to be his mistress: not an equal, but a kept woman.

This difference seems to uphold Armstrong’s argument that Pamela is able to tell a radical narrative of class difference by couching it in terms of gender difference. I thought that it might be instructive to look at words that describe people in both a classed and a gendered way – lady, gentleman, and gentlewoman – and see whether and how their frequency changed over the course of the book. I compared the frequency of a couple of words: gentleman, lady, virtue, and gentlewoman, over the course of the book.

Here is a comparison between the words gentleman and gentlewoman.

Gentleman consistently occurs with more frequency than gentlewoman, confirming what we already know, which is that the two main characters are a gentleman and someone who is not a gentlewoman. The results are similar for lady and gentleman ( with lady occurring more frequently at the beginning of the book, which corresponds to the prominence of Pamela’s late lady at the start of the plot.

Voyant does not distinguish between Lady as part of a reference to a person’s name and title, as in Lady Davers, and lady as a social position, as in “if I had been born a lady,” which is perhaps a weakness of the tool or at least of this chart. However, scanning through the occurrences of the word lady, references to lady outside of a particular person’s title and name, references include “if I was a lady of birth” and “if I had been born a lady,” all wishful, conditional references.


High Hopes

2 min read

For this exercise, I chose to focus in on the use of one word in particular from the word cloud that Voyant generated: “hope”. At the beginning of the novel, Pamela uses it several times in the context of expressing her initial opinions about Mr. B. For example, she says, “I hope I shall never find him to act unworthy of his character; for what could he get by ruining such a poor young creature as me?” Her father also shares a similar concern, saying, “I cannot but renew my cautions on your master’s kindness, and his free expression to you about the stockings. Yet there may not be, and I hope there is not, any thing in it.” Both of these quotes ironically foreshadow the events that are to come later, and there are other instances as well in which Pamela “hopes” for something and then the opposite thing happens. Read through an anti-Pamelist lens, Pamela’s hopes become almost sarcastic in nature.

I also discovered that she uses the word “hope” to describe her own personality and behavior. A couple of my favorite instances are “I hope I shall always know my place,” and “I hope, desperate as my condition seems, that as these trails are not of my own seeking, nor the effects of my presumption and vanity, I shall be enabled to overcome them, and, in God’s own good time, be delivered from them.” Again, I find myself reading these very isolated fragments of text in a very sarcastic way. This interpretation suggests that Pamela knows exactly what she is doing, and that the language she uses is specifically employed to highlight the innocence and purity that she wants to convey to her audience (both in the real world and in the context of the novel). Finally, I think that the frequency of the word hope throughout the novel somewhat pertains to Armstrong’s argument that Pamela acts like a book of conduct. Pamela simultaneously “hopes” all of these things for and about herself, and also in a way is sending the message to young female readers of the novel that she “hopes” they will follow in her footsteps and mimic her example.


Assignment 3

First, I found reading through the Table of Contents itself rather interesting. It is told by a third person, omniscient narrator that portrays both the events each letter/segment but also adequately conveys the emotions and ideas that the readers gather by reading Pamela's own writing. I found it to be actually quite informative and it is interesting to consider the audience that it was targeting at the time. Was it used as a reminder of the course of events and the typical means of locating a passage of interest in the book, or more so as an abridged version of the novel itself meant to be read independently.

On to the Voyant exercise, as many people have noted, most of the words that are most frequently seen in the novel (actions and titles aside) are virtues and qualities that one would expect to see in a conduct book. My immediate reaction is that the frequency of words like "good," "happy," "honor," "kind" etc. highlights this aspect of the book being a virtuous novel meant to "cultivate the principles of virtue and religion in the minds of the youth of both sexes." A vast majority of the most frequent words are positive, as listed above, thus suggesting that these are the main virtues that the novel is focusing on and attempting to cultivate.

The ease by which Voyant can thoroughly analyze an entire text is incredibly fascinating and makes this assignment rather interesting. I found that tracking the use of "Pamela" throughout the novel showed some interesting trends. Aside from addressing or signing the letters, a vast majority of the usages of "Pamela" were self-pitying remarks (i.e. "poor Pamel" or "hopeless Pamela"). This was something that I noticed while reading the novel as well, but looking through the specific usages of the word itself highlighted the self-pitying nature of her character at times. It also reminded me of scenes from Robinson Crusoe in which he pities his condition. We had talked about Robinson Crusoe being either very happy with his situation or woefully disappointed with it. Looking through the Voyant produced list of Pamela's name, I found a similar trend in that Pamela either refers to herself (or others refer to her) as "dutiful Pamela," "grateful Pamela," or "pretty Pamela." I'm not entirely sure what to make of this, but it does seem to create this air of pity around the main character as well as harkening back to the structure we saw in Defoe.

"And why"

4 min read

Richardson’s indexing: I would love to spend more time digging into this and figuring out what exactly Richardson thought he was doing — or I guess what kind of work the index is doing, since I don’t want to get into some kind of intentional fallacy trap. It was compelling to me to see what information or parts of the book Richardson believes are important and will contribute to an “easy and clear view” of the book. It’s almost as though he is puffing or blurbing himself by pointing to the locations in the book which might be most interesting — though it’s hard to know if he is acknowledging that some of us might not want to suffer through all of the pages or just might not have access to the entire text, or that he would be attempting to transmit some kind of vision/message to those who couldn’t access the full version? Indexing has to be considered some kind of technology here — he is doing work on the text and reducing it/transforming it into something distinct from the proper text. The difference in style stands out to me in particular — it’s impossible to imagine that Richardson could be so brief…also, is he the voice narrating this index or giving it to the reader? That could definitely account for the shift since this is no longer epistolary but describing the form and the content of his epistolary novel…complicated…it’s difficult to talk about this without more knowledge about the material conditions at play in the whole Pamela production. I also think it’s interesting that he often says “Pamela feels so and so way, or does so and so thing, ‘and why,’” — the why is kind of what’s important here, which lends credence to the idea that Pamela’s subjectivity is fully realized here, that she has a deep interiority.

Exercise: I love the word cloud feature/the general funny mid-2000s aesthetic of this website…I also think the stop word feature is incredibly useful. However, I’m having trouble drawing a lot of conclusions from the Voyant visualization — once I eliminate all of the words which don’t seem very interesting to me, how significant are the remaining words (statistically, I mean? Is this even calculable?)? Does it mean anything that I can pick out “master” and “poor” and “I” and “Pamela” once I cut all the words I think are boring? Or other, even more infrequently-occuring words? In other words, how applicable are the insights from this tool? What does critical work look like that uses these technologies to make arguments? In general though, I do see some really big applications for this — I spent some time unsuccessfully looking for something we looked at in one of my Shakespeare classes, which was a sort of visualization of the similarity in word choices between Shakespeare’s plays. We discussed this in the context of authorship and the class came to a sort of conclusion that it didn’t exactly matter whether or not Shakespeare wrote all of his plays, since we study them as though they are all by the same person…How does DH work fit into established canonicity notions like these? I”d love to think through its applications for GSST work specifically/read some of this work. Checked out the Journal of Digital Humanities and some other related sources and they all look really cool.

Challenge: There are plenty of applications/challenges to the Armstrong selections we read to be made using the Voyant text. Her contention that the novel evolves a new female subject — that it creates a form and disseminates it into the world — and that this subject’s value is based on her “essential qualities of mind” could certainly be mapped onto the usage of “know,” “think,” “feel,” etc. used in Pamela — maybe especially if charted against the usage in other books. This seems time-consuming but particularly interesting. With more time it would also be great to pick out all of the gender-related words and compare them to the instances of class-related words — this could lend support to or push back against Armstrong’s argument that differences of class/political issues are subsumed/subordinated to issues of gender. Maybe a collection of words could also lend credence to the ideas we’ve been discussing that these issues of gender are resolved at the end of novels…this would be fun to do with basically any novel involving a marriage plot! (Is fun the word? I don’t know.)


The Voyant Voyage

2 min read

This program is really cool. I am impressed by how easy it is to use, and by quickly it can convert a 500 - something page novel into a format that is super simple to analyze. First I had to confirm my suspicion that the word "sex" never once refers to the physical act, but instead refers to some particular gender (more often than not Pamela's, but occasionally the male sex is referred to as well). This continues to surprise me, seeing as Pamela's discussions with her parents about her chastity, virtue, and other such matters are extensive and occur throughout the course of the novel. In a similar vein, I also confirmed my suspicion that the word "rape" never once occurs in the novel, although as we have pointed out in class there are many moments in which we would say (as modern readers) that Mr. B attempts to rape Pamela. When one reads a novel like Pamela and finds a lack of profanity (e.g., the word "damn" is apparently bad enough to be dashed out) and a general prudishness regarding sexual matters (i.e., no one ever explicitly says what they're talking about), one might think for a moment that this must mean that the world these characters inhabit is less profane and less sex crazed than our own profane and sex crazed society. But that's nonsense. This novel is absolutely sex crazed - Mr. B haunts Pamela's presence throughout the entire story, and his ultimate desire (to have sex with her) is achieved by the end of the novel via their marriage. There are certainly differences between their world and ours, but it is important for us to realize that the language they use (or don't use) can be misleading- only if we don't think deeply about what is really going on.

Excel is Useful, Sometimes.

I’ve attached a photo of a little graph I made in Excel to this post; if anyone wants to learn how to make it, just find me in class/around campus and I can show you! It’s pretty easy.

This is a visual representation of the clusters of words I pulled out of my word cloud that I referenced in my last blog post. It compares some of the most frequently used words to the overall number of words used in the novel. As you can see, these 11 words account for almost 25% of all words in the novel!

Lies and Feelings, as Told by Pamela.

First and foremost, I really enjoyed this assignment, and I’d be interested in doing this sort of textual analysis on other novels. Is there a way to get clean versions of other novels? I was thinking of doing this with Americanah.

One of my most interesting findings occurred when I was looking at my word cloud, after using the standard list of English stopwords and then adding my own to the list. I also removed mrs, said, went, quite, shall, sir, dear, and mr from my word cloud. And when I looked at what was left behind, I noticed that the top words fell into a few distinct clusters that I could recognize. The first of these clusters was the group good, poor, lady, little and Pamela. When taken into consideration together as some of the most frequent words in the novel, it raises questions about how the novel is attempting to get us to look at Pamela as a character, as a person, and as a woman. In a sense, the high frequency of these words, often used together, is priming us as readers to think of Pamela as a tiny, poor, virtuous woman throughout the novel. It is not enough to just mention “poor Pamela” once or twice; it happens all the time. The repetition of these words throughout the novel may be, on an almost subconscious level, informing our perceptions of Pamela and of female characters and of female subjectivities in the novel in general without us even realizing it. Another cluster of words I noticed were the words “think, thought, know, and say”. The high frequency of these words helps to show how Pamela is truly an early novel form, and not something else. In Pamela, the primary method of characterization is through what the characters say, think, feel, and do, not necessarily (but sometimes) the societal forces being pressed upon them. It may be possible to conclude that the heavy use of these words is helping to form the idea that characters should, and indeed often do, have individual thoughts, feelings, and subjectivities that make them who they are.

Another interesting thing that I found was during the part of the exercise where we explored the frequency of given words throughout the novel as a whole, in a sort of “chronological” sense. I noticed that the word “Honesty” is used semi-frequently throughout the first half of the novel, but then its use drops off almost completely in the second half. Is this because Pamela no longer feels the need to assert her honesty (often in reference to her Virture), or is it because Pamela is becoming less of an honest character as the novel continues? And what does that say about her reliability as a narrator? All questions I still have after this exercise.

Small bug note: if I wanted to compare relative frequencies of two words that were not on the same page (like page 3/14 for the list of frequencies), I had trouble getting both of them to show up on the graph.

Assignment 3: Dialogue, Silence, and Writing

When I first glanced at the word cloud, I saw that the most frequent words were pretty generic and weren't really surprising. This included words like "and", "you", "the", "my", "me", "to", "he", "of", "said", "a", "so". The frequency of "me" and "my" do show the importance of the first person (and Pamela's voice) in this novel, but that's already known because Pamela is, after all, a series of letters. The word "said" might be the most interesting out of this generic list, showing the frequency of dialogue (or references to dialogue). A great part of the letters consists of Pamela's account of the events that occurred and her interactions with other people. I decided to compare the occurrence of this word with the occurrence of the word silent (somewhat its opposite).

While "said" is used quite frequently in the novel, neither "silent" nor its variations seem to appear at all. I find this very interesting. It seems the narrator doesn't think silence important enough to mention. In a way, there is always something being said, even if there aren't any characters speaking at the moment. As Pamela writes, she is speaking to the reader--there is no silence anywhere.

I then decided to compare "said" to "write", as writing plays a significant role in the novel as well. The graph is included in the post.

Surprisingly, the word "write" and its variations did not appear very frequently in the novel, at least when compared to "said". More attention seems to be focused on the dialogue, but I thought writing would be important enough to appear more frequently.

Exercise 3: Voyant

2 min read

The most surprising aspect of playing around with Voyant for me was the frequency of positive words. My word cloud was dominated mostly by words with good connotations—words like “good” (855 occurrences), “gentleman” (137), “goodness” (174), “hope” (372), “kind” (204), and “honour” (209), just to name a few. For a narrative dominated by the avoidance of rape, the word cloud was overwhelmingly positive (the main exception to this rule was the word “poor,” which is used 534 times). The “poor” exception is significant, because it is usually used to describe Pamela—“…found a place that your poor Pamela was fit for,” “poor daughter’s name,” “for what could he get by ruining such a poor young creature as me?” But even still, the positivity of the word cloud surprised me. I examined the uses of the word “kind,” and though it varies based on the context, it is often used to describe Mr. B. Instances of this include “my master has been very kind to me,” “Well, he is kinder and kinder, and, thank God, purely recovered,” and “good, kind, kind gentleman!” He’s trying to rape her the whole time! To me, that portrayal of Mr. B suggests that the novel isn’t quite as feminist as we might like to believe.

I repeated the exercise for Shamela and found some interesting things—“Parson” and “Williams” are much more prominent, for example, suggesting that his character plays a more significant role in Shamela—but I’m not sure it worked correctly. Maybe the Project Gutenberg text is a different edition than the one we read, but I couldn’t find “vartue” anywhere and “virtue” was only used four times. Maybe I just exaggerated its use in our book in my head?

The fact that “honour” and similar words are used so often in Pamela supports Armstrong’s thesis that the qualifications for the desirability of a woman were in flux and shifting towards holding a woman’s moral character in high standing. It is also important to keep in mind, however, that “sir” is used 820 times and “master” 627 lest we get carried away with anointing Pamela a feminist novel or character.


Pamela's Way More Independent than I thought

3 min read

On my first read through of Pamela, I’ pretty sure that I had the same impression of the book as everyone else did. That being, of course, that it was a pain to go through. Not because of its length (although that didn’t help either), but because of the material. We’re subjected to the character of Mr. B, a misogynistic control freak who is eventually somewhat rewarded in the fact that he marries Pamela, and Pamela, who states on multiple occasions that her virginity is more important than her life. While stating how important her virginity is, she also talks about how she’d rather be a poor virgin than a rich nonvirgin, and she goes on for a while about how honorable it is to be poor. That part struck out particularly to me, because it almost felt like she was romanticizing the notion of being poor, which is definitely not a great thing to do. If a normal, 21st century person picked this book up and read it, these problematic elements would jump out to them. So when you give a book like this to Swatties, we’re obviously going to notice just how terribly this portrays women. Even after talking in class about how this was radically liberal for its time, I still expected to see results in the cloud that matched our views when we first went through it. And, as always, technology proved me wrong. Of course, there are some things that popped up that reinforce the view that this is a problematic novel. Words such as “Master,” “Poor,” “Virtue,” and “Honour” pop up frequently (although I honestly expected to see virtue more than I did), but we also see words such as “Thought,” “Reason,” and “Believe.” In a book so flooded with Pamela doing what Mr. B tells her to do, it’s surprising to see that Pamela spends so much time doing things that a normal, intelligent, free, independent person would do, rather than the virtue slave that I thought the book was depicting. Even though Pamela is bossed around and does what she’s told, she still holds onto her own person and her own thoughts, and that’s something that I completely missed during my first read through. In general, I think this says a lot about bookies, and older books in particular. When we pick up a book, we have a preconceived notion about what should happen, how the characters should interact or feel about things, and how the book flows. And if one of those things don’t fit what we want, we get aggravated with it. And with that aggravation comes bias, and when you’re biased you tend to see things very, very subjectively. It’s tough not to read works without bias, but I think that’s definitely something that I in particular need to work on. If we really want to get the most out of a book, we need to put our ideas of what constitutes a great book and see everything through a more objective lens. …And THEN after we do that, we can joke about her “vartue.”


Agency and Goodness

2 min read

I was interested to observe the frequency of active words, such as said, thought, reason, believe, saw, hope, gave, stay, I’ll, came, shall, and wish, which I associate with agency, freedom, self-determination, and the new rise of these options for women. Then again, I did examine every case to see who was the subject before each verb.

On the other hand, however, the most frequent words also included good, dear, master, and honour, which I see as terms used to narrowly define and limit women. Perhaps this obscures the more specific uses of these terms in favor of generalizations.

I decided, then, to look up the frequency and specific usage of the terms good and bad, with surprising results. Good remains solidly frequent throughout the novel, but bad starts out frequent but makes a marked decline to the end. I'm not quite sure what this could signify- the triumph of good over evil? Bad appears a lot next to conscience and in terms of the contents of one's heart. Also: bad name, bad conduct, bad actions, bad words, and bad designs. I wonder what the decline of "bad" implies - certainly not the end of moralizing judgment or categorical criticism.


Assignment 3: Voyant

3 min read

What a cool program voyant is. My first experiences with the word cloud alone brought up some interesting stuff. The two high-frequency words that really caught my eye were "should" (676) and "master" (627). Both imply some form of submission and obligation. This is pretty surface level, but i'd still like to point out the dissonance between the images of Pamela and the words in it. Pamela resists her master, refuses to bend to Mrs. Jewkes, and in general sticks to her guns against immense pressure. Yet the most prevalent words used to tell her story feature a focus on her servility and subjection. Pamela is so praised because she serves her master as well as she can, though this includes defying him. Pamela is a completely obedient character: following her parents' wishes, obeying the writ of the land in social terms (for example, refusing to enter into anything that would make her seem of a higher class, such as wearing nice clothes, as well as refusing Mr. B's advances), going so far as asking, but never taking (her constant request to be sent home, but never any actual attempt to go home when she still could). Pamela is not defiant. She is pious, so much so that her acts of "rebellion" (i.e., refusing Mr. B's advances) are in fact acts of compliance with a more righteous guideline.

There was a lot to find with voyant, beyond that first revealing glance. I edited stopwords (mostly taking out names) for the frequency cloud and became fascinated by "good" (855) and "poor" (534). Perhaps because the words were similar in size (frequency) and shape ("oo"). But of course, there is a definite correlation between the two in the novel. Poor Pamela and her poor parents are the ultimate good characters.

My last interesting observation came from looking at a word from a scene and its graph through the novel. I looked at the second attempted rape scene, when Pamela is held down between Mrs. Jewkes and Mr. B. I graphed the frequency throughout the novel of "wicked"(128). It spikes at the two rape scenes, and drops sharply after book two. Then I looked at the names I remembered Mr. B calling Pamela: sauce-box, hussy, and slut. I had expected to find a much higher frequency of these words, because he verbally abused her so much more than she him, but all clock in under 20. What does this mean? Is it because Pamela is narrating, refers to Mr. B more often than to herself (via him)? Is 128 high frequency? Does this entire investigation yield anything? I liked finding through voyant that the big picture I got from the book often didn't line up with the micro picture afforded by voyant.


Assignment 3: The Targeted Demographic

2 min read

Pamela is a novel that contributes to the Age of Enlightenment, emphasizing individualism over the traditional roles and authority. The word cloud shows more counts of terms of individualism (e.g. “lady”, “mind”, and “thought”) than terms of traditional roles and politics (e.g. “girl” and “God”), indicating that Pamela is an individualist and that there is a separation of gender and politics. However, the popularity of Pamela was due to its novelty, the succession of a girl of the lower class against a man of aristocracy (i.e. people identified with Pamela, the underdog). The word cloud reveals that the word “poor” is mentioned about 500 times, appealing to the lower class. The working and middle classes were enchanted with the notion of rising the socioeconomic ladder, against the forces of the aristocracy. However, this intended audience was most likely ignoring the ideas of breaking traditional roles mentioned in the novel. Therefore, we cannot state that Pamela exemplifies the Age of Enlightenment, only partially evolving it. We cannot state that Pamela spearheaded the way for other proto-feminist novels (which arrive about a 100 years later), only suggesting that women have strong desires. We cannot state that Pamela instigated ideas of going against traditional authority, only manipulating it.

I do concede that Pamela does encourage individualism over the established authority. In Desire and Domestic Fiction, Nancy Armstrong states that “In place of the intricate status system, that had long dominated British thinking, these authors began to represent an individual's value in terms of his, but more often in terms of her, essential qualities of mind” (467). Because Pamela is written in first person, the audience experience the novel through the actions and view of Pamela, gaining a strong individualistic sense. Expectedly, the word “I” is used about 10,000 times. This point of view alone allows for a better representation of Pamela and her values; Richardson wanted to break the idea of “domesticated women”, characterizing Pamela as intelligent and a freethinker. Looking back, his efforts were deemed as heresy and seemingly fruitless, as seen by Shamela in 1741 and the rise of the cult of domesticity in the 19th century.


Poor Pamela

2 min read

I spent a while sifting through different words and comparing their uses throughout the book, and Voyant was a fantastic tool for this kind of study. I especially valued the "Keywords in Context" tool at the bottom right of the screen which showed a few words before and after the keyword. This made comparing the linguistic usage of the words much quicker and more efficient. Out of all the words I looked at, the one that stuck out to me the most was "poor," used 534 times.

The first thing that I noticed about the usage of "poor" is that Richardson uses it gradually less frequently throughout the course of the novel. The word trend graph I've included illustrates this. However, as you'll notice, there's a slight spike towards the end of the novel. As I explored the uses of the word in different sections of the book, I discovered that towards the beginning, Pamela is constantly describing herself and her parents (even when she is speaking to them directly) as poor. Over and over again, she reiterates how poor they are and how low her station of life is. As she begins to be ingratiated into a higher socioeconomic ring through her entangling with Mr. B, she begins to describe herself as poor less often and begins using "poor" to describe other things, such as an emotional state or simply referring to others as poor. For example, in the last letter, she describes a situation in which "several poor people begged my charity, and I beckoned John with my fan, and said, Divide in the further church-porch, that money to the poor, and let them come to-morrow morning to me, and I will give them something more, if they don't importune me now." Later, in the epilogue type thing that sums up why Pamela is an excellent example to follow, Richardson declares that "her diffusive charity to the poor" (inherently saying that this is a group which she is not a part of...anymore) has "made her blessed." I found it strange that Pamela was so ready to stop associating herself with poverty when offered a way out of it, especially considered how ingrained it was in her identity towards the beginning.


Exercise 3

2 min read

Whilst fooling around with Voyant, I thought it might be interesting to see how often class structure, relative to other themes, is represented in Pamela. One of Armstrong’s overarching points pertains to domestic fiction’s efforts to separate female individuality from class and politics, so I set out test this theory. In order to do this, I searched through the corpus for words that we typically associate with wealth and prominence. I rummaged for indicators such as “rich”, “poor”, “poverty”, “estate”, “money”, “merit”, and “power”. Of these, “poor” was by far the most frequent occurrence, appearing a total of 534 times. This is likely due to the fact that “poor” has the largest variety of connotations; for example, the most common use of the word in Pamela is to describe distressed individuals. Still, I found plenty of instances where “poor” is used in context of poverty, and those instances appear all throughout the novel. Although none of my other entries appear as often, they still seem relevant. “Estate”, “rich”, “money”, “merit”, and “power” each appear roughly 40 to 100 times, giving me the sense that Richardson saw hierarchical class structure as something pertinent within his narrative.

To further this investigation, I searched briefly for words that we might associate with Pamela’s individuality. The most obvious indicator is “virtue”, along with any relevant modifiers. In total, the words “virtue”, “virtuous”, “virtues”, “virtuously”, and “unvirtuous” appear 120 times—a number that, to me, is shockingly low. I also took the time to search for instances where “liberty” and “liberties” are used, producing a sum of 50 results. Even scarcer is the use of the word “moral”, of which Voyant only furnished three occurrences. Similarly, “agency” only appears twice throughout Pamela. Considering that Richardson’s novel seems rooted in its defense of Pamela’s virtue and individuality, I find it peculiar that these words don’t emerge very often at all. What also strikes me is the fact that words affiliated with class structure appear just as frequently, if not more.


According to Armstrong, the novel established the divisions of the world as gender-based, rather than politics-based. Gender is a proxy for establishing personal identity based on thoughts, feelings, and virtue, rather than by religious sect, class, etc. To translate this claim into a very focused study of one word in one novel, I chose to look at words related to virtue, which Pamela is very preoccupied with. I was surprised to see that “virtue” did not show up on the word cloud, even after correcting the cloud to eliminate the most common words. However, “good” and “goodness” did show up on the word cloud. “Good” is rather large and thus was used very often in the book:

I then looked through a bit to see how these words are used. Here are some examples:
-good lady (her former master who dies)
-if I was a good girl…
-Good sirs!
-you are a good girl, Pamela
-good old widow
-good families
-if we are good…(talking abt God)
-rather than forfeit my good name
-good advice
-good character
-that’s my good girl! He exclaimed

Most of these are describing Pamela’s character or are in some way related to remaining a good person or virtuous person.

I also looked at its frequency throughout the book:

The usage of “good” fluctuates throughout the book, but it is relatively the same at the end as it is at the beginning. Could this illustrate that Pamela at the end keeps her virtue, as she is just as good at the end as she was in the beginning?

I would probably need to look at the usage of “good” in other texts, pre-Pamela and post-Pamela, and compare them to Pamela, to really make a claim about its usage in Pamela and whether it can attest to Armstrong’s claims that the inner self (thoughts, feelings, virtue) is becomes identity. But for now, I would say that its frequent usage and its similar usage at the beginning and the end can tell us that the maintenance of virtue is important to Pamela, and consequently, important to the readers of the time.