Skip to main content

Exercise 8

2 min read

How much do titles foreshadow what the novel is truly about? To what extent do the elaborate descriptive titles of the 18th century novels we’ve looked at reflect the themes with which the novel is occupied? Do the words that appear on the title page reappear throughout, or are they simply there to attract readers?

I’m not entirely sure how to execute this using only the exact topic modeling and metadata tools of the past two assignments, but very similar technology could answer these questions. The topic modeling would need to be limited to a single novel (if we wanted to do this very inefficiently, with tons of iterations), or there would need to be a way to connect the topic modeling to the metadata in such a way that matches novels with themselves. That was poorly explained. What I’m trying to say is that the two technologies would need to be combined in such a way that would allow us to compare words in titles to themes within individual novels. This would allow us to determine—albeit pretty abstractly and inconclusively—how much of a correlation there is between what the title promises the reader and what is delivered.

Alternatively, there could be a cool tool that uses the basis of topic modeling—co-occurrence of words—but examining the titles as well as the body of the text. In novels with “virtue” in the title, what percentage of the words are “virtue” or related terms? And what topic does “virtue” belong to? What does that tell us about novels with “virtue” in the title?


Exercise 7: Topic Modeling

3 min read

Domestic life: house company gentleman person time acquaintance lordship great place conversation young lord received friend met day agreeable evening visit lived

Family: father son young daughter mother family child years wife fortune good children life age great year time married estate left

Woe is me: tears heart heaven death grief soul distress life eyes tender comfort melancholy despair pity unhappy moment sorrow alas days felt

The first Moby Dick: ship captain sea made men board found wind great land shore place boat richard water till voyage put sail immediately

Humanity: nature man human virtue life state god men world religion sense spirit natural power divine reason creatures mankind subject soul

Sophistication: country people men manner great find found order pleasure proper generally present english china society art world mankind taught peculiar

I kill, therefore I am: war army general french english enemy country men battle peace time enemies england number success forces command france troops part

Let’s get it on: passion love made found time person de horatio mistress lover affection loved knew louisa gave words nature tho thought thoughts

Someone studied for his SATs: peregrine pm pickle young hero consequence order pipes disposition sooner began view gentleman immediately satisfaction commodore great manner opportunity countenance

Authors who can’t spell: count ihe ed ft termes fee duchefs becaufe madam duke ihould mc fa paris myfelf cafardo mifs fe days wa

Coming into the assignment I wasn’t entirely convinced as to the usefulness of topic modeling; it seemed too mathematical and arbitrary to be of much help. After running the program, however, I was pretty impressed. Some topics were indiscernible, but some of them represented clear obsessions of the 18th century novel. And while some topics were compiled almost entirely from a single novel, some, such as the “Woe is me” topic, were sourced from a wide variety. I make this observation because topics from single novels are interesting if we are curious about that specific novel, but generally unhelpful for an overview of themes of the century. The fact that some of the themes come from a diverse set of novels, however, demonstrates the usefulness of the tool. I would be interested to do a similar thing on a set of contemporary novels. Many of these themes were pretty predictable and fit into my prior perception of novels of the period based on what we’ve read. I don’t have a similar sense of what topics I would predict for today’s novels, and I wonder if that’s because they’re less repetitive (doubtful), there are simply more of them, or just because I don’t have the benefit of hindsight.


Book Cube

2 min read

Staring at my desk looking for inspiration for the experimental bibliography project, my eyes landed on a photo cube my mom gave me at the beginning of freshman year. The concept is simple: it’s a normal six-sided cube, with each face of the cube featuring a different photograph. Each side of the cube represents a part of my life: there’s a picture of me with my mom; with my dad; with friends at prom; playing baseball; as a young kid; and with my high school cross-country team.

The idea that occurred to me then was the equivalent, but for a book. Each side could represent a part of the traditional bibliography. I’m not sure exactly how to divide it up, but I think it would have a side dedicated to the title, to the subtitle, to the city of publication, to the year of publication, and maybe two sides for two different notes (epistolary form, half pages, and illegible scribbles on the title page are possibilities). Of course, the representations of these aspects of the book wouldn’t be just words on the sides—they would be some kind of artistic interpretation of them. Ideas I’ve had so far include an old style photograph that I would stage and edit for the title, a drawing (though it’s definitely not my strong suit) for the city of publication (Big Ben or something of that sort), etc. I’ll have to think a bit more about the specifics, but I think it could be pretty cool. It ostensibly gives the same information as the traditional bibliography, but in a totally different way—it’s a physical manifestation of the information, and the physicality of the book is a big part of what you miss with a traditional bibliography.


Descriptive Bibliography

1 min read

Lady. The unfortunate union: or, the test of virtue. A story founded on facts, and calculated to promote the cause of virtue in younger minds. Written by a lady. London: printed for Richardson and Urquhart, 1778.

Lady. THE | UNFORTUNATE UNION: | OR, THE | TEST OF VIRTUE. | A | STORY founded on FACTS, | AND | Calculated to promote the Cause of VIRTUE | in Younger Minds. | Written by a LADY. | VOL. 1 | London, | Printed for RICHARDSON and URQUHART, | under the Royal Exchange, and at | No. 46, Pater-noster-Row | MDCCLXXVIII.

  • I 197p; II 226p. 12 mo.


    Vol 1. A1r title, B1r-K3v text. Vol 2. A1r title, B1r-L5v text.


    Source: Harvard University Houghton Library. Digital facsimile obtained from ECCO. Epistolary form. No half title, no advertisements, no dedication, no preface, no index. In Volume 1, pages 132 and 133 are cut off halfway and then are repeated as full pages after. Same for pages 152 and 153 for volume 2. There are illegible handwritten scribbles on the title page next to the title.


Exercise 6: Metadata

3 min read

I thought it would be really interesting to see what places were mentioned in titles to help us think about the obsessions of the time—what places were interesting, what places people wanted to read about. The results of the mapping, though, weren’t all that surprising. The highest concentration was in Europe (mostly the UK), with a sprinkling of places in Africa and Asia and a good number in the USA (what is now the USA). The reliability of this map, however, should be questioned. My favorite example of the flaws of the geocoding is that “The adventures of Abdalia, son of Hanif, sent by the sultan of the Indies, to make a discovery of the island of Borico, where the fountain which restores past youth is supposed to be found. Also an account of the travels of Rouschen, a Persian lady, to the topsy-turvy island, undiscover'd to this. The whole intermix'd with several curious and instructive histories. Translated into French from an Arabick manuscript found at Batavia by Mr. de Sandison : and now done into English by William Hatchett, Gent. Adorn'd with cuts” somehow geocoded to Illinois. There are lots of locations in that title; how it decided it was in Illinois is beyond me. So while the mapping experiment is interesting, it should be taken with a grain or two of salt (much like the mapping Robinson Crusoe project).

As for most of the Fusion charts, I found it difficult to draw meaningful conclusions from them because the data doesn’t necessarily mean all that much. There is, for some reason, a big spike around 1770 in the number of novels, but I’m not aware of a significant reason for that and it could be due simply to the dataset from which we drew. There’s a fairly steady increase in epistolary novels and in the use of non-narrative forms, but since general publications increased and since the data doesn’t take into account percentage of publications, those increases are to be expected and don’t mean much. This is borne out by the fact that the pie chart shows a fairly even distribution among epistolary novels, third-person narrations, and first-person narrations.

I’m a fan of word clouds, so I found the last part interesting. A word cloud of titles revealed (more confirmed than revealed, I suppose) a tendency to give lots of information in the title. Words like “containing,” “price,” “life,” “edition,” “travels,” “volumes,” “history,” “adventures” all show up prominently and all imply a certain piece of information being given in the title beyond the kind of title we would expect from a contemporary novel. Overall, I think these programs are cool and fun to play around with, but drawing definitive (or even speculative) conclusions from them is difficult. I think the further research question I’d be most interested in given the tools (which wouldn’t be that hard) would be the share (percentage) of epistolary vs third person vs first person narratives over time.


Exercise 5: Bibliography

5 min read

Looking at this bibliography is, I think, particularly useful for novels of this time period because titles of novels in the late 18th century told so much about the novel itself. Looking at the titles of contemporary novels wouldn’t be much help in understanding trends; we could observe patterns in the titles themselves, perhaps, but not in the content of the novels. (Besides maybe The Brief Wondrous Life of Oscar Wao. That fits right in. How about The Brief Wondrous Life of Oscar Wao; or, The History of a Fat Nerd’s Futile Loves; A Virtuous History Intended to Instruct in Morality and Provide Amusement to the Fair-Sex; Necessary to be Had in All Households.) Extremely Loud and Incredibly Close? What? 10:04? Huh? These titles only make sense to someone who has read the novel. But the titles alone of these early novels tell us a lot about what they tried to accomplish.

One fairly obvious but interesting thing to note is that we are beginning to see the word “novel” in many of these titles. The novel as a form has been established in a way it hadn’t been in the early part of the century, and novels are now self-aware. They continue to define the genre, but it’s now deliberate, a kind of self-definition rather than experimentation. The influence of Pamela is painfully clear in most of these titles. They usually begin with a short title—Modern Seduction or The Unfortunate Union—followed by a long explanation of that title, one that generally lets the reader know what to expect—Modern Seduction, or Innocence Betrayed: Consisting of Several Histories of the Principal Magdalens, Received into that Charity Since its Establishment. Very Proper to be Read by All Young Persons; as They Exhibit a Faithful Picture of Those Arts Most Fatal to Youth and Innocence; and of Those Miseries that are the Never-Failing Consequences of a Departure from Virtue. By the Author of Lady Louisa Stroud or, The Unfortunate Union: or, the Test of Virtue. A Story Founded on Facts, and Calculated to Promote the Cause of Virtue in Younger Minds. Written by a Lady. We see in these titles continuations of trends we’ve seen in many earlier titles: an insistence on the text’s faithfulness to truth, an assurance that the text promotes virtuous morals, etc. One interesting trend that we haven’t seen is the emphasis that some texts were “Written by a Lady.” I suppose this stems from the idea that ladies are the best models for young women; who better to teach virtue than a lady?

Of course, not everyone was convinced by the titles. A review of The Unfortunate Union read, “There is something so exceedingly disgusting in the exhibition of characters, which have no tints of elegance or virtue… there is something so extremely painful, in seeing such characters employed in harassing, tormenting, and defaming an innocent and gentle spirit—that it is surprising such representations should be thought capable of affording entertainment, or calculated to promote the cause of virtue in young minds.” Pretty harsh, to say the least—and all the criticism is centered around the novel’s inability to teach virtue. By contrast, Evelina’s reviews are glowing—and it’s interesting that the same reviewer writes a lot about Burney’s impressive command of the language, not just the novel’s moral strengths.

There are a few titles I was not able to find on ECCO, including Modern Seduction, or Innocence Betrayed and Isabella: or, the Rewards of Good Nature and The Generous Sister. A Novel. In a Series of Letters. By Mrs. Cartwright. In Two Volumes. This could be my fault, but if not, I wonder why it’s happening. Eventually I settled on Friendship in a Nunnery; or, the American Fugitive in addition to Misplaced Confidence; or, Friendship Betrayed and John Buncle, Junior, Gentleman, and The Unfortunate Union: or, the Test of Virtue. Easily the most interesting title page of these selections belongs to John Buncle, Junior, Gentleman. Much sparser than the others, this title page also includes a patterned image on the front, which stands in stark contrast to the pure text of all the other title pages. John Buncle was published in Dublin while the rest were published in London, so perhaps that could have something to do with it. The differences don’t end there. John Buncle also includes a table of contents on the second page, something that all the other novels I looked at lack. This table includes such humorous (to us) labels as “Sentimental Writing,” “Talkative Woman,” and “Self-Importance.” This page also has a patterned image on it. A broader look at the differences between Dublin- and London-published works would be fascinating and would perhaps reveal more overarching trends in the publishing world.

I set my date of publication parameters to 1719 (when Robinson Crusoe was published) to 1800. Looking at the tile visualization of the term clusters, the prominence of religion really stands out. Words like “church,” “parish,” “sect,” “religion,” “Christian,” and “principles” are all among the most commonly used in titles, subjects, and beginnings of novels. Inspired by Pamela, I looked up the term frequency for “virtue” over the course of this time period; the graph shows a steady increase in the use of the word, from 824 documents in 1720 to 2816 in 1800. Though the 1740 publication of Pamela doesn’t spark a sharp increase as I had hoped, it is part of a trend of more and more frequent use of the word (though I suppose it’s possible that this is simply due to more novels being published, and not a higher percentage of novels being concerned with virtue—this seems to be a potential problem with Artemis).


Exercise 4: OCR

2 min read

I found this assignment very entertaining. The first OCR program I tried (it got this honor by virtue of being the first Google result—oh, all-powerful Google),, failed in pretty impressive fashion. Here are some small excerpts from the first chapter: 4%.



TRISTRY rid_ SH.4A ND 17, Gent.

II4 I t 44;14084 Ir.1... 74 • % r-4 9.11 illr• mkt 47; A

For some reason it began to catch on for a bit: With either my father or my mothcr, or indeed both of them, as they. were in duty both equally bound to it, had minded what they were about -when they begot me; had they duly confalciA. how much depended upon what thcv -were then doing; §--- that not: crIly thc procludic Et of a rational Ecincr was con-k:, , cern'd in it, Dut that pofnbly the before falling apart again:

  • .-1:,7 .11:1: 0 1 4 ,. . .1 F' iTY11:1 n :-‘11 Z.111C1 t.',..:;:fir I wonder if its success has to do with the amount of random surrounding shapes. The first page, which the program struggled with, is dotted with stray ink that probably contributed to the confusion. Gdocs was actually fairly successful; beyond a few very strange series of characters (“སྐབླླ་མཟ † : : } -г fё;н А в, г”) and what looks like an attempt at an emoji (2- :), it was more or less spot-on.

I transcribed a lot of interviews this summer for my internship, and this exercise brought back memories of that tedium (believe it or not, recordings of Israeli authors being interviewed in crowded sidewalk cafés are not easy to decipher). Doing that job, I was struck by the subjectivity of my task; there were so many different ways to transcribe speech while staying faithful to the words, and how I punctuated the conversation made a big difference in the tone conveyed. Though the OCR should theoretically stay faithful to the text and punctuation, it seems a similar situation of subjective transcription/translation in which an aspect of the original has been lost. It feels to me more like a summary, even though the words are nearly identical; it can stand in for the plot, but not necessarily the intangibles.


Exercise 3: Voyant

2 min read

The most surprising aspect of playing around with Voyant for me was the frequency of positive words. My word cloud was dominated mostly by words with good connotations—words like “good” (855 occurrences), “gentleman” (137), “goodness” (174), “hope” (372), “kind” (204), and “honour” (209), just to name a few. For a narrative dominated by the avoidance of rape, the word cloud was overwhelmingly positive (the main exception to this rule was the word “poor,” which is used 534 times). The “poor” exception is significant, because it is usually used to describe Pamela—“…found a place that your poor Pamela was fit for,” “poor daughter’s name,” “for what could he get by ruining such a poor young creature as me?” But even still, the positivity of the word cloud surprised me. I examined the uses of the word “kind,” and though it varies based on the context, it is often used to describe Mr. B. Instances of this include “my master has been very kind to me,” “Well, he is kinder and kinder, and, thank God, purely recovered,” and “good, kind, kind gentleman!” He’s trying to rape her the whole time! To me, that portrayal of Mr. B suggests that the novel isn’t quite as feminist as we might like to believe.

I repeated the exercise for Shamela and found some interesting things—“Parson” and “Williams” are much more prominent, for example, suggesting that his character plays a more significant role in Shamela—but I’m not sure it worked correctly. Maybe the Project Gutenberg text is a different edition than the one we read, but I couldn’t find “vartue” anywhere and “virtue” was only used four times. Maybe I just exaggerated its use in our book in my head?

The fact that “honour” and similar words are used so often in Pamela supports Armstrong’s thesis that the qualifications for the desirability of a woman were in flux and shifting towards holding a woman’s moral character in high standing. It is also important to keep in mind, however, that “sir” is used 820 times and “master” 627 lest we get carried away with anointing Pamela a feminist novel or character.


Mapping Crusoe

2 min read

I had a good time with this—mostly laughing at MyMaps. Its mistakes are understandable, but amusing nonetheless. Of the eleven places MyMaps located in America, the only one it got right, as far as I can tell, was “America,” which apparently is on the border of Kansas and Oklahoma. Even better, according to the NER, “Blessed Virgin” is apparently a location. Understandably, MyMaps was unsure as to what to do with that. There were also numerous actual places that MyMaps wasn’t able to recognize due to spelling variations or other variables.

Jokes and shortcomings of the programs aside, however, this exercise was interesting to me mostly because of what we discussed in class. Though Robinson Crusoe involves its fair share of traveling, Crusoe does not actually visit nearly as many places as Defoe mentions; perhaps this reflects a restlessness or curiosity of the age. This is purely speculative, but it would seem to make sense that the early 18th century was, in many ways, a period in flux between today’s globalized world and the isolation of previous centuries. Travel was difficult and not available to the everyday person, but, simultaneously, there was an awareness of other worlds that hadn’t existed before. Crusoe reflects that; he satisfies some of the curiosity through travel, but, due to the limitations of the time, his curiosity still outpaces his experience.