The Ngram viewer was very useful for me, after plugging in the words, ‘Toledo,’ ‘casket,’ ‘bishop’ and ‘Visigoths,’ which are four very specific words from my research on Spain in the Middle Ages. After choosing a datespan on the bottom of the viewer, I found a good ebook that relates to my area of study, and the book is now on my Google ‘shelf.’ It inspired me to look for editions of Gregory of Tours’ work, which is the sort of benefit I predict for scholars in their work (inspiration, encouragement, finding new specifics.) Bookworm is my favorite tool, as a scholar, because I can access the books. With Google books I’m never sure what to expect.
The book that I selected for the assignment was a collection of short stories from 1884 by Edgar Allen Poe. I know it is very close to the “Pre-20th century” requirement, but when I was moving back and forth between the plain text and the scan, Google Play books was popping up for both the 16th century Spanish play I had chosen and the 19th century book. The play was originally in Castilian Spanish too, so I thought the process might be simpler with a text that was originally written in English. I copied the Plain Text of Poe’s The Gold Bug into a Word doc.
I saw no major problems with the job that the OCR had done, just a few minor ones that were disorienting nonetheless. Some words were mistakenly connected such as ‘witha.’ ‘Ise’ appeared once, but it was hard to tell if that was a mistake of the OCR, or because Poe was imitating African-American dialect in some passages. These things could be distracting for a researcher of the text, but not insurmountable difficulties. Strange combinations of numbers, letters and characters also appeared, such as Mags, t 1 . – _. …=.‘.=__. ‘, and it was a minor distraction to figure out if they were significant or not. Again, this would not be hard to find a way around and I would probably choose to read the plain text over the PDF. The PDF window was too small compared to the plain text window.
In Voyant was where the random strings became a problem, but I just needed to go through and erase some to make the program work. The overall presentation of the Voyant analysis was surprisingly pleasant, even though I had already seen some students’ screenshots: kudos to their front-end developers. Voyant showed ‘Legrand’ was most frequently used, and ‘Jupiter’ as next. Since these are two of the main characters in the text, that their names show up most is not surprising, and in fact help strengthen an argument that Poe was anthropocentric, in keeping with his time. It was surprising to me that ‘massa,’ which is the dialect form of ‘master,’ was the next most frequently used. I knew it was there, but the statistical frequency of Poe’s use of Ebonics interested me, and now I think him even more anthropocentric and a product of his time. Voyant helps analyze Victorian-era language by picking out phrases with the tool on the left. The context tool is interesting because it almost retells the story, but in miniature. I realize I’ve made a case for a literary critique of the text, but hopefully historians understand how much is historically determined, and research Poe’s work in context.