Text analysis is fundamental to humanities scholarship. Digital tools for text analysis can provide students with a novel approach to bring the underlying characteristics of texts to the surface. For example, text analysis tools can be used to create assignments that allow students to experiment with search terms, to juxtapose the style of one text against another, and to formulate questions that can inspire further research. Students can also use these tools to identify word frequencies and their contexts within texts and to compare and juxtapose multiple texts.
- Tools for text or data analysis are most often used to pose questions, bolster arguments, or test hypotheses. Most often, they are used to generate data or visual representations to be presented via other tools.
- Unlike many other tools commonly used for digital assignments, text analysis tools are most often used to test hypotheses or to gather evidence in support of an argument rather than to publish student work. Text analysis tools such as Voyant can work well as accompaniments to close reading exercises and to help formulate questions for further inquiry.
- Text analysis tools can also be used to quantitatively compare the language and style of one author’s writing against the style of another or to track how an author’s style changes over time.
- Concordance tools search a corpus to display the results of search queries in various ways. Although this may seem like an objective computational process, the selection of what to include in the underlying corpus is quite obviously very subjective. Your selection of texts to analyze has a significant impact on the results and should be considered a primary step in this type of analysis.
- The simplest and most common form of text analysis, the frequency list, can be quite misleading. The prevalence of a term or phrase does not necessarily correlate to its contextual importance. However, text analysis tools can provide more functions beyond determining word frequency. More insights can be drawn from comparisons between different types of analyses, which can then inspire class discussions.
- Perhaps the most common analysis function is the Keyword In Context (KWIC) search. This type of analysis shows search results within the context of the text in which it occurs. Other analyses, such as Cluster, N-gram, and Collocates, provide information about what words tend to occur with other words.
Each of the tools mentioned here will require some training in order to use effectively. There are many good tutorials available online, but setting aside a portion of class time for in-person training would also be advisable.
- AntConc, a desktop application for searching corpora and compiling concordances.
- Voyant, a suite of online tools for text analysis and visualization.
- Databasic.io, a collection of easy to use tools for simple data and text analysis
- Google N-Gram Viewer, visualize trends based on search terms across the Google Books corpus
- Bookworm, visualize trends based on search terms across the HathiTrust corpus of books.
- Recogito, an online collaborative platform for documenting the occurrence of named entities (people, places, events) in texts and images.