Tooling up for Digital History:
This post follows a discussion of “The Future of the Discipline and the Profession” last week in the “Historian’s Craft” seminar at University of Notre Dame. I thought I would follow up with a few suggestions for tooling up, in case anybody is inspired to explore some of the digital approaches we talked about.
The best places to get started with digital history are some really fine websites with tutorials, software, and examples. Although I could mention a million different useful sites, I’ll limit myself here to the most significant and in my view most frequently helpful.
First and foremost, The Programming Historian contains tutorials and really helpful information on several different digital history applications, including a great tutorial on topic modeling and an equally good one on web scraping. Spend significant time here if you are trying to teach yourself some new methods!
Stanford’s series on “Tooling up for the Digital Humanities,” with little lesson modules on different techniques.
For inspiration, “Digital Humanities Now,” is a clearinghouse for recent work.
For workshops and announcements and tools, the CHNM at George Mason.
Finally, take a look at the websites of some individual researchers. You can find key digital history researchers through the Digital Humanities Now aggregator. You will be inspired by what some of the true leaders of this field are doing. For instance, I am always amazed by Ben Schmidt. His history work is great, AND he also finds creative uses for these tools to study things like higher ed and gendered language in course evaluations.
Social Network Analysis
The best software for social network analysis is UCINET, although many tools exist to do network analysis in R. To learn to use UCINET, which is much easier than R, you might consider online tutorials, which can be found all over youtube, the help files and tutorials built into the software itself, and the book, Analyzing Social Networks, which the authors of the software sell on the website. The key to working with network analysis is understanding the basics of the matrix, and figuring out how to generate matrices from your data. UCINET has a wonderful tool called the DL editor which will help you turn your data into matrices. Another good piece of visualization software is Gephi , which can accept the matrix files that you generate in UCINET. Coursera once offered a course on Network Analysis, and I bet there are other online courses out there if you search.
Topic Modeling and Text Mining.
We talked quite a bit about Topic Modeling in class last week. To learn topic modeling, you’ll want to get a program called MALLET. The best place to get started is here. To continue understanding what topic modeling does, read these articles—here, here, and here— which are brilliant explanations of LDA—Latent Dirichlet Allocation.
The umbrella term “Text Mining” refers to more than just Topic Modeling, however, and there are some really powerful tools that you can profitably use as you’re learning LDA, or even if you never want to do Topic Modeling per se. The best tool for getting started in “distant reading” generally is called “Voyant.” Voyant is actually a suite of powerful text analysis tools and you can easily master it in an afternoon. In my view, Voyant will give you the biggest payoff for the least amount of struggle as you are getting started. I’ve made a few videos to show you how you might use it.
NOTE: If you watch these videos, make sure to MAXIMZE the screen size on the video player, otherwise the bottom half of the screen will probably be cut off.
Voyant tools demo Part 1
Voyant tools demo Part 2
Voyant tools demo Part 3
If you really want to get fancy with text analysis and topic modeling type applications, this book is a crash course in learning to use “R.” R is a very flexible statistical software environment; many researchers have written packages for text analysis and analysis of unstructured data (which is what prose is). With R you can do lots of useful text analysis, as well as create some great visualizations.
GIS is the backbone of Spatial History, and it is probably the most formidable of the various tools. To learn GIS, you can start with the free software QGIS, which happily works on both PCs and Macs. (ArcGIS only works on PCs). GIS takes a while to learn and it may be smart to take a class. However, I know historians who have trained themselves, and the results are very meaningful. Here’s John Randolph’s experiment in using GIS to study how roads functioned in imperial Russia. It can be done!
DH in a Box
One more thing I forgot to mention– as you’ll see when you start messing around with the most common DH applications, a big part of the initial challenge is getting everything installed on your computer correctly. Whatever kind of operating system you are using, you will need to get everything configured, get all your versions compatible, and get your files in the right places. This can be a pain but is necessary of course. For anybody looking to teach some of the applications, CUNY now has a cloud-based insructional lab. Here you can conduct classes and workshops in using the various applications without hassling with the configurations and etc.
And More Basically…
Finally, while I do think that all of these tools can be mastered, and might be useful, remember that they all have their limitations and you have to decide what they will provide and whether they are worth your investment. You can only answer this by trying them out. But in the meanwhile, I urge you ALL to do one thing in general, which is to master Excel, and get comfortable with spreadsheets. If you get in the habit of making spreadsheets early on, and if you then start coding certain kind of data as you are doing your research, pretty soon you will have all sorts of ways to think systematically and creatively about your information. By using simple tools like pivot tables, you can explore this data in ways that will surprise you and often lead to discoveries. Sometimes this will answer your questions, but more often it will provoke new questions, which is just as valuable.
For French readers, I highly recommend this book by Claire Lemercier and Claire Zalc as a starting place for thinking about quantitative history in general.
I hope some of this is helpful. Feel free to be in touch if you think I might be able to answer any questions!