I initially ran the topic model based on 100 topics neglecting to incorporate that we were only dealing with approximately one-third of the content of these complete records. At that point, I had not calculated the total number of documents (or lines) we would be analyzing if access was readily available to the whole corpus; I assumed it would be large nonetheless. One hundred topics did not provide any clear distinction that was insightful. I then reduced that number by half with still about the same results. I reduced the topic number again in half (so 25% of the first number of topics). I then reduced the number down in increments of 5 reaching 20 and then 15. Twenty seems to display enough distinction although again, all topics have a land association.
Topic 11 “New Orleans riverbank title” stood out the most to me because I personally recognize the “batture” as a unique location I’ve been to and also alluvion as a soil type. Although I connected with this topic description, it does not relate to the Midwest and the Northwest Territory my teammate and I are focused on. For that, Topic 2 relating to authority in treaties and grants will more likely address our topic. Topic 9 which appears to deal with heirs to claims will be more relevant, although there is no geographic connection to intersect with the Midwest yet at this point. For the primary source material we are dealing with, the topic model with twenty topics selected encapsulates a broad overview of land related subjects, of which the Midwest is our focus.
Unfortunately, the topic model does not necessarily add any new information or insight to our research. I think I would feel differently about this topic modeling being useful as guidance if we had been able to utilize it much earlier in our research. It could likely have served one of its intended purposes as an informative gateway to texts for people unfamiliar with the topics contained in the American State Papers. Because of the technical difficulties with access we have encountered, we have become more intimate with some individual documents and themes while being limited in access to others. An early, broad, generalized peek at the overall themes contained within our primary source documents would have offered insight and possibly enhanced our productivity by providing a more effective introduction to the subject matter. That route may have resulted in better overall time management because of acquiring some familiarity at an earlier implementation stage of our project.