There is growing interest in the digital humanities at UC Berkeley. I’m currently involved, as a computer scientist, in a project aimed at developing tools for historians to use, and exposing them to the potential applications of computer analysis to their work.
I’d like to talk with humanities researchers about specific computational needs that humanities researchers, and historians in particular, have. I would also like to go some way towards exposing humanities researchers to the potential applications of natural language processing, and text mining to their work.
I’m especially interested in seeing whether we can put large, existing digital corpora, such as the new york times data collection (every article since 1987), and the Internet Archive’s (www.archive.org) corpus of ~1.9 million OCR-scanned books, to use for the humanities.