• Text mining and the digital humanities

    There is growing interest in the digital humanities at UC Berkeley. I’m currently involved, as a computer scientist, in a project aimed at developing tools for historians to use, and exposing them to the potential applications of computer analysis to their work.

    I’d like to talk with humanities researchers about specific computational needs that humanities researchers, and historians in particular, have.  I would also like to go some way towards exposing humanities researchers to the potential applications of natural language processing, and text mining to their work.

    I’m especially interested in seeing whether we can put large, existing digital corpora, such as the new york times data collection (every article since 1987), and the Internet Archive’s (www.archive.org) corpus of ~1.9 million OCR-scanned books, to use for the humanities.


  1. sikarskie says:

    Sounds like a great session. There definitely needs to be a lot more collaboration between humanists/historians and computer scientists in the future.

    We (MATRIX) are currently working on a project with computer scientists at the University of Illinois: http://grants.matrix.msu.edu/did/

  2. tgloege says:

    This looks great; I’d love to participate!


  1. Text mining and the digital humanities – Great Lakes THAT Camp « Information Mining – Zenorg R&D
  2. How to propose a sesion « THATCamp Texas 2012
  3. How to propose a session « THATCamp Texas 2012
  4. Welcome aboard! Now, about those session proposals . . . - THATCamp Virginia 2012
  5. Please Post Your Proposals!! | THATCamp Historically Black Colleges and Universities
  6. Session Proposals | THATCamp ACRL 2013
  7. Get ready for THATCamp London 2013! | THATCamp London 2013

Leave a comment

You must be logged in to post a comment.