The RATOM project is engaged in two complementary development efforts.
The first – libratom – focuses on the development of a toolset to scan PST and MBOX email sources, produce reports describing content and metadata, and apply NLP to extract and categorize entities discovered in message content. These features are exported in a clearly documented SQLite database schema to support data analytics and machine learning tasks.
You can find the code and detailed installation instructions for the RATOM email processing library, tools, and notebooks on GitHub:
Ratom Appraisal Tool (Web Interface and Cloud Deployment Tooling)
The second – a selection and appraisal web application – focuses on the development of an interface and service to meet the needs of archivists reviewing individual email messages for retention, redaction, and public release.
You can find the RATOM web app deployment repository on GitHub:
Additional repositories used in this deployment can be found at ratom-server and ratom-web
Previous Development and Related Tools
RATOM builds on the efforts of previous projects conducted at UNC SILS and the State Archives of North Carolina, including BitCurator and TOMES. Selections from these projects are linked below.
TOMES Project Home: https://www.ncdcr.gov/resources/records-management/tomes
TOMES on GitHub: https://github.com/StateArchivesOfNorthCarolina/tomes-project
BitCurator Project Home: https://bitcurator.net/
BitCurator on GitHub: https://github.com/bitcurator