The RATOM project is engaged in two complementary development efforts.
The first – libratom – focuses on the development of a toolset to scan PST and MBOX email sources, produce reports describing content and metadata, and apply NLP to extract and categorize entities discovered in message content. These features are exported in a clearly documented SQLite database schema to support data analytics and machine learning tasks.
Ratom Appraisal Tool (Web Interface and Cloud Deployment Tooling)
The second – a selection and appraisal web application – focuses on the development of an interface and service to meet the needs of archivists reviewing individual email messages for retention, redaction, and public release.
Check out the RATOM web tool, server, and deployment repositories on GitHub:
Previous Development and Related Tools
RATOM builds on the efforts of previous projects conducted at UNC SILS and the State Archives of North Carolina, including BitCurator and TOMES. Selections from these projects are linked below.
TOMES Project Home: https://www.ncdcr.gov/resources/records-management/tomes
TOMES on GitHub: https://github.com/StateArchivesOfNorthCarolina/tomes-project