Summer 2026 | Graduate Research Assistant
We are seeking three Computer Science or Data Science graduate students to join an interdisciplinary digital humanities project. The project involves analyzing, linking, and visualizing demographic data (baptisms, marriages, and burials) from two 19th-century rural Peruvian localities (Aucará and Chipao).
To allow the GRAs to focus entirely on advanced data science and visualization, the core web architecture (SvelteKit/Quarto deployed on GitHub Pages) and data extraction pipelines (CSVs and daily JSON backups from a Baserow database) have already been established.
Rather than working independently in silos, the three GRAs will collaborate closely to conceptualize, design, and implement an end-to-end data pipeline. They will regularly share ideas, troubleshoot together, and ensure seamless data handoffs across three interconnected domains. Each student will take the lead on one of the following core areas while contributing to the overall system design:
- Building a probabilistic record linkage pipeline (entity resolution) to deduplicate historical personas and identify familial/social connections across datasets.
- Transforming the linked datasets into interactive network graphs (using tools like NetworkX and Gephi) to represent social and familial relationships.
- Generating interactive maps that represent geographical patterns and seamlessly integrating the visual components (maps and networks) to allow cross-referencing between datasets.