Purpose-Driven Data

My first career after college was as the Data Manager for a nonprofit in Kentucky that provided services to survivors of domestic violence and sexual assault. As an overly ambitious 23-year-old, I was the inaugural holder of that title and role, which came about to give the organization more ownership of our data. At the time, we had a system for keeping client notes, and a system for collecting the data our grant funders were asking for. But we contracted an outside company to take the data we entered and turn it into grant reports. The goal of my position was to bring these systems together so that we could explore the data we were already generating for other purposes and collect data that we knew as an organization was important, even if no funders were asking for it.

The thing we quickly realized, however, was that taking control of our data did not release us from funders’ influence. We wanted to know about services provided. Funders wanted to know about services provided. But if the funders had certain breakdowns, certain labels, for those services, eventually their labels became—or had already become—our labels. So even when there was a collective interest in a piece of information, the funders’ language and specificity in “how” we were to measure it had an enduring impact on our data management structures.

The structure of the data collections mirrors the foreseen use of the data, particularly since the data collection process takes time, energy, and resources. Early on, I made some big errors, like thinking we could also input data in a way that aligned better with how we thought it should be used, not considering that this would just create additional burdens for the folks who actually had to enter these notes and fill out the ever-lengthening forms. There were many things we thought might be interesting to know, but unless we could justify the time it would take to collect and analyze the data (which we rarely had), we had to learn to set priorities. I knew we were not unique and therefore learned a valuable lesson: collected data is almost always a reflection of what is valued in some way by the data collectors.

Creating Historical Datasets

The last few weeks I have been working on transforming archival records (Kleintop and Wigger 2025) (transcribed by student researchers into one cohesive dataset, and I’ve been sitting with this reality: that the foreseen use of the data shapes how we put it together. This was true, first and foremost, for the War Department bureaucrats who gathered these records into their own record-keeping spreadsheets on behalf of the US government. These original files from the 1860s represent the intention of the data:

To track claims made by enslavers for the “value” of their “lost property” when formerly enslaved men enlisted in the military
To track determinations of loyalty of those enslavers
To record the rare payments for these claims

But as a project team, we have our own intentions, and they are many and contradictory with the original creators’.

To serve as an accessible record of archival documents that doesn’t require training in reading nineteenth-century cursive
To contribute to principles of openness and data sovereignty
To learn more about the people in the records, the process of emancipation, and compensation claims and payments, such as who was considered “loyal” and what distinguished paid versus unpaid claims

As we’ve worked on this project, it has also become clear that these records can also serve an important purpose as a set of artifacts to show the interpretation of data over multiple phases. Looking at student transcriptions, I see the question marks and notes. This process of transcribing is imperfect, and the original files created by the student research assistants is itself an artifact of the imperfect, human nature of data transformation and creation that on its own holds many lessons for us as teachers and educators. As Mia Arango (2025), one of the student research assistants transcribing the records, noted that through the process, they”

“learned to sit with uncertainty. Some records were nearly illegible or lacked full names. I didn’t try to fill in the blanks or ‘correct’ the original records. I documented my uncertainties and allowed space for future researchers to see both the presence and the absence. That transparency felt like an important ethical step in work tied to such painful histories.”

— Mia Arango (2025)

Another student, Elon Brown (2025), also reflected on this uncertainty coupled with the emotion of the stories the data held, reflecting that the joint knowledge of both those things forced them:

“to acknowledge that mistakes are bound to take place, but when working with something as important as this, you can’t afford to be careless”

— Elon Brown (2025)

And perhaps most importantly, we have taken care in knowing throughout this process that we are holding lost records of genealogy. As we discussed with students during their transcription work, each name of a freed person holds the humanity of an entire story with shared history and lineage for descendants today. Preparing data to be accessible to descendants of formerly enslaved people requires names be maintained and notes be preserved. Dates and locations, which can provide important contextual identification, also take on new importance. The spelling of a name or precise date of enlistment may not be useful when trying to aggregate the data to find patterns in which claims were paid. But they matter for finding and sharing stories.

As I work to combine and “clean” this set of records, I frequently find myself pausing and asking why I’m making one change or another. How to transform this set of records is itself a process of foreseeing how it could be used. I am far more used to transforming data for a specific, empirical research question. But if that were our sole purpose here, a lot would look different. I would largely ignore the names, to be perfectly honest, because names disappear when the purpose is to aggregate. I almost hate to admit this, because I know that it may seem that to ignore the names is to ignore the humanity in the data. And while this is a slippery slope, I have never felt this way. Ignoring names ignores the individuality of the data, but allows us to explore the aggregation, which reveals something about our shared humanity. Aggregation is what allows us to see patterns and systems and injustices that go beyond individual instances and cross into shared experiences. And yet, some systems have and continue to blind us to the individual. The individuality of this data—the identity of formerly enslaved and freed men—takes precedence over understanding the system in aggregate.

Final Reflection

In many ways, I have come full circle to the lessons from that Data Manager role. Our data had to tell the stories of the individual clients and our collective organization, and balancing these purposes was a challenge then and remains a challenge now. At the time, I assumed this difficulty was from my own lack of experience, but engaging in this work now reveals that this is simply hard and requires tradeoffs. What I’ve learned most from this project is how to balance these purposes and how to recognize the relative importance. In this project, we are doing our best to preserve individuality while allowing for aggregation to emerge, but when these goals go head to head, the individuality of the data wins.

References

Arango, Mia. 2025. “Translating the Past: Reflections from Behind the Ledger.” Center for Engaged Learning (blog), Elon University. October 28, 2025. https://www.centerforengagedlearning.org/translating-the-past-reflections-from-behind-the-ledger/.

Brown, Elon. 2025. “More than Data.” Center for Engaged Learning (blog), Elon University. January 6, 2026. https://www.centerforengagedlearning.org/more-than-data.

Kleintop, Amanda, and Cora Wigger. 2025. “Engaging Students in Transcribing Historical Data: About the Project.” Center for Engaged Learning (blog). Elon University, October 21, 2025. https://www.centerforengagedlearning.org/engaging-students-in-transcribing-historical-data-about-the-project.

About the Author

Cora Wigger is an assistant professor of economics and a 2025–2027 CEL Scholar. Her research focuses on the intersections of education and housing policy, with an emphasis on racial inequality and desegregation. At Elon, she teaches statistics and data-driven courses and contributes to equity-centered initiatives like the “Quant4What? Collective” and the Data Nexus Faculty Advisory Committee.

How to Cite This Post

Wigger, Cora. 2025. “Purpose-Driven Data.” Center for Engaged Learning (blog). Elon University. May 12, 2026. https://www.centerforengagedlearning.org/purpose-driven-data.

Posted in:

Tagged:

Cora Wigger