By: Sarah Fister Gale

How digitization of archives is transforming the academic research landscape, enabling faculty and students to deliver groundbreaking projects in months rather than years, and to debunk prior historical assumptions that were based in incomplete datasets.

Everyone loves to talk about the impact digitization and big data has had on the business world -- but the impact that these technologies have had on academia is equally profound. The ability to digitize archival records, and to search those rare data sets remotely, is enabling academics to ask new questions and validate hypotheses in ways that were unthinkable even a decade ago.

Ada Palmer, Assistant Professor of Early Modern European History in the Department of History and the College, notes that she traveled to five countries over three years to complete her dissertation on the societal impact of Lucretius's De Rerum Natura poem during the Renaissance. These days, that same research would take months and could be done remotely thanks to digitization, she says. “The availability of digital content has fundamentally changed the practicality of what students can study.”

It has also expanded the questions they can attempt to answer – and in some cases changed the results, says Ufuk Akcigit, Assistant Professor of Economics in the Kenneth C. Griffin Department of Economics and Graduate Admissions Chair. He notes that in the past economic research tended to focus on small and limited data sets, often culled from only the largest firms, which created the risk of bias. “Without understanding the broader context of the data, it is easy to end up at the wrong policy conclusion,” he says. But digitization has changed that. “It allows us to process large amounts of data to learn things we couldn’t before.”

Though it is important to be thoughtful about how these records are digitized and shared, notes Mareike Winchell, Assistant Professor in the Department of Anthropology and the College, who uses ethnographic and archival sources to examine rural opposition to the Bolivian government's contemporary program of indigenous land titling, she recognizes concerns that documents proving land titles, maps, colonization history, and other vital records could be lost or appropriated by those who stand to gain power from it. “It raises new questions for how to conduct indigenous research projects,” she says.

While there are still uncertainties about how digital archives will shape the future of research, all three of these social scientists have already seen their own projects evolve due to digitization, and they are excited about the new opportunities this trend presents for their graduate students.

A new view of the Renaissance

Palmer has spent her academic career studying the history of ideas, and how history and our intellectual world shape each other over time. She focuses much of her work on the Italian Renaissance because it is a critical point in time when ideas about science, religion, and the world collide. “All at once many beliefs, scientific systems, and perceived worlds clashed, mixed, and produced an unprecedented range of new ideas, which in turn shaped the following centuries and, thereby, our current world,” she says.

Ada Palmer

Ada Palmer

Much of her research focuses on the evolution of marginally acceptable ideas that existed on the edges of society, such as atheism, witchcraft, homosexuality, and radical physics. “These were frequently things that were illegal to write about, but people did it anyway,” she says.

Until recently, finding and studying these marginalized documents required significant time and resources, which limited how many projects she could conduct, and what research she encouraged her students to pursue. She recalls a student five years ago who wanted to do a senior thesis on Seneca, the Roman philosopher, but Palmer discouraged her because it would be too time and resource intensive. “At that time, it was not a project that could be completed in a year,” she says. But because these texts are now digitized, those ‘once in a lifetime projects’ can be completed in months.

This is revolutionizing her own research and creating new opportunities for her students to pursue groundbreaking studies. She recently undertook a project to search a digitized database of classical Latin texts to understand how the authors talked about the Republic. “Finding every instance of the word “res publica” took five minutes,” she says. In the past that would have taken years. She is also excited about how digital technologies are accelerating efforts in Italy to catalog uncatalogued Renaissance manuscripts, enabling researchers to find and sort through documents that haven’t been reviewed for hundreds of years.

Some people might question why this already well documented period of history needs further research, Palmer notes. But she argues that these documents can help uncover gaps in our historical knowledge, and address misconceptions in previous work. For example, works by female authors tend in general to be reprinted less often than those by male authors, which creates a distorted version of history that looks more male-dominated than it actually was. “Whether you are looking at Renaissance Latin or golden age science fiction, the female voices that actually were present tended to fall out of circulation in subsequent decades,” Palmer says. Currently, many scholars working with physical libraries are working to bring female voices back into print, however, digitizing first printings provides access far faster than re-publication can.

The capacity to digitize archives has also got many academics thinking about new ways to capture and share other forms of knowledge, including during classes. Palmer is considering a video podcast of a new course covering the history of skepticism and is pursuing grant funding to support all aspects of the production, including closed captioning.

At the same time, the ability to create such resources also generates new caveats and concerns, including the issue of having off-the-cuff remarks captured as a permanent record of someone’s opinions. Last year, for example, Palmer sat in on a class featuring a prominent visiting professor in which participants discussed the proposal of recording the class. “Almost everyone said they were uncomfortable with the idea,” she says.

Broadly, she is eager to see what research questions the next generation of academics and students tackle. “Whenever new innovations make research easier, it excites young scholars to re-examine questions with better metrics.”

How to quantify innovation

Akcigit has also experienced a sea-change in the kinds of economic research questions he and his students can answer thanks to digitization of archival records. Over the past few years, Akcigit has been studying the intersection between economic status and innovation using digital records to prove previously unresolvable theories. “We know that in the long run economic growth only occurs with innovation and technology,” he says. But until recently, that knowledge was largely based on anecdotal evidence and isolated data sets. “The history of innovation is very hard to quantify.”

"For a long time, economists have theorized that wealth, economic growth and innovation were inextricably linked but it wasn’t until patent records were widely digitized that Akcigit and his colleagues could study this hypothesis on a grand scale. In 2017, he co-authored a paper, “The Rise of American Ingenuity: Innovation and Inventors of the Golden Age,” which tracks invention trends using historical U.S. patents and Federal Censuses data from 1880 and 1940, that they linked back to regional economic aggregates. The patent files allowed them to identify strong relationships between patented inventions and long-run economic growth, and to determine that patents were largely the outcome of collaborative work conducted by wealthy individuals with significant access to capital, education, and other talent. “Those with access to education and wealth were far more likely to innovate,” he says. A 2018 paper studying patents and economic trends in Europe found similar results.

Ufuk Akcigit

Ufuk Akcigit

For Akcigit the immediate implication of this research is clear: equal access to education is vital to generating innovation and that helps to drive economic growth, he says. “The data shows that people who can afford education are more likely to innovate.”

His research has also found that when young people are given the education, resources, and encouragement to innovative, it breaks down cultural barriers and fosters social mobility. His research into companies and patents shows that companies where managers are younger grow faster and produce more radical innovations. When young entrepreneurs face cultural barriers and competition from older incumbents, it is harder for innovative thinking to flourish.

None of these results could have been proven without digitization of those patent records, he argues. “It is generating a quiet revolution in the study of economics.”

However, there are downsides. Akcigit notes that in the short term, access to digital archives creates a temporary inequality in academia because not all researchers or institutions have access to the technology and resources to conduct these studies. “It’s a barrier to entry that we will need to democratize better in the future.”

He hopes that as access to such digital records expands, it will inspire graduate students to ask new questions, and to challenge policy conclusions that were defined by research using small and potentially biased data sets. “They have the time and energy to pursue this research,” he says. And now they have access to the data necessary to make it possible.

The benefits and risks of digitization

While many scholars are enthused about new research opportunities derived from archival digitization, not every archive owner is equally eager to digitize their records and share them with the world. Concerns about who will own the databases, who should have access to them, and how they will be used are causing some organizations to block such efforts, notes Winchell. “There are debates about the risks of undifferentiated access to these documents.”

Mareike Winchell

Mareike Winchell

Winchell is a sociocultural anthropologist with interests in indigeneity and nationalism, and how histories of colonial violence linger in the present. She notes that in formerly colonized regions, archival data is often used as a tool of “discipline and control.” “Some of these archives are living documents, like land titles that can be used to legitimize land use,” she says. Making this information publically accessible could pose risks to certain groups who want to control that knowledge. For example, in her current archival project, Just Documents: Property, Possession, and Bolivia's Decolonial Archive, Winchell explores how former domestic servants in Bolivia sought to leverage agrarian servitude as a basis for land claims. In this scenario, archival data becomes a valuable present-day tool to challenge land ownership and determine wealth distribution, she says. “These documents can set precedence for legal titles today.”

Digitizing these documents can ensure they are safely recorded and stored so that these claims can be made in the future – at least that is the theory. Winchell has seen many important documents disappear during the process of digitization. “It is supposed to protect against loss, but files can get damaged, misplaced or corrupted.” And there are those who argue that global access to this kind of data creates new risks, particularly if certain groups intend to use it to appropriate land or resources from those with less digital access or knowledge of the data.

As a solution, some organizations are opting to digitize documents but limit access to the databases to those who can prove a relevant need. For example, Winchell spent weeks going back and forth with the National Institute of Agrarian Reform to view digitized archives for her archival project, and part of her permission included agreeing not to share the data with anyone. “That put me in an awkward position of having to protect the data from people interested in accessing it to assist them in a land struggle,” she says. In the end she did not share the data because those groups couldn’t prove the legitimacy of their claims to it and she had agreed to abide by those rules. “This all raises important questions for indigenous research projects,” she says.

She argues that any effort to archive indigenous communities and their records must be carefully considered, with attention paid to how the data will be used and by whom. Even when people have the best intentions to assist an indigenous group by digitizing their cultural history, they have to consider what others could do with that data, she says. For example, charting the sacred sites of an indigenous group may help the community record their history, but could also open those sites up to economic extraction or misappropriation.

“It raises the question: at what point the promise of having a record becomes more perilous than liberatory,” she says. She believes the approach of digitization with restrained access may be a solution to address the misuse of these digital files.

A new era of research

Ultimately, digitization of archives is a breakthrough for researchers not because the technology is revolutionary, but because it makes it possible to answer new questions in less time, and to deliver more robust, less biased results. “It is changing the projects we pursue, how we collaborate, and what we leave in,” Palmer says.

While none of these faculty are working together directly on archival projects, their excitement about the opportunities that come with expanded access to data and the chance for their students and colleagues to tackle exciting new research projects is tantamount. ““There is now an unfathomable amount of unstudied material that is newly accessible,” Palmer says. And that data has the potential to reshape history.

Gerrymandering and the Geography of Violence in Chicago

In his 2016 book “Wounded City: Violent Turf Wars in a Chicago Barrio”, sociologist Robert Vargas found that violence in the Little Village neighborhood on the west side of Chicago was concentrated on blocks historically gerrymandered by the city council. In a new working paper, Vargas is using ward maps, crime statistics and census data from 1960 to the present to quantify the effects of ward redistricting on neighborhood violence. Comparing gerrymandered neighborhoods (like Little Village and Brighton Park) to neighborhoods (like Beverly and Bridgeport) that have resided in the same ward since 1961, he found that the murder rate was two-and-a-half times higher in gerrymandered areas. The findings, according to Vargas, show that violence persists in Chicago not only because of turf wars between gangs and police, but also turf wars among politicians over assigning blocks to wards. Read more here.

Whether they are used to more accurately define historic events, validate land ownership, or prove economic and social trends, these digitized documents offer a rich and untapped source of knowledge that will spur a new generation of research by University faculty and students.

 “The university has a great tradition of combining theory and data without bias,” Akcigit says. The digitization of new databases allows this tradition to continue -- and frees students to ask new questions that can now be answered. “Whenever a student presents a theory, the first question we ask is what data are you going to use to support it?” he says. “It makes us all more authoritative and confident in our work.”

Archival digitization trends are also spurring new collaborations across the University. Winchell notes that many of her students are producing films and other media using archival data in combination with other resources as a way to share anthropological stories with a broader audience. In many cases, these projects are bringing multiple disciplines together from departments of history, media, architecture, coding and other fields to create new forms of content. As Winchell notes, concerns with the political implications of digitization span various fields including anthropology, media studies, and law. Fostering these conversations has been particularly fruitful for faculty working in Latin America, where documents figure centrally into ongoing processes of post-conflict reconciliation.

“So many interdisciplinary discussions are emerging across the University,” she says. “It is creating new bridges to support our work.”