The Ethics of Using AI to Study Literature and Reviews

By Joris J. van Zundert and Julia Neugarten

Two weeks ago, NBD Biblion, an independent non-profit organization that provides information services for Dutch libraries, announced that it would fire most of its reviewers. NBD Biblion had been using a team of about 700 reviewers who provided short descriptions of newly published books. Reviewers made 14 euros per review of 200 to 300 words. Librarians would then use these descriptions to decide which titles to add to their collections, although such descriptions are far from the only source that librarians used. Henceforth, NBD Biblion explained, libraries could base their purchasing decisions on AI-generated book metadata and descriptions.

What followed was an outcry of horror. In blog posts, on Twitter, and in literary journals, NBD Biblion’s decision was met with derision and dismay. These reactions illustrate the unease that many people in the literary domain experience about the prospect of computer technologies assessing literature. Additionally, these reactions seem to have the built-in assumption that such technologies are always geared towards ousting the human factor in publishing, and even in authoring books.

For us, as computational literary scholars, the incident was cause to consider some of the ethical aspects of our project. In Impact & Fiction, we try to correlate features of book reviews written by humans with features of novels written by humans, using computational approaches. We believe –but still have to show– that we can gain new insights by doing this on a large scale. Here, a large scale means using upward of 500,000 reviews and upward of 10,000 novels. These numbers indicate that we will not be reading every review or analyzing every novel manually; aggregating across that much data can only rely on automated tools. In this project, we will use tools that are probably very similar to those used to create NBD Biblion’s automated descriptions. We say “probably”, because the technology that NBD Biblion’s subsidiary Bookarang uses to create these descriptions is proprietary and shrouded in secrecy.

What does it mean when, instead of reading novels and reviews to identify and evaluate relationships between the two, we let computers and algorithms do the “reading” for us? Big data piles can be skewed and biased, and it is easy to produce tools that play into expectation bias, prejudices, and model confirmation. On the other hand: individual human reviewers and researchers are just as biased and prejudiced in many respects.

What would it mean if computational tools indicated that the aggregate of reviews paints a particularly narrow or prejudiced view of a particular novel? How can we discern whether this result is an artifact of the technology or an accurate depiction of reviewer’s opinions? The discussion surrounding NBD Biblion reiterated the relevance of these ethical questions to our project. The online commotion reminded us that our project may well contribute to debates on the ethical aspects of using computational tools to study literature and reviews.

Over the course of this project, we want to make sure that our results and the insights we glean from them are not skewed by the computational tools we are using. At the same time, we want to examine how computational tools may influence the research process, change the research methods of literary studies, and challenge the biases and assumptions ingrained in non-computational literary studies. The insights we gain in the process could be just as valuable to the scientific field as the results we expect to draw based on quantitative analysis.

Ultimately, all the insights we gain are not intended to eradicate the human factor in the literary or academic process. Rather, it’s the opposite: we want our tools and results to benefit authors, readers, and publishers. We hope that our work will enable all involved to better understand each other’s needs and the way literature works.

Leave a comment Cancel reply