Critique of Pure Digital Reason: Validating Distant Reading from a Comparative Perspective

Israel Ministry of Science and Technology Grant, 2020, 300,000 NIS, 3 years

Prof. Ophir Münz-Manor, The Open University of Israel
Dr. Rennana Keydar, The Hebrew University of Jerusalem
Dr. Itay Marienberg-Milikowsky, Ben-Gurion University of the Negev

Since the beginning of the current century, the field of digital humanities has been viewed as innovative; promising to change the study of the humanities, pave new ways for understanding society, and build a strong bridge between different disciplines. As with other cultural, technological and academic developments, this new discipline has raised many hopes. It has, however, also been the subject of criticism: does it not entail a dangerous renunciation of the "good old humanistic tradition"?
Can computers "understand" phenomena in the same way as humans? What is the real value of what Moretti calls "distant reading"? Twenty years after this great and optimistic breakthrough in the field, it can be said that these questions do not stem from mere conservatism. Although almost no one questions the analytical power of the computer anymore, or the potential contribution of computationality to the humanities, it seems that this potential is still a long way from being realized. Various scholars have pointed to problems in digital humanities research. The achievements already made in the field also suffer from what Adam Hammond called “the double bind of validation,” i.e. the fact that, on the one hand, many computational studies prove things we already knew in advance without any advanced aids, and on the other, leading-edge computational studies have produced such surprising results that we lack the ability to confirm or deny them.
The project proposed here follows the call of Hammond, Underwood, Meister, and other leading researchers for the start of a new and more mature era of validation in the field; an era in which we will not only speak in theoretical terms about what might be done in the future, but about what is actually being done in the present; in which successes will be put to a balanced and careful critical test against failures; and tools for more effective, fruitful and meaningful research progress will be developed.
To this end, we will take a comparative and critical standpoint as we examine the distant reading conventional practices of three Hebrew and Israeli corpora representing three different disciplines: prose, law and poetry. As we will discover, each has its own unique characteristics and challenges. In the field of prose, the Ben-Gurion University team will examine how computers model modern and post-modern short stories, what phenomena they emphasize, how they differ from the interpretive process that literary readers experience when reading the same stories; and the connection that can be made between the two, beyond describing them as competing or contradictory. In the field of law, the Hebrew University team will seek to discover how algorithms read and interpret legal evidence on two different narratological levels. The first level concerns the identification of thematic content via testimony. The second concerns the identification of the narrative sequence and plot events recounted in legal testimony. In law, unlike in literature, testimony is not merely a narrative act, but serves as a factual basis for the entire legal process. As such, the legal field poses an extraordinary challenge to existing distant reading algorithms. In the field of poetry, the Open University team will focus on the identification of poetry models by humans and by algorithm. Poetry, by its very nature, is rich in structural characteristics such as meter, rhyme, and strophic divisions, therefore, repeating patterns are quite dominant. The initial phase of the study will be based on manual tagging of poetic and prosodic elements of the corpus of liturgical poetry (Piyyut) and distant reading of the tags in order to identify patterns. Human distant reading will take place on the basis of smart visualizations, while computational distant reading will be conducted based on algorithms. At the end of the process, human reading will be compared to computer reading in an attempt to produce a computational framework for identifying poetic models.