Brigitte Ouvry-Vial & Alessio Antonini

Read-it project website
Reading Europe Advance data investigation tool (READ-IT 2018-2021) is a collaborative research project focusing on regenerating lost connections about the Cultural heritage of reading through an interdisciplinary perspective and the development of innovative tools. Building from expertise in Book and reading history, literature, computer sciences, information sciences and digital humanities, it aims to collect and explore a vast corpora of readers’ testimonies in multiple languages from the 18th to the 21st century. It addresses the challenges set by large volumes of highly-diverse representations of recreative reading, a complex and ancient activity that has greatly evolved across space and time and for which the relevant knowledge strands remain disconnected.

40 years of scholarship on written culture have provided an understanding of what people read in the past until today, and established the act of reading as a dynamic interaction between text and reader. While striking changes affect formats, content, modes of reading today, questions arise about the nature of reading experiences: Why do people read, what need does it answer? How do they read, which faculties, senses are triggered? What stimulates readers’ response? Deep-seated reasons explain such aporias in the state of the knowledge: Reading is a mental activity that is hard to record (Darnton 1986). It is a multifaceted reaction to multiple stimuli, partially related to the object and content read, partially to readers’ personal dispositions. As the vast domain knowledge of reading studies merges findings in cultural history and cognitive studies and revisits established principles in theory of literature it is now arguably possible to confirm reading as the means of self-agency. Yet, identifying its salient features is challenging : the subjective nature of readers’ testimonies requires exploring notions of feelings, judgments, aesthetic emotions, or appraisal that belong to different research domains so far not clearly associated with reading studies.

Connecting the knowledge strands about reading culture

The study of reading as an enlightening inner experience of personal truth seeking emerges from a critical revival of landmark essays by pivotal literary figures (Proust 1906; Woolf 1932, Sartre 1948; Beauvoir 1958) that outlined the twofold perceptive-creative operation of reading. Changes of paradigm in reading behaviours were induced by seminal claims of the death of the author and the pleasure of the text (Barthes 1984; Foucault 1969). The critically important reader-response theory (Iser 1978; Jauss 1982) has generated a shift towards a reader-centered aesthetics of literature and demonstrated the changeable nature of reading that affects the meaning of the text. Parallelly, the field is indebted to founding historians (Febvre, Martin 1958; Chartier 1995), and to bibliographical (McKenzie 1999), anthropological (Certeau 1990), sociological approaches (Lahire 2004) of reading paving the way to a vast array of focussed studies, e.g. changing ways after the 18th c. advent of silent reading and the 19th c. rise of a mass reading culture ; modern national reading cultures and new working class, women and young reading publics.

With the digital revolution of the book and transformations of the types of texts written, of its medium and modes of appropriation, observation of reading on paper or screen has never been higher. Mutations in the literary sphere enforce an evolution of research far beyond the “book” culture. Scholarship shifts from historical contexts, communities and belief systems to `reading in the brain’ and subjective reading as defining means of empathy, facets of identity and social standing. Topics roam the sociocultural landscape from the everlasting bookish lust to its renewed social circumstances ; from generic effects on empathy and emotional skills to multimodal reading pleasures ; from motivational aspects of digital reading to the prevailing intimacy of comments on platforms. Approaches of haptic reading or of listening versus reading enforce the perception of its impact on our combined imaginations, intellects and emotions.

All substantiate the argument that reading triggers, but also mirrors our minds. However, the knowledge strands are disconnected. There is so far no integrated framework of research (Mangen, van der Weel 2016). The sociocultural approach tends not to rely on paradigms from psychology while psychologists use experimenter-devised text stimuli that are rarely encountered in naturalistic situations. Neuro-cognitive studies propose a wide angle on reading-learning brains but a situated perspective on reading experiences and its related emotions within affective sciences is just emerging. Recent contributions to a `global’ history of sensibilities scaffold the evolution of cultural emotions as involved in reading, yet definitions remain unsettled and heterogeneous. And there is no comprehensive study of the social modalities of reports by ordinary readers in the 20th and 21st centuries.

Exploring massive and unknown reports about the cultural experience of ordinary readers

That is why Reading Europe Advanced Data Investigation Tool (READ-IT), 2018-2021, an SSH led-ICT driven project funded by the EU Joint Programming Initiative on Cultural Heritage-H2020, takes on an open access, enriched investigation tool to identify evidence about European reading experiences. Once achieved it will help searching for the main features of reading (agent, resource, process) and for the main aspects of reading response. Furthermore, READ-IT tools aim to connect with previous/similar projects addressing differences in the type of sources and multi-language support needed for a Pan-European research. READ-IT builds on requirements emerging from annotation and sourcing and a plurality of research case studies, to develop practical solutions designed for convergence, reuse and synergy. READ-IT approaches the discussed need to establish connections between the body of knowledge on reading from the ground-up, starting from the operational level of how case studies are implemented (e.g. collection and analysis of sources) to identify and structure emerging synergies. Let us now see how…

Interest today is shifting from highbrow reading of classics by Gens de lettres to middlebrow fiction reading by ordinary readers. This change of focus is sustained by recent access to digitized archives about the cultural heritage of reading along with the development of oral history projects tagged as Readers remember, Memories of fiction, etc . Studies of ordinary readers’s experiences suggest the interplay between lived experience and its verbal articulation, the ludic posture of readers in Fabula. More importantly, the archives reveal underexplored evidence about reading in 20th and 21st c. Europe and result in databases (UK-RED; EuRED) supporting the largest extant datasets to date about reading. And Digital Humanities provide the means of exploring such archival sources through term extraction, opinion mining and emotion retrieval with the use of manually annotated training data. Open source tool-boxes and natural language processing also enable the automatic or semi-automatic navigation of the world of cultural emotions as well as help retrieve experiences through their verbal cues.

The READ-IT agenda is built on its ability to interconnect a cluster of research case studies ranging from Czech school to WWI veteran diaries, from correspondences, marginalia and author’s notes to interviews and social media comments. The agenda – addressing broad questions on the phenomenon of reading across centuries, nations and communities in Europe- is enabled by the integration and reuse “by design” of the research data generated in the context of the case studies. This is achieved through a toolbox making heterogeneous case studies generate comparable results and provide new opportunities for further research based on a common language of reading experience.

Case studies have a key role in the refresh, re-evaluating and reconnecting collections of cultural artefacts, curating, analyzing and encoding in the same formats sources otherwise disconnected. READ-IT provides the stage to refresh and re-evaluate cultural artefacts in light of their value as evidence of reading. READ-IT acts as a collector and hub, to reconnect both sources and results of the research case studies within multiple research communities and, on the data level, with resources, projects and standards for cultural heritage and Digital Humanities.

Refresh, re-evaluate and reconnect case-studies about cultural artefacts

In the project, we adopted an approach “motivated by sources, informed by theory, validated through case studies” aimed to overcome the discussed lack of a common research framework on reading. The toolbox is a growing set of digital and conceptual solutions aimed to support new findings on European history as well as current reading habits.

The toolbox includes the following components each of which addresses a specific issue concerning the realization of the READ-IT vision: a network of independent case studies, supported by the reuse of sources that, through their collective results, enables a wider and broader understanding of reading culture.

The first component is the Model of Reading Experience representing the core definitions about the salient facets of the reading phenomenon and resulting from a confrontation of the different disciplinary perspectives. The Reading Experience Ontology (REO) implements the model of reading as a formal ontology used as data schema for an Annotation Tool for text-based sources, producing data in a CIDOC-CRM and FRBR compatible format and supporting semantic data exploration. The ontology reveals the fluctuable and highly variable connections between the different poles of the cultural experience of reading Reader, Ressource, Circumstances, Response. The annotation tool integrates a Natural Language Processing Service capable of identifying and pre-annotating potential reading experiences included in large texts, to be further structured through a manual annotation.

Reading Experiences Ontology
The Contribution Platform is a web application developed to collect and organize sources coming from archival research, donations, crowdsourcing campaigns and output of the Scraping tool of Testimonies for specialistic social networks (e.g. Goodreads). The contribution platform supports multiple forms of engagement, ranging from postcards, online questions, documents uploads and conversations with the READ-IT chatbot. Collected sources are annotated using the Crowdsource of Experience Ontology (CEO), a data schema describing the conditions in which the testimonies of reading are generated, such as in response to a specific question or a more spontaneous, intimate diary.

The READ-IT toolbox as whole is designed to support the research case studies facilitating the reuse of sources and data (see Figure 1). Firstly, the contribution platform connects testimonies of reading by considering the specific conditions behind the observation of experience. Following, the annotation tool and NLP service supports the mutual understanding of the research goals and the generation of source annotations in the REO format. Finally, the combination of all above supports the reuse of sources guiding the mixing and recombination of annotations based on the characteristics of experiences reported and evidence found in the analysis of sources.

Figure 1. Interoperability of research data is supported through the integrated use of the READ-IT toolbox.

Refresh, re-evaluate technical and research perspectives from previous legacy projects

Beyond connecting case studies, READ-IT provided the opportunity to refresh, re-evaluate and reconnect with pre-existing DH resources and legacy projects, such as the UK Reading Experience Database (RED) and the Listening Experience Database (LED), see Figure 2.

Figure 2. READ-IT toolkit is used to re-evaluate UK-RED and refresh its data and to reconnect LED and RED within READ-IT framework.

UK-RED is a project pioneering the creation of a database of annotated sources of the reading experience. Compared to READ-IT, the UK-RED’s model of reading and annotations are outdated in terms of granularity on the analysis of aspects of reading, exhausting its value during the twenty-more years of the project [6].

Differently, LED focuses on music and concerts experience. LED experimented with an alternative approach to human curation based on a semantic-driven exploration of experiences, integrating existing data sources of concerts, participants, musical pieces, and performers (Brown et al. 2014). While being inspired by the UK-RED and EuRED, LED did not retain or set any formal or technical connection, e.g., in modelling experience or ontological alignment. Through a retrospective analysis, both UK-RED and LED are been re-connected (with READ-IT and among themselves) through ontology alignments, using REO and CEO as pivots.

To conclude…

On the research side, previous works on cultural heritage sources had been reframed and interrelated by using the model of reading experience. From this new angle, heterogeneous case studies find new relations supporting their combined (re)use in new integrated datasets, e.g. connecting social media comments with marginalia on author’s libraries or school diaries with the author’s correspondences.

While the case studies produce interesting results in their own rights, the collective effort in the ICT driven-SSH led READ-IT project provides -through a massive exploration of historical and contemporary comments as well as the progressive enrichment of a much needed reading lexicon – a unique viewpoint on how ordinary readers record, retrace and share their readings.

Converging on terms and definitions on the “bleeding edge” of reading research is per se a drive for generating new knowledge that is already blooming as spin-off projects. Among the many lessons, the importance of cross-mining cultural artefacts from an intersectoral perspective – leading to improve the adaptability of algorithms for automatic concept detection and linking- appears as a valuable method to unlock insights otherwise hidden in the many folds of disciplinary distinctions.

A last takeaway concerns the type and quality of data produced and the future directions concerning shaping the future role of AI in DH projects such as READ-IT. The careful analysis of sources generates small and “tick” data rich in terms of information content but small in terms of quantity. Small datasets are not compatible with computational approaches based on big-data and statistical techniques. But this is not a limit “per se” : while providing an opportunity to explore new AI solutions capable of discriminate between historical periods, linguistic registries and, overall, being capable to make use of Humanities research outputs, the low-level data approach also increases the capacity of AI to meet the specific expectations and methodological needs of end-users such as scholars in SSH and ultimately the broad public interested in searching the cultural heritage of reading.

Dr Brigitte Ouvry-Vial is Professor of Literature and Information & Communication Sciences at Le Mans University (France) and a senior member of Institut Universitaire de France. She has led several research national or international programs on literary publishing and reading practices in Europe and is the Project leader of READ-IT (https://readit-project.eu). She has recently published About reading seen as a Commons. Participations: Journal of audience and reception studies, 2019,16.; La conception éditoriale du lecteur. Studies in Book Culture, 2019, A.Glinoer, J.Lefort-Favreau (dir.); Lire en Europe : Textes, Formes, Lectures 18e-21e siècles, with Lodovica Braida (Eds), PUR, 2020.

Dr Alessio Antonini is a research associate at the Knowledge Media Institute of The Open University (UK), and a member of KMi’s Intelligent Systems and Data Science group. His work focuses on conceptual design and modelling, web and AI applications in the fields of Digital Humanities, Urban & Social Computing. He is responsible for the modelling activity in the EU Joint Programming Initiative on Cultural Heritage-H2020 “Reading Europe Advanced Data Investigation Tool” (https://readit-project.eu).