The study of history requires a careful approach to unforgettable facts. However, the social assumptions surrounding the events give rise to different historiographical interpretations. Today, algorithms interact with the sources on which historians rely to make or break the past. This discipline, sometimes considered dusty, nevertheless operates a fundamental shift with the digital humanities. Arnaud Chaniac, doctoral student at the University of Montreal, explains how the human sciences forge an intimate relationship with AI. In other words, how does AI do its duty to remember?
PROLOGUE
Algorithms and computing resources open up new perspectives in the world of historiography. Before trained software saw the light of day, historiographers framed their research by manual and therefore time-consuming reading. Today, dhe public institutions such as the National Institute for Research in Digital Sciences and Technologies are making artificial intelligence their spearhead. Scientists have enough perspective to recognize the efficiency of interfaces in the process of retrieving handwritten texts, for example.
It is thanks to the democratization of information technology that software has begun to impose itself. Pierre Mounier, research engineer at EHESS, explains that these practices cover the decisive stages ranging from research, creation, processing and exploitation of results. According to him, the expression “Digital Humanities” (“Digital Humanities”) emerges as a “big tent”, a term encompassing a large number of uses and techniques.
DIGITAL HUMANITIES
This auxiliary science gives the advantage to specialists in History to build corpora with a sharp methodology. Les powerful algorithms bring a refinement. There are many interfaces, including optical character recognition (OCR) software. This translates images of printed or typed text into text files. Trained by the technique of deep learning, OCR works on the same principle as image recognition, except that it works by character recognition. The digital humanities make the best use of the data management technologies that the computer offers us to make big data, to manage quantities of data and to draw firmer conclusions from a historical point of view.
“The use of digital data changes the way we produce knowledge.” – Milad Doueihi, historian at the Sorbonne in Paris
In a conference given by the Sorbonne University, the religious specialist Milad Doueihi speaks of “conversion” in the technical sense of the term. According to him, we are constantly converting sources from one format to another to evolve with the progress of computer science. The computer’s learning capabilities are easy to use on the syntactic part (language construction). On the other hand, semantics remains an obstacle to the modeling of information.
AI AT THE SERVICE OF HUMAN SCIENCES
Archives represent the elementary brick in which the scientific corpus rehabilitates a coherence of History. To process these famous sources, Arnaud Chaniac, doctoral student and agrégé in History, uses optical character recognition software. The development of this textometry software is progressing and becomes influential shortly before the 2000s. Arnaud Chaniac works on war correspondence between Canada and France which covers the entire period of the 20th century. In order to optimize his research, he uses lexicometry software. The robot processes the corpus for it and emphasizes salient ideas that might have gone under the radar.
“The AI is able to provide us with powerful tools to have raw data, but it does not draw any conclusions.” – Arnaud Chaniac, doctoral student in history at UDeM
If the artificial intelligence “ocerizes” the set of texts, it is then capable of revealing the syntactic units which come back. Indeed, the machine spots a pattern that the human eye may not have understood. On the other hand, the interpretative stage remains in the domain of the researcher. He is the only one with knowledge of the context and of the scientific literature. THE’interpretation of the facts is not its responsibility, but remains that of the human brain.
A SIZE ISSUE: INTERPRETABILITY
Called “black boxes”, artificial intelligence models are now proving to be relevant. More the lack of transparency linked to the operation of these models nevertheless arouses real skepticism in the general public. The interpretability of algorithms supposes an ethical issue while artificial intelligence influences more and more decisions of the scientific corpus. Laurent Romary, researcher at the CNRS, brings his Cartesian two cents to this notion. He questions the nature of these traces, the impact that the results will have on humans.
“Our society should integrate IT as part of the necessary training for the honest man. ” – Laurent Romary, researcher at CNRS
The medical prism is very much concerned with this process. Health professionals as well as the patients concerned must understand the progression of the algorithms that have influenced a decision. The notion of interpretability then becomes essential to establish a relationship of trust between artificial intelligence and the user.
Photo credit: Karolina Grabowska (Pexels)
–