What we propose to examine in this article, taking advantage of our field experience as classicists involved in databases projects about the Ancient Greek and Roman World, is the intersection between a theoretical approach, that of a philologist whose field is the study of texts (including material text media such as MSS, papyri, inscriptions) as a vehicle of thought, and a technology-based approach, namely that of computer science research and engineering. As far as we are concerned, this intersection consists of attempts (and difficulties) to speak a common language, and changing philological practices under the influence of the computer-based approach. To put it differently, one of the most important questions we are facing as classicists is how to translate [2] a text-based approach into a data-based one.
Our aim is to show how this dual approach can be the starting point of a critical view of our practice and of our discipline as a whole, including some theoretical questions. [3]
The current development of Digital Philology shows that this issue is widely discussed by classicists whose concern is to enhance ancient studies, broaden their public including students, re-evaluate the nature and method of philology and therefore provide new tools for philologists, such as databases, text encoding and mining, metadata annotation etc. [4] According to the homepage of the Open Greek and Latin Project of the University of Leipzig, “the rise of Digital Technologies is an opportunity to re-assess and re-establish how the humanities can advance the understanding of the past and to support a dialogue among civilizations.” [5] Ambitious though this definition may be, its aim is to shed light on the paradigm shift which is one of the main challenges we face as classicists educated with the “old-fashioned” methods: to complete and update our education.
The modeling of data in order to create or to update a database of classical philology/philosophy can be regarded as an occasion to question the concept of “author/authorship”, to present it digitally, comparing ancient Greek/Latin and modern authorship, and finally to offer database users effective search tools in order for them to acquire reliable information about ancient authors.
The concept of “authenticity” is also to be examined: in a digital process, what criteria are to be used in order to distinguish a spurious from an authentic work, an author from a pseudo-author; also, how to model this difference in order to offer the user reliable information about related scholarly debates. Classification of disciplines pertaining to Antiquity has already been a controversial matter of discussion for classicists and more prominently for bibliographers: the digital transition could be the occasion to rethink the concept of “discipline” (ancient vs. modern) and to wonder to what extent classification models reliable for the print are to be adapted for use in a database.
Our aim is also to fuel the debate about a critical approach of digital humanities (DH) in general, discussing their alleged power: are they to be regarded as the solution to our editorial or in general scholarly problems, including the problem of the cost of print periodicals or books? What is the actual contribution of the databases to the renewal of classical studies and the broadening of their audience?

1. Databases: Definitions and Questions

A database, in the computer sense, is a collection of structured data/piece of information [6] stored in a persistent way in a given computing format. Defining such a format is far beyond the scope of this article. This collection of data is managed by a database management system (DBMS) allowing data manipulation (such as read, insert, update, delete). SQL (Structured Query Language) is designed to manage data in a relational database management system (RDBMS); the relational model is used indeed by our databases discussed here (IPhiS and RSPA). A number of SQL-compliant RDBMSs are available to the humanist scholar. Like many humanities computing projects, we use MySQL, which is open source and guarantees interoperability.
Stephen Ramsay describes the process of the design and implementation of a simple relational database. [7] He underlines the specific interest of this model for the data to which humanist scholars are accustomed:
Where the accountant might express relations in terms like ‘has insurance’ or ‘is the supervisor of’, the humanist interposes the suggestive uncertainties of ‘was influenced by’, ‘is simultaneous with’, ‘resembles’, ‘is derived from.’ Such relationships as these hold out the possibility not merely of an increased ability to store and retrieve information, but of an increased critical and methodological self-awareness.
Our definition of a database is also a more human-oriented one: a database is created by an institution to be used by a community of people, pertaining to a specific field of knowledge. [8] Our field is the Digital Humanities, and, more specifically the so-called “digital philology”, including its technical aspects, namely the digital editions using the TEI standards. Questions such as: “to what extent a corpus of TEI digital editions of texts is or is not an actual database?” are beyond the scope of the present article, but eventually a part of our future reflection and scientific practice.

2. A historical Survey of Two Databases

The heritage of the Index Thomisticus by Roberto Busa is mentioned by almost all the scholars dealing with digital humanities, and especially databases nowadays. [9] Yet each database has its own history, whose study could constitute a contribution to a better understanding of its evolution or its obsolescence in the digital era. This is the reason why we aim to present a brief overview of two databases: the Année Philologique (APh) “an analytical and critical bibliography of the Greek and Latin Antiquity”, and the Répertoire des sources philosophiques antiques (RSPA) in its initial form.
The APh, a continuation of Jules Marouzeau’s Dix années de bibliographie classique (1914–1924), actually started in 1929, when Juliette Ernst published the first tome; it has appeared every year since then. [10] The main idea of its creators was to provide scholars with a complete bibliographical tool, “not only a repertoire but a book intended for reading” (p. XXI), a book where one could find updated, verified, and classified international information about publications on every aspect of the Greek and Roman world, including non-philological disciplines such as archaeology or numismatics. What was then a pioneer idea – and still remains reliable nowadays as a kind of “interdisciplinarity” adapted to the classics – is that all those disciplines were interwoven, and therefore the most important contribution of such a bibliography made by scholars for other scholars was to promote not only specialized knowledge, but also interconnections and interchanges between different fields. It was a quite difficult goal, a significant achievement, at a time when only manual records were available. In the eighties, the first computer records were implemented in order to facilitate the work, followed by the complete internal “computerization” in 1995 [11] , and the public CD-ROM version, the Database of classical bibliography (DCB) launched in 1997.
In 2002, after a three-year experiment (the free website, 1999–2002), the first on-line (not yet actually digital) version, the website, was launched, exclusively by subscription. Its main purpose was to provide remote access to the data of the printed volumes, from the very first one (1929) up to the most recent. This was the reason why the presentation of the data followed exactly their print equivalent, including author, text and discipline classification, but not the detailed tree-part index of the printed volume (and of the 4D computerized version), index geographicus, index nominum recentiorum, index nominum antiquorum, whose implementation in the website was delayed, due to their alleged complexity.
Despite some updates intended to facilitate research or to give access to citations of ancient texts, the website was not actually a digital database, but, as it is mentioned in its homepage, “the digital version of the Année Philologique” (i.e. of the print book). In 2009–2010, the debate about how to manage this dichotomy and to transform the website into a digital database according to the new standards of accessibility and interoperability without lowering quality shed light to some unabridged disagreements between some of the project’s collaborators.
In 2013–2014, the former editorial team of the APh decided to abandon this project and create a new, independent and thoroughly digital database, IPhiS (Information philologique et scientifique), intended for those interested in editions, translations and commentaries of ancient Greek and Latin works, as well as their ancient (Hebrew, Syriac, Aramaic, Arabic), Medieval and Renaissance reception. [12]
This work is currently in progress. Its comprehensive description, including detailed information about software and the data model is to be found in the online technical report. [13] For us, it is an occasion to interact with our colleagues, the computer scientists who undertook the development of applications and the deployment of tailor-made solutions suitable for classics. Our prevailing idea is to enhance the digital applications so as to completely adapt them to the needs of scholars and students of classics (the short-term step), or to a larger public interested in classics (the long-term step).
The RSPA was created in 1994 as an exclusively on-line tool powered by the same 4D software as the APh and following the same logic of coexistence of a printed and a web version, even though the printed version never came to light. The RSPA aims to bring together the primary antique philosophical sources, from the pre-Socratics to the late sixth century AD, providing, for each document, a list of editions, translations, commentaries and various on the spot research tools. Each book is first closely analyzed and then thoroughly presented. [14]
In 2016, the website of the Répertoire ( was entirely renewed from a technical point of view. The ultimate result was to abandon the 4D local application and to upgrade to a database following the most recent digital standards, focusing on open source tools. It could thus become gradually a web service interoperable with similar databases. At the present moment, we are implementing changes that completely renew the web interface, in order to offer a more pleasant and easier access to the data.

2.1 An online version which is a copycat of the print one: the example of APh

What was exactly “like the print book”? First of all, the very concept of an “online APh”, which was not a database. When the website was launched in 2001–2002, it was intended to be a communication tool, in order to increase the visibility of the bibliography (and hopefully encourage people to buy the book). The only notable difference was the existence of a full-text search option, while the other options reproduced the structure of the book, and not of the 4D work database: for example, concerning modern authors, 4D offers the opportunity of a tree structure which includes the authority record and the variants of the author’s name, while in the online APh the authority and the variant records appeared separately, in a linear structure, as if each one were the record of a different person.
One of our main concerns was the classification of the APh records. The “APh plan” was the basis of the structure, and this plan was conceived initially for the print book. Concerning ancient authors, there was an alphabetic classification, rather problematic in case of “atypical” authors/texts: for the sake of this alphabetic classification, there was no distinction between: a text such as Consultatio ueteris cuiusdam iurisconsulti and an author such as Corinna or Corripus [15] ; a “collective rubric” such as Medicorum scripta and individual authors such as Megasthenes Historicus and Mela Pomponius. [16]
Discipline classification into the appropriate rubrics and sub-rubrics raised a lot of scholarly debates between the APh collaborators. When we first started to discuss the transformation of the APh online into an actual digital database, some colleagues prompted us to abandon any classification in order to simplify and thus provide more quickly the latest upgrades of the bibliographic information; others were attached to the “old fashioned” system of classification, the rubrics of the print book; finally, the majority of us, based upon the needs expressed by the readership of both the print and the online APh, were about to transform rubric classification into a well-selected list of keywords and metadata, according to the XML standards for example.
Another similarity between the APh online and the print book was the workflow, an annual process of collection of data in periodical articles, book reviews, library catalogues, books or offprints sent by the authors etc. At the end of the process, all the records were reviewed by the editors, then exported for the print book and the website. An annual meeting with the participation of the local editors (USA, Italy, Spain, Switzerland and Germany) was organized by the chief editors (the French team) in order to discuss and update the general editorial policy. This workflow was changed from annual to quarterly with the argument that the online publication cannot suffer delays: a set of temporary records were exported to the website every three months, partially verified by the editors; the same records, re-examined by the editors, were published anew at the end of the academic year, their status changed from temporary to permanent. This “hybrid” system was unsatisfactory to the editors, considerably increasing their workload, and somehow unsatisfactory to the users, because of the temporary (not yet completely reliable) status of the records. The publication schedule of the print book remained annual. One can point out many similarities between our experience and the one related by Claire Warwick [17] concerning the publishing of their archives by the Portsmouth record office: the different schedule between online and print publication; the “concurrent” publication of the same data in order to resolve the problem of the delay; the advantage of data combined online with other information enriching them and making them more attractive to the user. Warwick’s suggestion to “publish [online] a release of the data without an introduction, even if temporarily” reminds us of some strong disagreements between the supporters of the traditional schedule in the name of reliability (but this claim led once or twice to the belated publication of the book), and the promoters of a direct online publication, less interested in reliability than in real-time information tracking. We totally agree with Warwick’s point of view about the “intellectual adjustments” necessary “on both sides”, especially the fact that “those interested in computing are reminded that more traditional elements of the intellectual culture we are working with cannot be ignored if the project is to keep the good will of the scholars who work on it.”

3. Examples of Databases as DH Tools for Classicists or Other Scholars Interested in Antiquity

The examples put forward in this part originate in our experience of transition from a print copycat database to an actually digital one: how to add new content; how to value the ancient one, if any, or how to emancipate ourselves from the ancient content patterns; how to create a new design; which public is targeted and how to meet their needs.

3.1 Becoming familiar with the public of a database

The first survey in the history of the APh took place in 2012, eighty-two years after its creation. It was not a survey per se, but a detailed analysis of the commentaries left by the people who signed the petition in defense of the German office of the APh. [18] This office was then under the threat of being closed down, because the German academic institutions were on the point of changing their funding policy and stopping funding long-term research. The petition was signed by 4,400 persons worldwide, not only scholars and students of classics. For us, it was the first step towards a community-based database that fulfills the desiderata of its public.
The majority of the commentaries focused on the importance of a database prepared by qualified scholars, offering accurate information and thus saving time for other scholars, classicists or people interested in the study of antiquity. Despite the huge amount of scholarly information available online (or perhaps because of this), classifying, verifying, analyzing the information, and providing guidelines and orientation was a highly appreciated service. The academic character of this specific database, as opposed to corporate ones, was also underlined:
We live in an era in which academic databases have been purchased by or subsumed by corporate entities, where sales figures trump everything. It is critical to maintain the intellectual integrity of this core resource for the study of Greek and Latin antiquity. [19]
The second survey was a questionnaire survey launched in 2013. Its target public was the signatories of the aforementioned petition who received it by email; it aimed to improve the APh website according to their expectations and make it more user-friendly. Some of the questions concerned the different search options, the incorporation of more information, the design and intuitiveness, and other topics.
We received more than 500 answers to this questionnaire during four months (from May to August 2013). It was not surprising that the users’ expectations and suggestions coincided with our concerns: they asked for visible classification of the records, improvement in indexing, an interactive user’s manual, and, above all, links to other specialized databases (indexes of citations, open access online texts, MSS or bibliographical databases etc.). Although this survey did not reach its goal because of the reorganization of the APh project a few months later (cf. part 1 of this article), it was important to keep in touch with the community and to constantly update our database according to their demands. We plan to take some of them into account in the upcoming database IPhiS.
The RSPA survey took place at the beginning of 2014, within the framework of its complete renewal. It contained questions concerning the evaluation of different aspects of the database, proposals of improvement, and of course requested wishes or suggestions from the users. We noticed that the most satisfactory aspect of the RSPA was the accuracy of its content; the least satisfactory one, the interface design. Other aspects, more or less satisfactory, were: constantly updated content, ergonomics, and community work. The majority of the results have been taken into account in the new version of the RSPA database, including the creation of a mailing list.

3.2 Interaction with the users of a database

Is the model of the data of noticeable impact on changing the user’s attitude? This question is fairly difficult for us to answer, because, in the majority of cases, the contact between the administrators and the end users of a database is limited to the database interface, which is more often the task of a web designer. According to the TaDiRAH (Taxonomy of Digital Research Activities in the Humanities) [20] terminology, designing is “the development of a user interface with which the user is able to interact […]”, including “the user experience”, i.e. “the person’s perceptions of the practical aspects such as utility, ease of use, and efficiency”. As far as we are concerned, efficiency is of crucial importance, and therefore sophistication is to be banned, even though it sometimes corresponds with the actual data model.
When the “fuzzy search” was implemented in the APh online, the main argument was to facilitate the user’s interaction with multilingual data (5 different languages), adapting the online version to the actual nature of the data. Yet it proved to be quite inefficient, firstly because of its complexity, and secondly, and most importantly, because there was no explanation of the use (and utility) of this kind of search. Feedback from users showed that the majority of them did not even know the existence of such a possibility, let alone its expected (or unexpected) results. The implementation of the “fuzzy search” has been decided by computer scientists who regarded it as the technical solution to the problem of multilingualism. Yet, to what extent the “best” technical solution can also be the most suitable to the end user? As Claire Warwick points out [21] , “we cannot, and must not, try to tell users what they ought to like, need or use. We also cannot expect people to abandon working practices instantly when they have suited them well over many years and, in some humanities fields, generations.” An almost perfect data model and a sophisticated interface taking into account the latest functionalities may turn out to be useless when, as in our case, the “functionality is designed for the kind of users the designers want or can imagine” [22] , i.e. not necessarily the actual ones. Time is our “ally”: technicians and scholars must take time to exchange opinions and benefit from each other, to change practices if needed, and finally to conceive digital projects (databases, editions etc.) which meet the needs and wishes of both their “makers” and their users. A “belated” project is sometimes preferable to a hasty one.

3.3 The content of our databases: some concrete examples

To abandon a database using proprietary software such as 4D and create a thoroughly new, open access, “home-made” database conceived by a computer scientist under the instructions of philologists is not an easy task. One of the core difficulties consists in decision-making concerning the new database: how to distinguish between “useful” and “useless” data? Is it a technical or a scientific choice? Who decides? What are the criteria, including the cost in terms of time and/or money? “Storage is cheaper than decision-making” [23] ; yet the problem is not merely the cost of decision-making, but the person who makes decisions and the one who depends on those decisions. It could be argued that a database is not a storage area; its purpose is to help scholars and students to save time and learn more about a given subject. This is the reason why creating a database takes time, and needs collaboration between classicists, computer scientists, and/or web designers, communication managers etc.
To make this general point of view more understandable, we will now focus on some examples we came across during our work on the two aforementioned databases.
Code shifting [24] in order to “translate” classical data in a computer-understandable language helps classicists to question some key concepts such as authorship, authenticity, as well as criteria – and reliability – of item classification.
This is not new: our experience of a database primarily conceived as a printable model (as it is already mentioned, its digital version has been for a long time a kind of carbon-copy of what was destined for the print book) shows that those questions have always raised scholarly debates (not only) among classicists.
The author topic and the author-text relationship are a part of the scholarly debate in modern literary theory. [25] This debate is still open. Yet our aim here is not to discuss the author question in general, but to focus on ancient authors and ancient texts, and, more importantly, to examine how the author question is to be answered through the digital process of modeling the data to be implemented in a database [26] . We are supposed to give answers suitable for the “formal language” of the computer. This code shifting helps us to deepen our understanding of the aforementioned key notions and to present them in a different way than in the past.
For example, a pseudo- author was then not supposed to have a different entry than the original one; the distinction was to be made at a second level (thanks to elements such as title, abstract, text citations etc.); this would not be very complicated when one consulted the annual printed book of the APh, but it becomes time-devouring, if not completely unproductive, to proceed like that when consulting larger amounts of information included in a database. We thus are obliged to sharpen our knowledge on who can be regarded as author in the ancient world, in order to produce a more precise classification and more efficient research tools intended for the users of our database.
Let us examine some topics about ancient authors more closely: are ancient authors different than modern ones? What is the difference – or does such a difference actually exist – and, if so, how to figure it digitally? Yet the authorship question is not only a mere representation problem, pertaining to the digital form of data. It is an epistemological and methodological one: what does it actually mean for a text to be attributed to an “author”? One can find similitudes between ancient and modern authors, both “creators of texts” (but what about “authors without text”, such as Socrates, Pythagoras?); yet in the case of philological databases (such as the new database project IPhiS), the most important element to represent digitally is not the author, but the title of a work, which is supposed to be less misleading than the name of an author or a purported author. [27]
The question of authorship (the act of creation) does not depend exclusively on the existence of an identified author-person: there are indeed many anonymi in ancient literature, either unknown authors or purported authors bearing this name (e.g. the Anonymus Medicus Londiniensis [28] or the Anonymus Iamblichi (TLG 1134), or texts by unknown author(s) (e.g. the Oracula Sibyllina TLG 1151 [29] ).
How can we deal with such cases? One can use the association of the Latin term anonymus and an author name already included in a list (suitable for entries such as the Anonymus Iamblichi, linked to the Iamblichus entry); or include a list of anonymi in the general text/author list (suitable for entries such as the Anonymus Medicus Londinensis, which cannot be linked to any existent entry). The question is how a user can find information about it.
How can we handle collections, compendia or anthologies, for example the Greek Anthology (TLG 7000), which contains epigrams by some identified authors and others by unidentified ones?
What about the distinction between “genuine” and “spurious”, concerning authors and texts? What about its reliability and its place in a database? What about pseudo-authors? Can we resolve the problem only by adding the prefix “pseudo-?” [30] There is considerable debate among scholars about “forgeries” in antiquity [31] and computer scientists should be aware of it.
The methods used by ancient scholars in order to find out the authors of some works were not so different from the modern philological investigation (or Quellenforschung). Blum [32] underlines that for the Greek, “to discover the real author was […] tantamount to the unmasking of a forger”. Our modern philological concern, including the digital representation of the “author”, is more about pseudo- authors and pseudepigraphic texts than about real forgeries. In fact, practices of forgery are complex [33] and not easily classified; moreover, there are numerous literary phenomena closely related to forgery: literary fictions, pen names, homonymity, anonymity, false attributions, plagiarism, fabrications, or falsifications. [34]
The author debate can be related to the textual one, i.e. the question about the alleged “original” text of an ancient work or at least of an accurate and stabilized critical edition of this work, the closest possible to this “original” text. An interesting analysis on this issue is provided by Claire Clivaz, who examines the impact of the digital culture on the critical editions of the New Testament. [35] The IPhiS database project’s endeavor, like the one of the RSPA, is to put forward information about the editions of ancient texts, and put aside the other bibliographical data. Although these databases focus mainly on titles of works and not on authors, the question of the attribution of a given text to a given author and of its digital representation still remains. According to Clivaz, who also quotes Umberto Eco’s point of view on “variants”, philological study, and textual reconstruction, the existence of a “stabilized text attributed to a specific author goes back no further than the middle of the nineteenth century”, and is more suited to the print culture than to the digital one. [36]
Yet, even without looking for an “Ur-Text”, or focusing on the author entry as the main one in a database dealing with classics, the link between an author and a text, as well as the question of textual authenticity must not be disregarded.
Repositories of ancient texts encoded with the TEI/XML standards, as the Open Greek and Latin Project, make texts accessible not only to scholars, but also to students and a broader public all over the world. Moreover, texts included in the OGLP philological repository “are published in GitHub on an ongoing basis” and the user is encouraged to check updates. Collaborative editions or repositories are quite different from the databases examined in the present article, but they face the same question concerning the text-author link, even though their main purpose is not to model this link digitally as it is the case in our databases.
Clivaz’s tantalizing concept [37] of the “triad authors-scribes-writers” which, according to her, is more accurate in the digital era than the classical (old-fashioned?) triad “author-text-reader” and upgrades the role of the reader as a text-maker and the role of the ancient scribe/scriptor as the one “who writes and reads, reads and writes, and rewrites” is hardly applicable to the kind of databases we are dealing with, but can be a challenge for us as classicists and “digital humanists”, because it introduces the concept of versatility and constant change: if there is no fixed text or author, what would be the pivotal table in a database dealing with Antiquity?
Fragmentary texts are another issue difficult to deal with: no work is to be entitled fragmenta; actually, it is not a title but a type of text, this is the reason why we are planning to change it in the new version of the RSPA. A separate entry called fragmenta and/or testimonia in the author/text list, associated with a given author or a work title could provide useful information. In the IPhiS database project, a “level” selection (complete work, part of a text, or assemblage of texts) and a “characterization” selection (fragment, excerpt, isolated text, variant, anthology, florilegium or corpus) are included in each ancient text reference.
Although the topic here is not to discuss electronic editions of fragmentary texts or collections of fragments but to find the adequate way to reference editions of this kind in our databases, we will take a brief detour into the Sharing Ancient Wisdom (SAWS) project aiming to use TEI/XML standards applied to Greek and Arabic collections of sayings, in order to demonstrate “the relationships between the collections and the texts on which they drew, and between the collections themselves”. [38]

3.4 Some cases of databases in other humanities fields

We will provide here some examples of databases which we regard as models (or counter-models) and whose subject is not Classical Antiquity, but various humanities fields such as: Renaissance studies (prosopographical and textual data); history of the print book; Arabic philosophy. We limited our selection to projects we know quite well, because they belong to our “research ecosystem” (philology, philosophy, textual history, history of art and musicology). Another interesting example, which is not a database strictly speaking, but can be included in this category, deals with the history of mathematics and focuses on the author issue from a non-antiquarian point of view.
Renaissance and early modern studies
Epistemon includes literary corpuses of the French Renaissance. Its first version (1995) included some texts in HTML, then, since 2001, facsimile reproductions, and finally, since 2007, “after some years of experimentation”, XML/TEI standards are used for all the new texts implemented in the database. Word index research uses the PhilologicTM tool ( developed by the ARTFL Project of the University of Chicago. The interest lies in two reasons: first, the transition between facsimile (or OCR) versions and new encoding (of new texts, but also of the ancient ones) using the TEI standards; secondly, the fact that it was certainly necessary to find out how to encode texts from the beginnings of printing (sixteenth century) through its normalization (nineteenth century), when typographical standards were not yet stabilized. The aim of the “TEI encoding for Renaissance and early modern texts” manual issued from this work ( is to enrich the general TEI standards. The content can also be displayed in image mode (reproduction of the ancient books).
BUDE: this “all-in-one” database, a part of the international networks Europa Humanistica and Tradlat, includes information about the transmission of texts “from Antiquity to Renaissance”, covering a period from the end of the Middle Ages to the seventeenth century. Its main aim (and interest for the scientific community) is to centralize all available information, in order to facilitate research on manuscripts, characters (prosopographical data), print editions, and geographical information related to the humanist activities, including a bank of images (portraits, samples of handwriting, plates of the MSS or print books). One must register in order to have access to the content (registration is free); the users are encouraged to contact the administrators, if they want to participate in the enhancement and enrichment of the data: the “community” is thus under control, because the nature and variety of the data requires verification by specialists. Unfortunately, the database has not been updated since 2015, year of retirement of its chief administrator.
Arabic philosophy
ABJAD is the result of a European research project (funded by the ERC, the European Research Council), launched in 2011 and entitled “Philosophy in context: Arabic and Syriac manuscripts in the Mediterranean” (Phic). The aim of both the project and the database is to study the circulation of ancient Arabic and Syriac philosophy in the Islamic world and in Europe through the MSS. The database is supposed to provide not only a list of MSS discovered in libraries or other institutions, but also a “complete ID” of each one: codicological description, information about content, owners and places, incipit and explicit, marginalia by various hands. Bibliographical and biographical data are included. There are 6662 documents in the database: given the international nature of the project, collaborators in different countries add new content to the database. MSS being the main vehicle of knowledge in the Islamic world until the ninth century, such a database is a wonderful idea. Yet the interface is not at all user-friendly. Furthermore, although a registration is required in order to access the content, no information is available about how to register as an “invited member” of the group. This is a typical example of how invaluable information and rare data gathered in an extremely meticulous way by highly qualified volunteers in various countries remain accessible only to initiates.
Book history
Maguelone includes typographical ornaments used by the printers in the eighteenth century, in order to identify the French and European printing houses which published books under fake addresses or names, and to retrieve the real locations and names comparing the authentic with the clandestine publications issued by the same printing house. The database was conceived and developed by a book historian and image specialist as a part of the international bank of printer’s ornaments “Passe-Partout” The interest of Maguelone relies upon its originality: it combines prosopographical and geographical information about printers, classification by printing house or ornamental motif, as well as a list of publications issued by each house, including their title pages. Technical support is provided by the aforementioned international network, which is also responsible for the implementation of new data: although the “Passe-Partout” network is “based on cooperation and exchange” (free and open access, no registration required to search in the database), participation in the “community” is restricted to authorized “researchers or institutions”. Necessary though this restriction may be, it does not encourage amateur participation.
Music and performance
NEUMA is a digital library with musical corpora, i.e. partitions that are rare or hardly accessible. It uses the MusicXLM/MEI standards to provide reliable editions, and includes “innovative tools of analysis and treatment of the musical notation”. Its aim is to facilitate access to those corpora, and therefore to favor their dissemination, among musicians or teachers as well (pedagogical use). It is a collaborative database: one can open an account on the platform in order to “create new corpora, enrich them with new scores”, or “annotate the existing ones”. It includes sound and offers access to the MEI files of the partitions, providing a complete overview of each work.
Chronopéra includes the repertoire of the Paris Opera from 1749 to 1989 (the year of the opening of the Bastille Opera). Its target public is: “opera and ballet lovers” and scholars, “historians of music, of dance, or performance”. Content is indexed in a quite simple way: date, title of work, composer, librettist, choreographer, information about a work (of course those criteria can be combined). The only problem with this interesting and original database is that only subscribers can access the whole content. Fortunately, subscription is free.
Bourbaki, Nicolas: an author’s name? The Bourbaki archives online
Historians of mathematics, philosophy, and science know what hides behind this name: the “Bourbaki archives” are not the archives of some famous mathematician, but of a group of French mathematicians, which was founded in Paris in 1935, and issued a series of books (first published in 1939) entitled Elements of mathematic. [39] There is neither an author (person) named “Bourbaki”, nor any “personal” archives to be digitized, but an amount of collective archives of this group, including the ones relating their everyday activity. The current “Bourbaki database” is the result of many years of reflection concerning the digitization of those 400 documents (16 000 pages): the PDF format was considered more appropriate to the nature and state of the original documents; the classification of the documents put forward the functioning of the Bourbaki group and their method of work; an “up to date bibliography pertaining to the history of the Bourbaki group”, “as comprehensive as possible” was implemented in the database; the appropriate keywords were added under the guidance of a specialist in the history of mathematics; and a brief introduction presents “Nicolas Bourbaki” and some links to the history of the database and its technical biases.

4. Interdisciplinarity

Creation of a new database or adaptation of an old one to the digital constraints and requirements is a difficult task that needs paradigm shifting; classicists must combine their traditional work methods, “craft-oriented”, where precision and detail matter (methods that take time and require in-depth analysis of many different sources or many different material supports), with the formal grammar and the digital rules they are not familiar with. This is at the same time fascinating and frustrating. Yet, classicists, even the more technically skilled, are not computer scientists: this is the reason why collaboration is indispensable.
Interdisciplinarity is not a mere juxtaposition of skills, but an actual interchange of methods of work and of points of view. [40] It takes time. Although digital projects nowadays are supposed to be achieved in less time than the old print indexes and the like, our experience leads us to put forward a kind of “slow DH” or “slow databases” [41] : we need to devote more time to this debate, not to skip it.
Our institution, the French CNRS (The National Center for Scientific Research), since its creation in 1949, has focused on interdisciplinarity and encouraged collaborative research. An interdisciplinary commission n°53, whose mission is to analyze “methods, practices, and communications of science and techniques”, as well as to recruit and evaluate scientists involved in these areas was launched in 2012 in the National Committee for Scientific Research. [42] According to our personal experience (one of us is an elected member of this commission, participating in its work and in the condition report published in 2014 [43] ), evaluating interdisciplinary projects and thinking about interdisciplinary practice are fairly fascinating, despite their complexity and despite the difficulty to cross such different fields as classics, engineering, biology, or IT. One of our main concerns then was not to be biased in favor of one discipline or another. Another one is to question the very concept of interdisciplinarity as an enrichment and enlargement of a given disciplinary field, as well as a way to transcend disciplinary boundaries. [44]
Given the diversity of interpretations of this concept, we feel that the best way to discuss it is to refer to concrete attempts to describe what one can actually expect to find in this kind of research. The keywords describing the aforementioned interdisciplinary commission are quite useful guidelines. Let us give a (by no means exhaustive) list: “historical, sociological, philosophical, geographical approaches of science and technology; ethics and scientific responsibility; science and power; risks and controversies; IT governance and security; sociological and geopolitical approach of digital networks; participative science, co-elaboration of science”. [45] What is of particular interest here is that crossing the disciplines not only encourages synergies, but also contributes to the emergence of innovative theory and practice. [46]
To stay within the scope of the present article, classicists involved in databases often wonder if it is necessary to learn IT, namely programming languages or encoding techniques, in order to establish effective communication with computer scientists. [47] As classicists, we would like to reverse this challenging question: what about computer scientists being introduced to classics? The ways of thinking of classicists can indeed change under the influence of the digital humanities; what about those of computer scientists?
“Interdisciplinary humanities” [48] , more specifically the crossing of disciplines pertaining to the study of Antiquity, is a less widespread aspect of interdisciplinarity. Marouzeau’s pioneer conception of the APh as a bibliography whose main concern was to promote the crossing of disciplines pertaining to different fields (such as philology, archaeology, history, epigraphy, paleography, philosophy etc.) was the key to a good understanding of the ancient world, as well as the main scholarly contribution of the APh.

5. Crowdsourcing in the DH: Advantages and Limits. Community-based science [49]

Creating a community of scholars through users’ contribution can be a real advantage of a digital database, but also a rather risky path: what about verification of the reliability of all those contributions? What about security issues? What about evaluation and peer reviewing in an open access digital environment?
Despite those challenging questions whose answer is not yet conclusive, crowdsourcing can help us to face the tantalizing problem of exhaustiveness, which is inherent especially in bibliographical databases, without jeopardizing the qualifications and skills of the scholars or engineers working on it.
Although one of the first experiences of crowdsourcing was the Oxford English Dictionary, the process is not as widespread in the humanities as in environmental or astronomical research. Amateur observation allows not only the collection of an incredibly large amount of data, but also the enhancement of some existent databases or the creation of new ones suitable for new kinds of data. In social sciences, native knowledge is now regarded not as mere folklore, but as a piece of scientific evidence. [50]
As far as we are concerned, we first faced this problem in 2013, when we were about to upgrade the APh website and to launch the discussion on “the digital future” of this bibliography. The idea of a community-based collection of bibliographic information was undoubtedly the first (non-controversial) option to consider in the APh as well as in the RSPA. Then emerged the problem of the implementation of new records in the database: a quite seductive idea was to open it, in order to encourage not only scholarly participation, but also amateur interest about classics. Public-friendly though this solution may seem, it could lead to the marginalization of skilled specialists whose task of data verification and editorial work could be minimized. In our opinion, a balanced cooperation between all the members of the community is to be encouraged.
When the database editorial team is limited to a few scholars who work together (physically or thanks to remote connection), as it is the case of the projects described here and of many other antiquarian databases, the security problems can easily be resolved. As Ramsay points out [51] , in “database management”, the “privileged users” who can do any operation needed, should be distinguished from the “broader but not unlimited” set of users who can add, delete, or change data. The implementation of the appropriate security systems is another field of cooperation between developers and scholars, because scholars indicate the nature of the data to be protected, and developers transpose it into technical procedures. We agree that “ the administrator will take time to study the security model […] and implement the appropriate procedures”. [52] Yet how to combine a satisfactory level of security on the one hand (that means restricted access to fields or functions), and on the other hand openness to a broader community of scholars, as well as collaborative building of a database thanks to the contribution of an increasing number of users? Time is needed to study how to encourage community building without compromising security and reliability of the data.
Although it is difficult to stress a parallel between a database on classical antiquity and a “fab-lab”, an open community of specialists and enlightened amateurs could contribute to improving the quality and increasing the quantity of data, and to upgrading software or database management systems. Certainly this kind of community would not contribute to the substantial economies expected by those who would like to replace permanent scientist positions (whose skillful intervention can prevent errors or misinformation, especially in case of incessant workflow, i.e. daily implementation of new records in the database) by precarious ones, or to abandon crosschecking at all.
Merging databases and increasing their visibility (in a context where all labs/scholars nowadays are conducting DH research), harmonizing software tools, coordinating and propagating good practices: this is the role of federative facilities such as the “Very Large Facility” Huma-Num, launched in 2013 and currently supported by the CNRS, the university of Aix-Marseille and the Campus Condorcet. Digital platforms like this provide technical support and safe storage, but also training and guidelines for data processing, interoperability etc. In other words, their mission is to bring and maintain “worlds in contact”. [53]

6. Critical Approach to DH

The work of Roland Reuss [54] has a provocative title and content. Its critic of the digital world, which is supposed to “hypnotize” us is rather exaggerated and somehow unfair, considering the advantages of digital technologies (spreading knowledge, minimizing spatial distance, facilitating access to an amount of data and works as well as communication between scholars, students, and public, saving space in libraries and preserving ancient books or MSS). It is admittedly difficult to follow Reuss in his absolute defense of the print as opposed to the intrinsically harmful digital. The digital world is not absolutely undermined by big capitalistic corporations such as Google (or, in our field, the predatory editors or the providers of “customizable services and products [55] ”). Yet we think that what is digital is not the solution to our editorial or in general to scholarly problems, nor the absolute remedy to the cost of a print periodical or book (quality digital publications are costly indeed): one must be fairly open in order to evaluate the improvement due to digital technology and innovation applied to humanities, the enlargement of perspective due to the dialogue with computer scientists, the emergence of interdisciplinary fields and the new services one can provide to the scientific community and to the public in general.
Certainly, databases, like any digital project and electronic device, are facing technological obsolescence despite the fact that computing research entails constant infrastructure research, in order to be up to date from a technological point of view. The solution to this problem is certainly not to call into question technology as such, but to take action in coordination with many other groups of people fighting built-in obsolescence and promoting durability. In our opinion, the main risk for our databases is less technological obsolescence than fragility and instability due to short-term funding of many a research project, turnover and precariousness of personnel, changing modes and trends, and proliferation of publications due to the disastrous effects of “publish or perish”, i.e. quantity to the detriment of quality.
“Digital is powerful”: some scientific policy makers regard technology as a means to increase their power through control of positions, profiling, funding etc. Some scholars are somehow following them in this direction, hoping that, in times of underfunding of the humanities, projects featuring digital issues would increase their chances to receive subsidies. This has nothing to do with the intrinsic value of the digital humanities as a research field, not only as a tool playing an ancillary role or as a “side-effect” of the introduction of technology in the “traditional” erudition scholarship. Yet the question is not about how to increase “power” in a competitive world, but more about how to broaden our audience without falling in the pitfalls of “trendy” science.
“Digital humanities make classics more attractive to scholars and students”: this kind of assertion implies that classics are by definition boring or at least less attractive than other disciplines. Does digitization turn classics into a “sexy” discipline? To answer such questions one needs to take into account the role of classics in the high school and university curricula for decades. What made classics less “sexy” than, say, mathematics, engineering or biology, was first of all the idea that their target public was the students’ elite (classics as a means to select the best because special requirements were needed for this kind of study; according to this point of view, classics were not for everyday people); then, the rather widespread idea that the study of antiquity is useless (an utilitarian view of disciplines worth to be taught in schools or universities) because they have no practical applications. Are digital humanities the face of the “modern” humanities in the “modern” era? Are they to be opposed to some “old-fashioned” philology doomed to extinction in the near future? [56]
In the aforementioned report of the interdisciplinary commission 53, a chapter is devoted to a “reflexive approach of digital humanities”. [57] This chapter focuses on data management (analysis of extended or perpetually upgradable textual corpuses, collection and preservation of oral languages without textual evidence), network analysis, diffusion of knowledge, and their influence on community formation or transformation. It also focuses on ethical questions such as privacy or data selection and opinion making… The majority of those questions are fruitfully addressed thanks to the interaction between the “traditional activities of the humanities, such as edition, archives, textual analysis” on the one hand, and, on the other hand, the “computational approaches” which introduce innovation [58] and provide new insights into those activities thanks to formal logic and models from computer science, physics, or mathematics. As classicists, we do think that such insights are highly beneficial to our scholarship, not only because they contribute to telling old stories from a new perspective, but also because they shed new light on interdisciplinarity and help classics to reach a new public of scholars, students, or amateurs. Digital is not about an alleged “power” to acquire, but about empowerment of classics and classicists.
To conclude, we do think that DH in general and databases in particular can open up new paths towards a brighter future for the study of antiquity. We do remain optimistic and feel comfortable about the evolution of classical scholarship in the digital era. Nevertheless, we can agree neither with those who insinuate that digital humanities are going to replace “traditional” humanities, nor with those who think that in the upcoming period DH will be a “must have” for any classicist. In 2010, the authors of the Manifesto of digital humanities [59] put forward their heuristic perspectives, the new opportunities offered by this “transdiscipline” (concerning, among others, preservation, valorization, and dissemination of knowledge), the building of “a solidary, open, welcoming and freely accessible” community, but they insisted that “DH are not tabula rasa”. In 2014, a report by Marin Dacos and Pierre Mounier, provided a fairly detailed presentation of the international DH community. [60] An interesting point in this report is the geographical and cultural diversity of the DH community, as well as the need for “hybrid” groups combining research and engineering, scholarly exchange of opinions, and technical experimentation. This is more about coordination that gives new perspectives to the scholarly work than about constraint and obligation.
Are the DH to be unconditionally identified with progress? To answer this question, we would like to cite Flanders’ conclusion [61] :
The most interesting papers and books we read, in any genre, are those that neither foretell doom nor glory, but give us instead an interesting idea about the world to play with. Methods and tools that combine what has been gained in power and scale with a real measure of scholarly effort and engagement can give us such an idea. But the intellectual outcomes will not be judged by their power or speed, but by the same criteria used in humanities scholarship all along: does it make us think? Does it make us keep thinking?
This conclusion shows that what we need as classicists would probably be to combine our “traditional” methods of scholarly work with the new digital paradigm, and face the questions raised by the new model with the same “craft-oriented” concern, the same attention to detail and precision that has always been a “trademark” of the humanities, and more specifically of the classical studies. Digital literacy does not replace thinking, it can – or should – help classicists to “keep thinking”, improve, expand, and popularize the knowledge of the ancient world.


