Multitext Technical Infrastructure


Technological Infrastructure

The texts, images, indices, and essays that compose the Homer Multitext are for researchers and any interested readers to use as they see fit. The Center for Hellenic Studies is committed to providing innovative and evolving interfaces to this data, some that will illuminate aspects of the Homeric tradition or illustrate scholarly arguments about that tradition, and some that will invite exploration among ancient and medieval witnesses to Homeric poetry, in Greek and in translation. Further, the Center for Hellenic Studies team hopes to encourage other projects to take advantage of this library of texts and data by incorporating them into other digital libraries as seamlessly as possible.

With these goals in mind, the editors of the Homer Multitext decided early in the project’s history to build on a generic technological infrastructure to serve its data. This infrastructure is described at Digital Incunabula, a series of technical papers initiated by the Center for Hellenic Studies.

The heart of the Homer Multitext’s infrastructure is the Canonical Text Services protocol (CTS). The CTS defines a network service for identifying and working with texts. CTS joins the conceptual model of “texts” described by the Functional Requirements for Bibliographic Records (or FRBR) with a hierarchical model of canonical citations that is traditional in many areas of the humanities. The Canonical Text Services protocol defines a hierarchical scheme of “works.” As in FRBR, the “work” is a conceptual entity: an abstract idea of the content expressed in all versions of a work, in the original language or in translation, but in CTS, the work’s original language is specified. CTS organizes works in “groups” that have no direct parallel in FRBR. Groups organize works according to traditional citation practice. They may reflect authorship (e.g., a work entitled Huckleberry Finn might belong to a group named “Mark Twain”), or may represent some other kind of corpus (for example, a work numbered 1 belonging to a group named “Federalist Papers”). Works may include specific versions, called “expressions” in the FRBR model; in CTS, these are identified as either editions or translations, with language of translations explicitly identified. These expressions may in turn be represented by specific exemplars, or “items” in FRBR parlance. Beyond identifying works, as the FRBR model aims to do, CTS provides a hierarchical model for citation of sections of a work. A prose work like Herodotus’ Histories might be organized in a book/chapter/section scheme, or an epic poem might be cited by book number and verse number. The protocol is designed so that canonical text servers accept citations at any level of the work hierarchy, and any level of the reference hierarchy, and interpret them as humanists working manually with printed materials would do.

CTS allows automatic discovery of and access to electronic texts in a digital library, with no prior knowledge of that library’s holdings or of the structure of texts. It enables sharing and mirroring among libraries, and feeds data to applications that present texts to human readers with the degree of multiformity required by the Homer Multitext.

In addition to CTS, the architects of the Homer Multitext have developed similar   protocols and tools for publishing images, data about collections of objects (concrete objects like coins or fragments of pottery, or abstract objects like geographical references, personal names, or lexical lemmata), and indices creating associations among these things.