Linguistics

The Component Metadata Infrastructure provides a framework to create and use self-defined metadata formats. It relies on a modular model of so-called metadata components, which can be assembled together, to improve reuse, interoperability and cooperation among metadata modelers. The model is standardised in ISO 24622-1 and ISO 24622-2. The serialization is typically in XML. Metadata in this format are often distributed via OAI-PMH. The definition of data categories is provided externally, for example by linking to schema.org or the Clarin Concept Registry.
CRMtex is the extension defined to address the documentation and representation of text-related cultural heritage objects. It provides a structured framework for describing various types of textual objects, including inscriptions, papyri, manuscripts, and other ancient texts. It aims to facilitate the organization and retrieval of text-related cultural heritage objects by providing a standardised approach to their documentation and description. CRMtex allows cultural heritage institutions, libraries, and research projects to manage and share textual resources effectively. It also enhances the interoperability of text-related data, enabling the integration of textual information from diverse sources, to facilitate scholarly research, textual analysis, and the preservation of textual cultural entities for future generations.

The Text Encoding Initiative (TEI) Guidelines make recommendations about suitable ways of representing those features of textual resources which need to be identified explicitly in order to facilitate processing by computer programs. They specify a set of XML tags in order to mark the textual metadata, text structure, relationship between images and transcriptions and other features of interest. They therefore primarily define a data format, but the TEI Header in particular includes a native set of metadata and may include metadata from other schemas.

In their decades of community driven development they have developed into a de-facto standard in the production of textual data in the humanities. Since the release of version P5, 2 to 4 new revisions have been released each year.

TEI/EPIDOC is a collaborative effort that combines the expertise of EpiDoc and the Text Encoding Initiative (TEI). It establishes standardised guidelines and tools for encoding scholarly and educational editions of ancient documents, embracing inscriptions, papyri, manuscripts, and other text-bearing objects. By leveraging a subset of TEI's standard, TEI/EPIDOC enables the representation of texts in a digital form while also addressing the historical context and materiality of the objects. This comprehensive approach allows scholars to publish digital editions that not only encompass the transcription and editorial treatment of the texts but also provide insights into the objects themselves. As a result, TEI/EPIDOC enriches our understanding of ancient civilizations and facilitates the dissemination of knowledge about their tangible heritage. TEI/EPIDOC is currently employed by the EAGLE Project and Epigraphy.info.