Scope of the Catalog
The focus of the RDA Metadata Standards Catalog is on metadata standards that may be used to document research data, but this simple statement hides a large amount of complexity.
Metadata schemes and data formats
As explained in the RDA Metadata Principles, the distinction between metadata and data depends on the context, since ‘metadata’ really describes a role that a particular set of data plays in a given process. If you are analysing ocean currents, then your data would be observations and measurements, and your metadata would describe how, when and where those observations were taken. But if you were interested in how oceanography had evolved as a field of study over the years, then the information on how and where measurements were taken would form a subset of your data.
Just as the line between data and metadata is blurred, so the line between metadata schemes and data formats can be blurred too. While standalone metadata records are common in some domains and domain-agnostic contexts, other domains have settled on standards that tightly integrate data and metadata, in an effort to make data files self-describing. So as not to disadvantage those communities, the Catalog does admit data formats that have a standardised block of metadata incorporated within them, as well as schemes that define standalone metadata records.
Data formats with no metadata component, such as CSV, are not in scope.
Metadata schemes and vocabularies
One of the easier distinctions to make is between a standardized collection of metadata and a standardized way of representing an individual datum.
For the purposes of the Catalog, a metadata scheme should define a set of information to record about an entity. In Linked Data terms, it should define a set of predicates, where each one represents either
- a property of the entity that has a given literal or complex value;
- a relationship that the entity has with another entity.
There are a number of other standards that specify how the object in a subject–predicate–object triple should be represented. These are not within the scope of this Catalog. The following are examples of standards that are not in scope:
- controlled vocabularies, such as the UNESCO Thesaurus;
- encoding standards, such as ISO 8601 for representing dates and times;
- identifier schemes, such as Handles, InChIs, RRIDs and IGSNs, though where an identifier scheme enforces a metadata scheme for describing the identified entity, that metadata scheme might be in scope.
Metadata standards and ad-hoc metadata schemes
It is recognized that just because a particular metadata scheme has been approved as a standard, this is not an absolute guarantee of its quality or widespread use.
Similarly, a metadata scheme can act like a standard in a given domain without having gone through a formal standards approval process.
Because of this, the ‘Standards’ part of ‘Standards Catalog’ should be taken in a liberal and pragmatic sense: a metadata scheme is considered in scope if it has a formal specification that is used independently by at least two organizations, groups or teams, or is a publicly available profile of such a scheme. A scheme that has been designed for the sole use of single group or application without reference to a wider standard is considered ad hoc and is not in scope.
The Catalog is primarily concerned with metadata about research data. The RDA Metadata Principles recognize that metadata can describe other entities than data while still being considered metadata, so for instance metadata about research facilities and instruments can be relevant.
For the purposes of determining scope, schemes in the Catalog should have demonstrable relevance for describing data generated or collected for the purpose of performing research. The scheme may include information about other entities, but if it solely concerns another type of entity without any reference to the research process, then it is out of scope. So, for example, FOAF (a generic standard for describing people) is out of scope, but a metadata profile that borrows FOAF to describe the creator of a dataset would be in scope.