UNC Digital Library Metadata Guidelines

Rev. 0.1: Hugh A. Cayless <hcayless@email.unc.edu>

Introduction: This page provides a text description of the structure and semantics of the UNC Digital Library Services base metadata schema. Metadata objects in this system are expressed in XML, and must validate against a registered XML Schema in order to be processed by the system. The current version of the schema is 1.1, and it is available at http://www.unc.edu/projects/diglib/schemata/DL_D/DC/1_1/DC.xsd. A DTD version may be found at http://www.unc.edu/projects/diglib/schemata/DL_D/DC/1_1/DC.dtd. The schema is based on the Dublin Core Element Set (see http://dublincore.org) and uses qualifying attributes to refine the semantics of the metadata.

Each resource indexed by the Digital Library Services database will have an associated metadata object corresponding to this schema or one of its derivatives. The system uses a one to one approach, so surrogates of a larger resource will have their own associated metadata objects, related to the primary version by means of a relation tag. The only exception to the one to one rule is thumbnails and similar surrogates which are used to summarize resources visually.

<object>

The root element for a UNC digital library metadata object.

attributes:

objectid: the unique identifier of this object. The id is assigned by the digital library database.

1. <title> optional, repeatable, text content only

parent: <object>

Title is typically a name by which the object is formally known. Although according to the Dublin Core standard, title is optional, in practise there is little that can be done without one, so future schemas will probably make it required.

attributes:

titlequalifier: Denotes the type of title in the title element. The default is 'main.' Other options in version 1.1 are 'short,' 'abbreviation,' 'alternative,' 'release,' 'series,' 'subtitle,' and 'firstline.'

scope: Scope indicates whether a given element describes the original object, a physical surrogate, such as a slide or photograph, or a digital surrogate, such as a scan of that slide. Valid values are 'original,' 'surrogate-physical,' and 'surrogate-digital.'

language: The language of text in the element. Recommended best practice uses a two-letter Language Code followed optionally, by a two-letter Country Code. For example, en for English, fr for French, or en-uk for English used in the United Kingdom.

2. <creator> optional, repeatable, text content only

parent: <object>

The entity primarily responsible for making the content of the object (e.g., person, organization, service, etc.).

attributes:

agenttype: The type of agent. Valid values are 'person' (the default), 'organization,' 'event,' and 'object.'

agentrole: The role the agent played with regard to the resource. The agentrole can be used to specify more detailed information about the activity implied by the element. For example, the creator of a sculpture would be entered in the Creator element, and could be given a role of Sculptor.

scope: see definition in <title>.

3. <subject> optional, repeatable, text content only

parent: <object>

Keywords, key phrases or classification codes that describe a topic of the object. Recommended best practice is to select a value from a controlled vocabulary or formal classification scheme.

attributes:

vocabulary: The controlled vocabulary (if there is one) from which the subject terms were drawn. E.g. LCSH (Library of Congress Subject Headings).

scope: see definition in <title>.

4. <description> optional, repeatable, text content only

parent: <object>

An account of the content of the object. Could be an abstract, table of contents, reference to a graphical representation of content or a free-text account. Title + description are the default summary representation of an object in the digital library interface. This means that description is the right place to put text that should show up in (e.g.) search results or browsing.

attributes:

externaldesc: The URL of an external document that describes the resource.

descriptionqualifier: signals the type of description represented by the element contents. Valid values in version 1.1 are 'Abstract' and 'TOC.'

scope: see definition in <title>.

language: see definition in <title>.

5. <publisher> optional, repeatable, text content only

parent: <object>

The entity responsible for making the object available (e.g., person, organization, service, etc.)

attributes:

agenttype: see definition in <creator>.

agentrole: see definition in <creator>.

scope: see definition in <title>.

6. <contributor> optional, repeatable, text content only

parent: <object>

An entity responsible for making contributions to the content of the object.

attributes:

agenttype: see definition in <creator>.

agentrole: see definition in <creator>.

scope: see definition in <title>.

7. <date> optional, repeatable, text content only

parent: <object>

Date is associated with the creation or availability of the object. Recommended best practice follows the YYYY-MM-DD format.

attributes:

datetype: The type of date contained in the Date element. Valid values are "Created", "DataGathered", "Valid", "Issued", "Available", "Accepted", and "Acquired".

dateaccuracy: The accuracy of the date contained in the Date element. Valid values are "exact" and "approximate".

scope: see definition in <title>.

8. <type> optional, repeatable, text content only

parent: <object>

Type includes terms describing general categories, functions, genres, or aggregation levels for content. Recommended best practice is to select a value from a controlled vocabulary.

attributes:

scope: see definition in <title>.

language: see definition in <title>.

9. <format> optional, repeatable, text content only

parent: <object>

Format may include the media-type or dimensions of the object. Format may be used to determine the software, hardware or other equipment needed to display or operate the object. Examples of dimensions include size and duration. Recommended best practice is to select a value from a controlled vocabulary (for example, the list of Internet Media Types defining computer media formats).

attributes:

formatscheme: The format scheme indicates how the contents of the format element are to be interpreted. For example, if the format element contains information about the size of an object, formatscheme could indicate the units in which that size is expressed, e.g., pixels or inches. Formatscheme may also be used for media types, e.g., MIME.

dimension: If the format element is being used to indicate the size of the object, dimension indicates which particular dimension the figure refers to, e.g. width, height, file size, etc.

scope: see definition in <title>.

10. <identifier> optional, repeatable, text content only

parent: <object>

An unambiguous reference to the object within UNC Digital Library Services.

attributes:

identifierscheme: The scheme to which the identifier belongs. E.g., a URL, or an ISBN number.

target: Indicates whether the identifier refers to the resource itself, or some surrogate, such as a thumbnail image which can be displayed in place of the object.

scope: see definition in <title>.

11. <source> optional, repeatable, text content only

parent: <object>

A reference to an object from which the present object is derived, in whole or in part.

attributes:

scope: see definition in <title>.

language: see definition in <title>.

12. <language> optional, repeatable, text content only

parent: <object>

The language of the intellectual content of the object. Recommended best practice uses a two-letter Language Code followed optionally, by a two-letter Country Code. For example, en for English, fr for French, or en-uk for English used in the United Kingdom.

attributes:

scope: see definition in <title>.

13. <relation> optional, repeatable, text content only

parent: <object>

A reference to a related metadata object.

attributes:

relationtype: The type of relationship being described. Valid values are 'IsPartOf,' 'IsVersionOf,' 'IsFormatOf,' 'References,' 'IsInSubCollection.'

objectid: The digital library objectid of the related metadata object.

scope: see definition in <title>.

14. <coverage> optional, repeatable, text or element content

parent: <object>

Coverage will typically include spatial location (a place name or geographic coordinates), temporal period (a period label, date, or date range) or jurisdiction (such as a named administrative entity). Recommended best practice is to select a value from a controlled vocabulary and that, where appropriate, named places or time periods be used in preference to numeric identifiers such as sets of coordinates or date ranges.

attributes:

coveragetype: indicates what type of coverage is contained by this element. Valid values include: placename, periodname, stylename, time, point, line, and polygon.

coveragescheme: Coveragescheme is an optional attribute that provides information on how the contents of the coverage element are to be interpreted. For example, if the coverage is describing a period name, the coveragescheme would indicate the culture to which that period belongs. A coverage of 'Classical', for example, could have a coveragescheme of 'Greek,' or 'Mayan.' If the coverage is a placename, then the coveragescheme might indicate what kind of place is being referred to, e.g. 'country,' 'findspot,' etc.

coveragedatum: If the coverage uses a coordinate system, this attribute can be used to specify a datum for that coordinate system. [Deprecated]--future versions will use the coordinate element to supply a datum.

language: see definition in <title>.

scope: see definition in <title>.

a. <coordinate> optional, repeatable, empty

parent: <coverage>

The coordinate element is to be used for numeric coverages of time or space. Coordinate elements are always empty. The coordinate's value should be put in the coordinateval attribute.

attributes:

coordinateorder: the order in which coordinate elements should be processed (optional).

coordinateaccuracy: indicates whether a given coordinate is exact or an approximation. Valid values are 'exact' (default) or 'approximate.'

coordinatetype: the type of coordinate and its axis. Valid values are 'x,' 'xmin,' 'xmax,' 'y,' 'ymin,' 'ymax,' 'z,' 'zmin,' 'zmax,' 't,' 'tmin,' and 'tmax.'

coordinateval: the value of the coordinate.

15. <rights> optional, repeatable, element content only

parent: <object>

Information about rights held in and over the object.

attributes:

scope: see definition in <title>.

a. <rightsstatement> optional, not repeatable, text content only

parent: <rights>

A statement specifying any copyright or licensing restrictions on use of the object.

b. <rightscontact> optional, repeatable, text content only

parent: <rights>

Information on how to contact the person or organization responsible for this object.

attributes:

contacttype: Describes the content of the rightscontact element. The contact type may be a URL, an email address, or free text.

c. <rightsrealm> required, repeatable, empty

parent: <rights>

This element is used to specify the actual permissions on the object and its descriptive metadata.

attributes:

entityid: the database key of the user or group with the permissions stated in this element.

discover: Can this user or group find the object in a search?

read: Can this user or group read the object's metadata?

write: Can this user or group modify the object itself?

execute: Can this user or group traverse the link to the object? That is, can the user or group retrieve the digital object referred to by this metadata record?

writemetadata: Can this user edit this metadata record?

16. <meta>

parent: <object>

The "meta" element is designed to hold any additional metadata that cannot be mapped adequately to the basic schema. In the basic schema, the meta element does not appear, because there is no extra metadata vocabulary defined in that schema. Collections that require additional metadata will have a schema that extends the basic model defined for them in consultation with the Digital Library Services staff. When the diglib schemas are namespace-enabled, the contents of <meta> should be placed in a namespace other than the default diglib namespace.