Table of Contents
Table of Contents
Ecological Metadata Language (EML) is a metadata standard developed by the ecology discipline and for the ecology discipline. It is based on prior work done by the Ecological Society of America and associated efforts (Michener et al., 1997, Ecological Applications). EML is implemented as a series of XML document types that can by used in a modular and extensible manner to document ecological data. Each EML module is designed to describe one logical part of the total metadata that should be included with any ecological dataset.
To provide the ecological community with an extensible, flexible, metadata standard for use in data analysis and archiving that will allow automated machine processing, searching and retrieval.
The architecture of EML was designed to serve the needs of the ecological community, and has benefitted from previous work in other related metadata languages. EML has adopted the strengths of many of these languages, but also addresses a number of short-comings that have proved to inhibit the automated processing iand integration of dataset resources via their metadata.
The following list represents some of the features of EML:
Modularity: EML was designed as a collection of modules rather than one large standard to facilitate future growth of the language in both breadth and depth. By implementing EML with an extensible architecture, groups may choose which of the core modules are pertinent to describing their data, literature, and software resources. Also, if EML falls short in a particular area, it may be extended by creating a new module that describes the resource (e.g. a detailed soils metadata profile that extends eml-dataset). The intent is to provide a common set of core modules for information exchange, but to allow for futute customizations of the language without the need of going through a lengthy 'approval' process.
Detailed Structure: EML strives to balance the tradeoff of 'too much detail' with enough detail to enable advanced services in terms of processing data through the parsing of accompanied metadata. Therefore, a driving question throughout the design was: 'Will this particular piece of information be machine-processed, just human readable, or both?' Information was then broken down into more highly structured elements when the answer involved machine processing.
Compatibility: EML adopts much of it's syntax from the other metadata standards that have evolved from the expertise of groups in other disciplines. Whenever possible, EML adopted entire trees of information in order to facilitate conversion of EML documents into other metadata languages. EML was designed with with the following standards in mind: Dublin Core Metadata Initiative, the Content Standard for Digital Geospatial Metadata (CSDGM from the US geological Survey's Federal Geographic Data Committee (FGDC)), the Biological Profile of the CSDGM (from the National Biological Information Infrastructure), the International Standards Organization's Geographic Information Standard (ISO 19115), the ISO 8601 Date and Time Standard, the OpenGIS Consortiums's Geography Markup Language (GML), the Scientific, Technical, and Medical Markup Language (STMML), and the Extensible Scientific Interchange Language (XSIL).
Strong Typing: EML is implemented in an Extensible Markup Language (XML) known as XML Schema, which is a language that defines the rules that govern the EML syntax. XML Schema is an internet recommendation from the World Wide Web Consortium (http://www.w3.org), and so a metadata document that is said to comply with the syntax of EML will structurally meet the criteria defined in the XML Schema documents for EML. Over and above the structure (what elements can be nested within others, how many, etc.), XML Schema provides the ability to use strong data typing within elements. This allows for finer validation of the contents of the element, not just it's structure. For instance, an element may be of type 'date', and so the value that is inserted in the field will be checked against XML Schema's definition of a date. Traditionally, XML documents have been validated against Document Type Definitions (DTDs), which do not provide a means to employ strong validation on field values through typing.
There is a distinction between the content model (i.e. the concepts behind the structure of a document - which fields go where, how many, etc.) and the syntactic implementation of that model (the technology used to express the concepts defined in the content model). The normative sections below define the content model and the XML Schema documents distributed with EML define the syntactic implementation. For the forseeable future, XML Schema will be the syntactic specification, although it may change later.
Table of Contents
The following section briefly describes each EML module and how they are logically designed in order to document ecological resources. Some of the modules are dependent on others, while others may be used as stand-alone descriptions. This section describes the modules from the "top down", starting from the top-level eml wrapper module, followed by modules of increasing detail. However, there are modules that may be used at many levels, such as eml-access. These modules are described when it is appropriate.
The eml module is a wrapper container that allows the inclusion of any metadata content in a single EML document. The eml module is used as a container to hold structured descriptions of ecological resources. In EML, the definition of a resource comes from the The Dublin Core Metadata Initiative , which describes a general element set used to describe "networked digital resources". The top-level structure of EML has been design to be compatible with the Dublin Core syntax. In general, dataset resources, literature resources, software resources, and protocol resources comprise the list of information that may be described in EML. EML is largely designed to desrcibe digital resources, however, it may also be used to describe non-digital resources such as paper maps and other non-digital media. In EML, the definition of a "Data Package" is the combination of both the data and metadata for a resource. So, data packages are built by using the <eml> wrapper, which will include all of the metadata, and optionally the data (or references to them). All EML packages must begin with the <eml> tag and end with the </eml> tag.
The eml module may be extended to describe other resources by means of it's optional sub-field, <additionalMetadata>. This field is largely reserved for the inclusion of metadata that may be highly discipline specific and not covered in this version of EML, or it may be used to internally extend fields within the EML standard.
The eml-resource module contains general information that describes dataset resources, literature resources, protocol resources, and software resources. Each of the above four types of resources share a common set of information, but also have information that is unique to that particular resource type. Each resource type uses the eml-resource module to document the information common to all resources, but then extend eml-resource with modules that are specific to that particular resource type. For instance, all resources have creators, titles, and perhaps keywords, but only the dataset resource would have a "data table" within it. Likewise, a literature resource may have an "ISBN" number associated with it, whereas the other resource types would not.
The eml-resource module is exclusively used by other modules, and is therefore not a stand-alone module.
The following four modules are used to describe separate resources: datasets, literature, software, and protocols. However, note that the dataset module makes use of the other top-level modules by importing them at different levels. For instance, a dataset may have been produced using a particular protocol, and that protocol may come from a protocol document in a library of protocols. Likewise, citations are used throughout the top-level resource modules by importing the literature module.
Many sites may want to develop a library of protocol documents, a library of party (people) documents, and a library of software documents. These may then be used by reference in your dataset documents as reusable content.
The eml-dataset module contains general information that describes dataset resources. It is intended to provide overview information about the dataset: broad information such as the title, abstract, keywords, contacts, maintenance history, purpose, and distribution of the data themselves. The eml-dataset module also imports many other modules that are used to describe the dataset in fine detail. Specifically, it uses the eml-methods module to describe methodology used in collecting or processing the dataset, the eml-project module to describe the overarching research context and experimental design, the eml-access module to define access control rules for the data and metadata, and the eml-entity module to provide detailed information about the logical structure of the dataset. A dataset can be (and often is) composed of a series of data entities (tables) that are linked together by particular integrity constraints.
The eml-dataset module, like other modules, may be "referenced" via the <references> tag. This allows a dataset to be described once, and then used as a reference in other locations within the EML document via it's ID.
The eml-literature module contains information that describes literature resources. It is intended to provide overview information about the literature citation, including title, abstract, keywords, and contacts. Citation types follow the conventions laid out by EndNote, and there is an attempt to represent a compatible subset of the EndNote citation types. These citation types include: article, book, chapter, edited book, manuscript, report, thesis, conference proceedings, personal communication, map, generic, audio visual, and presentation. The "generic" citation type would be used when one of the other types will not work.
The eml-literature module, like other modules, may be "referenced" via the <references> tag. This allows a citation to be described once, and then used as a reference in other locations within the EML document via it's ID.
The eml-software module contains general information that describes software resources. This module is intended to fully document software that is needed in order to view a resource (such as a dataset) or to process a dataset. The software module is also imported into the eml-method module in oreder to document what software was used to process or perform quality control procedures on a dataset.
The eml-software module, like other modules, may be "referenced" via the <references> tag. This allows a software resource to be described once, and then used as a reference in other locations within the EML document via it's ID.
The eml-protocol module is used to describe the prescribed procedures that are used in the creation or the subsequent processing of a dataset. Likewise, eml-protocol is used to describe proccesses that have been used to define / improve the quality of a data file, or to identify potential problems with the data file. Note that the eml-protocol module is intended to be used to document a prescribed procedure, whereas the eml-method module is used to describe procedures that were actually performed. The distinction is that the use of the term "protocol" is used in the "prescriptive" sense, and the term "method" is used in the "descriptive" sense. This distinction allows managers to build a protocol library of well-known, established protocols (procedures), but also document what procedure was truely performed in relation to the established protocol. The method may have diverged from the protocol purposefully, or perhaps incidentally, but the procedural lineage is still preserved and understandable.
The eml-protocol module, like other modules, may be "referenced" via the <references> tag. This allows a protocol to be described once, and then used as a reference in other locations within the EML document via it's ID.
The following six modules are used to qualify the resources being described in more detail. They are used to describe access control rules, distribution of the metadata and data themselves, parties associated with the resource, the geographic, temporal, and taxonomic extents of the resource, the overall research context of the resource, and detailed methodology used for creating the resource. Some of these modules are imported directly into the top-level resource modules, often in many locations in order to limit the scope of the description. For instance, the eml-coverage module may be used for a particular column of a dataset, rather than the entire dataset as a whole.
The eml-access module describes the level of access that is to be granted or denied to a resource or a subset of a resource for a particular user or group of users. A single eml-access document may be used to express access control for many resources, or for a given resource (e.g., a dataset or citation ). The eml-access module represents a list of resources to be controlled in the context of a particular authentication system. That is, the authentication system determines the set of principals (users + groups) that can be used, and the membership of users in groups. The rules set in this module will determine the level of access to a resource for the defined users and groups. In EML, there are two mechanisms for including access control information via the eml-access module. 1) Each top-level resource module (eml-dataset, eml-literature, eml-software, and eml-protocol) include an optional <access> element directly inline in the document. This is used to define access control at the resource level scope. 2) Finer grained access control may be applied to a subset of a resource via the <addtionalMetadata> element in the eml module. An access control document may be defined, or referenced, from this location, and the <describes> element is used to point to the subset of the resource that is to be controlled via its "id" attribute. Applications that process EML documents must implement the access control rules from both mechanisms. Note that, although access control may be bound to any element with an "id" attribute, the processing involved may be very costly. For instance, it would not be recommended to apply access control to a column of a data file (eml-attribute), since every read/write operation on that column may not proceed until access is verified.
The eml-access module, like other modules, may be "referenced" via the <references> tag. This allows an access control document to be described once, and then used as a reference in other locations within the EML document via it's ID.
The eml-physical module defines the structural characteristics of data formats as delivered over the wire or as found in a file system. One physical object (which can be a bytestream or an object in a file system) might contain multiple entities (for example, this would be typical in a MS Access file that contained multiple tables of data). However, it is typically used to describe a file or stream that is in some text-based format such as ASCII or UTF-8, and includes the information needed to parse the data stream to extract the entity and its attributes from the stream. The eml-physical module defines three distinct distribution types: 1) online - defined as either a URL, a "connection" comprised of a connection definition and a parameter list, or just a connection definition, 2) offline - which contains a number of fields to document data distribution on media such as CDROM, digital tape, etc., or 3) inline - where the data are directly included in the eml-physical metadata document, perhaps in an XML syntax itself. The eml-physical module is used in 2 ways: 1) at the resource level where one may define a general means of getting to the entire dataset using one of the 3 options above, and 2) in the eml-entity module, where each entity (.e.g. each table in a database) is defined as either a specific URL or a specific connection (which may use the connection definition that is defined at the resource level.
The eml-physical module, like other modules, may be "referenced" via the <references> tag. This allows a physical distribution to be described once, and then used as a reference in other locations within the EML document via it's ID.
The eml-party module describes a responsible party (person or organization), and is typically used to name the originator of a resource or metadata document. It contains detailed contact information for the party, be it an individual person, an organization, or a named position within an organization. The eml-party module is used throughout the other EML modules where detailed contact information is needed.
The eml-party module, like other modules, may be "referenced" via the <references> tag. This allows a party to be described once, and then used as a reference in other locations within the EML document via it's ID.
The eml-coverage module contains fields for describing the coverage of a resource in terms of time, space, and taxonomy. These coverages (temporal, spatial, and taxonomic) represent the extent of applicability of the resource in those domains. graphic coverage section allows for 2 means of expressing coverage on the surface of the earth: 1) via a set of bounding coordinates that define the North, South, East and West points in a rectangular area, optionally including a bounding altitude, and 2) using a G-Ring polygon definition, where an irregularly shaped area may be defined using a ordered list of latitude/longitude coordinates. A G-Ring may also include an "inner G-Ring" that defines one or more "cut-outs" in the area, i.e. the donut hole concept.
The temporal coverage section allows for the definition of either a single date/time, or a range of dates/times. These date/times may be expressed as a calendar date according to the ISO 8601 Date and Time Specification, or or by using an alternate time scale, such as the geologic time scale. In order to express an "ongoing" time frame, the end date in the range would likely use the alternate time scale fields with a value of "ongoing", whereas the begin date would use the specific calendar date fields.
The taxonomic coverage section allows for detailed description of the taxonomic extent of the dataset or resource. The taxonomic classification consists of a recursive set of taxon rank names, their values, and their common names. This construct allows for a taxonomic hierarchy to be built to show the level of identification (e.g. Rank Name = Kingdom, Rank Value = Animalia, Common Name = Animals, and so on down the hierarchy.) The taxonomic coverage module also allows for the definition of the classification system in cases where alternative systems are used.
The eml-coverage module, like other modules, may be "referenced" via the <references> tag. This allows the coverage extent to be described once, and then used as a reference in other locations within the EML document via it's ID.
The eml-project module describes the research context in which the dataset was created, including descriptions of over-all motivations and goals, funding, personnel, description of the study area etc. This is also the module to describe the design of the project: the scientific questions being asked, the architecture of the design, etc. This module is used to place the dataset that is being documented into it's larger research context.
The eml-project module, like other modules, may be "referenced" via the <references> tag. This allows a research project to be described once, and then used as a reference in other locations within the EML document via it's ID.
The eml-methods module describes the methods followed in the creation of the dataset, including description of field, laboratory and processing steps, sampling methods and units, quality control proceudures. The eml-methods module is used to describe the actual procedures that are used in the creation or the subsequent processing of a dataset. Likewise, eml-methods is used to describe proccesses that have been used to define / improve the quality of a data file, or to identify potential problems with the data file. Note that the eml-protocol module is intended to be used to document a prescribed procedure, whereas the eml-method module is used to describe procedures that were actually performed. The distinction is that the use of the term "protocol" is used in the "prescriptive" sense, and the term "method" is used in the "descriptive" sense. This distinction allows managers to build a protocol library of well-known, established protocols (procedures), but also document what procedure was truely performed in relation to the established protocol. The method may have diverged from the protocol purposefully, or perhaps incidentally, but the procedural lineage is still preserved and understandable.
The eml-methods module, like other modules, may be "referenced" via the <references> tag. This allows a method to be described once, and then used as a reference in other locations within the EML document via it's ID.
note schema specific modules
The eml-entity module defines the logical characteristics of every entity in the dataset. Entities are usually tables of data with a fixed logical structure, but could also be other types of data such as raster or vector image data. The "table-entity" element is used to describe all table entities, and the "other-entity" element would be used to describe all other types of entities. Specific details of non-tabular entities are left to other standards (e.g., for remote sensing data, one should include a spatial metadata module such as the FGDC Content Standard for Digitial Geospatial Metadata [CSDGM]).
The eml-attribute module describes all attributes (known in various disciplines as variables, fields, columns, etc) in a data entity (e.g., data table). The description includes the name and definition of each attribute, its type, its allowable range (if numeric), definitions of coded values, and other pertinent information.
The eml-constraint schema defines the integrity constraints between entities (e.g., data tables) as would be maintained in a relational database. These constraints include primary key constraints, foreign key constraints, unique key constraints, check constraints, and not null constraints, among potential others.
The following six modules are used to describe a number of common types of entities found in datasets. Each entity type uses the eml-entity module elements as it's base set of elements, but then extends the base with entity-specific elements. Note that the eml-spatialReference module is not an entity type, but is rather a common set of elements used to describe spatial reference systems in both eml-spatialRaster and eml-spatialVector. It is described here in relation to those two modules.
The eml-dataTable module is used to describe the logical characteristics of each tabular set of information in a dataset. A series of comma-sparated text files may be considered a dataset, and each file would subsequently be considered a dataTable entity within the dataset. Since the eml-dataTable module extends the eml-entity module, it uses all of the common entity elements to describe the table, along with a few elements specific to just data table entities. The eml-dataTable module allows for the description of each attribute (column/field/variable) within the data table through the use of the eml-attribute module. Likewise, there are fields used to describe the physical distribution of the data table, it's overall coverage, the methodology used in creating the data, and other logical structure information such as its orientation, case sensitivity, etc.
Note on eml utility schemas
The eml text module is a wrapper container that allows general text descriptions to be used within the various modules of eml. It can include either structured or unstructured text blocks. It isn't really appropriate to use this module outside of the context of a parent module, because the parent module determines the appropriate context to which this text description applies. The eml-text module allows one to provide structure to a text description in order to convey concepts such as sections (paragraphs), hierarchy (ordered and unordered lists), emphasis (bold, superscript, subscript) etc.
The multiple modules in EML all depend on each other in complex ways. To easily see these dependencies see the EML Dependency Chart.
Table of Contents
This section explains the rules of EML. There are some rules that cannot be written directly into the XML Schemas nor enforced by an XML parser. These are guidelines that every EML package must follow in order for it to be considered EML compliant.
Each EML module, with the exception of "eml" itself, has a top level choice between the structured content of that modules or a "references" field. This enables the reuse of content previously defined elsewhere in the document. Methods for defining and referencing content are described in the next section
EML allows the reuse of previously defined structured content (DOM sub-trees) through the use of ID/IDREF type references. In order for an EML package to remain cohesive and to allow for the cross platform compatability of packages, the following rules with respect to packaging must be followed.
IDs are required on all modules that extend resource.
IDs are optional on all other modules.
If an ID is not provided, that content must be interpreted as representing a distinct object.
If an ID is provided for content then that content is distinct from all other content except for that content that references its ID.
If a user wants to reuse content to indicate the repetition of an object, a reference must be used. you cannot have two identical ids in a document.
"Local scope" is defined as identifiers unique only to a single instance document (if a document does not have a system or if scope is set to 'local' then all ids are defined as distinct content).
System scope is defined as identifiers unique to an entire data management system (if two documents share a system string, then any IDs in those two documents that are identical refer to the same object).
If an element references another element, it must not have an ID.
All EML packages must have the 'eml' module as the root.
The system and scope attribute are always optional except for at the 'eml' module where the scope attribute is fixed as 'system'. The scope attribute defaults to 'local' for all other modules.
Table of Contents
Recommended Usage: all datasets
Stand-alone: yes
Imports:
eml-documentation, eml-dataset, eml-literature, eml-software, eml-protocol, eml-resource
Imported By:
The eml module is a wrapper container that allows inclusion of any metadata content in a single EML document. In general, dataset resources, literature resources, and software resources, or another type that extends eml-resource are described using an eml document. The eml document represents a "package" that can contain both metadata and data. It can optionally include non-EML metadata through the flexibility of the "additionalMetadata" element. Any additional metadata that is provided can provide a pointer into the EML metadata indicating what the context of the additional metadata is. For example, a spatial raster image might be described in EML, and an FGDC CSDGM metadata document could be included in the additionalMetadata element with a pointer to the EML spatialRaster element to indicate that the FGDC metadata is providing supplemental documentation about that particular image entity.
Recommended Usage: all data where controlling user access to the dataset is an issue
Stand-alone: yes
Imports:
eml-documentation, eml-resource
Imported By:
eml-dataset, eml-literature, eml-protocol, eml-software
The EML Access Module describes the level of access that is to be granted or denied to a resource for a particular user or group of users. A single eml-access document may be used to express access control for many resources, or for a given resource (e.g., a dataset or document). The relationship between a resource and it's access control document is defined in the eml-resource module. The EML Access Module represents a list of resources to be controlled in the context of a particular authentication system. That is, the authentication system determines the set of principals (users + groups) that can be used, and the membership of users in groups. The rules set in this module will determine the level of access to a resource for the defined users and groups.
Recommended Usage: any dataset that uses dataTable, spatialRaster, spatialVector, storedProcedure, view or otherEntity or in a custom module where one wants to document an attribute (variable)
Stand-alone: yes
Imports:
eml-documentation, eml-methods, eml-coverage, eml-literature, eml-resource
Imported By:
eml-entity, eml-storedProcedure, eml-view
The EML Attribute Module describes all attributes (variables) in a data entity: dataTable, spatialRaster, spatialVector, storedProcedure, view or otherEntity). The description includes the name and definition of each attribute, its type, its allowable range (if numeric), definitions of coded values, and other pertinent information. Two structures exist in this module: 1. attribute-used to define a single attribute; 2. attributeList is used to define a discreet list of attributes that go together in some logical way.
Recommended Usage: All datasets where there are logical constraints between entities
Stand-alone: no
Imports:
eml-documentation, eml-resource
Imported By:
eml-entity, eml-storedProcedure, eml-view
The eml-constraint schema defines the integrity constraints between entities (e.g., dataTables) as would be maintained in a relational database. These constraints include primary key constraints, foreign key constraints, unique key constraints, check constraints, and not null constraints, among potential others.
Recommended Usage: all datasets where spatial, temporal or taxonomic coverage is important
Stand-alone: no
Imports:
eml-literature, eml-documentation, eml-party, eml-resource
Imported By:
eml-attribute, eml-entity, eml-literature, eml-methods, eml-project, eml-resource, eml-spatialRaster, eml-storedProcedure, eml-view
The eml-coverage module contains fields for describing the coverage of a resource in terms of time, space, and taxonomy. These coverages (temporal, spatial, and taxonomic) represent the extent of applicability of the resource in those domains.
Recommended Usage: all datasets
Stand-alone: yes
Imports:
eml-documentation, eml-resource, eml-party, eml-access, eml-entity, eml-dataTable, eml-project, eml-methods, eml-spatialRaster, eml-spatialVector, eml-storedProcedure, eml-text, eml-view
Imported By:
eml, eml-methods
The eml-dataset module contains general information that describes dataset resources. It is intended to provide overview information about the dataset, including title, abstract, keywords, contacts, and the links to associated metadata for the given resource. It also describes the temporal, geographic, and taxonomic coverage of the overall dataset. A dataset can be (and often is) composed of a series of data entities (tables) that are linked together by particular integrity constraints.
Recommended Usage: all datasets with one or more data tables.
Stand-alone: yes
Imports:
eml-documentation, eml-resource, eml-entity
Imported By:
eml-dataset
The EML dataTable Module defines the logical characteristics of every dataTable in the dataset. Each attribute of the table is described in eml-attribute. This module is for basic table information only.
Recommended Usage: all datasets
Stand-alone: only when 'otherEntity' is used
Imports:
eml-documentation, eml-coverage, eml-physical, eml-methods, eml-attribute, eml-resource, eml-constraint, eml-text
Imported By:
eml-dataset, eml-dataTable, eml-spatialRaster, eml-spatialVector, eml-storedProcedure, eml-view
The EML Entity Module defines the logical characteristics of every entity in the dataset. Entities are usually tables of data with a fixed logical structure, but could also be other types of data such as raster or vector image data. The "table-entity" element is used to describe all table entities, and the "other-entity" element would be used to describe all other types of entities. Specific details of non-tabular entities are left to other standards (e.g., for remote sensing data, one should include a spatial metadata module such as the FGDC Content Standard for Digitial Geospatial Metadata [CSDGM]).
Recommended Usage: All datasets with literary citations
Stand-alone: yes
Imports:
eml-documentation, eml-resource, eml-coverage, eml-party, eml-access, eml-project
Imported By:
eml, eml-attribute, eml-coverage, eml-methods, eml-physical, eml-project
The eml-literature module contains information that describes literature resources. It is intended to provide overview information about the literature citation, including title, abstract, keywords, and contacts. Citation types follow the conventions laid out by EndNote, and there is an attempt to represent a compatible subset of the EndNote citation types.
Recommended Usage: All datasets
Stand-alone: no
Imports:
eml-documentation, eml-dataset, eml-resource, eml-software, eml-protocol, eml-party, eml-text, eml-coverage, eml-literature
Imported By:
eml-attribute, eml-dataset, eml-entity, eml-protocol
The EML Methods Module describes the methods followed in the creation of the dataset, including description of field, laboratory and processing steps, sampling methods and units, quality control proceudures.
Recommended Usage: all datasets
Stand-alone: yes
Imports:
eml-documentation, eml-resource
Imported By:
eml-coverage, eml-dataset, eml-literature, eml-methods, eml-project, eml-resource
The eml-party module describes a responsible party (person or organization), and is typically used to name the originator of a resource or metadata document. It contains detailed contact information for the party, be it an individual person, an organization, or a named position within an organization.
Recommended Usage: All packages that contain data
Stand-alone: yes
Imports:
eml-documentation, eml-literature, eml-resource
Imported By:
eml-entity, eml-storedProcedure, eml-view
The eml-physical Module defines the structural characteristics of data formats as delivered over the wire or as found in a file system. One physical object (which can be a bytestream or an object in a file system) might contain multiple entities (for example, this would be typical in a MS Access file that contained multiple tables of data). However, it is typically used to describe a file or stream that is in some text-based format such as ASCII or UTF-8, and includes the information needed to parse the data stream to extract the entity and its attributes from the stream.
Recommended Usage: All datasets
Stand-alone: no
Imports:
eml-documentation, eml-resource, eml-party, eml-coverage, eml-literature, eml-text
Imported By:
eml-dataset, eml-literature, eml-software
The EML Project Module describes the research context in which the dataset was created, including description of over-all motivations and goals, funding, personnel, description of the study area etc. This is also the module to describe over-all sampling methods for the project.
Recommended Usage: all datasets
Stand-alone: yes
Imports:
eml-resource, eml-methods, eml-documentation, eml-access
Imported By:
eml, eml-methods, eml-storedProcedure, eml-view
The EML Protocol Module is used to describe methods and identify the proccesses that have been used to define / improve the quality of a data file, also used to identify potential problems with the data file.
Recommended Usage: all datasets
Stand-alone: no
Imports:
eml-documentation, eml-party, eml-coverage, eml-text
Imported By:
eml, eml-access, eml-attribute, eml-constraint, eml-coverage, eml-dataset, eml-dataTable, eml-entity, eml-literature, eml-methods, eml-party, eml-physical, eml-project, eml-protocol, eml-software, eml-spatialRaster, eml-spatialReference, eml-spatialVector, eml-storedProcedure, eml-view
The eml-resource module contains general information that describes dataset resources, literature resources, collection resources, and software resources. It is intended to provide overview information about the resource, including title, abstract, keywords, contacts, and the links to associated metadata and data for the given resource.
Recommended Usage: All dataset where software was used in the analysis or creation of the dataset.
Stand-alone: yes
Imports:
eml-documentation, eml-resource, eml-access, eml-project
Imported By:
eml, eml-methods
The eml-software module contains general information that describes software resources.
Recommended Usage: all spatial datasets that use spatial gridded data
Stand-alone: yes
Imports:
eml-documentation, eml-spatialReference, eml-coverage, eml-entity, eml-resource
Imported By:
eml-dataset
The eml-spatialRaster module allows for the description of entities composed of rectangular grids of data values that are usually georeferenced to a portion of the earth's surface. Specific attributes of a spatial raster can be documented here including the spatial organization of the raster cells, the cell data values, and if derived via imaging sensors, characteristics about the image and its individual bands.
Recommended Usage: all entities that are georeferenced
Stand-alone: yes
Imports:
eml-resource
Imported By:
eml-spatialRaster, eml-spatialVector
The eml-spatialReference module allows for the description of the relationship between locational references made in the dataset and the real world. A spatial referencing system typically consists of the identification of a datum, a spheroid, and a set of parameters used in the mathematical projection from geographic to planar coordinates.
Recommended Usage: all spatial datasets that contain spatial data entities represented as vector features.
Stand-alone: yes
Imports:
eml-documentation, eml-spatialReference, eml-entity, eml-resource
Imported By:
eml-dataset
The eml-spatialVector module allows for the description of spatial vectors in a GIS system. Specific attributes of a spatial vector can be documented here including the vector's geometry type, count and topology level.
Recommended Usage: all datasets that use storedProcedures to retrieve archived data
Stand-alone: yes
Imports:
eml-entity, eml-documentation, eml-attribute, eml-protocol, eml-physical, eml-coverage, eml-resource, eml-constraint
Imported By:
eml-dataset
The StoredProcedure module is meant to capture information on procedures that produce data output in the form of a data matrix. It allows the optional describtion of any parameters that are expected to be passed to the procedure when it is called.
Recommended Usage: any module
Stand-alone: no
Imports:
eml-documentation
Imported By:
eml-dataset, eml-entity, eml-methods, eml-project, eml-resource
The eml text module is a wrapper container that allows general text descriptions to be used within the various modules of eml. It can include either structured or unstructured text blocks. It isn't really appropriate to use this module outside of the context of a parent module, because the parent module determines the appropriate context to which this text description applies.
Recommended Usage: all datasets that contain one or more views
Stand-alone: yes
Imports:
eml-entity, eml-documentation, eml-attribute, eml-protocol, eml-physical, eml-coverage, eml-resource, eml-constraint
Imported By:
eml-dataset
The EML View module describes a view from a database management system. A view is a query statement that is stored as a database object and executed each time the view is called.