Module Documentation: eml-physical
Back to EML Contents
The eml-physical Module defines the structural characteristics of data formats as delivered over the wire or as found in a file system. One physical object (which can be a bytestream or an object in a file system) might contain multiple entities (for example, this would be typical in a MS Access file that contained multiple tables of data). However, it is typically used to describe a file or stream that is in some text-based format such as ASCII or UTF-8, and includes the information needed to parse the data stream to extract the entity and its attributes from the stream.

Element Definitions:

physical
Content of this field: Description of this field:
Type: PhysicalType
Attributes: Required?: Default Value:

Description:
Physical structure of an entity or entities. This generally is a detailed description of a text representation that shows how the columns and rows of a table are represented, or simply the name of a well-known binary or proprietary format (e.g., Microsoft Excel 2000).
Example:
dataObject
Content of this field: Description of this field:
Elements: Required?: How many:
A sequence of (
objectNameOptionalMultiple Times
sizeOptionalMultiple Times
authenticationOptionalMultiple Times
compressionMethodOptionalMultiple Times
encodingMethodOptionalMultiple Times
characterEncodingOptionalMultiple Times
)
Attributes: Required?: Default Value:

Description:
The dataObject element is a container for several items which identify the physical object described by this metadata document
Example:
objectName
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:
size
Content of this field: Description of this field:
Elements: Required?: How many:
Attributes: Required?: Default Value:

Description:
This element contains information of the physical size of the entity, typically in bytes.
Example:
<entitySize unit="bytes">13</entitySize>
authentication
Content of this field: Description of this field:
Elements: Required?: How many:
Attributes: Required?: Default Value:

Description:
This element describes authentication procedures or techniques, typically by giving a checksum method (e.g., MD5) and checksum value for the bytestream.
Example:
<authentication method="MD5"> f5b2177ea03aea73de12da81f896fe40</authentication>
compressionMethod
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
This element describes any compression methods used to compress the entity, such as zip, compress, etc.
Example:
encodingMethod
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
This element describes the entity's encoded method, such as MIME base64 encoding or binhex encoding.
Example:
characterEncoding
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
This element contains the name of the character encoding. This is typically ASCII or UTF-8, or one of the other common encodings.
Example:
<characterEncoding>UTF-8</characterEncoding>
dataFormat
Content of this field: Description of this field:
Elements: Required?: How many:
A choice of (
formatTypeOptionalMultiple Times
OR
asciiDelimitedOptionalMultiple Times
OR
asciiFixedOptionalMultiple Times
OR
binaryRasterInfoOptionalMultiple Times
)
Attributes: Required?: Default Value:

Description:
This element contains the name of the file's format. The file's format is typically ASCII, Unicode, or some well-known binary format (e.g., Microsoft Excel 2000). It is recommended to include a complete MIME type here, such as image/jpeg or text/xml. Note that this is the format of the physical file itself.
Example:
<dataFormat>ASCII</dataFormat>
formatType
Content of this field: Description of this field:
Elements: Required?: How many:
A sequence of (
formatNameOptionalMultiple Times
formatVersionOptionalMultiple Times
citationOptionalMultiple Times
)
Attributes: Required?: Default Value:
formatName
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:
formatVersion
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:
citation
Content of this field: Description of this field:
Type: lit:CitationType
Attributes: Required?: Default Value:
asciiDelimited
Content of this field: Description of this field:
Elements: Required?: How many:
A sequence of (
fieldDelimiterOptionalMultiple Times
numHeaderLinesOptionalMultiple Times
maxRecordLengthOptionalMultiple Times
quoteCharacterOptionalMultiple Times
recordDelimiterOptionalMultiple Times
literalCharacterOptionalMultiple Times
)
Attributes: Required?: Default Value:
fieldDelimiter
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
Variable width format fields (attributes) can vary in their field length, thus the end of the field is delimited by a special character called a field delimiter (typically a comma or a space). Data sets are generally classified as fixedWidth format or variableWidth format, but we have determined that this is actually a per-field classification because one may encounter fixedWidth fields mixed together in the same data file with variableWidth fields. In our encoding scheme, the start of each field is assumed to be the column after the last column of the previous field, or the first column if this is the first field in the dataset, unless the starting column is explicity enumerated using the "fieldStartColumn" element. The end column for each field is classified using either a special character delimeter indicated using the filedDelimiter element, or a fixed field length indicated by using the "fieldWidth" element. The delimiter for the last field in the data set can be omitted. variableWidth fields can vary in their field length, and the end of the field is delimited by a special character called a field delimiter, usually a comma or a tab character. fixedWidth fields have a set length, and so the end of the field can always be determined by adding the fieldWidth to the starting column number. Here is an example: Assume we have the following data in a data set: May,100aaaa,1.2, April,200aaaa,3.4, June,300bbbb,4.6, The metadata indicating the physical layout of the 4 fields would include the following: <delimiter>,</delimiter> <fieldWidth>3</fieldWidth> <fieldWidth>3</fieldWidth> <delimiter>,</delimiter> In a strictly fixed format file, the metadata would be slightly different: May100aaaa1.2 Apr200aaaa3.4 Jun300bbbb4.6 <fieldWidth>3</fieldWidth> <fieldWidth>3</fieldWidth> <fieldWidth>4</fieldWidth> <fieldWidth>3</fieldWidth> or, one could explicitly describe the starting columns: <fieldStartColumn>1</fieldStartColumn> <fieldWidth>3</fieldWidth> <fieldStartColumn>4</fieldStartColumn> <fieldWidth>3</fieldWidth> <fieldStartColumn>7</fieldStartColumn> <fieldWidth>4</fieldWidth> <fieldStartColumn>11</fieldStartColumn> <fieldWidth>3</fieldWidth>
Example:
comma, tab, white space, etc.
numHeaderLines
Content of this field: Description of this field:
Type: xs:unsignedLong
Attributes: Required?: Default Value:

Description:
Number of header lines or information that prepares data.
Example:
<numHeaderLines>3</numHeaderLines>
maxRecordLength
Content of this field: Description of this field:
Type: xs:unsignedLong
Attributes: Required?: Default Value:
quoteCharacter
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
This element specifies a character to be used in the entity for quoting values so that field delimeters can be used within the value. This basically allows delimeter "escaping". The quoteChacter is typically a " or '.
Example:
<quoteCharacter>"</quoteCharacter>
recordDelimiter
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
This element specifies the record delimiter character when the format is text. The record delimiter is usually a newline (\n) on UNIX, a carriage return (\r) on MacOS, or both (\r\n) on Windows/DOS. Multiline records are usually delimited with two line ending characters, for example on UNIX it would be two newline characters (\n\n).
Example:
<recordDelimiter>\n\r</recordDelimiter>
literalCharacter
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
This element specifies a character to be used for escaping character values so that the following character is treated as its literal value. This allows "escaping" for special characters like quotes, commas, and spaces when they aren't intended as a delimiter value. The literalChacter is typically a \.
Example:
<literalCharacter>\</literalCharacter>
asciiFixed
Content of this field: Description of this field:
Elements: Required?: How many:
A sequence of (
numHeaderLinesOptionalMultiple Times
numPhysicalLinesOptionalMultiple Times
fieldBoundsOptionalMultiple Times
)
Attributes: Required?: Default Value:
numHeaderLines
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:
numPhysicalLines
Content of this field: Description of this field:
Type: xs:unsignedInt
Attributes: Required?: Default Value:

Description:
A single logical data record may be written over several physical lines in a file, with no special marker to indicate the end of a record. In such cases, it is necessary to know the number of lines per record in order to correctly read them.
Example:
3
fieldBounds
Content of this field: Description of this field:
Elements: Required?: How many:
A sequence of (
lineNumberOptionalMultiple Times
fieldStartColumnOptionalMultiple Times
fieldWidthOptionalMultiple Times
)
Attributes: Required?: Default Value:
lineNumber
Content of this field: Description of this field:
Type: xs:unsignedLong
Attributes: Required?: Default Value:

Description:
A single logical data record may be written over several physical lines in a file, with no special marker to indicate the end of a record. In such cases, the relative location of a data field must be indicated by both relative row and column number.
Example:
3
fieldStartColumn
Content of this field: Description of this field:
Type: xs:unsignedLong
Attributes: Required?: Default Value:

Description:
FixedWidth fields have a set length, thus the end of the field can always be determined by adding the fieldWidth to the starting column number.
Example:
any positive integer, see example in "delimeter" description
fieldWidth
Content of this field: Description of this field:
Type: xs:unsignedLong
Attributes: Required?: Default Value:

Description:
FixedWidth fields have a set length, thus the end of the field can always be determined by adding the fieldWidth to the starting column number.
Example:
any positive integer, see example in "delimeter" description
binaryRasterInfo
Content of this field: Description of this field:
Elements: Required?: How many:
A sequence of (
nrowsOptionalMultiple Times
ncolsOptionalMultiple Times
orientationOptionalMultiple Times
nbandsOptionalMultiple Times
nbitsOptionalMultiple Times
byteorderOptionalMultiple Times
layoutOptionalMultiple Times
skipbytesOptionalMultiple Times
ulxmapOptionalMultiple Times
ulymapOptionalMultiple Times
xdimOptionalMultiple Times
ydimOptionalMultiple Times
bandrowbytesOptionalMultiple Times
totalrowbytesOptionalMultiple Times
bandgapbytesOptionalMultiple Times
)
Attributes: Required?: Default Value:

Description:
The binaryRasterInfo element is a container for various parameters used to described the contents of binary raster image files. In this case, it is based on a white paper on the ESRI site that describes the header information used for BIP and BIL files ("Extendable Image Formats for ArcView GIS 3.1 and 3.2").
Example:
nrows
Content of this field: Description of this field:
Type: xs:int
Attributes: Required?: Default Value:

Description:
The number of rows in the image. Rows are parallel to the x-axis of the map coordinate system. There is no default.
Example:
ncols
Content of this field: Description of this field:
Type: xs:int
Attributes: Required?: Default Value:

Description:
The number of columns in the image. Columns are parallel to the y-axis of the map coordinate system. There is no default.
Example:
orientation
Content of this field: Description of this field:
Elements: Required?: How many:
Attributes: Required?: Default Value:
columnOrRowoptional

Description:
This element contains specification of the binary raster entity's record orientation by defining the element's attribute "columnorrow". The binary raster will be column major if the raster is to be displayed column by column from the byte stream, or row major if it is to be displayed row by row from the byte stream.
Example:
The valid attribute values are "columnmajor" or "rowmajor". If the attribute is not specified, "columnmajor" is used.
nbands
Content of this field: Description of this field:
Type: xs:int
Attributes: Required?: Default Value:

Description:
The number of spectral bands in the image. The default is 1.
Example:
nbits
Content of this field: Description of this field:
Type: xs:int
Attributes: Required?: Default Value:

Description:
The number of bits per pixel per band. Acceptable values are 1, 4, 8, 16, and 32. The default value is eight bits per pixel per band. For a true color image with three bands (R, G, B) stored using eight bits for each pixel in each band, nbits equals eight and nbands equals three, for a total of twenty-four bits per pixel. For an image with nbits equal to one, nbands must also equal one.
Example:
byteorder
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
The byte order in which image pixel values are stored. The byte order is important for sixteen-bit images, with two bytes per pixel. Acceptable values are I - Intel byte order (Silicon Graphics, DEC Alpha, PC) Also known as littleendian. M - Motorola byte order (Sun, HP, etc.) Also known as big-endian. The default byte order is the same as that of the host machine executing the software.
Example:
layout
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
The organization of the bands in the image file. Acceptable values are bil - Band interleaved by line. bip - Band interleaved by pixel. bsq - Band sequential. The default layout is bil.
Example:
skipbytes
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
The number of bytes of data in the image file to skip in order to reach the start of the image data. This keyword allows you to bypass any existing image header information in the file. The default value is zero bytes.
Example:
ulxmap
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
The y-axis map coordinate of the center of the upper-left pixel. If this parameter is specified, ulxmap must also be set, otherwise a default value is used.
Example:
ulymap
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
The x-axis map coordinate of the center of the upper-left pixel. If you specify this parameter, set ulymap, too, otherwise a default value is used.
Example:
xdim
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
The y-dimension of a pixel in map units. If this parameter is specified, xdim must also be set, otherwise a default value is used.
Example:
ydim
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
The x-dimension of a pixel in map units. If this parameter is specified, ydim must also be set, otherwise a default value is used.
Example:
bandrowbytes
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
The number of bytes per band per row. This must be an integer. This keyword is used only with BIL files when there are extra bits at the end of each band within a row that must be skipped.
Example:
totalrowbytes
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
The total number of bytes of data per row. Use totalrowbytes when there are extra trailing bits at the end of each row.
Example:
bandgapbytes
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
The number of bytes between bands in a BSQ format image. The default is zero.
Example:
distribution
Content of this field: Description of this field:
Type: PhysicalDistributionType
Attributes: Required?: Default Value:

Description:
This element provides information on how the resource is distributed online and offline. Connections to online systems can be described as URLs and as a list of relevant connection parameters.
Example:
references
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:
online
Content of this field: Description of this field:
Elements: Required?: How many:
A sequence of (
A choice of (
urlOptionalMultiple Times
OR
connectionOptionalMultiple Times
)
)
Attributes: Required?: Default Value:

Description:
Distribution information for accessing the resource online, represented either as a URL or as a series of named parameters that are needed in order to connect. The URL field is provided for the simple cases where a file is available for download directly from a web server or other similar server and a complex connection protocol is not needed. The connection field provides an alternative where a complex protocol needs to be named and described, along with the necessary parameters needed for the connection.
url
Content of this field: Description of this field:
Elements: Required?: How many:
Attributes: Required?: Default Value:

Description:
A URL (Uniform Resource Locator) from which this resource can be downloaded or additional information can be obtained. If accessing the URL would directly return the data stream, then the "function" attribute should be set to "download". If the URL provides further information about downloading the object but does not directly return the data stream, then the "function" attribute should be set to "information". If the "function" attribute is omitted, then "download" is implied for the URL function. In more complex cases where a non-standard connection must be established that complies with application specific procedures beyond what can be described in the simple URL, then the "connection" element should be used instead of the URL element.
Example:
http://data.org/getdata?id=98332
connection
Content of this field: Description of this field:
Elements: Required?: How many:
A choice of (
A sequence of (
connectionDefinitionOptionalMultiple Times
parameterOptionalMultiple Times
)
OR
referencesOptionalMultiple Times
)
Attributes: Required?: Default Value:
idoptional
systemoptional
scopeoptional

Description:
A description of the information needed to make an application connection to a data service. The connection starts with a connectionDefinition which lists all of the parameters needed for the connection and possible default values for each. It then includes a list of parameter values, one for each parameter, that override the defaults for this particular connection. One parameter element should exist for every parameterDefinition that is present in the connectionDefinition, except that parameters that were defined with a defaultValue in their parameterDefinition can be ommitted from the connection and the default will be used. All information about how to use the parameters to establish a session and extract data is present in the connectionDefinition, possibly implicitly by naming a connection schemeName that is well-known.
connectionDefinition
Content of this field: Description of this field:
Type: res:ConnectionDefinitionType
Attributes: Required?: Default Value:

Description:
Definition of the connection protocol to be used for this connection. The definition has a "scheme" which identifies the protocol by name, and a detailed description of the scheme and its required parameters.
parameter
Content of this field: Description of this field:
Elements: Required?: How many:
A sequence of (
nameOptionalMultiple Times
valueOptionalMultiple Times
)
Attributes: Required?: Default Value:

Description:
A parameter to be used to make this connection. This value overrides any default value that may have been provided in the connection definition.
name
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
The name of the parameter to be used to make this connection.
Example:
hostname
value
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
The value of the parameter to be used to make this connection. This value overrides any default value that may have been provided in the connection definition.
Example:
nceas.ucsb.edu
references
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
The id of another connection in this EML document to be used to provide the connection information. This is used instead of duplicating connection information when an identical connection needs to be used multiple times in an EML document.
offline
Content of this field: Description of this field:
Elements: Required?: How many:
A sequence of (
mediumNameOptionalMultiple Times
mediumDensityOptionalMultiple Times
mediumDensityUnitsOptionalMultiple Times
mediumVolumeOptionalMultiple Times
mediumFormatOptionalMultiple Times
mediumNoteOptionalMultiple Times
)
Attributes: Required?: Default Value:

Description:
the medium on which this resource is distributed digitally, such as 3.5" floppy disk, or various tape media types, or 'hardcopy'
Example:
CD-ROM, 3.5 in. floppy disk, Zip disk
mediumName
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
Name of the medium on which this resource is distributed. Can be various digital media such as tapes and disks, or printed media which can collectively be termed 'hardcopy'.
Example:
Tape, 3.5 inch Floppy Disk, hardcopy
mediumDensity
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
the density of the digital medium if this is relevant. Used mainly for floppy disks or tape.
Example:
High Density (HD), Double Density (DD)
mediumDensityUnits
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
if a density is given numerically, the units should be given here.
Example:
B/cm
mediumVolume
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
the total volume of the storage medium on which this resource is shipped.
Example:
650 MB
mediumFormat
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
the file system format of the medium on which the resource is shipped
Example:
NTFS, FAT32, EXT2, QIK80
mediumNote
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Description:
any additional pertinent information about the media
Example:
inline
Content of this field: Description of this field:
Type: xs:anyType
Attributes: Required?: Default Value:
references
Content of this field: Description of this field:
Type: xs:string
Attributes: Required?: Default Value:

Attribute Definitions:

unit

Use: optional


Description:
This element gives the unit of measurement for the size of the entity, and is typically bytes.
Example:
<entitySize unit="bytes">13</entitySize>
method

Type: xs:string

Use: optional


Description:
This element names the method used to calculate and authentication checksum that can be used to validate a bytestream. Typical checksum methods include MD5 and CRC.
Example:
<authentication method="MD5"> f5b2177ea03aea73de12da81f896fe40</authentication>
columnOrRow

Use: optional


Description:
This attribute specifies the entity's record orientation.
Example:
The valid attribute values are "columnmajor" or "rowmajor". If the attribute is not specified, "columnmajor" is used.
id

Type: xs:string

Use: optional

system

Type: xs:string

Use: optional

scope

Type: res:ScopeType

Use: optional

function

Type: res:FunctionType

Use: optional

id

Type: xs:string

Use: optional

system

Type: xs:string

Use: optional

scope

Type: res:ScopeType

Use: optional

id

Type: xs:string

Use: optional

system

Type: xs:string

Use: optional

scope

Type: res:ScopeType

Use: optional

Complex Type Definitions:

PhysicalType
Content of this field: Description of this field:
Elements: Required?: How many:
A choice of (
A sequence of (
dataObjectOptionalMultiple Times
dataFormatOptionalMultiple Times
distributionOptionalMultiple Times
)
OR
referencesOptionalMultiple Times
)
Attributes: Required?: Default Value:
idoptional
systemoptional
scopeoptional

Description:
PhysicalDistributionType
Content of this field: Description of this field:
Elements: Required?: How many:
A choice of (
A choice of (
onlineOptionalMultiple Times
OR
offlineOptionalMultiple Times
OR
inlineOptionalMultiple Times
)
referencesOptionalMultiple Times
)
Attributes: Required?: Default Value:
idoptional
systemoptional
scopeoptional

Simple Type Definitions:

Derived from: xs:string (by xs:restriction)

Allowed values:

  • columnMajor
  • rowMajor

Web Contact: jones@nceas.ucsb.edu