knb portal
Informatics Research
KNB Home Data People Informatics Biocomplexity Education Software

 
  EML Developer's Workshop
  October 4-5, 2001
National Center for Ecological Analysis and Synthesis,
Santa Barbara

Co-sponsored by:
"Networking our Research Legacy", NSF grant # 9983132 to Arizona State University.

"Knowledge Networking for Biodiversity", NSF grant # to National Center for Ecological Analysis and Synthesis (UCSB), LTER Network Office (UNM) and San Diego Supercomputing Center (UCSD).

Goals
Participants
Logistics
Agenda

Goals

The goal of this workshop is to promote collaboration among ecoinformatics development efforts. This will be accomplished by a series of brief overviews of existing development efforts in ecoinformatics and more extensive discussions and working sessions in which developers can share ideas, experiences and solutions to problems in ecoinformatics.

Participants

  • Matt Jones (NCEAS)
  • Mark Schildhauer (NCEAS)
  • Chad Berkley (NCEAS)
  • Jivka Bojilova (NCEAS)
  • Dan Higgins (NCEAS)
  • John Harris (NCEAS)
  • Peter McCartney (ASU)
  • Corinna Gries (ASU)
  • Robin Schoeninger (ASU)
  • Owen Eddins (LTER Network Office)
  • James Brunt (LTER Network Office)
  • Eric Ordway (Canopy Database)
  • Chris Jones (PISCO)
  • Rudolf Nottrott (NCEAS)

Logistics

You should all have recieved an email from Gail Stichler concerning travel, hotel and local transport, the body of which is as follows:

NCEAS IS UNABLE TO MAKE YOUR FLIGHT AND/OR HOTEL RESERVATIONS FOR YOU. However, we have secured a block of rooms in a small hotel, the Casa Del Mar Inn, which is one mile from the Center and one block to the beach area. By securing this block of hotel rooms, we have negotiated a lower rate than is usual. This BLOCK EXPIRES SEPTEMBER 11 and you will need to make your reservations before then, mentioning the UCSB Ecology Center when you register. This ensures that the corporate rate will apply when your meeting is held in October. If you do not make your reservations within this block period, you risk losing the corporate rate.

Casa Del Mar Inn 18 Bath Street 1-805-963-4418 or 1-800-433-3097

For many of you, reimbursement for your travel expenses will be through the ASU BDI grant "Networking our Research Legacy", and Peter McCartney will be sending you details about how to file for reimbursement soon. Because of our central location, it is not necessary to have a car in Santa Barbara. In fact, downtown parking is restricted and expensive. Please fly to Santa Barbara rather than to Los Angeles. Secure a ride with the SuperRide Shuttle outside the terminal and mention the Ecology Center when you state your destination. The cost should be approximately $15.00. Your participation will begin on Thursday morning, October 4 at 8:30 AM and end on Friday afternoon, October 5 at 5:00 PM.

My email address is stichler@nceas.ucsb.edu. Please use the subject heading "KDI in October" when you respond. If I can help you in any other way, please do not hesitate to email, call, or write me.

Draft Agenda:

Thursday, October 4th

8:30 - 9:45 : Introductions and Overview of project activities

A discussion/laundry list at the highest level of the projects and subprojects each person/group is working on to get an idea of where overlaps might be strongest. We should be able to produce a list of projects, their lead developer(s), and get a rough idea of correspondence with other subprojects. Each "group" should come with a list and a 10-15 minute summary already prepared for their team.

  • ASU
  • NCEAS
  • NET
  • CANOPY
  • PISCO
  • NRS

Break

10:15 - 12:00: Tools for collaboration (Matt)

  • shared source code system (CVS)
  • shared bug tracking database
  • a common GPL template?
  • the role of ecoinformatics.org in all of this
  • writing collaborations for things we have all been working together on.
  • communication - mailing lists, forum sw, irc, vnc, etc

Lunch

1:30 - 2:30 : Storage solutions for managing metadata and API considerations (Peter, Owen)

We can expect that many locations have invested in one or another storage solution and are expecting to inteface these with the EML language and its associated software tools. We should briefly compare and contrast three viable solutions in terms of how we load, search and retrieve data. We should then consider what would be the specifications for an API that would ensure a minimum functionality regardless of the storage method.

  • ASU: RDBMS with some associated software for translating into EML
  • NCEAS: Metacat
  • ASU: dbXML

Break

2:15 - 3: 15 : Persistence of package structure (Chad)

The modular approach to EML is a powerful feature, enabling extensibility and reuse of both structure and content. However, we see lingering challenges with ensuring package relationships beyond the scope of a single repository. Depending on the nature of the storage format, different methods might be used to define relationships between modules (foreign keys in an RDBMS, the proposed EML triple, or the X pointer specification). How can we easily move data from one repository to another? Are there applications where it is advantageous to 'resolve' all the content out into a single XML file?

Break

3:30 - 5:00 - Reverse engineering data structure (Corinna, Dan)

We all want to include some automated metadata capture in our metadata management tools, and we have all done some work in this area. What datasources are we working on now? Do we need some standard specifications for modules? Are there some tasks that might be better handled by creating some server based tools that our applications might call remotely (for example things that involve costly licensed librarires, or that would run faster remotely)

  • NCEAS: ASCII files
  • ASU: relational databases via jdbc, GIS via mapObjects.

Friday, October 5

8:30 - 10:00 : XML Editing Interfaces (Dan)

Several of us have undertaken one or more efforts at producing usable
XML editors/metadata entry tools with which scientists can produce EML
and edit existing EML documents. This discussion would be an overview
of techniques and approaches used by the various projects for building
generic, extensible wizards and metadata forms.

  • NCEAS: Morpho
  • ASU: Metadata Wizard
  • Canopy: Web interface

Break

10:15 - 12:00 XML Query (Robin, Mark)

It seems xpath query with Xalan is being used by both ASU and NCEAS for querying xml data. We also seem to have an interesting array of diverse interface solutions for helping users create xpath expressions - tree-based forms, specialized browsers for selecting and/or resolving certain kinds of search criteria such as taxon names, map GUIs, etc. There are also new things on the horizon like Xquery that we might want to talk about.

  • CEAS: Morpho/Metacat query
  • ASU: Distributed Xpath Query System

Lunch

1:30 - 3:00 - Schema and DTD handling (Corinna)

We all have expressed a goal of having our applications driven by schemas to mitigate the problem of keeping code syncronized with ever-changing standards. While there are reasonably good libraries for handling DTD files, working with schemas is still a dimly lit road. What solutions are there for accessing schema information in our programs?

  • ASU: classes for retrieving and representing schema information as an xml file
  • NCEAS: dtd handling.

Break

3:15 - 4:15 : XML interfaces with other data formats (Peter)

There are a variety of tools for moving data between XML and other formats such as relational databases. These include built-in vendor support for XML, XSLt, third-party tools like Cocoon, Turbine, Castor, etc. How useful are these relative to our EML standards and the kinds of things we want to do?

  • ASU: Server-side Java solution for porting data from relational databases.
  • NET: Cocoon

Break

4:30 - 5:00 - Wrap up

  • Workshop summary report
  • Action items for ecoinformatics.org
  • Next workshop

Web Contact: jones@nceas.ucsb.edu