2007/09/24

ADASS XVII: Long-Term Preservation of Astronomical Research Results (B. Hanish)

Bob Hanish is the VO Project Manager

Electronic Information in Astronomy
  • Astronomy pioneered the electronic data publication (APJLet 1995)
  • E-Journals link underlying data, archives: Astronomers interact with connected resources. Research libraries play an important role
The data preservation problem
  • Articles describe highly processed data: journals preserve some data, but results cannot be easily verified. Some legacy dataset are being lost, this is the data used by scientist to produce the paper
  • ADIL is trying to recover data of any form. This is a start but the data capture process is not integrated into the journal. Data management is centralized
  • NED has done something similar to capture spectral data, however data capture and curation is separate from the journal submission
An example: storyboard. Pic a paper, I'd like to be able to Save the data, analyze the data.
Data can be found but it might not be trivial to actually retrieve...

  • The approach we should take is to integrate digital data management into the publication process
  • Data storage appliances will store the data
  • There is data access to these repositories
  • The publishers will then link this data
Data preservation requires partners
  • Medatada definition (VO, library)
  • Content management tool evaluation/selection
  • Physical storage replication
  • ...
Data Curation Challenge
  • Without metadata, data is useless
  • We can automate data curation (this should REALLY be the effort!)
  • Curation is an ongoing and significant cost for digital data (registry)
Data discovery and access is essential for the research community.

Q/A
Q. (Hogg) Page charges might hinder the process (particularly if one requires a tarball associated to each paper).
A. Evolutionary, volitile... Agencies will fund data preservation, not sure how it will work. In a decade all data will be stored digitally.

Q. People are not accepting to give a description of their data (some journals only). Librarians might really be involved, but heterogenous data requires special training librarians might have to do.
A. Librarians are not experts in detailed data curation. A new career will evolve.

Q. Need to preserve numerical simulations as well.
A. IVOA will have a session on this.