2007/09/24

ADASS XVII: Data Preservation and the VO (R. Mann)


I arrived at ADASS on Sunday morning, walked for about 3 hours from my hotel to Westminster Abbey and enjoyed a great autumn day with temperatures in the 20C (68F or so).
Today, however, the weather is more London-like: pouring rain....

I will try to report from some of the sessions I attend. My comments are in italics.

First talk at ADASS XVII is by R. Mann on Data Preservation and the VO.
  • Data Discovery in the VO
    • The VO requires rich, accurate and complete metadata that can be easily queried. It needs to be stores in simple structures
    • How can we provide content in simple structures? Solving this is crucial to the VO success (something I really agree on!)
  • Data Access and Analysis in the VO
    • The current model (S*AP, etc) has astronomers downloading the data
    • There is an increasing importance of surveys driven by large statistical survey
    • This puts contraints on the archives to provide analysis tools (a point often made by Alex Szalay and Jim Gray)
    • Data should be preserved for decades and this has technical and sociological concerns
    • Who will do the data preservation? National centers, universities, single groups with grants? Some funding agencies find this hard to address long term issues... but to some extent they will be forced to do so. Shall we consider moving some of these archive to libraries for data curation?
  • Wider perspective
    • High-level policy statements, OECD principles?
      • Openness, flexibility, transparency, legal conformity, protection of intellectual property, formal responsibility, quality, professionalism, interoperability, security, efficiency, accountability, sustainability
    • Interdisciplinary research
      • Most work has common starting points: open archive information system reference model
      • Influence extends into commercial sector: IBM white paper "toward OASIS-based data preservation initiative"
  • Not all the data is digital
    • ROE ~19000 plates
    • Harvard plate collection ~500,000 plates
Conclusions:
  • Data preservation and VO go hand-in-hand
  • VO needs to actively enable data preservation
  • Action is needed NOW
    • Need the implementation of metadata standards
    • Interact with other communities and enjoy the benefits from new high-level data policies.
Q/A
Q: (Hogg) What about political unrest in some national funding agencies?
A: There is an initiative to spread and copy the data to multiple archives (LOCKS?)

Q: Something is missing: we are missing the tools and the software system used to produced the data
A: There are level of "emulations" (virtualization?)

Q: (Rots) Do you have an example of this preservation model?
A: (passed on to collaborator). We have analyzed non-astronomical archives. OAIS models are trying to keep things inderstandable