2007/09/24

ADASS XVII: Building a framework for data preservation of large astronomical data

Kantor LSST Corporation, Project Manager

LSST Data Archive
  • Millions of images
  • Billions of objects
  • Trillions of source
  • Need to be preserved for decades
  • Data will be stored in file systems and databases (or something else)
LSTT Data Flow and Data Management System (see here for more info)

Data Access Framework API
  • Data must exist in different formats (C++,FITS, RDMS)
  • Challenge in providing an API that meets requirements
  • API is a work in progress, first release is Data Challenge 2 (end of this year)
LSST API Data Diagram, Data Model

Status of work
  • All open source (not ready yet)
  • In system integration phase
  • Execute a test with 2.5TB test data from CHFT data
Q/A
Q. How much dat do you process daily? Do you use a distributed system
A. 15TB/night, 30TB of raw data

Q. What is the plan after LSST ends for data preservation?
A. We intend to serve and preserve the data for decade. We need financial support. Data will be public from day one. 3 data center: 2US, 1 in Chile. We need to provide open interfaces.

Q. Do you have plans to recalibrate the data?
A. Sizing LSST to see if we can reprocess the data yearly. Data release is annual.