|
|
- Why is the size of the full catalog data for a given release different (less than) the catalog data size shown on the
sdss.org page for that release?
The catalog data size shown on sdss.org includes the size of the target dataset, which is a separate database
containing the version of the data that the spectroscopic targets were selected from. We do not distribute the target
dataset to mirror sites. It is only required for a very small fraction of queries and we ask that you go
to the main SDSS CAS site for those queries that require the target dataset.
- What about the raw (FITS) image and spectroscopic data? How do I get the raw data for a given SDSS data release?
The raw data is not served by the CAS (Catalog Archive Server), rather it is available from a different Fermilab site
called DAS (Data Archive Server). The raw data may become available at some point
the SDSS NCDM data distribution site, but until then you can contact the
SDSS Help Desk to ask how you can get the raw imaging and spectroscopic data.
Be warned that the raw data is a factor of 3-4 times the size of the catalog data.
- Why are there checksums included in the data subsets table for each subset?
The checksum is a quantity we generate based upon the schema and data in that particular version of each subset of the
database, and it provides a verification of that version. You can use the value of the checksum, found in the SiteConstants
or Versions table, to check that the version you have is exactly the same as what is distributed. Note that this checksum
is different from the MD5 checksum(s) used to verify the data transfer itself.
|