The spec2d Reduction Pipeline:
General Outline and Data Formats

This page is a very brief introduction to the UCB DEIMOS spec2d reduction pipeline. This IDL code is under CVS control, and you can download a stamped version of the code -- see our website for more details. The spec2d pipeline is modeled on the SDSS spectral pipeline of David Schlegel and Scott Burles. Most of the DEIMOS spec2d code was written by Doug Finkbeiner, Marc Davis, Jeff Newman, and Michael Cooper with important contributions from Brian Gerke regarding the implementation of non-local sky subtraction. The pipeline operates in 5 separate stages, each of which produces its own output files. The code operates without supervision and usually does an excellent job of wavelength fitting, sky subtraction, and object fitting. If you want to understand what the code is actually doing, start with the routine domask.pro (in ~/cvs/spec2d/pro/) which is a simple script to run through the full reduction for a given slitmask. Below we will briefly discuss the 5 stages of the spec2d code and its outputs.


0. The zeroth step in the reduction of a slitmask is the generation of a plan file. Creation of a plan file is NOT handled by the routine domask.pro. It needs to be generated prior to running the domask script. The plan file can be generated, in an automated fashion, by analysis of the FITS headers for all files in a given raw data directory. The plan file is ascii in format and can be edited to tailor the reduction plan for a given mask. Listed in the file is the mask name, the raw data directory, the names of flats, arcs, and science frames, as well as a few optional keywords. Refer to the routine read_planfile.pro for more details and see the example plan file. Also, the spec2d cookbook includes far more details regarding the construction of a plan file.

1. The first stage of the data reduction is the processing of the flats and arc files to determine where the slitlets fall on the CCD array and to find the 2-d lambda solution for each slitlet. deimos_mask_calibrate.pro is the main routine for this stage of the pipeline. This processing of the flats and arcs is done chip by chip, with information for each slitlet written to a separate FITS file, in the form of a BINTABLE extension. The blue side and red side of each slitlet lead to different files, calibSlit.xxxx.sssB.fits and calibSlit.xxxx.sssR.fits, where sss is the slit number and xxxx is the mask name. First the flats are read and edges are detected; these edges are compared to the bluslits BINTABLE stored with the DEIMOS data files which describes where each slitlet should appear (in units of millimeters). Next a smooth curve is fit to the edges of each curving slitlet, (X0 and X1). The multiple flats are then processed to reject CRs and to measure the SLITFN, the 'throughput' of the slitlet as function of row number (since the DEIMOS data is transposed, row is the spatial direction and column is the spectral direction). Next the curved slitlets are mapped into rectangular arrays by shifts and interpolation in the SPATIAL direction; this rectification greatly simplifies further processing. The flats are also used to generate a 2-d normalizing function for fringing corrections.

The next step is solving for the wavelength of each pixel in the 2-d data array of each rectified slitlet. Basic information on the grating and grating tilt is drawn from the FITS headers of the raw data files and at this stage we employ the DEIMOS optical model as designed by Drew Phillips to assist in the initial lambda fitting. The fitting is done in multiple stages and leads to a fit that is typically precise at the 0.005 Angstrom level over the entire 2-d slitlet. deimos_arcfit.pro is the routine that produces the final fits. (Two alternative wavelength fitting schemes are possible, selectable by flags within the plan file. These are the POLYFLAG and the TRACESET methods. The first is DEIMOS specific while the second is taken from SDSS routines. Note that the POLYFLAG method is the default fitting scheme, and should be employed by all users. The TRACESET scheme has not been maintained and should not be used.)

At this stage, the results are output to calibSlit files by means of the mwrfits command, which saves the data in the form of a FITS BINTABLE. Reading the file can be done via the mrdfits command (both routines are part of the GODDARD IDL/FITS library and contained in the SDSS IDLUTILS package). These routines write and read IDL structures (same as C structures). For example, the command
    IDL> cc = mrdfits('calibSlit.2200.004B.fits', 1)
reads the first HDU extension of this calibSlit file into the variable cc. To find out what cc contains, execute
    IDL> help, cc, /str
which will list the IDL structure 'tags' showing the type and the size of each tag (note that the number of rows, here 229, is specific to each slitlet):

   IDL> help, cc, /str
   ** Structure <81e73ac>, 11 tags, length=15980400, data length=15980400, refs=1:
FLAT FLOAT Array[4096, 229]
IVAR FLOAT Array[4096, 229]
MASK BYTE Array[4096, 229]
SLITFN FLOAT Array[229]
X0 FLOAT Array[4096]
X1 FLOAT Array[4096]
DLAM FLOAT Array[229]
LAMBDAX DOUBLE Array[6]
TILTX DOUBLE Array[6]
RAWFLAT FLOAT Array[4096, 229]
RAWARC FLOAT Array[4096, 229]
   IDL>
To access the tags, one writes, e.g. cc.flat or cc.slitfn. For definitions of the calib structure tag variables or for more regarding the information contained in the headers of the calib files, please follow these links: calib structure tags and calib header info.

For a TRACESET wavelength solution, LAMBDAX and TILTX would be set to zero or omitted, and new parameters FUNC, COEFF, XMIN, and XMAX will appear. TRACESETs are more flexible than the POLYFLAG format, but testing has shown that the POLYFLAG scheme appears to be slightly more robust. In the TRACESET scheme, each row of COEFF is analogous to LAMBDAX; i.e., 6 coefficients defining a legendre polynomial. The traceset deals with the tilted lines by having a different wavelength solution for each row, rather than a global TILTX function. Please, recall that the TRACESET method is no longer maintained; all users should employ the POLYFLAG wavelenght fitting scheme which is the default.

Note that a 2-d wavelength array is NOT contained in the calibSlit structure. But the full wavelength solution can be obtained by using the simple routine lambda_eval.pro. The syntax for the lambda_eval function is as follows:
    IDL> wv = lambda_eval(cc)
where cc is the calibSlit structure contained in the first HDU of a calibSlit file. The result, wv, is a 2-d float array ([4096, 229]) containing the wavelength value at each pixel in units of angstroms. This same format of handling the wavelength solution (TILTX, LAMBDAX, DLAM) is propogated through the pipeline to the subsequent output files (spSlit and slit files). Using the same lambda_eval function, 2-d wavelength arrays can be determined using the spSlit and slit structures.

2. The next stage of the reduction process reads the science frames, one by one for each chip, to extract the science data for each slitlet. In this step, the corresponding calibSlit files for each slit are employed to flatfield and to rectify the slitlet. The main routine for this stage of the pipeline is deimos_2dreduce.pro. The output from this routine goes to files named spSlit.xxxx.sssB(R).fits with two HDU's per science frame. The first of these HDU's describes the information in the raw science frame, with tags:
FLUX FLOAT Array[4096, 229]
IVAR FLOAT Array[4096, 229]
LAMBDAX DOUBLE Array[6]
TILTX DOUBLE Array[3]
MASK BYTE Array[4096, 229]
SLITFN FLOAT Array[229]
SKYROW BYTE Array[229]
SLITWIDTH FLOAT     0.54900
DLAM FLOAT Array[229]
INFOMASK BYTE Array[4096, 229]
Refer to these links, for detailed information on the spSlit structure tags and the spSlit header keywords contained in the first HDU of the spSlit files.

The 2nd HDU written for each science frame describes the b-spline used to construct the sky model. This b-spline is a description of the mean sky for the selected rows as a function of wavelength, with CR rejection (see code for details). Subsequent science frames are written as additional HDU's to the same file, 2 HDU's per frame.

3. The next stage of the pipeline is the combination of the separate science exposures into one mean, sky-subtracted, CR cleaned 2-d spectrum of each slitlet. The main routine here is spslit_combine.pro, which reads the separate HDU's of each spSlit file, subtracts the b-spline model sky appropriate for each separate exposure and then does an inverse variance weighted average of the residuals, pixel by pixel. Cosmic ray rejection is done at this stage as well, based on time variability of a given pixel. The output of this process is a set of files slit.xxxx.sssB(R).fits. These files contain one HDU, and the tags for the structure contained within take the form:
FLUX FLOAT Array[4096, 229]
IVAR FLOAT Array[4096, 229]
MASK BYTE Array[4096, 229]
CRMASK BYTE Array[4096, 229]
LAMBDA0 FLOAT Array[4096]
DLAMBDA FLOAT Array[4096, 229]
LAMBDAX DOUBLE Array[6]
TILTX DOUBLE Array[3]
SLITFN FLOAT Array[229, 3]
DLAM FLOAT Array[229, 3]
INFOMASK BYTE Array[4096, 229]
In these files, the structure tags have the following meaning:
  FLUX: the mean flux (e-/hour) of the spectral image
  IVAR: the appropriate inverse variannce of the sky-substracted spectrum, pixel by pixel.
  CRMASK: describes which pixels had cosmic rays in a given frame. Bit N of the CRMASK is set if a CR registered on a given pixel in Nth frame.
  MASK: inherited from the spSlit file.
  LAMBDAX: inherited from the spSlit file.
  TILTX: inherited from the spSlit file.
  SLITFN: inherited from the spSlit file.
  DLAM: inherited from the spSlit file.
  INFOMASK: inherited from the spSlit file.
  LAMBDA0: in conjunction with DLAMBDA, this field provides a secondary means for recovering the 2-d lambda solution. The LAMBDA0 array contains the wavelength value in angstroms for the zeroth pixel row in the slit.
  DLAMBDA: in conjunction with LAMBDA0, this field provides a secondary means for recovering the 2-d lambda solution. The DLAMBDA array contains the increment to add to LAMBDA0 to obtain the wavelength at pixel x,y: LAMBDA(x,y) = LAMBDA0[x] + DLAMBDA[x,y]. The wavelength info is encoded in this manner (rather than storing the full 2-d solution) so that the low order bits of all the large 2-D arrays can be set to 0 without loss of precision, allowing the files to be compressed.

4. The fourth stage of the pipeline involves the creation of the "Allslits" files. These files are very helpful for examining many spectra at once. The files show all of the 2-d slit spectra (flux as a function of wavelength) stacked in a large image and aligned in wavelength space. These files are generally very large (will vary depending on the total number of slits on your mask) and for this reason half of the slits are contained in one file and the other half in the other file (Allslits0.xxxx.fits and Allslits1.xxxx.fits). As one will notice upon viewing one of these files, the tilt of any slits has been removed in creating these Allslits images so that sky-lines should appear vertical in the image. It should be made clear that these files should not be used for science. In order to align the slits in wavelength, the tilt of each slit was removed via interpolation. The slit files are the proper 2-d spectra to employ in scientific analysis. To view an Allslits file simply use the following IDL syntax:
    IDL> atv, 'Allslits0.fits', min=-5, max=50
This will launch a GUI display device called ATV. Re-scaling of the image will likely be needed. Some useful markers are placed in the Allslits files. The number of each slit is marked in a bar in between the stacked 2-d slit spectra; a narrow strip above each spectrum should be filled with a value equal to (-1000 - slitno) where slitno is the slit number corresponding to the slit below the bar. At the far right and left ends of the images, the slit numbers are marked in a large vertical bar.

5. Finally, we extract 1-d spectra from each of the slitlets. The main routine for this stage is do_extract.pro. By default, the 1-d extraction is done along the locus of constant lambda, with a boxcar extraction as well as an optimal extraction (see Horne, K. 1986, PASP, 98, 609). Two additional extraction algorithms exist in the pipeline: a variant of the default boxcar and a variant of Horne's optimal algorithm. For more on the different extraction algorithms see the documentation for the routine extract1d.pro. Since individual slitlets can contain multiple objects, we write a separate output file for each object with name: spec1d.xxxx.sss.objno.fits where 'objno' is the object name carried in the raw data file (for DEEP2 this is an 8 digit number OBJNO as listed in the pcatxx.fits files). Each spec1d file generally contains 4 separate HDUs, B and R HDUs for the boxcar extraction, and B and R HDUs for the optimal extraction. Each HDU is a BINTABLE with the tags:
SPEC FLOAT Array[4096]
LAMBDA FLOAT Array[4096]
IVAR FLOAT Array[4096]
CRMASK INT Array[4096]
BITMASK INT Array[4096]
ORMASK INT Array[4096]
NBADPIX INT Array[4096]
INFOMASK INT Array[4096]
OBJPOS FLOAT     32.9567
FWHM FLOAT     10.04573
NSIGMA FLOAT     0.467127
R1 LONG     22
R2 LONG     44
SKYSPEC FLOAT Array[4096]
IVARFUDGE FLOAT     1.24571
In these spec1d files, the structure tags are defined as follows:
  SPEC: the 1-d spectrum (at presnt, not flux-calibrated)
  IVAR: the appropriate 1-d inverse variance array.
  LAMBDA: the wavelength in units of angstroms.
  CRMASK: a 1-d array giving the number of pixels at each wavelength which were affected by a CR.
  BITMASK: a 1-d representation of the 2-d MASK field contained in the slit file. The slit MASK array has been evaluated in the spatial direction (over the extraction width R1 to R2) using the boolean operator AND.
  ORMASK: a 1-d representation of the 2-d MASK field contained in the slit file. The slit MASK array has been evaluated in the spatial direction (over the extraction width R1 to R2) using the boolean operator OR.
  NBADPIX: the total number of pixels at each wavelength (summed over the extraction width R1 to R2) for which the 2-d slit MASK is set (that is, for which the MASK is greater than 0).
  INFOMASK: a 1-d representation of the 2-d INFOMASK field contained in the slit file. The slit INFOMASK array has been evaluated in the spatial direction (over the extraction width R1 to R2) using the boolean operator OR.
  OBJPOS: the spatial position of the object in the slit in units of pixels (recall that the IDL arrays are zero-indexed). This is employed as the extraction center position.
  FWHM: the measured full-width at half maximum of the object in units of pixels (recall that the DEIMOS pixel scale is roughly 0.117371 arcsec/pixel.
  NSIGMA: the spatial extent of the extraction window in terms of a number of sigma. Recall that for a Gaussian function FWHM ~ 2.35482 * SIGMA
  R1: in conjunction with R2, defines the spatial pixel rows in the 2-d flux array from which the 1-d spectrum was extracted (R2 > R1).
  R2: in conjunction with R1, defines the spatial pixel rows in the 2-d flux array from which the 1-d spectrum was extracted (R2 > R1).
  SKYSPEC: the average 1-d b-spline sky spectrum.
  IVARFUDGE: a fudge factor employed in incorporating the b-spline ivar values into the 1-d object ivar values.

For more on running the pipeline, see the detailed instructions here.