Catalog¶
- class kszx.Catalog(cols=None, name=None, filename=None, size=0)¶
Represents a galaxy catalog, with one “row” per galaxy, and user-defined “columns” for RA, DEC, etc.
A design decision: do we actually need a Catalog class? Or should we phase it out, in favor of
numpy.recarrayorastropy.Table?- Constructor args:
cols: dictionary (string col_name) -> (1-d numpy array). Optional, since you can add columns later with add_column().
name: catalog name (optional)
filename: filename, if catalog is stored on disk (optional)
size: number of galaxies in catalog (optional, can be inferred from array dimensions instead).
Members:
self.size(integer): Number of galaxies in catalog.self.col_names(list of strings): List of user-defined columns.self.name(string or None): Catalog name (optional).self.filename(string or None): Filename, if Catalog is stored on disk (optional).
Additionally, for each column name (in
self.col_names), the Catalog contains a member with the corresponding name, whose value is a 1-d array of length self.size.Column names are user-defined, but here are some frequently-occuring column names (note that I always use lower case):
self.ra_deg: right ascension (degrees)self.dec_deg: declination (degrees)self.z: redshiftself.zerr: redshift error (if photometric)self.rmag: r-band magnitude (and similarly for g-band, z-band, etc.)
Here are some functions which return Catalogs:
kszx.sdss.read_galaxies() kszx.sdss.read_randoms() kszx.desils_lrg.read_galaxies() kszx.desils_lrg.read_randoms() kszx.Catalog.from_h5() # static method
Here are some functions which make maps from catalogs:
kszx.healpix_utils.map_from_catalog() # catalog -> 2d healpix map kszx.pixell_utils.map_from_catalog() # catalog -> 2d pixell maps kszx.grid_points() # catalog -> 3d map
- add_column(col_name, col_data)¶
Adds a new column to the catalog.
col_name (string)
col_data (1-d numpy array)
- remove_column(col_name)¶
Removes an existing column from the catalog.
- apply_boolean_mask(mask, name=None, in_place=True)¶
Reduces the size of the Catalog, by applying a caller-specified boolean mask.
This can be used to implement color cuts, redshift cuts, etc.
mask (1-d boolean array). Elements should be True for galaxies which are kept, False for galaxies which are discarded.
name (string or None). If specified, the mask fraction will be printed.
in_place (booelan). If False, the original Catalog is unmodified, and a new Catalog is returned.
- apply_redshift_cut(zmin, zmax, in_place=True)¶
Reduces the size of the catalog, by keeping objects in redshift range \(z_{min} \le z \le z_{max}\).
This is a special case of
apply_boolean_mask().zmin (float or None). If None, then no lower redshift limit is imposed.
zmax (float or None). If None, then no upper redshift limit is imposed.
in_place (booelan). If False, the original Catalog is unmodified, and a new Catalog is returned.
- get_xyz(cosmo, zcol_name='z')¶
Returns shape (N,3) array containing galaxy locations in Cartesian coords (observer at origin).
The Catalog must define columns named
ra_deg,dec_deg, andz. The constructor arg is:cosmo (
Cosmology). Used to convert redshifts to distances.
This function is super useful – here are some use cases for the
pointsarray that it returns:To “grid” the galaxy field, pass the
pointsarray tokszx.grid_points().To interpolate a real-space field to the galaxy locations, pass the
pointsarray tokszx.interpolate_points()To create a bounding box for the galaxy field, pass the
pointsarray to theBoundingBoxconstructor.
This function is roughly equivalent to:
# Compute radial distaance from 'z' column of catalog chi = kszx.Cosmology.chi(self.z) # Compute Cartesian coords from 'ra_deg' and 'dec_deg' cols of catalog. points = kszx.utils.ra_dec_to_xyz(self.ra_deg, self.dec_deg, r=chi) return points
- generate_batches(batchsize, verbose=True)¶
Splits catalog into subcatalogs no larger than ‘batchsize’.
If
batchsizeis None, the catalog will be “split” into a single subcatalog.If
verboseis True, and the catalog is split into more than one batch, then a progress indicator will be shown.
- static concatenate(catalog_list, name=None, destructive=False)¶
Returns a new Catalog, obtained by concatenating (merging) multiple Catalogs.
catalog_list (sequence of Catalogs).
name (string or None). Name of new Catalog (optional).
destructive (boolean). If True, then the input Catalogs will be destroyed to save memory
- absorb_columns(src_catalog, destructive=False)¶
‘Absorbs’ all columns of ‘src_catalog’ into an existing Catalog (equivalent to calling add_column() multiple times).
If
destructive=True, then all columns will be removed fromsrc_catalog(saving some memory).
- make_random_subcatalog(new_size, name=None)¶
Returns a new Catalog, obtained by choosing a random subset (with size
new_size) of existing Catalog.This method can be used to reduce the size of a random catalog. For example:
# Random catalong with 100x the galaxy density rcat_large = kszx.sdss.read_randoms('CMASS_North') # Random catalong with 10x the galaxy density rcat_small = rcat_large.make_random_subcatalog(rcat_large.size/10)
- static from_h5(filename)¶
Reads FITS file in HDF5 format (written by
catalog.write_h5()), and returns a Catalog object.
- write_h5(filename)¶
Writes a Catalog to disk, in an HDF5 file format (readable with
Catalog.read_h5()).
- static from_fits(filename, col_name_pairs, name=None)¶
Reads FITS file in SDSS/DESILS format, and returns a Catalog object.
The
col_name_pairsarg should be a list of pairs (col_name, fits_col_name). See sdss.py and desils_lrg.py for examples.
- static from_text_file(filename, col_names, name=None)¶
Reads text file (one row per catalog object) and returns a Catlog object. The
col_namesarg should be a list of column names. See sdss.py for examples.
- show()¶
Prints a summary of the Catalog to stdout, with one line per column showing min/mean/max.