rail.estimation.algos.bpz_lite module

Port of some parts of BPZ, not the entire codebase. Much of the code is directly ported from BPZ, written by Txitxo Benitez and Dan Coe (Benitez 2000), which was modified by Will Hartley and Sam Schmidt to make it python3 compatible. It was then modified to work with TXPipe and ceci by Joe Zuntz and Sam Schmidt for BPZPipe. This version for RAIL removes a few features and concentrates on just predicting the PDF.

Missing from full BPZ: -no tracking of ‘best’ type/TB -no “interp” between templates -no ODDS, chi^2, ML quantities -plotting utilities -no output of 2D probs (maybe later add back in) -no ‘cluster’ prior mods -no ‘ONLY_TYPE’ mode

class rail.estimation.algos.bpz_lite.BPZliteEstimator

Bases: CatEstimator

CatEstimator subclass to implement basic marginalized PDF for BPZ In addition to the marginalized redshift PDF, we also compute several ancillary quantities that will be stored in the ensemble ancil data: zmode: mode of the PDF amean: mean of the PDF tb: integer specifying the best-fit SED at the redshift mode todds: fraction of marginalized posterior prob. of best template, so lower numbers mean other templates could be better fits, likely at other redshifts

Parameters:
  • output_mode ([str] default=default) – What to do with the outputs. The options are ‘default’, where outputs will be written to files and some returned, and ‘return’, where outputs will only be returned and not written.

  • chunk_size ([int] default=10000) – Number of objects per chunk for parallel processing or to evalute per loop in single node processing

  • hdf5_groupname ([str] default=photometry) – name of hdf5 group for data, if None, then set to ‘’

  • zmin (float] (default=0.0))

  • zmax (float] (default=3.0))

  • nzbins (int] (default=301))

  • id_col ([str] default=object_id) – name of the object ID column

  • redshift_col (str] (default=redshift))

  • calc_summary_stats ([bool] default=False) – Compute summary statistics

  • calculated_point_estimates ([list] default=[]) – List of strings defining which point estimates to automatically calculate using qp.Ensemble.Options include, ‘mean’, ‘mode’, ‘median’.

  • recompute_point_estimates ([bool] default=False) – Force recomputation of point estimates

  • nondetect_val (float] (default=99.0))

  • mag_limits (dict] (default={'mag_u_lsst': 27.79, 'mag_g_lsst': 29.04, 'mag_r_lsst': 29.06, 'mag_i_lsst': 28.62, 'mag_z_lsst': 27.98, 'mag_y_lsst': 27.05}))

  • bands (list] (default=['mag_u_lsst', 'mag_g_lsst', 'mag_r_lsst', 'mag_i_lsst', 'mag_z_lsst', 'mag_y_lsst']))

  • ref_band (str] (default=mag_i_lsst))

  • err_bands (list] (default=['mag_err_u_lsst', 'mag_err_g_lsst', 'mag_err_r_lsst', 'mag_err_i_lsst', 'mag_err_z_lsst', 'mag_err_y_lsst']))

  • dz ([float] default=0.01) – delta z in grid

  • unobserved_val ([float] default=-99.0) – value to be replaced with zero flux and given large errors for non-observed filters

  • bpz_ref_data_path ([str] default=None) – bpz_ref_data_path (str): file path to the SED, FILTER, and AB directories. If left to default None it will use the install directory for rail + ../examples_data/estimation_data/data

  • filter_list (list] (default=['DC2LSST_u', 'DC2LSST_g', 'DC2LSST_r', 'DC2LSST_i', 'DC2LSST_z', 'DC2LSST_y']))

  • spectra_file ([str] default=CWWSB4.list) – name of the file specifying the list of SEDs to use

  • madau_flag ([str] default=no) – set to ‘yes’ or ‘no’ to set whether to include intergalactic Madau reddening when constructing model fluxes

  • no_prior ([bool] default=False) – set to True if you want to run with no prior

  • p_min ([float] default=0.005) – BPZ sets all values of the PDF that are below p_min*peak_value to 0.0, p_min controls that fractional cutoff

  • gauss_kernel ([float] default=0.0) – gauss_kernel (float): BPZ convolves the PDF with a kernel if this is set to a non-zero number

  • zp_errors (list] (default=[0.1, 0.1, 0.1, 0.1, 0.1, 0.1]))

  • mag_err_min ([float] default=0.005) – a minimum floor for the magnitude errors to prevent a large chi^2 for very very bright objects

  • model (ModelHandle (INPUT))

  • input (TableHandle (INPUT))

  • output (QPHandle (OUTPUT))

__init__(args, **kwargs)

Constructor, build the CatEstimator, then do BPZ specific setup

entrypoint_function: str | None = 'estimate'
interactive_function: str | None = 'bpz_lite_estimator'
name = 'BPZliteEstimator'
open_model(**kwargs)

Load the mode and/or attach it to this Stage

Parameters:
  • tag – Input tag associated to the model

  • **kwargs – Should include ‘model’, see notes

Notes

The keyword arguement ‘model’ should be either

  1. an object with a trained model,

  2. a path pointing to a file that can be read to obtain the trained model,

  3. or a ModelHandle providing access to the trained model.

Returns:

The object encapsulating the trained model.

Return type:

Any

class rail.estimation.algos.bpz_lite.BPZliteInformer

Bases: CatInformer

Inform stage for BPZliteEstimator, this stage assumes that you have a set of SED templates and that the training data has already been assigned a ‘best fit broad type’ (that is, something like ellliptical, spiral, irregular, or starburst, similar to how the six SEDs in the CWW/SB set of Benitez (2000) are assigned 3 broad types). This informer will then fit parameters for the evolving type fraction as a function of apparent magnitude in a reference band, P(T|m), as well as the redshift prior of finding a galaxy of the broad type at a particular redshift, p(z|m, T) where z is redshift, m is apparent magnitude in the reference band, and T is the ‘broad type’. We will use the same forms for these functions as parameterized in Benitez (2000). For p(T|m) we have p(T|m) = exp(-kt(m-m0)) where m0 is a constant and we fit for values of kt For p(z|T,m) we have

` P(z|T,m) = f_x*z0_x^a *exp(-(z/zm_x)^a) where zm_x = z0_x*(km_x-m0) `

where f_x is the type fraction from p(T|m), and we fit for values of z0, km, and a for each type. These parameters are then fed to the BPZ prior for use in the estimation stage.

Parameters:
  • output_mode ([str] default=default) – What to do with the outputs. The options are ‘default’, where outputs will be written to files and some returned, and ‘return’, where outputs will only be returned and not written.

  • hdf5_groupname ([str] default=photometry) – name of hdf5 group for data, if None, then set to ‘’

  • zmin (float] (default=0.0))

  • zmax (float] (default=3.0))

  • nzbins (int] (default=301))

  • nondetect_val (float] (default=99.0))

  • mag_limits (dict] (default={'mag_u_lsst': 27.79, 'mag_g_lsst': 29.04, 'mag_r_lsst': 29.06, 'mag_i_lsst': 28.62, 'mag_z_lsst': 27.98, 'mag_y_lsst': 27.05}))

  • bands (list] (default=['mag_u_lsst', 'mag_g_lsst', 'mag_r_lsst', 'mag_i_lsst', 'mag_z_lsst', 'mag_y_lsst']))

  • err_bands (list] (default=['mag_err_u_lsst', 'mag_err_g_lsst', 'mag_err_r_lsst', 'mag_err_i_lsst', 'mag_err_z_lsst', 'mag_err_y_lsst']))

  • ref_band (str] (default=mag_i_lsst))

  • redshift_col (str] (default=redshift))

  • bpz_ref_data_path ([str] default=None) – bpz_ref_data_path (str): file path to the SED, FILTER, and AB directories. If left to default None it will use the install directory for rail + rail/examples_data/estimation_data/data

  • spectra_file ([str] default=CWWSB4.list) – name of the file specifying the list of SEDs to use

  • m0 ([float] default=20.0) – reference apparent mag, used in prior param

  • nt_array ([list] default=[1, 2, 5]) – list of integer number of templates per ‘broad type’, must be in same order as the template set, and must sum to the same number as the # of templates in the spectra file

  • mmin ([float] default=18.0) – lowest apparent mag in ref band, lower values ignored

  • mmax ([float] default=29.0) – highest apparent mag in ref band, higher values ignored

  • init_kt ([float] default=0.3) – initial guess for kt in training

  • init_zo ([float] default=0.4) – initial guess for z0 in training

  • init_alpha ([float] default=1.8) – initial guess for alpha in training

  • init_km ([float] default=0.1) – initial guess for km in training

  • type_file ([str] default=) – name of file with the broad type fits for the training data

  • output_hdfn ([bool] default=True) – if True, just return the default HDFN prior params rather than fitting

  • input (TableHandle (INPUT))

  • model (ModelHandle (OUTPUT))

__init__(args, **kwargs)

Init function, init config stuff

entrypoint_function: str | None = 'inform'
interactive_function: str | None = 'bpz_lite_informer'
name = 'BPZliteInformer'
run()

compute the best fit prior parameters

rail.estimation.algos.bpz_lite.nzfunc(z, z0, alpha, km, m, m0)