rail.estimation.algos.nz_dir module

Implement simple version of TxPipe NZDir

class rail.estimation.algos.nz_dir.NZDirInformer

Bases: CatInformer

Quick implementation of an NZ Estimator that creates weights for each input object using sklearn’s NearestNeighbors. Very basic, we can probably create a more sophisticated SOM-based DIR method in the future. This inform stage just creates a nearneigh model of the spec-z data and some distances to N-th neighbor that will be used in the estimate stage.

This will create model a dictionary of the nearest neighbor model and params used by estimate

Parameters:
  • output_mode ([str] default=default) – What to do with the outputs. The options are ‘default’, where outputs will be written to files and some returned, and ‘return’, where outputs will only be returned and not written.

  • hdf5_groupname ([str] default=photometry) – name of hdf5 group for data, if None, then set to ‘’

  • usecols ([list] default=['mag_u_lsst', 'mag_g_lsst', 'mag_r_lsst', 'mag_i_lsst', 'mag_z_lsst', 'mag_y_lsst']) – columns from sz_data for Neighbor calculation

  • n_neigh ([int] default=10) – number of neighbors to use

  • kalgo ([str] default=kd_tree) – Neighbor algorithm to use

  • kmetric ([str] default=euclidean) – Knn metric to use

  • sz_name ([str] default=redshift) – name of specz column in sz_data

  • szweightcol ([str] default=) – name of sz weight column

  • distance_delta ([float] default=1e-06) – padding for distance calculation

  • input (TableHandle (INPUT))

  • model (ModelHandle (OUTPUT))

bands = ['u', 'g', 'r', 'i', 'z', 'y']
default_usecols = ['mag_u_lsst', 'mag_g_lsst', 'mag_r_lsst', 'mag_i_lsst', 'mag_z_lsst', 'mag_y_lsst']
entrypoint_function: str | None = 'inform'
interactive_function: str | None = 'nz_dir_informer'
name = 'NZDirInformer'
run()

Run the stage and return the execution status.

Subclasses must implemented this method.

class rail.estimation.algos.nz_dir.NZDirSummarizer

Bases: CatEstimator

Quick implementation of a summarizer that creates weights for each input object using sklearn’s NearestNeighbors. Very basic, we can probably create a more sophisticated SOM-based DIR method in the future

Parameters:
  • output_mode ([str] default=default) – What to do with the outputs. The options are ‘default’, where outputs will be written to files and some returned, and ‘return’, where outputs will only be returned and not written.

  • chunk_size ([int] default=10000) – Number of objects per chunk for parallel processing or to evalute per loop in single node processing

  • hdf5_groupname ([str] default=photometry) – name of hdf5 group for data, if None, then set to ‘’

  • zmin (float] (default=0.0))

  • zmax (float] (default=3.0))

  • nzbins (int] (default=301))

  • id_col ([str] default=object_id) – name of the object ID column

  • redshift_col ([str] default=redshift) – name of redshift column

  • calc_summary_stats ([bool] default=False) – Compute summary statistics

  • calculated_point_estimates ([list] default=[]) – List of strings defining which point estimates to automatically calculate using qp.Ensemble.Options include, ‘mean’, ‘mode’, ‘median’.

  • recompute_point_estimates ([bool] default=False) – Force recomputation of point estimates

  • seed ([int] default=87) – random seed

  • usecols ([list] default=['mag_u_lsst', 'mag_g_lsst', 'mag_r_lsst', 'mag_i_lsst', 'mag_z_lsst', 'mag_y_lsst']) – columns from sz_data for Neighbor calculation

  • leafsize ([int] default=40) – leaf size for testdata KDTree

  • phot_weightcol ([str] default=) – name of photometry weight, if present

  • n_samples ([int] default=20) – number of bootstrap samples to generate

  • model (ModelHandle (INPUT))

  • input (TableHandle (INPUT))

  • output (QPHandle (OUTPUT))

  • single_NZ (QPHandle (OUTPUT))

__init__(args, **kwargs)

Initialize Estimator

bands = ['u', 'g', 'r', 'i', 'z', 'y']
default_usecols = ['mag_u_lsst', 'mag_g_lsst', 'mag_r_lsst', 'mag_i_lsst', 'mag_z_lsst', 'mag_y_lsst']
entrypoint_function: str | None = 'estimate'
initialize_handle(tag, data, npdf)
interactive_function: str | None = 'nz_dir_summarizer'
join_histograms()
name = 'NZDirSummarizer'
open_model(**kwargs)

Load the mode and/or attach it to this Stage

Parameters:
  • tag – Input tag associated to the model

  • **kwargs – Should include ‘model’, see notes

Notes

The keyword arguement ‘model’ should be either

  1. an object with a trained model,

  2. a path pointing to a file that can be read to obtain the trained model,

  3. or a ModelHandle providing access to the trained model.

Returns:

The object encapsulating the trained model.

Return type:

Any

outputs = [('output', <class 'rail.core.data.QPHandle'>), ('single_NZ', <class 'rail.core.data.QPHandle'>)]
run()

Run the stage and return the execution status.

Subclasses must implemented this method.