rail.estimation.algos.k_nearneigh module
quick implementation of k nearest neighbor estimator First pass will ignore photometric errors and just do things in terms of magnitudes, we will expand in a future update
- class rail.estimation.algos.k_nearneigh.KNearNeighEstimator
Bases:
CatEstimatorKNN-based estimator
- Parameters:
output_mode ([str] default=default) – What to do with the outputs. The options are ‘default’, where outputs will be written to files and some returned, and ‘return’, where outputs will only be returned and not written.
chunk_size ([int] default=10000) – Number of objects per chunk for parallel processing or to evalute per loop in single node processing
hdf5_groupname ([str] default=photometry) – name of hdf5 group for data, if None, then set to ‘’
zmin (float] (default=0.0))
zmax (float] (default=3.0))
nzbins (int] (default=301))
id_col ([str] default=object_id) – name of the object ID column
redshift_col (str] (default=redshift))
calc_summary_stats ([bool] default=False) – Compute summary statistics
calculated_point_estimates ([list] default=[]) – List of strings defining which point estimates to automatically calculate using qp.Ensemble.Options include, ‘mean’, ‘mode’, ‘median’.
recompute_point_estimates ([bool] default=False) – Force recomputation of point estimates
bands (list] (default=['mag_u_lsst', 'mag_g_lsst', 'mag_r_lsst', 'mag_i_lsst', 'mag_z_lsst', 'mag_y_lsst']))
ref_band (str] (default=mag_i_lsst))
nondetect_val (float] (default=99.0))
mag_limits (dict] (default={'mag_u_lsst': 27.79, 'mag_g_lsst': 29.04, 'mag_r_lsst': 29.06, 'mag_i_lsst': 28.62, 'mag_z_lsst': 27.98, 'mag_y_lsst': 27.05}))
model (ModelHandle (INPUT))
input (TableHandle (INPUT))
output (QPHandle (OUTPUT))
- __init__(args, **kwargs)
Constructor: Do Estimator specific initialization
- entrypoint_function: str | None = 'estimate'
- interactive_function: str | None = 'k_near_neigh_estimator'
- name = 'KNearNeighEstimator'
- open_model(**kwargs)
Load the mode and/or attach it to this Stage
- Parameters:
tag – Input tag associated to the model
**kwargs – Should include ‘model’, see notes
Notes
The keyword arguement ‘model’ should be either
an object with a trained model,
a path pointing to a file that can be read to obtain the trained model,
or a ModelHandle providing access to the trained model.
- Returns:
The object encapsulating the trained model.
- Return type:
Any
- class rail.estimation.algos.k_nearneigh.KNearNeighInformer
Bases:
CatInformerTrain a KNN-based estimator
- Parameters:
output_mode ([str] default=default) – What to do with the outputs. The options are ‘default’, where outputs will be written to files and some returned, and ‘return’, where outputs will only be returned and not written.
hdf5_groupname (str] (default=photometry))
zmin (float] (default=0.0))
zmax (float] (default=3.0))
nzbins (int] (default=301))
nondetect_val (float] (default=99.0))
mag_limits (dict] (default={'mag_u_lsst': 27.79, 'mag_g_lsst': 29.04, 'mag_r_lsst': 29.06, 'mag_i_lsst': 28.62, 'mag_z_lsst': 27.98, 'mag_y_lsst': 27.05}))
bands (list] (default=['mag_u_lsst', 'mag_g_lsst', 'mag_r_lsst', 'mag_i_lsst', 'mag_z_lsst', 'mag_y_lsst']))
ref_band (str] (default=mag_i_lsst))
redshift_col (str] (default=redshift))
trainfrac ([float] default=0.75) – fraction of training data used to make tree, rest used to set best sigma
seed ([int] default=0) – Random number seed for NN training
sigma_grid_min ([float] default=0.01) – minimum value of sigma for grid check
sigma_grid_max ([float] default=0.075) – maximum value of sigma for grid check
ngrid_sigma ([int] default=10) – number of grid points in sigma check
leaf_size ([int] default=15) – min leaf size for KDTree
nneigh_min ([int] default=3) – int, min number of near neighbors to use for PDF fit
nneigh_max ([int] default=7) – int, max number of near neighbors to use ofr PDF fit
only_colors ([bool] default=False) – if only_colors True, then do not use ref_band mag, only use colors
input (TableHandle (INPUT))
model (ModelHandle (OUTPUT))
- __init__(args, **kwargs)
Constructor Do CatInformer specific initialization, then check on bands
- entrypoint_function: str | None = 'inform'
- interactive_function: str | None = 'k_near_neigh_informer'
- name = 'KNearNeighInformer'
- run()
train a KDTree on a fraction of the training data