rail.estimation.algos.gpz module
RAIL wrapping of Peter Hatfield’s version of GPz, which can be found at: https://github.com/pwhatfield/GPz_py3
- class rail.estimation.algos.gpz.GPzEstimator
Bases:
CatEstimatorEstimate stage for GPz_v1
- Parameters:
output_mode ([str] default=default) – What to do with the outputs. The options are ‘default’, where outputs will be written to files and some returned, and ‘return’, where outputs will only be returned and not written.
chunk_size ([int] default=10000) – Number of objects per chunk for parallel processing or to evalute per loop in single node processing
hdf5_groupname ([str] default=photometry) – name of hdf5 group for data, if None, then set to ‘’
zmin (float] (default=0.0))
zmax (float] (default=3.0))
nzbins (int] (default=301))
id_col ([str] default=object_id) – name of the object ID column
redshift_col ([str] default=redshift) – name of redshift column
calc_summary_stats ([bool] default=False) – Compute summary statistics
calculated_point_estimates ([list] default=[]) – List of strings defining which point estimates to automatically calculate using qp.Ensemble.Options include, ‘mean’, ‘mode’, ‘median’.
recompute_point_estimates ([bool] default=False) – Force recomputation of point estimates
nondetect_val (float] (default=99.0))
mag_limits (dict] (default={'mag_u_lsst': 27.79, 'mag_g_lsst': 29.04, 'mag_r_lsst': 29.06, 'mag_i_lsst': 28.62, 'mag_z_lsst': 27.98, 'mag_y_lsst': 27.05}))
bands (list] (default=['mag_u_lsst', 'mag_g_lsst', 'mag_r_lsst', 'mag_i_lsst', 'mag_z_lsst', 'mag_y_lsst']))
err_bands (list] (default=['mag_err_u_lsst', 'mag_err_g_lsst', 'mag_err_r_lsst', 'mag_err_i_lsst', 'mag_err_z_lsst', 'mag_err_y_lsst']))
ref_band (str] (default=mag_i_lsst))
log_errors ([bool] default=True) – if true, take log of magnitude errors
replace_error_vals (list] (default=[0.1, 0.1, 0.1, 0.1, 0.1, 0.1]))
model (ModelHandle (INPUT))
input (TableHandle (INPUT))
output (QPHandle (OUTPUT))
- __init__(args, **kwargs)
Constructor: Do CatEstimator specific initialization
- entrypoint_function: str | None = 'estimate'
- interactive_function: str | None = 'gpz_estimator'
- name = 'GPzEstimator'
- class rail.estimation.algos.gpz.GPzInformer
Bases:
CatInformerInform stage for GPz_v1
- Parameters:
output_mode ([str] default=default) – What to do with the outputs. The options are ‘default’, where outputs will be written to files and some returned, and ‘return’, where outputs will only be returned and not written.
hdf5_groupname ([str] default=photometry) – name of hdf5 group for data, if None, then set to ‘’
nondetect_val (float] (default=99.0))
mag_limits (dict] (default={'mag_u_lsst': 27.79, 'mag_g_lsst': 29.04, 'mag_r_lsst': 29.06, 'mag_i_lsst': 28.62, 'mag_z_lsst': 27.98, 'mag_y_lsst': 27.05}))
train_frac ([float] default=0.75) – fraction of training data used to make tree, rest used to set best sigma
seed ([int] default=87) – random seed
bands (list] (default=['mag_u_lsst', 'mag_g_lsst', 'mag_r_lsst', 'mag_i_lsst', 'mag_z_lsst', 'mag_y_lsst']))
err_bands (list] (default=['mag_err_u_lsst', 'mag_err_g_lsst', 'mag_err_r_lsst', 'mag_err_i_lsst', 'mag_err_z_lsst', 'mag_err_y_lsst']))
redshift_col (str] (default=redshift))
gpz_method ([str] default=VC) – method to be used in GPz, options are ‘GL’, ‘VL’, ‘GD’, ‘VD’, ‘GC’, and ‘VC’
n_basis ([int] default=50) – number of basis functions used
learn_jointly ([bool] default=True) – if True, jointly learns prior linear mean function
hetero_noise ([bool] default=True) – if True, learns heteroscedastic noise process, set False for point est.
csl_method ([str] default=normal) – cost sensitive learning type, ‘balanced’, ‘normalized’, or ‘normal’
csl_binwidth ([float] default=0.1) – width of bin for ‘balanced’ cost sensitive learning
pca_decorrelate ([bool] default=True) – if True, decorrelate data using PCA as preprocessing stage
max_iter ([int] default=200) – max number of iterations
max_attempt ([int] default=100) – max iterations if no progress on validation
log_errors ([bool] default=True) – if true, take log of magnitude errors
replace_error_vals (list] (default=[0.1, 0.1, 0.1, 0.1, 0.1, 0.1]))
input (TableHandle (INPUT))
model (ModelHandle (OUTPUT))
- __init__(args, **kwargs)
Constructor Do CatInformer specific initialization
- entrypoint_function: str | None = 'inform'
- interactive_function: str | None = 'gpz_informer'
- name = 'GPzInformer'
- run()
train the GPz model after splitting train data into train/validation