rail.tools.table_tools module
Stages that implement utility functions
- class rail.tools.table_tools.ColumnMapper
Bases:
RailStageUtility stage that remaps the names of columns.
This operates on pandas dataframs in parquet files.
2. In short, this does: output_data = input_data.rename(columns=self.config.columns, in_place=self.config.in_place)
- Parameters:
output_mode ([str] default=default) – What to do with the outputs. The options are ‘default’, where outputs will be written to files and some returned, and ‘return’, where outputs will only be returned and not written.
columns ([dict] (required)) – Map of columns to rename
in_place ([bool] default=False) – Update file in place
input (PqHandle (INPUT))
output (PqHandle (OUTPUT))
- entrypoint_function: str | None = '__call__'
- inputs = [('input', <class 'rail.core.data.PqHandle'>)]
- interactive_function: str | None = 'column_mapper'
- name = 'ColumnMapper'
- outputs = [('output', <class 'rail.core.data.PqHandle'>)]
- run()
Run the stage and return the execution status.
Subclasses must implemented this method.
- Return type:
None
- stage_columns: list[str] | None
- class rail.tools.table_tools.RowSelector
Bases:
RailStageUtility Stage that sub-selects rows from a table by index
This operates on pandas dataframs in parquet files.
2. In short, this does: output_data = input_data[self.config.start_row:self.config.stop_row]
- Parameters:
output_mode ([str] default=default) – What to do with the outputs. The options are ‘default’, where outputs will be written to files and some returned, and ‘return’, where outputs will only be returned and not written.
start_row ([int] (required)) – starting row number
stop_row ([int] (required)) – Stoppig row number
input (PqHandle (INPUT))
output (PqHandle (OUTPUT))
- entrypoint_function: str | None = '__call__'
- inputs = [('input', <class 'rail.core.data.PqHandle'>)]
- interactive_function: str | None = 'row_selector'
- name = 'RowSelector'
- outputs = [('output', <class 'rail.core.data.PqHandle'>)]
- run()
Run the stage and return the execution status.
Subclasses must implemented this method.
- Return type:
None
- stage_columns: list[str] | None
- class rail.tools.table_tools.TableConverter
Bases:
RailStageUtility stage that converts tables from one format to anothe
FIXME, this is hardwired to convert parquet tables to Hdf5Tables. It would be nice to have more options here.
- Parameters:
output_mode ([str] default=default) – What to do with the outputs. The options are ‘default’, where outputs will be written to files and some returned, and ‘return’, where outputs will only be returned and not written.
output_format ([str] (required)) – Format of output table
input (PqHandle (INPUT))
output (Hdf5Handle (OUTPUT))
- entrypoint_function: str | None = '__call__'
- inputs = [('input', <class 'rail.core.data.PqHandle'>)]
- interactive_function: str | None = 'table_converter'
- name = 'TableConverter'
- outputs = [('output', <class 'rail.core.data.Hdf5Handle'>)]
- run()
Run the stage and return the execution status.
Subclasses must implemented this method.
- Return type:
None
- stage_columns: list[str] | None