-
Notifications
You must be signed in to change notification settings - Fork 279
Feature: EventPreprocessor #2928
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from 9 commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
05fe242
add EventPreprocessor
kosack 2567584
added altaz_to_fov helper
kosack 4a553a1
added changelog
kosack e93db12
add alt_az_to_fov to init
kosack f1a15a3
add missing config=True tag
kosack 65d1f01
fix some docstring/type annotation warnings
kosack 6bea57c
Don't use GADF FOV convention by default
kosack 6e97c93
rename function in test too
kosack 87b43dc
fix test after GADF -> Nominal change
kosack 5da834a
fix links in changelog
kosack 096fd64
pass parent to predefined QualityQuery
kosack 3fd0513
remove old comment
kosack fa6de2a
remove unnecessary conversion
kosack fdffce3
fix links in changelog
kosack 88657f5
fix docstring typo and attribute
kosack eba8c25
fix wrong inputs for angular_separation
kosack 1f7bbd8
use a FeatureSetRegistry for FeatureSets
kosack 243ca62
update changelog
kosack 9b32d86
show better example
kosack File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,42 @@ | ||
| Introduces the `~ctapipe.io.EventPreprocessor` class that can generically | ||
| transform an event table by applying the following steps: | ||
|
|
||
| * Generate new or rename existing columns with a `~ctapipe.core.FeatureGenerator` | ||
| * Select "good" event rows with a `~ctapipe.core.QualityQuery` | ||
| * Select which columns to output (by setting the ``features`` configuration | ||
| attribute of the `~ctapipe.io.EventPreprocessor`) | ||
|
|
||
| This is useful for doing the final steps of DL2 processing, and will eventually | ||
| replace what is in `DL2EventPreprocessor` and `DL2EventLoader`, which will be | ||
| deprecated in a future release. | ||
|
|
||
| The `~ctapipe.core.EventPreprocessor` also includes the ability to pre-configure | ||
| itself for specific use cases by setting the ``feature_set`` option. Currently | ||
| only two `~ctapipe.io.ProcessingFeatureSet` are implemented: | ||
| `feature_set=dl2_irf`, which defines the transforms, event selection, and output | ||
| features for processing simulated DL2 events, and `feature_set=custom`, which | ||
| has no pre-configuration and requires all parameters to be set by the user in a | ||
| config file. | ||
|
|
||
| The functionality of `DL2EventLoader` can be mimicked with the following: | ||
|
|
||
| .. code-block:: python | ||
|
|
||
| from ctapipe.io import TableLoader, EventPreprocessor | ||
| from astropy.table import vstack | ||
|
|
||
| DL2FILE = "some_dl2_file.h5" | ||
| loader = TableLoader(DL2FILE, dl2=True, simulated=True, observation_info=True) | ||
| preprocess = EventPreprocessor(feature_set="dl2_simulation") | ||
| events = vstack( | ||
| [ | ||
| preprocess(QTable(c.data)) | ||
| for c in loader.read_subarray_events_chunked(chunk_size=100_000) | ||
| ] | ||
| ) | ||
|
|
||
|
|
||
| This also introduces a helper function `~ctapipe.coordinates.altaz_to_nominal` to | ||
| convert columns of alt/az coordinates to FOV coordinates in the | ||
| `~ctapipe.coordinates.NominalFrame`, which works with the | ||
|
kosack marked this conversation as resolved.
Outdated
|
||
| `~ctapipe.io.FeatureGenerator`. | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,205 @@ | ||
| """Module containing classes related to event loading and preprocessing""" | ||
|
|
||
| from enum import StrEnum, auto | ||
|
|
||
| from astropy.coordinates import angular_separation | ||
| from traitlets import default | ||
|
|
||
| from ..coordinates import altaz_to_nominal | ||
| from ..core import ( | ||
| Component, | ||
| FeatureGenerator, | ||
| QualityQuery, | ||
| ToolConfigurationError, | ||
| traits, | ||
| ) | ||
|
|
||
| __all__ = ["EventPreprocessor"] | ||
|
|
||
|
|
||
| class PreprocessorFeatureSet(StrEnum): | ||
|
kosack marked this conversation as resolved.
Outdated
|
||
| """Pre-defined configurations for DL2EventPreprocessor for specific use cases.""" | ||
|
|
||
| custom = auto() #: use user-supplied configuration | ||
| dl2_irf = auto() #: support IRF preprocessing use case | ||
|
|
||
|
|
||
| class EventPreprocessor(Component): | ||
| """ | ||
| Selects or generates features and filters tables of events. | ||
|
|
||
| In normal use, one only has to specify the ``feature_set`` option, which | ||
| will generate features supports standard use cases. For advanced usage, you | ||
| can set ``feature_set=custom`` and pass in a configured | ||
| `~ctapipe.core.FeatureGenerator` and set the ``features`` property of this | ||
| class with the columns you to retain in the output table. | ||
|
|
||
| In the `~ctapipe.core.FeatureGenerator`` used internally, you have access to several | ||
| additional functions useful for DL2 processing: | ||
|
|
||
| - `~astropy.coordinates.angular_separation` | ||
| - `~ctapipe.coordinates.altaz_to_nominal` | ||
| """ | ||
|
|
||
| energy_reconstructor = traits.Unicode( | ||
| default_value="RandomForestRegressor", | ||
| help="Prefix of the reco `_energy` column", | ||
| ).tag(config=True) | ||
|
|
||
| geometry_reconstructor = traits.Unicode( | ||
| default_value="HillasReconstructor", | ||
| help="Prefix of the `_alt` and `_az` reco geometry columns", | ||
| ).tag(config=True) | ||
|
|
||
| gammaness_reconstructor = traits.Unicode( | ||
| default_value="RandomForestClassifier", | ||
| help="Prefix of the classifier `_prediction` column", | ||
| ).tag(config=True) | ||
|
|
||
| feature_set = traits.UseEnum( | ||
| PreprocessorFeatureSet, | ||
| default_value=PreprocessorFeatureSet.dl2_irf, | ||
| help=( | ||
| "Set up the FeatureGenerator.features, output features, and quality criteria " | ||
| "based on standard use cases." | ||
| "Specify 'custom' if you want to set your own in your config file. If this is set to " | ||
| "any value other than 'custom', the feature properties of the configuration " | ||
| "file you pass in will be overridden." | ||
| ), | ||
| ).tag(config=True) | ||
|
|
||
| features = traits.List( | ||
| traits.Unicode(), | ||
| help=( | ||
| "Features (columns) to retain in the output. " | ||
| "These can include columns generated by the FeatureGenerator. " | ||
| "If you set these, make sure feature_set=custom." | ||
| ), | ||
| ).tag(config=True) | ||
|
|
||
| def __init__(self, config=None, parent=None, **kwargs): | ||
| super().__init__(config=config, parent=parent, **kwargs) | ||
| if PreprocessorFeatureSet(self.feature_set) == PreprocessorFeatureSet.custom: | ||
|
maxnoe marked this conversation as resolved.
Outdated
|
||
| self.feature_generator = FeatureGenerator(parent=self) | ||
| self.quality_query = QualityQuery(parent=self) | ||
| else: | ||
| self.feature_generator = FeatureGenerator( | ||
| features=self._get_predefined_features_to_generate() | ||
|
kosack marked this conversation as resolved.
Outdated
|
||
| ) | ||
| self.quality_query = QualityQuery( | ||
| quality_criteria=self._get_predefined_quality_criteria() | ||
|
kosack marked this conversation as resolved.
Outdated
|
||
| ) | ||
| # sanity checks: | ||
| if len(self.features) == 0: | ||
| raise ToolConfigurationError( | ||
| "DL2EventPreprocessor has no output features configured." | ||
| "You have set `feature_set=custom`, but did not provide the list " | ||
| "of features in the configuration (DL2EventPreprocessor.features)." | ||
| ) | ||
|
|
||
| def __call__(self, table): | ||
| """Return new table with only the columns in features.""" | ||
|
|
||
| # generate new features, which includes renaming columns: | ||
| generated = self.feature_generator( | ||
| table, | ||
| angular_separation=angular_separation, | ||
| altaz_to_nominal=altaz_to_nominal, | ||
| ) | ||
|
maxnoe marked this conversation as resolved.
|
||
|
|
||
| # apply event selection on the resulting table | ||
|
|
||
| selected_mask = self.quality_query.get_table_mask(generated) | ||
|
|
||
| # return only the columns specified in `self.features`, and rows in | ||
| # `selected_mask` | ||
| return generated[self.features][selected_mask] | ||
|
|
||
| def _get_predefined_features_to_generate(self) -> list[tuple]: | ||
| """Return a default list of FeatureGenerator features.""" | ||
| if self.feature_set == PreprocessorFeatureSet.dl2_irf: | ||
| # Default features for DL2/Subarray events | ||
| return [ | ||
| ("reco_energy", f"{self.energy_reconstructor}_energy"), | ||
| ("reco_alt", f"{self.geometry_reconstructor}_alt"), | ||
| ("reco_az", f"{self.geometry_reconstructor}_az"), | ||
| ("gh_score", f"{self.gammaness_reconstructor}_prediction"), | ||
| ("theta", "angular_separation(reco_az, reco_alt, true_az, true_alt)"), | ||
| ( | ||
| "reco_fov_coord", | ||
| "altaz_to_nominal(reco_az, reco_alt, subarray_pointing_lon, subarray_pointing_lat)", | ||
| ), | ||
| ( | ||
| "reco_fov_lon", | ||
| "reco_fov_coord[:,0]", | ||
| ), # note: GADF IRFs use the negative of this | ||
| ("reco_fov_lat", "reco_fov_coord[:,1]"), | ||
| ( | ||
| "true_fov_coord", | ||
| "altaz_to_nominal(true_az, true_alt, subarray_pointing_lon, subarray_pointing_lat)", | ||
| ), | ||
| ( | ||
| "true_fov_lon", | ||
| "true_fov_coord[:,0]", | ||
| ), # note: GADF IRFs use the negative of this | ||
| ("true_fov_lat", "true_fov_coord[:,1]"), | ||
| ( | ||
| "true_fov_offset", | ||
| "angular_separation(reco_fov_lon, reco_fov_lat, 0*u.deg, 0*u.deg)", | ||
|
kosack marked this conversation as resolved.
Outdated
|
||
| ), | ||
| ( | ||
| "reco_fov_offset", | ||
| "angular_separation(true_fov_lon, reco_fov_lat, 0*u.deg, 0*u.deg)", | ||
|
kosack marked this conversation as resolved.
Outdated
|
||
| ), | ||
| ( | ||
| "multiplicity", | ||
| f"np.count_nonzero({self.gammaness_reconstructor}_telescopes,axis=1)", | ||
| ), | ||
| ] | ||
| else: | ||
| raise NotImplementedError(f"unsupported feature_set: {self.feature_set}") | ||
|
|
||
| def _get_predefined_quality_criteria(self) -> list[tuple]: | ||
| """ | ||
| Set the quality criteria for a DL2FeatureSet. | ||
|
|
||
| Here you can use any columns in the input table, or any that are | ||
| specified in the FeatureGenerator. | ||
| """ | ||
| if self.feature_set == PreprocessorFeatureSet.dl2_irf: | ||
| return [ | ||
| ("Valid geometry", f"{self.geometry_reconstructor}_is_valid"), | ||
| ("valid energy", f"{self.energy_reconstructor}_is_valid"), | ||
| ("valid gammaness", f"{self.gammaness_reconstructor}_is_valid"), | ||
| ("sufficient multiplicity", "multiplicity >= 4"), | ||
| ] | ||
| else: | ||
| raise NotImplementedError(f"unsupported feature_set: {self.feature_set}") | ||
|
|
||
| @default("features") | ||
| def default_features(self): | ||
| """Set the columns to output, for a given FeatureSet.""" | ||
| if self.feature_set == PreprocessorFeatureSet.dl2_irf: | ||
| return [ | ||
| "event_id", | ||
| "obs_id", | ||
| "reco_energy", | ||
| "reco_alt", | ||
| "reco_az", | ||
| "gh_score", | ||
| "true_energy", | ||
| "true_alt", | ||
| "true_az", | ||
| "true_fov_offset", | ||
| "reco_fov_offset", | ||
| "theta", | ||
| "reco_fov_lat", | ||
| "true_fov_lat", | ||
| "reco_fov_lon", | ||
| "true_fov_lon", | ||
| "multiplicity", | ||
| ] | ||
| elif self.feature_set == PreprocessorFeatureSet.custom: | ||
| return [] | ||
| else: | ||
| raise NotImplementedError(f"unsupported feature_set: {self.feature_set}") | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.