I have a data set of particle physics events. An event could be seen as a training instance. In these events, there are various particles and these particles have various characteristics (energy, momentum etc.). An example particle is an electron. Now, not every event contains an electron, but when it does, the characteristics of the electron are available. If an electron is in the event, then its characteristic values like momentum are saved (e.g. 107425.323473
) and if no electron is in the event, then its characteristic values are set to some code number (e.g. -999
).
How should data like this be preprocessed (e.g. sklearn.preprocessing.MinMaxScaler(feature_range = (-1, 1))
)? I am keen to use the data with a variety of deep learning algorithms in TensorFlow.
In a sense, I am asking how TensorFlow could be told that certain values of a tensor (or an image, or however the data is formulated) are inactive.