Advice to find faint periodic signals in time series data using deep learning methods

Question

We are using few petabytes worth of time series astronomy data. The general aim is to find very faint periodic signals within it.

Our current method of processing this data is to do a Fast Fourier transform of the time series and look for peaks in the Fourier space. However, most of the periodic signals are false positives (what we call Radio Frequency interference). Generally 1 in 10,000 periodic signals is a true candidate. So basically, the task of the neural network is to find which of them are true candidates.

Supervised learning methods have been explored in the past with mixed success, but generally well labelled training data is hard to come by. The Interference environment also depends on which telescope was used and it could change with time. My question is if reinforcement learning methods are well-suited for this problem? Do you think the Monte-Carlo tree search algorithm could help here? Any advice would be appreciated.

P.S. In case it is important to know, for the current processing of the data, there are several steps after the fast fourier transform as well. In case this needs to be expanded upon, please let me know.

Monte-Carlo tree search and reinforcement learning still require some kind of outcome/reward in order to work, so if you can't get well-labeled data to use with supervised methods I'm not sure. (Actually, it sounds like you have been a handful of true positives as well.) You're looking for true periodic signals (say planets orbiting a star), not semi-periodic signals? — Wayne, Dec 11 '17 at 00:27
I do have a handful of true positives (~2000 of them) and a lot of false positives (~20 million) for the neural net to train on. I am looking for true periodic signals (pulsars- rotating neutron stars). In simple cases, their periods are very regular. However, it is possible that the periodicity can change especially if the source is located in an accelerated binary system. — Vishnu, Dec 11 '17 at 00:46

andfor · Answer 1 · 2017-12-11T08:34:58.420

2

It is quite difficult to say something specific without access to the data. But if you have had problems separating true positives from false positives with supervised learning in the past, then perhaps the data itself may be the problem? Perhaps insufficient information makes it impossible to actually separate true and false positives.

I don't know how feasible it is in your specific field or with this data, but could it perhaps be possible to leverage knowledge about the physical laws of pulsars in order to separate them from RFI? Can you perhaps filter out false positives by checking whether it follows those laws? Just throwing out some ideas.

There must be plenty of literature on filtering out RFI. Is there any reason why methods in the current literature is not used?

edited Dec 11 '17 at 08:34

answered Dec 11 '17 at 08:29

andfor

511
3
5

There are methods we currently use in order to separate RFI from pulsars. One of them involves a human expert looking at a bunch of diagnostic plots and classifying sources based on that. This approach becomes non feasible as the data volume increases. We have used convolutional neural networks to identify features from these diagnostic plots. This does give us accuracy above 90 %. This method has two drawbacks: – Vishnu Dec 11 '17 at 09:10
1. It needs well labeled data from different telescopes as the nature of RFI changes according to the location. 2. Reaching upto the point of diagnostic plots involves a lot of computation. So basically my question is if I'm able to model the rules of pulsar searching, is it a good idea to apply reinforcement learning to this problem? – Vishnu Dec 11 '17 at 09:15
Do I understand you correctly that it is (generally) possible to separate pulsar data from RFI data, but the problem is that the RFI is manifesting in different ways based on location? Is there a reason why you cannot use one-class supervised learning and just focus on learning the structure of pulsar data, if that data is quiet constant? – andfor Dec 11 '17 at 09:29
In theory, if an expert human looks through the entire dataset, yes we can mostly filter out RFI candidates. RFI are not only location based but in general are also not predictable. Someone could switch on a phone and that could be a false positive /RFI. There are other features we can use to filter out such candidates. You are right, we can use supervised learning and this has been done before. I asked the question to understand if it's possible to improve on this method by using reinforcement learning. – Vishnu Dec 11 '17 at 10:50
Say, can the monte carlo tree search figure out on its own that it needs to do an fft search, if i define the reward function as to find a pulsar. (Assuming I modeled the problem properly) – Vishnu Dec 11 '17 at 10:52
Sorry for using the abbreviation. Fft refers to fast fourier transform. – Vishnu Dec 11 '17 at 11:09

Advice to find faint periodic signals in time series data using deep learning methods

1 Answers1