1

I'm new to Machine Learning.

Q: How can I use ML to classify graphs?

My goal is to create a machine learning model that can take a function as an input and return a string that classifies the graph as an output. For example, given $\sin(x)$, the graph will return "sinusoidal". Of course, I'll put a bound on the types of graphs that can be classified (i.e., only sinusoidals, exponentials, and lorentzians). For instance: enter image description here Even though this is a simple classification task, I'm not sure how to tackle it. Here are some of my questions:

  • Is there any specific python ML library suited for this type of classification task?
  • Is there an existing sci-kit learn model that I can modify to my needs?
  • How should I acquire my training data? On the one hand, I could manually take hundreds of screenshots of different graphs, but that would be a lot of work. Alternatively, much like I did here, I can randomly generate graphs and then feed points from the graphs into the machine learning model. The second method seems to be much easier, but then I can't use a CNN anymore. What can I use, if anything?

Any help is appreciated. Thank you.

BR56
  • 71
  • 6
  • If you'd like to reframe an unanswered question by adding additional context or specificity, the best way to do that is via an [edit]. More information on how to use this website is in the [help]. – Sycorax Jul 18 '21 at 21:56
  • Hi @Sycorax, what should I do? Revise this question or append to the previous one? Galen informed me that I should open a new question, which is why I did so. – BR56 Jul 18 '21 at 21:58
  • @Sycorax I think the earlier question was about estimating parameters. This question is about training an ML classifier to determine family of model. – DifferentialPleiometry Jul 18 '21 at 22:01
  • 1
    I hadn't seen that comment thread. In light of your recently-added answer, I see that this question is no longer a refinement of an existing, unanswered question but instead a new question. I'll happily reopen this if you can be more specific about what it is that you want to know. As it stands, there's no explicit question in the post, so it's not clear if you're asking for code (which is off-topic) or if you have a statistical question (and if so, what it is). Please [edit] to clarify. – Sycorax Jul 18 '21 at 22:02
  • @Refath What is this classifier for? Is this something where you want the best-fitting model? In which case, I recommend you look into model-selection techniques. – DifferentialPleiometry Jul 18 '21 at 22:02
  • Hi @Sycorax, I clarified my question at the top. Let me know if I should revise it. Hi Galen, my ultimate goal is to create a "function classifier", which can take a limited set of graphs as an input, return what _type_ of graph it is (i.e., Exponential) and then return the _parameters_ (i.e., amplitude, frequency, etc.) of that graph (which is what my previous question+answer was regarding) – BR56 Jul 18 '21 at 22:05
  • 1
    Thanks. This revision, which asks for a software library recommendation, is not specific enough to be suitable. With a greater or lesser amount of effort, any modern machine learning library could be employed. My recommendation is to read about different computer vision methods, of which [tag:conv-neural-network]s are one popular method, but HOG and SIFT, or other methods, are also options, among others. – Sycorax Jul 18 '21 at 22:11
  • Hi @Sycorax, I can clarify my question further as needed. To clarify, however, CNNs take images as inputs. Do you mean that I can train a program to classify graphs by looking at the _images_ themselves, or by extracting points from the graph and analyzing the shape of the distributions? – BR56 Jul 18 '21 at 22:13
  • I thought that the statement of the problem was about classifying images. If you have something else in mind, please edit to clarify. Once again, I refer you to the [help] which includes more information. – Sycorax Jul 18 '21 at 22:14
  • Yes, the goal is to classify graphs, and you recommended using a CNN. However, a CNN takes images as inputs. Does that mean I will be taking _screenshots of graphs_ and feeding them to the program so it can learn which type of graph it is from the image (supervised)? Or should I _extract a discrete # points_ from the graph and give it to some machine learning model so it can analyze the distribution and inform me what kind of graph it is? Is either method applicable, or is only one of these suited to the task? – BR56 Jul 18 '21 at 22:16
  • It's up to you to decide what your research project is. Is the input an image or points sampled from a function? Or is it something else? – Sycorax Jul 18 '21 at 22:19
  • I don't have the training data yet, which is why I'm wondering what the best/fastest way to approach this problem is. On the one hand, I could manually take hundreds of screenshots of different graphs, but that would be a lot of work. Alternatively, much like I did [here](https://stats.stackexchange.com/a/535061/328135), I can randomly generate graphs using `numpy` and then feed points from the graphs into the machine learning model. The second method seems to be much easier, but then I can't use a CNN anymore. What can I use, if anything? – BR56 Jul 18 '21 at 22:23
  • I suggested convolutional networks when I had a different understanding of your question. If you're using sampling, then using a CNN on images might not be the best approach (but other kinds of CNNs exist and can solve other kinds of problems). – Sycorax Jul 18 '21 at 22:34
  • @Sycorax Thanks for letting me know, but that's too general to help. Would you mind reopening the question so that someone could perhaps give a helpful answer? I can rephrase my question if needed. – BR56 Jul 18 '21 at 22:37
  • It's not intended as an answer, it's intended to assist you in clarifying the question to be answerable. As it stands, I don't think this is sufficiently clear to be answerable because it's not even clear what the input data is. You will need to [edit] it. When a closed question is edited, it automatically populates the review queue, where other users with sufficient privileges will vote to reopen. – Sycorax Jul 18 '21 at 22:40
  • Hi @Sycorax, I've revised my questions and hopefully they're more specific now. Let me know if I need to clarify them any further. If you could open it up, that would be great so that I don't have to wait for another user, as I want to resolve these questions ASAP. This will enable me to start actually working on the problem. – BR56 Jul 18 '21 at 22:46
  • 1
    This post, which expresses at least three questions, remains too unfocused for our format. – whuber Jul 19 '21 at 12:39
  • Hi @Sycorax, my question is marked as a duplicate. I checked *"What should I do when my neural network doesn't learn?"* and it offered some good advice. However, with so many techniques, I am lost which one would apply to my program and I would be appreciative of some individualized feedback on my code. – BR56 Jul 23 '21 at 16:11
  • The duplicate thread outlines a comprehensive debugging procedure. If you create a reproducible example with reproducible documentation that the suggestions do not solve your problem, then it will be eligible for reopening. – Sycorax Jul 23 '21 at 16:15
  • Thank you for letting me know @Sycorax. I will try the suggestions in the debugging procedure and return if they fail. – BR56 Jul 23 '21 at 16:17
  • Hi @Sycorax I just tried unit testing my NN; No bugs there. I tried adjusting the hyperparameters recommended in your answer, including batch size, learning rate, etc. -- but none of them have improved the loss function. I've added the results of these attempts to my question. Please consider reopening it. – BR56 Jul 23 '21 at 16:44

0 Answers0