Why is my 1D CNN so bad?

Question

TLDR: My 1D CNN is doing a really bad job classifying graphs. Here's more context:

Note: I've tried adopting the advice listed here and here, but my CNN hasn't stopped overfitting. I've already tried many of the advice listed there, including unit testing, changing the architecture, etc. In fact, the CNN works just fine with two classes, but for some reason, three classes makes it fail.

What am I trying to do?

My Time-Series is a 90 x 100 table representing points from three different types of graphs: Linear, Quadratic, and Cubic Sinusoidal. Thus, there are 30 Rows for Linear Graphs, 30 for Quadratics, and 30 for Cubics. I have sampled 100 points from every graph. Here’s an image to illustrate my point:

Here's the problem

I’m very confused why my 1D FCN is performing so badly. I’ve been debugging it for the last two weeks – changing the training set, changing the # classes, changing the architecture … nothing works. Here's a quick overview:

Confusion Matrix shows only 42% validation accuracy.

Loss Function also looks disappointing – it seems to be overfitting after only 10 epochs (I also tried 100 and 1000 epochs – same results)

Probability Graph shows that the Network is extremely confident about the Quadratic Class, but very confused about the Linear and Cubic graphs.

Here's how I tried to fix it

Here are some of the strategies I used for debugging (none worked):

Make the problem simpler by giving the FCN only two classes to classify. At one point, the network actually reached 100% validation accuracy after 5 epochs -- I tried the exact same network on the three classes and it failed catastrophically
Changed # Epochs (increased and decreased it, trying everything from 10 to 1000 epochs)
I decreased my training set from 30k to 3k to 300 all the down way to 60. However, this has not changed the problem.

Code

Here is the code: 1D CNN Time Series Classifier. Below's a quick MWE if you need it. I'm using a library based on a library based on PyTorch, tsai (time-series AI).

url="https://barisciencelab.tech/Training3.csv"
c=pd.read_csv(url)
c.head()

X, y = df2xy(c, data_cols=c.columns[1:-1], target_col='Class')
test_eq(X.shape, (3000, 1, 100))
test_eq(y.shape, (3000, ))

splits = get_splits(y, valid_size=.2, stratify=True, random_state=23, shuffle=True)

class_map = {
    1:'Linear',
    2:'Quadratic',
    3:'Cubic',
    }
class_map
labeler = ReLabeler(class_map)
new_y = labeler(y) # map to more descriptive labels
X.shape, new_y.shape, splits, new_y

tfms  = [None, TSClassification()] # TSClassification == Categorize
batch_tfms = TSStandardize()
dls = get_ts_dls(X, new_y, splits=splits, tfms=tfms, batch_tfms=batch_tfms, bs=[64, 128])
dls.dataset

learn = ts_learner(dls, FCN, metrics=accuracy, cbs=ShowGraph())
learn.fit_one_cycle(30, lr_max=1e-4)

I would be very grateful if anyone could help me resolve this issue.

Does this answer your question? [What should I do when my neural network doesn't learn?](https://stats.stackexchange.com/questions/352036/what-should-i-do-when-my-neural-network-doesnt-learn) — mhdadk, Sep 23 '21 at 21:22
Or this? [What should I do when my neural network doesn't generalize well?](https://stats.stackexchange.com/q/365778/296197) — mhdadk, Sep 23 '21 at 21:24
Hi @mhdadk, yes, I actually saw that response by Sycorax before. Unfortunately, I've already tried many of the advice listed there, including unit testing, changing the architecture, etc. — rb3652, Sep 23 '21 at 21:24
"Increasing training set from 300 to 3,000 to 30,000 images. Still the same problem." You should be decreasing your training set size not increasing. Please check the second answer to the question: https://stats.stackexchange.com/a/352190/296197 — mhdadk, Sep 23 '21 at 21:28
Hi @mhdadk. Yes, I recognized that too. In fact, I decreased my training set from 30k to 3k to 300 all the down way to 60. However, this has not changed the problem. What *did* change the problem was changing my training set so it has only two classes, and so the CNN has to classify just two classes. — rb3652, Sep 23 '21 at 21:30
You need to scale two things: architecture and data. You start with a small dataset (1-2 examples) and a small architecture, and then start scaling each one slowly. It is highly unlikely that the net can perform well on $N+1$ examples if it cannot perform well on $N$ examples first. — mhdadk, Sep 23 '21 at 21:32
Thank you for the advice @mhdadk; I'm using the default architecture for an FCN, which consists of 3 Convolution Blocks, 1 AvgPooling Layer, and 1 Fully Connected layer. When I made my training set size 60, 20% * 60 = 12 of the images were used for validation. The network perfectly classified all 12 images into their correct categories: 6 Linear, 6 Quadratic. In short, the network does indeed classify 2 classes well, but not 3 classes. — rb3652, Sep 23 '21 at 21:34
Have you tried classifying with LDA, SVM or whatever - if these perform well on 3 classes at least you know its not a problem with the dataset or how it's split. — N Blake, Sep 24 '21 at 12:19
Hi @NBlake Thank you for the comment. No, I haven't tried LDA/SVM/etc. because I'm specifically trying 1D CNNs for Time Series Classification. Right now, I'm making a checklist of the debugging strategies I tried (i.e., Unit Testing, Training Set, etc.) — rb3652, Sep 24 '21 at 12:21

Why is my 1D CNN so bad?

What am I trying to do?

Here's the problem

Here's how I tried to fix it

Code

0 Answers0