Recent comments in /f/deeplearning

BlacksmithNo4415 t1_j6uwn1n wrote

i can try to help you though, i worked as a deep learning engineer in computer vision:

  1. do you mean the dimension of 1 sample is [2000, 5] ? that is a very weird shape for an image. usually they have a shape of [h, w, 3] and [h, w, 4] for video data - a temporal additional dimension is added
  2. what do you want this model should be classifying ? so far it sounds more trivial - but depending on the object it might be a bit more complex.
  3. the more complex your task -> more complex your model must be -> a larger data set you will need
  4. how are the labels distributed in your data set ?
  5. do you use adversarial attacks for robustness ? don't do that at the beginning.
  6. are you sure that a cnn is the proper model for signal classification ?
  7. how do you want to represent your dataset ? what should be the 3rd axes represent as an information ?
  8. btw dropouts makes it also more difficult for the model to overfit. you use this so the model learns to generalize
  9. i think the model is way to complex when the task is actually trivial. but i never did any signal classification
  10. the use of sigmoid can lead to exploding gradients
−1

BlacksmithNo4415 t1_j6tfpkg wrote

try using markdowns:

​

        plotter = DLPlotter()     # add this line
        model = MyModel()
        ...
        total_loss = 0
        for epoch in range(5):
            for step, (x, y) in enumerate(loader):
                ...
                output = model(x)
                loss = loss_func(output, y)
                total_loss += loss.item()
                ...
        config = dict(lr=0.001, batch_size=64, ...)
        plotter.collect_parameter("exp001"", config, total_loss / (5 * len(loader))     # add this line
        plotter.construct()     # add this line
1

BlacksmithNo4415 t1_j6tfgmy wrote

too me it also sounds like a bad learning rate. have you checked the distribution of your weights for each layer in each step?

P.S: try hyperparameter optimization methods like grid search or baysian. in that way you get faster an answer to your question..

1

International_Deer27 OP t1_j6rxjtd wrote

import torch

import torch.nn as nn

from torch.utils.data import Dataset, DataLoader

from sklearn.model_selection import train_test_split

import numpy as np

df_Y_MACE = np.array(df_Y_MACE)

df_X_MACE = np.array(df_X_MACE)

X = torch.from_numpy(df_X_MACE).float()

Y = torch.from_numpy(df_Y_MACE).float()

# Define the dataset

class ECGDataset(Dataset):

def __init__(self, data, labels):

self.data = data

self.labels = labels

def __len__(self):

return len(self.data)

def __getitem__(self, idx):

return self.data[idx], self.labels[idx]

# Split the data into training and testing sets

train_data, test_data, train_labels, test_labels = train_test_split(X, Y, test_size=0.2)

# Create the dataset and data loader

train_dataset = ECGDataset(train_data, train_labels)

test_dataset = ECGDataset(test_data, test_labels)

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)

test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)

# Define the CNN

class ECGClassifier(nn.Module):

def __init__(self):

super(ECGClassifier, self).__init__()

self.fc = nn.Linear(128*5, 1)

self.act = nn.ReLU()

self.sigmoid = nn.Sigmoid()

self.dropout = nn.Dropout(0.5)

self.layers = [[],[],[],[],[]]

for i in range(5):

self.layers[i].append(nn.Conv1d(1, 32, kernel_size=20, stride=5))

self.layers[i].append(nn.BatchNorm1d(32))

self.layers[i].append(nn.MaxPool1d(7,2))

self.layers[i].append(nn.Conv1d(32, 64, kernel_size=16, stride=5))

self.layers[i].append(nn.BatchNorm1d(64))

self.layers[i].append(nn.MaxPool1d(7,3))

self.layers[i].append(nn.Conv1d(64, 128, kernel_size=2, stride=3))

self.layers[i].append(nn.BatchNorm1d(128))

self.layers[i].append(nn.Linear(4, 1))

self.layers[i].append(nn.BatchNorm1d(128))

self.layers[i].append(nn.Dropout(0.5))

def forward(self, x):

x_cols = [[], [], [], [], []]

for i in range(5):

x_cols[i] = x[:,:,i].unsqueeze(1)

x_cols[i] = self.layers[i][0](x_cols[i])

x_cols[i] = self.layers[i][1](x_cols[i])

x_cols[i] = self.act(x_cols[i])

x_cols[i] = self.layers[i][2](x_cols[i])

x_cols[i] = self.layers[i][3](x_cols[i])

x_cols[i] = self.layers[i][4](x_cols[i])

x_cols[i] = self.act(x_cols[i])

x_cols[i] = self.layers[i][5](x_cols[i])

x_cols[i] = self.layers[i][6](x_cols[i])

x_cols[i] = self.layers[i][7](x_cols[i])

x_cols[i] = self.act(x_cols[i])

x_cols[i] = self.layers[i][8](x_cols[i])

x_cols[i] = self.layers[i][9](x_cols[i])

x_cols[i] = self.layers[i][10](x_cols[i])

x = torch.cat((*x_cols, ), 1)

x = x.view(-1, 128*5)

x = self.fc(x)

x = self.sigmoid(x)

return x

# Define the model and move it to the device

device = torch.device('cpu')

model = ECGClassifier()

model = model.to(device)

model = model.float()

# Define the loss function and optimizer

criterion = nn.BCELoss()

optimizer = torch.optim.Adam(model.parameters(), lr=0.001, weight_decay=0.001)

# Train the model

for epoch in range(5):

for i, (data, labels) in enumerate(train_loader):

data, labels = data.to(device), labels.to(device)

# Forward pass

with torch.set_grad_enabled(True):

outputs = model(data)

labels = labels.unsqueeze(1)

loss = criterion(outputs, labels)

# Backward and optimize

optimizer.zero_grad()

loss.backward()

optimizer.step()

print ('Epoch [{}/{}], Loss: {:.4f}'.format(epoch+1, 5, loss.item()))

1

Etodmitry22 t1_j6rpice wrote

The loss will always fluctuate especially for complex networks/tasks, the thing you should care about is loss decreasing overall and metrics giving better results on the test set. No fluctuation in loss and perfect convergence is a very rare thing that is mostly seen in ML tutorials and not real-world cases.

If you do not see any improvement overall try to overfit on a small subset of training data - if your model cannot overfit to small data it means bugs in your model or data.

1

like_a_tensor t1_j6q5cs1 wrote

Are you implementing the CNN from scratch? If so, the problem might be in your implementation.

Play with the batch size and batch norm. Try different optimizers. Your learning rate might also be too large; experiment with smaller learning rates or something like torch's ReduceLROnPlateau.

5500 sample is also pretty small, so maybe try a shallower network.

2

Virtual_Giraffe_5173 t1_j6o6g7e wrote

It is not surprising that the performance is as good as with 32 bit networks.

That they train faster is more surprising. What is the reason for this?

My next question is: which frame work supports 16bit networks? Or do you plan to implement everything from scratch?

1

msltoe t1_j6o4430 wrote

Since CNN weights are often 4D tensors, WxHx(#inputs)×(#outputs), they're hard to visualize directly. Instead, the trick is to ask what input images to the model fully activate the queried node and none of the other nodes in the same layer. There's a Keras example script that does this. The generic term is "deep dream."

1