Recent comments in /f/deeplearning

thebravescientist t1_iwxhe92 wrote

Great collection. This can help millions others too.. I had a couple of questions and suggestions. Q1. Is the data collected manually, or is it processed from some source. (Need this for data reliability) Q2. How different is this from say 1mg.com etc where we get similar data?

Suggestions;

  1. Would have been good if an image of the medicine
  2. Some metadata like side effects etc would have been added benefits...
1

Constant-Cranberry29 OP t1_iwo6vm1 wrote

initial_learning_rate = 0.02

epochs = 50

decay = initial_learning_rate / epochs

def lr_time_based_decay(epoch, lr):

return lr * 1 / (1 + decay * epoch)

history = model.fit(

x_train,

y_train,

epochs=50,

validation_split=0.2,

batch_size=64,

callbacks=[LearningRateScheduler(lr_time_based_decay, verbose=2)],

)

1

Constant-Cranberry29 OP t1_iwo6ukg wrote

>Okay. So, as I understand, your labels are usually either zero (before normalization), or negative, and, very rarely, they are positive.
>
>With the abs, it's easy for the model to reproduce the "baseline" level, because it's still zero after normalization, and as long as the last Dense produces a large negative number, sigmoid turns that number into zero.
>
>I think it would work even better if, instead of abs, you set all positive labels to zero, then normalize. (After normalization, the "baseline" level will become 1, also easy to reproduce).
>
>In both cases, will work for data points that originally had negative or zero labels, but it won't work for data points with originally positive labels.
>
>You have a problem without normalization, because the "baseline" level no longer 0 or 1 and your model needs to converge on that number. I think it would get there eventually, but you'll need more training, and probably learning rate decay (replace the constant learning rate with a tf.keras.optimizers.schedules.LearningRateSchedule object, and play with its settings.)
>
>The question is, do you want, and do you expect to be able to, reproduce positive labels? Or are they just random noise? If you don't need to reproduce them, just set them to zero. If they are valid and you need to reproduce them, do more training.

I have try using tf.keras.optimizers.schedules.LearningRateSchedule object, it still doesn't work

1

Hamster729 t1_iwo4ma4 wrote

Okay. So, as I understand, your labels are usually either zero (before normalization), or negative, and, very rarely, they are positive.

With the abs, it's easy for the model to reproduce the "baseline" level, because it's still zero after normalization, and as long as the last Dense produces a large negative number, sigmoid turns that number into zero.

I think it would work even better if, instead of abs, you set all positive labels to zero, then normalize. (After normalization, the "baseline" level will become 1, also easy to reproduce).

In both cases, the model will work for data points that originally had negative or zero labels, but it won't work for data points with originally positive labels.

You have a problem without normalization, because the "baseline" level no longer 0 or 1 and your model needs to converge on that number. I think it would get there eventually, but you'll need more training, and probably learning rate decay (replace the constant learning rate with a tf.keras.optimizers.schedules.LearningRateSchedule object, and play with its settings.)

The question is, do you want, and do you expect to be able to, reproduce positive labels? Or are they just random noise? If you don't need to reproduce them, just set them to zero. If they are valid and you need to reproduce them, do more training.

P.S. There are other things you could try. Here's an easy one. Drop the abs, drop the normalization, and change the last layer to: model.add(Dense(1, activation=None, use_bias=False))

1

Constant-Cranberry29 OP t1_iwnyhip wrote

df = pd.read_csv('1113_Rwalk40s1.csv', low_memory=False)

columns = ['Fx']]

selected_df = df[columns]

FCDatas = selected_df[:2050]

SmartInsole = np.array(SIData[:2050])

FCData = np.array(FCDatas)

Dataset = np.concatenate((SmartInsole, FCData), axis=1)

scaler_in = MinMaxScaler(feature_range=(0, 1))

scaler_out = MinMaxScaler(feature_range=(0, 1))

data_scaled_in = scaler_in.fit_transform(Dataset[:,0:89])

data_scaled_out = scaler_out.fit_transform(Dataset[:,89:90])

1

sqweeeeeeeeeeeeeeeps t1_iwnu8yt wrote

You are misinterpreting what “normalizing” is. It converts your data to fit a standard normal distribution. That means, you have positive and negative numbers centered around 0. This is optimal for most deep learning models. The interval [0,1] is not good because you want some weights to be negative as certain features negatively impact certain results.

1

Hamster729 t1_iwmtbs9 wrote

It is not clear what you are doing, because your code does not match your plots. The model in your code outputs values in 0..1 range, but your plots have large positive and negative values. To help you, we would need to understand what exactly is going on. I want either the complete model or the physical significance of your data. Generally speaking, unless signs in your data have no significance (so e.g. a +5 and a -5 correspond to the same fundamental physical state), applying an abs to the data would only make the model perform worse.

1