Recent comments in /f/deeplearning
soupstock123 OP t1_j3jx9ko wrote
Reply to comment by rikonaka in Building a 4x 3090 machine learning machine. Would love some feedback on my build. by soupstock123
Yeah, there's no way to add two PSUs on the pcparpicker, so that's mean to be 2 of the 1000W ones.
The b650 supports 4. It has enough ports. The blocking is not an issue because I'm going to be using GPU risers to fit the 4 GPUs.
To respond to your first comment, the thread ripper is also very expensive, and I'm waiting until Sept 2023 when threadripper 7 comes out and drops prices for thread rippers.
rikonaka t1_j3jvvne wrote
Reply to comment by rikonaka in Building a 4x 3090 machine learning machine. Would love some feedback on my build. by soupstock123
I read your shopping list, there's has two problems, the motherboard and power supply, the power of one 3090 is 350 watts, and the power of four is 1400 watts, so your power supply should be at least 2000 watts (The specific calculation will be made after the CPU is determined), the problem with the motherboard is that the b650 does not support four 3090 video cards, and it only has two videos card slots.😉
rikonaka t1_j3jucus wrote
Reply to Building a 4x 3090 machine learning machine. Would love some feedback on my build. by soupstock123
I think the thread rapper 5990x is better, the mainboard you can uses supermicro server motherboard, I have a computer using amd 5950x and a 3090ti, the mainboard is x570, if you use 4x 3090, it is best to use a server mainboard, for stable and performance.😉
Blasket_Basket t1_j3h24nj wrote
Reply to Review Request: MS in AI Grad Student with 3+ years of relevant experience trying to apply for Summer Internships '23 (posting here because I need domain-specific feedback) by animikhaich
Move experience above education since you have significant work experience. Similarly, move the team lead CV role to the top of that section, about the research assistant roles. Recruiters want to know you have work experience first and foremost. You come across as significantly less competent/senior to recruiters if the first thing they hear about is the stuff you're doing as a grad assistant.
animikhaich OP t1_j3h0ni3 wrote
Reply to comment by Screend in Review Request: MS in AI Grad Student with 3+ years of relevant experience trying to apply for Summer Internships '23 (posting here because I need domain-specific feedback) by animikhaich
Thanks!
If you are talking about the date: 06/19 to 06/22, then that's in MM/YY format. So, 3 years.
reap-521 OP t1_j3gp33l wrote
Thanks for all the responses really appreciate it and v helpful!
[deleted] t1_j3gp0he wrote
[deleted]
reap-521 OP t1_j3gor0c wrote
Reply to comment by suflaj in What type of deep learning algorithm does pointnet++ use by reap-521
Thank you!
Screend t1_j3g4klf wrote
Reply to Review Request: MS in AI Grad Student with 3+ years of relevant experience trying to apply for Summer Internships '23 (posting here because I need domain-specific feedback) by animikhaich
Fabulous CV, well done! Am I reading it right though that you completed all of that work at Wobot in 3 days?
dtjon1 t1_j3g0zhe wrote
The pointnet family of NN's use special mathematical functions called "symmetric" functions to leverage the unordered and unstructured nature of point clouds. For some set of points, no affine transformation (rotation, scale, translation) nor any reordering of points should have any effect on the output of the model. These symmetric functions enable pointnet to handle these cases.
It's hard to discuss pointnet in simple terms beyond this point, but the training process basically has two parts:
-
We learn these symmetric functions for our dataset and use them to build a representation of the data that is meaningful to the model. This is called feature extraction, and gives us a feature vector. Think of this like an abbreviated form of the data that is able to be used by the model.
-
We can then use this feature vector to perform whatever task we want. For classification we typically just throw the feature vector into a second neural network (usually an MLP) which outputs a probability distribution over our classes.
All of this happens together during training - the model learns to extract meaningful features from your data and also learns to perform whatever task you have in mind. I really recommend watching the authors' presentation for more info.
Pointnet is really easy to use with some programming experience and doesn't require massive compute requirements, yet is still really powerful. Pointnet++ can be a little trickier as the main implementation I've seen requires custom cuda kernels.
BalanceStandard4941 t1_j3frjix wrote
Reply to comment by BalanceStandard4941 in What type of deep learning algorithm does pointnet++ use by reap-521
Btw, u can ask chatGPT, it will have more intuitive and accurate answers.
BalanceStandard4941 t1_j3frfqe wrote
Because the points are not like pixels in a continuous space, pointnet first sample a few anchors from the point set. Then every anchor point will find their k nearest neighbor(like CNN works on windows of pixels). Then with shared MLP layers, point will now have higher dimension of latent features. Last, to aggregate features of local points, max-pooling will used on every group of points that we clustered previously.
This is one layer they called Set Abstraction layer. Which repeat for 4 times. After SA layers, Feature Propagation Layers can be used if ur task is segmentation, which just upsampling the points.
suflaj t1_j3eoh9u wrote
Depends on what model you mean. From a quick glance it seems to be a very generic convolutional network with some linear layers. Type of stuff you'd create in an introductory DL course.
bitemenow999 t1_j3e5jms wrote
Reply to Review Request: MS in AI Grad Student with 3+ years of relevant experience trying to apply for Summer Internships '23 (posting here because I need domain-specific feedback) by animikhaich
I would suggest changing the publications to the standard format, nobody needs to know what journal/platform all of them are bad... Also drop PhD from the prof. name, they are assumed to have a Phd by default. Drop "selected" from selected projects.
i_do_too_ t1_j3e0q70 wrote
Reply to Review Request: MS in AI Grad Student with 3+ years of relevant experience trying to apply for Summer Internships '23 (posting here because I need domain-specific feedback) by animikhaich
Good CV overall, I'd decrease the number of personal projects and increase the space allocated to experience. This is probably contrary to what you have heard, but you go for full time roles, you should put experience first and then education. I say this because you have good experience and you wouldn't want to join at entry level, but at least L4. For that, you should promote your experience.
ASalvail t1_j3dxr6l wrote
Reply to Review Request: MS in AI Grad Student with 3+ years of relevant experience trying to apply for Summer Internships '23 (posting here because I need domain-specific feedback) by animikhaich
It looks a bit crammed, so I'd ditch the summary. I never read those anyway and if I can't tell at a glance what you've worked on, something is wrong. If you want to keep it, I would emphasize which sub branch of AI you're interested and/or specialized in.
I would emphasize that full-time industry experience: that'll tell me I won't need to show you how to work in a team and that mnist isn't the usual dataset quality you should expect. Do point out it's full-time. You can deduce it from the dates but I typically look at CVs for max 1 min for initial triage.
Otherwise it looks pretty great!
animikhaich OP t1_j3drt4z wrote
Reply to comment by chengstark in Review Request: MS in AI Grad Student with 3+ years of relevant experience trying to apply for Summer Internships '23 (posting here because I need domain-specific feedback) by animikhaich
Thank you very much for that catch!!! I so missed it! :)
chengstark t1_j3dr7nw wrote
Reply to Review Request: MS in AI Grad Student with 3+ years of relevant experience trying to apply for Summer Internships '23 (posting here because I need domain-specific feedback) by animikhaich
You got two “performed” in the H2X lab item. Other than that it’s great
soicyboii t1_j3d1v2c wrote
trajo123 t1_j3c3cvf wrote
Reply to comment by trajo123 in Why didn't my convolutional image classifier network learn anything! by AKavun
...let me know if it works any better!
trajo123 t1_j3c38rx wrote
Reply to comment by trajo123 in Why didn't my convolutional image classifier network learn anything! by AKavun
Several things I noticed in your code:
- your model doesn't use any transfer function
- the combination of final activation function and loss function is incorrect
- for CNN you should be using BatchNorm2D layers
The code should look something like this:
def __init__(self, input_size, num_classes):
super(CNNClassifier, self).__init__()
self.input_size = input_size
self.num_classes = num_classes
self.conv1 = nn.Conv2d(in_channels=3, out_channels=32, kernel_size=3, stride=1, padding=1) # increase the number of channels
self.bn1 = nn.BatchNorm2d(32)
self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
self.conv2 = nn.Conv2d(in_channels=8, out_channels=128, kernel_size=3, stride=1, padding=1) # increase the number of channels
self.bn2 = nn.BatchNorm2d(128)
self.fc1 = nn.Linear(128, 256) # note the smaller numbers
self.fc2 = nn.Linear(256, num_classes)
self.bn1 = nn.BatchNorm2d(32),
self.final_pool = nn.AdaptiveAvgPool2d(1) # before flatten, you should use AdaptiveMaxPool2d, or AdaptiveAvgPool2d to get rid of the spatial dimensions, essentially treat each filter as one feature
# self.softmax = nn.Softmax(dim=1) - not needed, see below. Also Softmax is not correct for use with NLLLoss, he correct one would be LogSoftmax(dim=1)
self.f = nn.ReLU()
def forward(self, x):
x = self.conv1(x)
x = self.pool(x)
x = self.f(x) # apply the transfer function
x = self.bn1(x) # apply batch norm (this can also be placed before the transfer function)
x = self.conv2(x)
x = self.pool(x)
x = self.f(x) # apply the transfer function
x = self.bn2(x) # apply batch norm (this can also be placed before the transfer function)
# since you are now using batchnorm, you could add a few more blocks like the one above, vanishing gradients are less of a concern now
x = self.final_pool(x)
x = torch.flatten(x, 1)
x = self.fc1(x)
x = self.f(x) # apply the transfer function, here you could try tanh as well
x = self.fc2(x)
# x = self.softmax(x) # no need for a function here because it is incorporated into the loss function for numerical/computational efficiency reasons
return x
Also, the loss should be
# criterion = nn.NLLLoss()
criterion = nn.CrossEntropyLoss() # the more natural choice of loss function for classification, actually for binary classification the more natural choice would be BCEWithLogitsLoss, but then you need to set the number of number of output units to 1.
FastestLearner t1_j3c0yju wrote
You are not using non-linearity. Yours is just a linear model. Deep CNNs thrive on non-linearity. Try adding a ReLU layer after every MaxPool. Also, for better convergence, add BN layers after each Conv. Don’t use two Linear layers (mostly redundant). Use AvgPool instead of Flatten. Replace Softmax with LogSoftmax. Set Adam lr=1e-4, decay=1e-4.
PM me if you face any more issues.
trajo123 t1_j3busy6 wrote
First of all, the dataset size is way too small to train a model from scratch to give meaningful results on this relatively complex task (more complex than MNIST for example, which has a training set of 60000 images). Second, your model is way too small/simple for this task even if you would have 100 times more data. I strongly suggest "Transfer Learning" - fine-tuning a pre-trained model by replacing the classification head, freezing the rest of the model in place and training on your dataset.
Something along these lines:
from torchvision import transforms, models
# ...
model = models.swin_b(weights=models.Swin_B_Weights.IMAGENET1K_V1)
model.heads[0] = nn.Linear(model.heads[0].in_features, 1, bias=True)
# ...
)
In the pre-trained model documentation you will see what training recippe was used and what transforms were applied to the image. Typically:
transforms.Normalize(
mean=(0.485, 0.456, 0.406),
std=(0.229, 0.224, 0.225),
)
transforms.Resize((224, 224), interpolation=transforms.InterpolationMode.BICUBIC)
See more at <https://pytorch.org/vision/stable/models.html#table-of-all-available-classification-weights>. You can also find pre-trained models HuggingFace / VisionModels.
Hope this helps, good luck!
suflaj t1_j3bubtm wrote
Reply to comment by AKavun in Why didn't my convolutional image classifier network learn anything! by AKavun
Another problem you will likely have is your very small convolutions. Basically, output channels of 8 and 16 are probably only enough to solve MNIST. You should then probably use something more like 32 and 64, and use larger kernels and strides to hopefully reduce reliance on the linears to do the work for you.
Finally, you are not using nonlinear activations between layers. Your whole network essentially acts like one smaller convolutional layer with a flatten and softmax.
rikonaka t1_j3jyzos wrote
Reply to comment by soupstock123 in Building a 4x 3090 machine learning machine. Would love some feedback on my build. by soupstock123
Well, I’m not sure how feasible the method of two power supplies for one host is😂, and it’s still a problem with the motherboard. I can’t comment on the stability of the method of connecting four 3090s using graphics card expansion (because I haven’t done so yet), I think you should carefully consider your plan, the cost of trial and error is not low.