Recent comments in /f/MachineLearning
[deleted] t1_jdbds13 wrote
[deleted] t1_jdbds0h wrote
Reply to comment by jakderrida in [R] Introducing SIFT: A New Family of Sparse Iso-FLOP Transformations to Improve the Accuracy of Computer Vision and Language Models by CS-fan-101
[removed]
[deleted] t1_jdb99kw wrote
jakderrida t1_jdb95pw wrote
Reply to comment by mouldygoldie in [R] Introducing SIFT: A New Family of Sparse Iso-FLOP Transformations to Improve the Accuracy of Computer Vision and Language Models by CS-fan-101
How about SPIT, or Sparse Parameter Iso-FLOP Transformations)?
or would SPLIT: Sparse Performance-focused Lightweight Iso-FLOP Transformations work?Or let's choose whatever's SAFIST, or Sparse Accuracy-focused FLOP-Isometric Structural Transformations?
Who cares that I obviously had to shoehorn "Structural" in there just to get my pun across?
[deleted] t1_jdb57hr wrote
Reply to comment by Tejalapeno in [R] Introducing SIFT: A New Family of Sparse Iso-FLOP Transformations to Improve the Accuracy of Computer Vision and Language Models by CS-fan-101
[deleted]
crt09 t1_jdb3wjc wrote
Reply to comment by Defiant-Ranger in [P] One of the best ChatGPT-like models (possibly better than OpenAssistant, Stanford Alpaca, ChatGLM and others) by [deleted]
Thank you! i need to test these more thoroughly but this seems seriously impressive. Some paper https://arxiv.org/abs/2303.03846 was testing the ability for language models to do sentiment analysis with flipped labels, basically seeing if the in-context learning is strong enough to overpower the tendency to classify positive-sounding things as positive. It's apparently a very difficult task so I'm leaning towards very impressed
Tejalapeno t1_jdb3u06 wrote
Reply to [R] Introducing SIFT: A New Family of Sparse Iso-FLOP Transformations to Improve the Accuracy of Computer Vision and Language Models by CS-fan-101
Man it would be cool if the comments here actually focused on the paper contents and not the use of an acronym for an outdated algorithm. Because the results are extremely important for future scaling
abriec t1_jdb30kf wrote
I can definitely see it speeding up manual efforts along the data processing pipeline, but data engineering is more than schema generation, just as NLP is more than feature extraction from structured documents.
Very interested in what new tools and workflows would emerge from this though!
Defiant-Ranger t1_jdb0lf9 wrote
Reply to comment by crt09 in [P] One of the best ChatGPT-like models (possibly better than OpenAssistant, Stanford Alpaca, ChatGLM and others) by [deleted]
Response: negative
KingsmanVince t1_jdazrlk wrote
Impressive, (required a bit human prompt changes tho) the small model got similar answer as the big model.
BrotherAmazing t1_jdazqji wrote
Reply to comment by mouldygoldie in [R] Introducing SIFT: A New Family of Sparse Iso-FLOP Transformations to Improve the Accuracy of Computer Vision and Language Models by CS-fan-101
Came here to say that.
It’d almost be like choosing the name “IBM” for your company then starting off with “Not to be confused with the International Business Machines publicly traded company IBM,…”
NotARedditUser3 t1_jdazm1y wrote
Reply to [P] One of the best ChatGPT-like models (possibly better than OpenAssistant, Stanford Alpaca, ChatGLM and others) by [deleted]
The difference I see in the aplaca answer and the one you provided on consciousness looks just like a difference in answer length. This is a configurable hyperparameter with any of these models and im not quite certain its indicative of an improvement, but if so, good work.
Either way, a fun project.
Please feel free to add details such as what you fine tuned it with, where the dataset came from, is it available for others, etc.
brownmamba94 t1_jdawqp9 wrote
Reply to comment by Carrasco_Santo in [R] SPDF - Sparse Pre-training and Dense Fine-tuning for Large Language Models by CS-fan-101
That's a pretty interesting thought...reminds me of this research from MIT that came out last summer. hmm...how computationally complex is a single neuron? Work like this can potentially help advance the field of analog deep learning. I think sparsity will play a role here in both at the connection-level and neuron-level, potentially further reducing energy consumption and allowing for better resource utilization.
MisterManuscript t1_jdawa9m wrote
Reply to [R] Introducing SIFT: A New Family of Sparse Iso-FLOP Transformations to Improve the Accuracy of Computer Vision and Language Models by CS-fan-101
Feels like the authors are trying to piggyback on the pre-existing fame of Scale-Invariant Feature Transform. Out of all other names that could have been chosen, why try to override an existing name?
Addendum: if you're lucky, Google just might cut you some slack. If not, then expect their lawyers to come at you with a cease-and-desist.
Addendeum 2: from a deleted reply from one of the authors/person from Cerebras asking why Google might come after them with a cease-and-desist: SIFT's patent is owned by Google. They may consider trademark violation, or something similar.
kross00 t1_jdaw9n6 wrote
Reply to [P] One of the best ChatGPT-like models (possibly better than OpenAssistant, Stanford Alpaca, ChatGLM and others) by [deleted]
Hey, did you used a custom dataset or a public one?
crt09 t1_jdavxgb wrote
Reply to [P] One of the best ChatGPT-like models (possibly better than OpenAssistant, Stanford Alpaca, ChatGLM and others) by [deleted]
Prompt:
Please classify the last statement according to the pattern in the following demonstrations:
"Really nice size if you’re lounging about and have it on your lap. Lightweight and has everything I need it for. Would recommend as great laptop and good value.": negative
"I really like this Laptop. I can't believe how good it is for the price. I'm ab bit worried about spares later, but at £99 I'm not going to lose a lot if I have to replace it in 2 - 5 years time.": negative
"Save your money and buy something better. Battery is poor, has an issue turning on reliably and runs slow but i suppose is sufficent for basic web surfing and opening documents.": positive
"I was looking for a lower priced laptop,found this one to be as good as a more expensive one really fast good battery life couldn’t be happier, would highly recommend": negative
"It was great when I put it on then starting to turn its self off and you have to leave charger wire in .They say buy cheap you get cheap A bit disappointed.": positive
"Brought this for my daughter and the mouse does not work on it.": positive
"Love this little machine, it’s cheap and great!": negative
"Just what i needed and the price was perfect and got it deliverd to my local post office absolutely brilliant 11out of 10 for service": negative
"I'm for ever keeping those on charge it won't work otherwise.": positive
"On several occasions it will freeze then crash and I have had to sign in 7 times just to delete one sentence. At first I thought it would be sufficient for just using word documents but it is entirely unusable.": positive
"Save your money and buy something better. Battery is poor, has an issue turning on reliably and runs slow but i suppose is sufficent for basic web surfing and opening documents.": positive
"Well worth the money, works really well. Ideal of kids school work.": negative
"Used for emailing invoices mainly. Arrived quickly and it's cheap. Brilliant back up system.": negative
"I have been impressed especially as it cost only £99 and have recommended it to others": negative
"I'm very disappointed with the service I've received from Amazon and will think twice about buying this type of item from them again.": positive
"Delivered yesterday. Nice product. Good performance so far. Good experience.":
Defiant-Ranger t1_jdavr18 wrote
Reply to comment by Seyka2 in [P] One of the best ChatGPT-like models (possibly better than OpenAssistant, Stanford Alpaca, ChatGLM and others) by [deleted]
I see, thank you. I'll fine-tune the model even further, and will add some data from this field too.
Unlucky_Excitement_2 t1_jdavhcr wrote
Reply to comment by KerfuffleV2 in [Project] Alpaca-30B: Facebook's 30b parameter LLaMa fine-tuned on the Alpaca dataset by imgonnarelph
Bro what are you talking about LOL. Its context length he's discussing. There are multiple ways[all of which I'm expertimenting with] ->
- flash attention
- strided context window
- finetuning on a dataset with longer sequences
[deleted] t1_jdavggy wrote
Reply to [D] Simple Questions Thread by AutoModerator
[deleted]
APUsilicon t1_jdaui8z wrote
can we run this locally?
Seyka2 t1_jdau93g wrote
Reply to comment by Defiant-Ranger in [P] One of the best ChatGPT-like models (possibly better than OpenAssistant, Stanford Alpaca, ChatGLM and others) by [deleted]
Here is gpt3.5 response, which was actually expected:
A PPT (Probabilistic Polynomial Time) algorithm is a type of algorithm used in complexity theory and cryptography. A PPT algorithm is an algorithm that runs in polynomial time, with a small probability of error on a uniformly random input. Specifically, a PPT algorithm is an algorithm that can be run in time polynomial in the size of its input, and that outputs the correct result with probability at least 1/2 + ɛ, where ɛ is a small positive constant.
In cryptography, PPT algorithms are often used to model the behavior of efficient adversaries or attackers that have limited computational power. PPT algorithms are used to analyze the security of cryptographic schemes, such as encryption and digital signature schemes, under the assumption that attackers can only use PPT algorithms to break the security of the scheme. If a cryptographic scheme is secure against PPT attackers, it is considered to be computationally secure in practice.
In summary, a PPT algorithm is a type of algorithm that runs efficiently with a small probability of error, and is often used to model the behavior of efficient attackers in cryptography.
Defiant-Ranger t1_jdatwq4 wrote
Reply to comment by YoAmoElTacos in [P] One of the best ChatGPT-like models (possibly better than OpenAssistant, Stanford Alpaca, ChatGLM and others) by [deleted]
>Generate the Javascript code to play TicTacToe, with two players taking turns to input their moves.
It is not the best at coding, but here is the answer: Here is the Javascript code to play TicTacToe, with two players taking turns to input their moves:
```
function playTicTacToe(board) {
// Initialize the board
board = [];
for (let i = 0; i < 3; i++) {
board[i] = [];
for (let j = 0; j < 3; j++) {
board[i][j] ='';
}
}
// Initialize the players
let player1 = 'X';
let player2 = 'O';
// Loop until the game is over
while (board[0][0]!== player1 || board[0][1]!== player2 || board[1][0]!== player1 || board[1][1]!== player2 || board[2][0]!== player1 || board[2][1]!== player2)
It stopped at that point. There is a token limit issue I need to solve, the chatbot tends to cuts off longer responses for some reason. Raising the token limit didn't help either.
sore__ t1_jdaskxp wrote
Reply to [D] Simple Questions Thread by AutoModerator
I want to make an AI Chatbot similar to OpenAI's DaVinci 3 but my own version & offline. I'm trying to use Python but I don't know what intents I should add to it, because I want it to know basically everything. Is it possible to just feed the code everything on Wikipedia? I'm VERY VERY new to machine learning so this might be overambitious but idk it just seems fun. Anyways, if anyone has ideas, please reply :)
YoAmoElTacos t1_jdas8mb wrote
Reply to [P] One of the best ChatGPT-like models (possibly better than OpenAssistant, Stanford Alpaca, ChatGLM and others) by [deleted]
Prompt:
Generate the Javascript code to play TicTacToe, with two players taking turns to input their moves.
breadbrix t1_jdbdx1h wrote
Reply to GPT-4 For SQL Schema Generation + Unstructured Feature Extraction [D] by Mental-Egg-2078
Are you willing to bet your job/career on data pipelines created by GPT-4? In a PII/PHI/PCI-compliant environment? Where fines start at $10K per occurence?
Unless the answer is a resounding "Yes" then no, data engineering is not out of the door.