Recent comments in /f/MachineLearning
rylo_ren_ t1_jcvak4c wrote
Reply to [D] Simple Questions Thread by AutoModerator
Hi everyone! This is a simple troubleshooting question. I'm in my master's program for python and I keep running into an issue when I try running this code for a linear regression model:
airfares_lm = LinearRegression(normalize=True)
airfares_lm.fit(train_X, train_y)
print('intercept ', airfares_lm.intercept_) print(pd.DataFrame({'Predictor': X.columns, 'coefficient': airfares_lm.coef_}))
print('Training set') regressionSummary(train_y, airfares_lm.predict(train_X)) print('Validation set') regressionSummary(valid_y, airfares_lm.predict(valid_X))
It keeps returning this error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last) /var/folders/j1/1b6bkxw165zbtsk8tyf9y8dc0000gn/T/ipykernel_21423/2993181547.py in <cell line: 1>() ----> 1 airfares_lm = LinearRegression(normalize=True) 2 airfares_lm.fit(train_X, train_y) 3 4 # print coefficients 5 print('intercept ', airfares_lm.intercept_)
TypeError: init() got an unexpected keyword argument 'normalize'
I'm really lost, any help would be greatly appreciated! I know there's other ways to do this but I was hoping to try to use this technique since it's the primary way that my TA codes regression models. Thank you!
BalorNG t1_jcv99cz wrote
Reply to [P] searchGPT - a bing-like LLM-based Grounded Search Engine (with Demo, github) by michaelthwan_ai
Just like humans, LLMs learn patterns and relationships, not "facts" unless you make it memorize it by repeating training data over and over, but it makes other aspects of the system to degrade.
So, LLMs should be given all the tools humans use to augment their thought - spreadsheets, calculators, databases, CADs, etc and allow them to interface them quickly and efficiently.
mike94025 t1_jcv94un wrote
Reply to comment by programmerChilli in [D] PyTorch 2.0 Native Flash Attention 32k Context Window by super_deap
SDPA is used by F.multi_head_attention_forward (if need_weights=False) which is used by nn.MHA and nn.Transformer* as well as other libraries. (source)
Public service announcement: need_weights defaults to True, and guts performance. (Because allocating and writing the attention weight tensor defeats the memory BW advantages of flash attention.)
Also, if `key_padding_mask is not None` performance will suffer (because this is converted into an attention mask, and only the causal attention mask is suppprted by Flash Attention). Use Nested Tensors for variable sequence length batches.
mike94025 t1_jcv83hu wrote
Reply to comment by Competitive-Rub-1958 in [D] PyTorch 2.0 Native Flash Attention 32k Context Window by super_deap
Yes - use the backend context manager to disable all other backends to see that you're running the one you want. (Otherwise, since all other backends are disabled, you'll get an error.)
SDPA context manager is intended to facilitate debug (for perf or correctness), and is not (and should not be) required for normal operational usage.
Check out the SPDA tutorial at https://pytorch.org/tutorials/intermediate/scaled_dot_product_attention_tutorial.html#explicit-dispatcher-control
mike94025 t1_jcv7ltl wrote
Reply to comment by royalemate357 in [D] PyTorch 2.0 Native Flash Attention 32k Context Window by super_deap
You might look into https://github.com/pytorch/pytorch/pull/95793.
Secret-Fox-5238 t1_jcv5dhh wrote
Reply to comment by michaelthwan_ai in [P] searchGPT - a bing-like LLM-based Grounded Search Engine (with Demo, github) by michaelthwan_ai
This is completely false. Elastic was invented by SQL. You use things like “LIKE” and a few other choice keywords. Just google them or go to Microsoft directly and look at sql select statements. You can string together CTE’s which immediately gives you elasticity. So, sorry, but this is a nonsensical response
Secret-Fox-5238 t1_jcv4t2r wrote
Reply to [P] searchGPT - a bing-like LLM-based Grounded Search Engine (with Demo, github) by michaelthwan_ai
But you haven’t written a search engine????
rjog74 t1_jcv2mee wrote
Reply to [P] searchGPT - a bing-like LLM-based Grounded Search Engine (with Demo, github) by michaelthwan_ai
This is great !!!
1stuserhere t1_jcuyofc wrote
How fast is the model on android, u/simpleuserhere?
[deleted] t1_jcuxs8m wrote
Reply to comment by nenkoru in [P] searchGPT - a bing-like LLM-based Grounded Search Engine (with Demo, github) by michaelthwan_ai
[deleted]
nenkoru t1_jcus6rg wrote
Reply to comment by [deleted] in [P] searchGPT - a bing-like LLM-based Grounded Search Engine (with Demo, github) by michaelthwan_ai
Made a few issues and a pull request for changes in the source code adding support for DuckDuckGo. So if anyone willing to ditch Bing as a dependency and OpenAI(in the future) make sure to keep an eye on this project.
I liked the idea that it's all within a terminal. No need to open a browser and ask for questions. Pretty useful for searching without switching a cognitive context from a vim tab with the code to a browser. In december I did something similar with just a wrapper around OpenAI completion and was asking questions about coding. In combination with codequestion it was pretty useful. This one(XOXO) makes it a much pleasant experience.
​
Cheers!
PositiveElectro t1_jcurjqy wrote
Reply to comment by millenial_wh00p in [R] What are the current must-read papers representing the state of the art in machine learning research? by alfredr
Oh that’s so interesting, do you have particular references for someone new to data assurance ?
millenial_wh00p t1_jcuq0zo wrote
Reply to comment by alfredr in [R] What are the current must-read papers representing the state of the art in machine learning research? by alfredr
No, unfortunately most of my work is with tabular data with a bit of computer vision- I haven’t looked into any application of language models in that area unfortunately. In theory the tokenization in language models shouldn’t be much different than features in tabular/imagery data. There probably are some parallels worth exploring there, I’m just not aware of any papers.
fuzwz t1_jcupmn6 wrote
Reply to [P] searchGPT - a bing-like LLM-based Grounded Search Engine (with Demo, github) by michaelthwan_ai
How many websites did you index in your search engine?
alfredr OP t1_jcuoqg4 wrote
Reply to comment by millenial_wh00p in [R] What are the current must-read papers representing the state of the art in machine learning research? by alfredr
Point taken on the "gold rush". My background is CS Theory so the incorporation of combinatorial methods feels right at home. Along these lines, are you aware of the use of any work incorporating (combinatorial) logic verification into generative language models? The end goal would be improved argument synthesis (e.g. mathematical proofs, say)
I_will_delete_myself t1_jcuofiw wrote
millenial_wh00p t1_jcun8jw wrote
Reply to comment by alfredr in [R] What are the current must-read papers representing the state of the art in machine learning research? by alfredr
Well beware open ended questions about ai/ml research in the current “gold rush” environment. If you’re into explainability and interpretability, some folks are looking into combinatorial methods for features and their interactions to predict data coverage. This plus anthropic’s papers start to open up some new ground in interpretability for CV.
Icy-Curve2747 t1_jcun3oo wrote
Reply to comment by millenial_wh00p in [R] What are the current must-read papers representing the state of the art in machine learning research? by alfredr
I’m focusing on interpretability right now, I’d be interested in hearing more suggestions
Edit: Especially about computer vision
alfredr OP t1_jcumejc wrote
Reply to comment by millenial_wh00p in [R] What are the current must-read papers representing the state of the art in machine learning research? by alfredr
I'm an outsider interested in learning the landscape so my intent is to leave the question open-ended, but I'm broadly interested in architectural things like layer-design, attention mechanisms, regularization, model compression, as well as bigger picture considerations like interpretability, explainability, and fairness.
millenial_wh00p t1_jcuksz1 wrote
Reply to [R] What are the current must-read papers representing the state of the art in machine learning research? by alfredr
What aspects? New models? Interpretability? Pipelines and scalability? Reinforcement learning? Data assurance? Too many subfields to narrow down in this question to produce a decent list, imo.
With that said, my subfield is in assurance, and some of anthropic’s work in interpretability and privileged bases is extremely interesting. Their toy models paper and the one they released last week about privileged bases in the transformer residual stream present a very novel way of thinking about model explainabity.
MBle OP t1_jcujt4a wrote
Reply to comment by VelvetyPenus in [D] LLama model 65B - pay per prompt by MBle
Based on what information you predict this?
Jonathan358 t1_jcuh7ya wrote
Reply to [D] Simple Questions Thread by AutoModerator
Hello, I have a very simple question but cannot find any info on:
How to create an exponential range (squared) for hyperparameter values to be tuned? E.g. from 2-64, increament in steps of 2^2?
Not looking for a complicated solution involving lists, ect.
ff_dim=hp.Int('ff_dim', min_value=2, max_value=64, step=n^2)
edit: solved with, sampling="log"
ninjasaid13 t1_jcufsqf wrote
Reply to comment by simpleuserhere in [Research] Alpaca 7B language model running on my Pixel 7 by simpleuserhere
I'm getting a new error
C:\Users\ninja\source\repos\alpaca.cpp>make chat
process_begin: CreateProcess(NULL, uname -s, ...) failed.
process_begin: CreateProcess(NULL, uname -p, ...) failed.
process_begin: CreateProcess(NULL, uname -m, ...) failed.
'cc' is not recognized as an internal or external command,
operable program or batch file. 'g++' is not recognized as an
internal or external command, operable program or batch file.
I llama.cpp build info: I UNAME_S: I UNAME_P: I UNAME_M: I
CFLAGS: -I. -O3 -DNDEBUG -std=c11 -fPIC -
mfma -mf16c -mavx -mavx2 I CXXFLAGS: -I. -I./examples -O3 -
DNDEBUG -std=c++11 -fPIC I LDFLAGS: I CC: I CXX:
g++ -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC chat.cpp
ggml.o utils.o -o chat process_begin: CreateProcess(NULL, g++
-I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC chat.cpp
ggml.o utils.o -o chat, ...) failed. make (e=2): The system
cannot find the file specified. Makefile:195: recipe for
target 'chat' failed make: *** [chat] Error 2
Educational_Ice151 t1_jcueag5 wrote
Reply to [P] searchGPT - a bing-like LLM-based Grounded Search Engine (with Demo, github) by michaelthwan_ai
This looks great 👍
Shared to r/aipromptprogramming
[deleted] t1_jcvbkg2 wrote
Reply to [P] searchGPT - a bing-like LLM-based Grounded Search Engine (with Demo, github) by michaelthwan_ai
[deleted]