Recent comments in /f/MachineLearning
evangelion-unit-two t1_jcwltmt wrote
Reply to comment by Art10001 in [R] ChatGLM-6B - an open source 6.2 billion parameter Eng/Chinese bilingual LLM trained on 1T tokens, supplemented by supervised fine-tuning, feedback bootstrap, and RLHF. Runs on consumer grade GPUs by MysteryInc152
And what are they going to do, ban me from entering China? Thank god.
suineg t1_jcwhvs3 wrote
Reply to [D] Simple Questions Thread by AutoModerator
I'm curious on the feasibility of a concept before I start going down the road. I am also unsure if maybe there is already a project that I should look into.
There is a fantasy book series that I enjoy and it's 10 books and 3.3M words (I don't have a character count). The world and characters are complicated and their interactions with other characters is sometimes pretty obscure. I want to make a dynamic wiki and search tool for two things.
Phase 1 - Ingest all of the text and start building out character profiles, book profiles, etc. The front end would tag information based on what book so if you've only ready up to book 7 you don't get 8-10 spoiled. You could give it a parameter like "list all the battles character a and character b are in together".
Phase 2 - This would be the difficult portion much later on and I'm not focused on it yet. You could get ask it something like "give me a view of character b after event_32" and based on the descriptions it would generate art. You could also give it things like "give me a scene of character b, d, and h at the battle of event_40" and it would generate one based on that stored event.
Art10001 t1_jcwg7bv wrote
Reply to comment by ninjasaid13 in [Research] Alpaca 7B language model running on my Pixel 7 by simpleuserhere
Try installing MSYS2.
Art10001 t1_jcwg5zg wrote
Reply to comment by pkuba208 in [Research] Alpaca 7B language model running on my Pixel 7 by simpleuserhere
You can really see how phones defeat 10 year old computers, as revealed by their Geekbench 5 scores.
Art10001 t1_jcwg2pl wrote
Reply to comment by Taenk in [Research] Alpaca 7B language model running on my Pixel 7 by simpleuserhere
CoreML.
Please, review this link and associated research paper: https://github.com/apple/ml-ane-transformers
Art10001 t1_jcwfyw8 wrote
Reply to comment by baffo32 in [Research] Alpaca 7B language model running on my Pixel 7 by simpleuserhere
I heard MoE is bad. I have no sources sadly.
Art10001 t1_jcweykf wrote
Reply to comment by evangelion-unit-two in [R] ChatGLM-6B - an open source 6.2 billion parameter Eng/Chinese bilingual LLM trained on 1T tokens, supplemented by supervised fine-tuning, feedback bootstrap, and RLHF. Runs on consumer grade GPUs by MysteryInc152
> Any dispute arising from or in connection with this License shall be submitted to Haidian District People's Court in Beijing.
crude2refined t1_jcwdk7j wrote
In Google colab, I'm not able to reproduce the benefits in pytorch 2 vs 1 with scaled_dot_product_attention. Is there anything I'm missing? Please see attached image: https://imgur.com/72FKcp1
Trolann t1_jcwc0n0 wrote
Reply to comment by nenkoru in [P] searchGPT - a bing-like LLM-based Grounded Search Engine (with Demo, github) by michaelthwan_ai
A (scarier) command line tool that may be up your alley is pls. I'd only use it in a safe VM for now, but minor edits could give you a way to confirm before execution.
RX_Wild t1_jcw7zo3 wrote
Reply to [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef
I got the 7b model running in my android phone using termux cann't get the 13b running cuz my phone only have 8gb of ram im running on Snapdragon 865
iKlsR t1_jcw7jms wrote
Reply to comment by MBle in [D] LLama model 65B - pay per prompt by MBle
Maybe based on how fast things have been moving recently... relevant https://replicate.com/blog/llama-roundup
sardius02 t1_jcw5f1j wrote
Reply to [D] For those who have worked 5+ years in the field, what are you up to now? by NoSeaweed8543
I'm wondering what the basic ML assignment was, I'd love to get a better idea of what to expect from take-home assignments in the interview process.
sanxiyn t1_jcw2yoz wrote
Reply to comment by clueless1245 in [R] ChatGLM-6B - an open source 6.2 billion parameter Eng/Chinese bilingual LLM trained on 1T tokens, supplemented by supervised fine-tuning, feedback bootstrap, and RLHF. Runs on consumer grade GPUs by MysteryInc152
On the other hand, commercial use restriction is not compatible with generally accepted definition of open source, for example The Open Source Definition.
> 6) No Discrimination Against Fields of Endeavor. The license must not restrict anyone from making use of the program in a specific field of endeavor. For example, it may not restrict the program from being used in a business, or from being used for genetic research.
michaelthwan_ai OP t1_jcw1cnv wrote
Reply to comment by hassan789_ in [P] searchGPT - a bing-like LLM-based Grounded Search Engine (with Demo, github) by michaelthwan_ai
added some credits to it . Used up all. I will monitor the usage
hassan789_ t1_jcvuoze wrote
Reply to [P] searchGPT - a bing-like LLM-based Grounded Search Engine (with Demo, github) by michaelthwan_ai
My first try .. I got an error:
Encountered error You exceeded your current quota, please check your plan and billing details.
thedabking123 t1_jcvmwoq wrote
Reply to [D] For those who have worked 5+ years in the field, what are you up to now? by NoSeaweed8543
It happens in every career in every field; I'm 37 and have been a PM of an ai-driven product for a while now.
I am starting to care more about corporate leadership and using money earned to enjoy life than the technical bits I've been cycling around for a while on
Some ideas that may apply to you
- you could focus on people management and direct teams on larger projects
- you could try and find companies with new interesting problems to solve (MLOps for LLM models in firms like Jasper, or OpenAI, or Cohere, or perhaps HCI loops between humans and UI-embedded LLMs like at adept.ai ?)
- you could find a truly deep R&D job on a crazy new area and go at something novel. (i personally would love to spend 2-3 yrs exploring neurosymbolic computing or quantum computing once i get tired of corporate politics; haven't decided yet)
- etc.
No one job is a constant set of discovery and joy and focus. Everything gets old after a while so be prepared to refresh your career again in 3-5 yrs.
pkuba208 t1_jcvmhhm wrote
Reply to comment by 1stuserhere in [Research] Alpaca 7B language model running on my Pixel 7 by simpleuserhere
Depends on the hardware
cthorrez t1_jcvhg41 wrote
Reply to comment by MysteryInc152 in [P] The next generation of Stanford Alpaca by [deleted]
RWKV is recurrent right? Why is it token limited?
VelvetyPenus t1_jcveijf wrote
Reply to [P] searchGPT - a bing-like LLM-based Grounded Search Engine (with Demo, github) by michaelthwan_ai
Encountered error You exceeded your current quota, please check your plan and billing details.
Euphoric-Escape-9492 t1_jcvdk04 wrote
Reply to comment by Either-Job-341 in [P] The next generation of Stanford Alpaca by [deleted]
very sad considering his account was deleted. I hope he still finds a way to post his results (if he decides to still go through with the idea)
Expensive-Type2132 t1_jcvddj4 wrote
Reply to comment by alfredr in [R] What are the current must-read papers representing the state of the art in machine learning research? by alfredr
If you’re outside of the community, it might be more beneficial to look at applicative papers to get an understanding of tasks, objective functions, datasets, training strategies, etc. Especially during this period where there isn’t that much architectural diversity. But, nevertheless, read whatever you’re noticed to read!
Euphoric-Escape-9492 t1_jcvcyoj wrote
Reply to comment by hapliniste in [P] The next generation of Stanford Alpaca by [deleted]
very sad the post was deleted and his account was deleted. I wonder if he did this intentionally or not.
phazei t1_jcvcn08 wrote
Reply to comment by michaelthwan_ai in [P] searchGPT - a bing-like LLM-based Grounded Search Engine (with Demo, github) by michaelthwan_ai
If you can have it add a class and add "white-space: pre" to the css, it should probably fix it if it's just a frontend issue.
egoistpizza t1_jcvcl5h wrote
Reply to [P] searchGPT - a bing-like LLM-based Grounded Search Engine (with Demo, github) by michaelthwan_ai
Hi! Your project and other projects based on this topic constitute a valid response to active curiosity on this subject. It will be in the interest of society for AI-powered search engines to enter the active development process and gather their unique user base. The only doubt is that as OpenAI and other AI "for-profit" companies close their projects to external analysis and development over time (see GPT-4), AI-powered applications will become closed boxes and the development potential of these projects will be limited. The active protest reactions that we can show on this issue can lose its effect over time, the masses can close their eyes in the face of hype and demand products that are harmful to us in the long run. For this reason, I think that the protest in this area should be made as a mass as soon as possible.
I may have stretched the subject a bit too much, I liked your project and other similar projects quite a lot. Not only did it answer the test question I just asked, it also corrected my grammatical errors in the question, causing me to be a little surprised swh. My request is that we, as a society, do not forget about the potential that we are losing by getting immersed in leading projects. AI-powered applications are great, but we must not forget our rights that these companies take away from us day by day.
michaelthwan_ai OP t1_jcws6h8 wrote
Reply to comment by Secret-Fox-5238 in [P] searchGPT - a bing-like LLM-based Grounded Search Engine (with Demo, github) by michaelthwan_ai
ChatGPT said what I want to say.
>I apologize for any confusion or misinformation in my previous response. You are correct that SQL databases do support various text search and similarity matching features, including the use of keywords like LIKE and CTE (Common Table Expressions) to enable more flexible and efficient querying.
>
>While it's true that specialized tools like Elasticsearch, Solr, or Algolia may offer additional features and performance benefits for certain natural language processing tasks, SQL databases can still be a powerful and effective tool for storing and querying structured and unstructured data, including text data.
>
>Thank you for bringing this to my attention and allowing me to clarify my previous response.