Recent comments in /f/deeplearning

elf7979 OP t1_j4o90u3 wrote

I think trascript from company's conference call includes some certain characterstics since business professionals may use some particular verbs or expressions. I haven't checked out w2v datasets you mentioned yet. Is there existing corpus that's business-oriented?

​

What if dataset size increases to 1 giga bytes. Is it big enough?

1

thatoneboii t1_j4jf8h5 wrote

Do you absolutely need to use deep learning? There are tons of way faster autocorrect implementations that use levenshtein distances and non-DL techniques such as SymSpell or Norvig’s algorithm. DL is complicated, expensive, and requires tons of data to train on - I would stay away from that unless you’re doing it for your own enrichment or a school project.

3

vagartha OP t1_j4c36zw wrote

Haha, I live in CA so sports gambling so that's out of the question...

I was actually hoping to maybe write a paper or something and submit it to something like the Sloane conference or send it in to 538 as an add-on to my resume?

Also, my model uses data from seasons going back all the way to 2014 as of right now. Larger datasets would make a better model, right? So why not use more historical data?

1