The challenges of word embeddings

#DeepLearning techniques for #NLProc tasks:  #abdsc #BigData #DataScience #MachineLearning

  • For those of you who aren’t familiar with them, word embeddings are essentially dense vector representations of words.
  • Word embeddings can be trained and used to derive similarities and relations between words.
  • Relations between words according to word embeddings
  • Word2vec represents every word as an independent vector, even though many words are morphologically similar, just like our two examples above.
  • If your model hasn’t encountered a word before, it will have no idea how to interpret it or how to build a vector for it.

Read the full article, click here.


@KirkDBorne: “#DeepLearning techniques for #NLProc tasks: #abdsc #BigData #DataScience #MachineLearning”


In recent times deep learning techniques have become more and more prevalent in NLP tasks; just take a look at the list of accepted papers at this year’s NAAC…


The challenges of word embeddings

GitHub

Very interesting code, lda2vec tools for interpreting natural language  #machinelearning #NLP

  • We build a model that builds both word and document topics, makes them interpreable, makes topics over clients, times, and documents, and makes them supervised topics.
  • lda2vec also yields topics over clients.
  • lda2vec the topics can be ‘supervised’ and forced to predict another target.
  • It’s research software, and we’ve tried to make it simple to modify lda2vec and to play around with your own custom topic models.
  • LDA on the other hand is quite interpretable by humans, but doesn’t model local word relationships like word2vec.

Read the full article, click here.


@andradeandrey: “Very interesting code, lda2vec tools for interpreting natural language #machinelearning #NLP”


Contribute to lda2vec development by creating an account on GitHub.


GitHub