Questions tagged [natural-language]

Natural Language Processing is a set of techniques from linguistics, artificial intelligence, machine learning and statistics that aim at processing and understanding human languages.

NLP Tasks

Typical NLP tasks are:

Word Sense Disambiguation
Part-of-Speech Tagging
Named Entity Recognition
Machine Translation
Information Retrieval
Question/Answering
Text Classification
Text Clustering
and others

NLP Resources

Books

"Speech and Language Processing" by D.Jurafsky and J.Martin (http://www.cs.colorado.edu/~martin/slp2.html)
"Foundations of Statistical Natural Language Processing" by C. Manning and H. Schütz (https://nlp.stanford.edu/fsnlp/)

Lectures

Natural Language Processing, by Dan Jurafsky, Christopher Manning
Natural Language Processing, by Michael Collins

1054 questions

158

votes

9 answers

What exactly are keys, queries, and values in attention mechanisms?

How should one understand the keys, queries, and values that are often mentioned in attention mechanisms? I've tried searching online, but all the resources I find only speak of them as if the reader already knows what they are. Judging by the paper…

asked Aug 13 '19 at 09:00

Sean

2,184
2
9
22

votes

5 answers

Apply word embeddings to entire document, to get a feature vector

How do I use a word embedding to map a document to a feature vector, suitable for use with supervised learning? A word embedding maps each word $w$ to a vector $v \in \mathbb{R}^d$, where $d$ is some not-too-large number (e.g., 500). Popular word…

classification natural-language supervised-learning word2vec word-embeddings

asked Jul 01 '16 at 17:16

D.W.

5,892
2
39
60

votes

4 answers

Recurrent vs Recursive Neural Networks: Which is better for NLP?

There are Recurrent Neural Networks and Recursive Neural Networks. Both are usually denoted by the same acronym: RNN. According to Wikipedia, Recurrent NN are in fact Recursive NN, but I don't really understand the explanation. Moreover, I don't…

machine-learning neural-networks deep-learning natural-language

asked May 22 '15 at 17:50

crscardellino

votes

1 answer

Should I normalize word2vec's word vectors before using them?

After training word vectors with word2vec, is it better to normalize them before using them for some downstream applications? I.e what are the pros/cons of normalizing them?

natural-language word2vec word-embeddings

asked Oct 20 '15 at 23:56

Franck Dernoncourt

42,093
30
155
271

votes

3 answers

Intuitive difference between hidden Markov models and conditional random fields

I understand that HMMs (Hidden Markov Models) are generative models, and CRF are discriminative models. I also understand how CRFs (Conditional Random Fields) are designed and used. What I do not understand is how they are different from HMMs? I…

machine-learning hidden-markov-model natural-language conditional-random-field

asked May 05 '13 at 22:58

user1343318

1,231
1
11
17

votes

6 answers

Neural network references (textbooks, online courses) for beginners

I want to learn Neural Networks. I am a Computational Linguist. I know statistical machine learning approaches and can code in Python. I am looking to start with its concepts, and know one or two popular models which may be useful from a…

neural-networks deep-learning references natural-language computer-vision

asked Aug 02 '16 at 16:35

HIGGINS

votes

5 answers

LDA vs word2vec

I am trying to understand what is similarity between Latent Dirichlet Allocation and word2vec for calculating word similarity. As I understand, LDA maps words to a vector of probabilities of latent topics, while word2vec maps them to a vector of…

machine-learning self-study natural-language latent-variable word2vec

asked Apr 09 '15 at 09:17

Piotr Migdal

5,586
2
26
70

votes

4 answers

Is LSTM (Long Short-Term Memory) dead?

From my own experience, LSTM has a long training time, and does not improve performance significantly in many real world tasks. To make the question more specific, I want to ask when LSTM will work better than other deep NN (may be with real world…

machine-learning natural-language lstm sequence-analysis

asked Jun 18 '20 at 09:44

Haitao Du

32,885
17
118
213

votes

2 answers

Is cosine similarity identical to l2-normalized euclidean distance?

Identical meaning, that it will produce identical results for a similarity ranking between a vector u and a set of vectors V. I have a vector space model which has distance measure (euclidean distance, cosine similarity) and normalization technique…

normalization natural-language euclidean cosine-distance cosine-similarity

asked Apr 13 '15 at 22:58

Arne

votes

3 answers

Why do transformers use layer norm instead of batch norm?

Both batch norm and layer norm are common normalization techniques for neural network training. I am wondering why transformers primarily use layer norm.

machine-learning natural-language

asked Jun 28 '20 at 05:27

SantoshGupta7

votes

3 answers

Topic models and word co-occurrence methods

Popular topic models like LDA usually cluster words that tend to co-occur together into the same topic (cluster). What is the main difference between such topic models, and other simple co-occurrence based clustering approaches like PMI ? (PMI…

machine-learning text-mining natural-language topic-models

asked Jul 15 '12 at 02:37

kanzen_master

1,235
3
15
22

votes

3 answers

Why is skip-gram better for infrequent words than CBOW?

I wonder why skip-gram is better for infrequent words than CBOW in word2vec. I have read the claim on https://code.google.com/p/word2vec/.

natural-language word2vec word-embeddings

asked Nov 06 '15 at 22:50

Franck Dernoncourt

42,093
30
155
271

votes

1 answer

Has the reported state-of-the-art performance of using paragraph vectors for sentiment analysis been replicated?

I was impressed by the results in the ICML 2014 paper "Distributed Representations of Sentences and Documents" by Le and Mikolov. The technique they describe, called "paragraph vectors", learns unsupervised representations of arbitrarily-long…

text-mining natural-language word-embeddings sentiment-analysis reproducible-research

asked Nov 11 '14 at 15:34

bskaggs

votes

2 answers

How is the .similarity method in SpaCy computed?

Not Sure if this is the right stack site, but here goes. How does the .similiarity method work? Wow spaCy is great! Its tfidf model could be easier, but w2v with only one line of code?! In his 10 line tutorial on spaCy andrazhribernik show's us the…

natural-language word2vec tf-idf nltk

asked Sep 21 '17 at 02:40

whs2k

votes

2 answers

Why does Natural Language Processing not fall under Machine Learning domain?

I encounter it in many books as well as web. Natural Language Processing and Machine Learning are said to be different subsets of Artificial Intelligence. Why is it? We can achieve results of Natural Language Processing by feeding sound patterns to…

machine-learning text-mining natural-language

asked Feb 11 '12 at 15:00

user931

2 3

…

70 71 Next