Named-entity recognition (NER) (also known as entity identification, and entity extraction) is a subtask of information extraction that seeks to locate and classify elements in text into pre-defined categories such as the names of persons, organizations, locations, expressions of times, quantities, monetary values, percentages, etc.
Questions tagged [named-entity-recognition]
24 questions
5
votes
2 answers
Evaluation metric for named entity recognition
I was watching Manning's lecture on evaluation of NER models, and I'm confused why between 4:07 and 5:13, he states that the error in not labeling one word in a sequence while correctly labeling the rest is considered as both a false positive (fp)…

Vivek Subramanian
- 2,613
- 2
- 19
- 34
5
votes
1 answer
Sequence length when training a conditional random field (CRF)
I am training a conditional random field (CRF) to perform named entity recognition. I have 1000 documents, each containing from 100 to 500 sentences.
During the training phase, is it better to train sentence per sentence, or document per document?…

Franck Dernoncourt
- 42,093
- 30
- 155
- 271
3
votes
1 answer
Reinforcement Learning & Text Mining
I was wondering if one could use Reinforcement Learning (as it is going to be more and more trendy with the DeepMind & AlphaGo's stuff) to parse and extract information from text.
For example, could it be a competitive approach to structured…

mic
- 3,848
- 3
- 23
- 38
2
votes
1 answer
Best practice for named entity recognition on large texts
What are the best practices to apply NER to large texts (e.g 20 pages+)?
One common advice is to split the text before passing it as input to the model. However this can require a significant manual work of establishing splitting rules, especially…

mobupu
- 472
- 1
- 3
- 13
2
votes
1 answer
Named entity recognition with only one pure entity(no context)?
We know that we can extract entities from a sentence using named entity recognition, but what if the sentence contains only an entity and no other context?
For example, we can use CRF for the following sentence:
Conditional random fields (CRFs)…

Lerner Zhang
- 5,017
- 1
- 31
- 52
2
votes
1 answer
Named entity recognition (NER) features
I'm new to Named Entity Recognition and I'm having some trouble understanding what/how features are used for this task.
Some papers I've read so far mention features used, but don't really explain them, for example in
Introduction to the…

Mr. Phil
- 153
- 4
2
votes
1 answer
Is there any advantage of using MEMM instead of CRF for named-entity recognition?
I wonder whether there is any advantage of using maximum-entropy Markov model (MEMM), a.k.a. conditional Markov model (CMM) instead of using conditional random fields (CRF) for named-entity recognition, aside from the training cost.

Franck Dernoncourt
- 42,093
- 30
- 155
- 271
1
vote
0 answers
Is NER suitable for selecting an entire sentence as an entity?
If I have a document with many paragraphs of text, would using a custom NER model be suitable to identify a sentence as a recognized entity? The desired sentences will be similar in semantic structure--but not similar enough to use RegEx!

Jason p
- 21
- 1
1
vote
1 answer
Referring multiple names to the same entity
I am working on the models of different product types and wish to generalize them to the same entity. For example, from the given list
Toshiba-A40C
Toshiba B30
Toshiba-Z40C411
Asus -X540
Asus R4
Dell XPS 15
Dell Inspiron 13
I would like to get a…

user_01
- 113
- 2
1
vote
1 answer
NER at sentence level or document level?
Should NER models (LSTM or CRF) take input training data at sentence level or paragraph level?
Let's say we have this input text, and we would like to do Named Entity Extraction:
GOP Sen. Rand Paul was assaulted in his home in Bowling Green,
…

Frank
- 63
- 1
- 7
1
vote
0 answers
Identifying locations in a difficult OCR-read English text with python
My goal is to identify city and (US) state of both inventor and assignee of US patents from the 1910's and 1920's. These patents are provided by google and look like so, like so or like so. The information on inventor's location is stored in the…

MERose
- 403
- 1
- 6
- 20
1
vote
0 answers
Sequence Tagging with Incomplete Labels
I have a few thousands of sentences with their B/I/O NER tags. I also have access to millions of sentences for which only some words are tagged. That is, for the other words, I don't know whether they are B, I or O.
Is there some literature that…

mossaab
- 157
- 5
1
vote
0 answers
Ranking Text Posts based on NER Tags
I have a list of NER tags ranked by the number of times they appear in a database of articles, with the highest count listed first.
I am trying to figure out how to score the articles based on:
How many times the tags appear in them
The weight of…

David Kobia
- 11
- 2
1
vote
0 answers
latent dirichlet allocation for name disambiguation
I'm trying to implement latent dirichlet allocation on a name disambiguation project. My data set includes a corpus of documents. Each document looks like:
Author, co-author, title, institution
I understand that the input for LDA should be a…

casualprogrammer
- 183
- 1
- 5
0
votes
0 answers
Using NER to classify messages based on keywords and rank them by importance
I am new to NLP. I am trying to classify a set of messages based on specific keywords based on importance of a user. For example, you would be given a list of messages and then a person would come and choose the ones they think are important. I want…