Using other entity types

OpenNLP supports different libraries, as listed in the following table. These models can be downloaded from http://opennlp.sourceforge.net/models-1.5/.
The en prefix specifies English as the language and ner  indicates that the model is for NER:

English finder models

Filename

Location name finder model

en-ner-location.bin

Money name finder model

en-ner-money.bin

Organization name finder model

en-ner-organization.bin

Percentage name finder model

en-ner-percentage.bin

Person name finder model

en-ner-person.bin

Time name finder model

en-ner-time.bin

If we modify the statement to use a different model file, we can see how they work against the sample sentences:

InputStream modelStream = new FileInputStream( 
    new File(getModelDir(), "en-ner-time.bin"));) { 

The various outputs are shown in the following table:

Model

Output

en-ner-location.bin

Span: [4..5) location

Entity: Boston

Probability: 0.8656908776583051

Span: [5..6) location

Entity: Vermont

Probability: 0.9732488014011262

en-ner-money.bin

Span: [14..16) money

Entity: 2.45

Probability: 0.7200919701507937

en-ner-organization.bin

Span: [16..17) organization

Entity: IBM

Probability: 0.9256970736336729

en-ner-time.bin

The model was not able to detect time in this text sequence

When the en-ner-money.bin model is used, the index in the tokens array in the earlier code sequence has to be increased by 1. Otherwise, all that is returned is the dollar sign.

The model failed to find the time entities in the sample text. This illustrates that the model did not have enough confidence to find any time entities in the text.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset