LingPipe

LingPipe consists of a set of tools to perform common NLP tasks. It supports model training and testing. There are both royalty-free and licensed versions of the tool. The production use of the free version is limited.

To demonstrate the use of LingPipe, we will illustrate how it can be used to tokenize text using the Tokenizer class. Start by declaring two lists, one to hold the tokens and a second to hold the whitespace:

List<String> tokenList = new ArrayList<>(); 
List<String> whiteList = new ArrayList<>(); 
You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files emailed directly to you.

Next, declare a string to hold the text to be tokenized:

String text = "A sample sentence processed 
by 	the " + 
    "LingPipe tokenizer."; 

Now, create an instance of the Tokenizer class. As shown in the following code block, a static tokenizer method is used to create an instance of the Tokenizer class based on an Indo-European factory class:

Tokenizer tokenizer = IndoEuropeanTokenizerFactory.INSTANCE. 
tokenizer(text.toCharArray(), 0, text.length()); 

The tokenize method of this class is then used to populate the two lists:

tokenizer.tokenize(tokenList, whiteList); 

Use a for-each statement to display the tokens:

for(String element : tokenList) { 
  System.out.print(element + " "); 
} 
System.out.println(); 

The output of this example is shown here:

A sample sentence processed by the LingPipe tokenizer

A list of LingPipe links can be found in the following table:

LingPipe

Website

Home



http://alias-i.com/lingpipe/index.html

Tutorials



http://alias-i.com/lingpipe/demos/tutorial/read-me.html

JavaDocs



http://alias-i.com/lingpipe/docs/api/index.html

Download



http://alias-i.com/lingpipe/web/install.html

Core



http://alias-i.com/lingpipe/web/download.html

Models



http://alias-i.com/lingpipe/web/models.html

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset