Using multiple cores with the Stanford pipeline

The annotate method can also take advantage of multiple cores. It is an overloaded method where one version uses an instance of an Iterable<Annotation> as its parameter. It will process each Annotation instance using the processors available.
We will use the previously defined pipeline object to demonstrate this version of the annotate method.
First, we create four Annotation objects based on four short sentences, as shown here. To take full advantage of the technique, it would be better to use a larger set of data. The following is the working code snippet:

Annotation annotation1 = new Annotation("The robber took the cash and ran.");
Annotation annotation2 = new Annotation("The policeman chased him down the street.");
Annotation annotation3 = new Annotation("A passerby, watching the action, tripped the thief "
+ "as he passed by.");
Annotation annotation4 = new Annotation("They all lived happily ever after, except for the thief "
+ "of course.");

ArrayList<Annotation> list = new ArrayList();
list.add(annotation1);
list.add(annotation2);
list.add(annotation3);
list.add(annotation4);
Iterable<Annotation> iterable = list;
pipeline.annotate(iterable);
List<CoreMap> sentences1 = annotation2.get(SentencesAnnotation.class);

for (CoreMap sentence : sentences1) {
for (CoreLabel token : sentence.get(TokensAnnotation.class)) {
String word = token.get(TextAnnotation.class);
String pos = token.get(PartOfSpeechAnnotation.class);
System.out.println("Word: " + word + " POS Tag: " + pos);
}
}

The output is as follows:

Word: The POS Tag: DT
Word: policeman POS Tag: NN
Word: chased POS Tag: VBD
Word: him POS Tag: PRP
Word: down POS Tag: RP
Word: the POS Tag: DT
Word: street POS Tag: NN
Word: . POS Tag:
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset