Searching for the answer

Once we know the type of question, we can use the relations found in the text to answer the question. To illustrate this process, we will develop the processWhoQuestion method. This method uses the TypedDependency list to garner the information needed to answer a who type question about presidents. Specifically, we need to know which president they are interested in, based on the president's ordinal rank.

We will also need a list of presidents to search for relevant information. The createPresidentList method was developed to perform this task. It reads a file, PresidentList, containing the president's name, inauguration year, and last year in office. This file uses the following format, and can be downloaded from https://github.com/PacktPublishing/Natural-Language-Processing-with-Java-Second-Edition:

    George Washington   (1789-1797) 

The following createPresidentList method demonstrates the use of OpenNLP's SimpleTokenizer class to tokenize each line. A variable number of tokens make up a president's name. Once that is determined, the dates are easily extracted:

public List<President> createPresidentList() { 
    ArrayList<President> list = new ArrayList<>(); 
    String line = null; 
    try (FileReader reader = new FileReader("PresidentList"); 
            BufferedReader br = new BufferedReader(reader)) { 
        while ((line = br.readLine()) != null) { 
            SimpleTokenizer simpleTokenizer =  
                SimpleTokenizer.INSTANCE; 
            String tokens[] = simpleTokenizer.tokenize(line); 
            String name = ""; 
            String start = ""; 
            String end = ""; 
            int i = 0; 
            while (!"(".equals(tokens[i])) { 
                name += tokens[i] + " "; 
                i++; 
            } 
            start = tokens[i + 1]; 
            end = tokens[i + 3]; 
            if (end.equalsIgnoreCase("present")) { 
                end = start; 
            } 
            list.add(new President(name,  
                Integer.parseInt(start), 
                Integer.parseInt(end))); 
        } 
     } catch (IOException ex) { 
        // Handle exceptions 
    } 
    return list; 
} 

The President class holds presidential information, as shown here. The getter methods have been left out:

public class President { 
    private String name; 
    private int start; 
    private int end; 
 
    public President(String name, int start, int end) { 
        this.name = name; 
        this.start = start; 
        this.end = end; 
    } 
    ... 
} 

The processWhoQuestion method follows. We use type dependencies again to extract the ordinal value of the question. If the governor is president and the adjectival modifier is the relation, then the dependent word is the ordinal.
This string is passed to the getOrder method, which returns the ordinal as an integer. We add 1 to it since the list of presidents also started at one:

public void processWhoQuestion(List<TypedDependency> tdl) { 
    List<President> list = createPresidentList(); 
    for (TypedDependency dependency : tdl) { 
        if ("president".equalsIgnoreCase( 
                dependency.gov().originalText()) 
                && "adjectival modifier".equals( 
                  dependency.reln().getLongName())) { 
            String positionText =  
                dependency.dep().originalText(); 
            int position = getOrder(positionText)-1; 
            System.out.println("The president is "  
                + list.get(position).getName()); 
        } 
    } 
}

The getOrder method is as follows and simply takes the first numeric characters and converts them to an integer. A more sophisticated version would look at other variations, including words such as "first" and "sixteenth":

private static int getOrder(String position) { 
    String tmp = ""; 
    int i = 0; 
    while (Character.isDigit(position.charAt(i))) { 
        tmp += position.charAt(i++); 
    } 
    return Integer.parseInt(tmp); 
} 

When executed, we get the following output:

The president is Franklin D . Roosevelt

This implementation is a simple example of how information can be extracted from a sentence and used to answer questions. The other types of questions can be implemented in a similar fashion and are left as an exercise for the reader.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset