Extracting information through textual analysis

Using the text analytics API, we are able to analyze text. We will cover language detection, key-phrase analysis, and sentiment analysis. In addition, a new feature is the ability to detect topics. This does, however, require a lot of sample text, and as such, we will not go into detail on this last feature.

For all our text-analysis tasks, we will be using a new View. Add a new View into the View folder called TextAnalysisView.xaml. This should contain a TextBox element for the input query. It should also have a TextBox element for the result. We will need three Button elements, one for each detection analysis that we will perform.

We will also need a new ViewModel, so add TextAnalysisViewModel.cs to the ViewModel folder. In this, we need two string properties, one for each TextBox. Also add three ICommand properties, one for each of our buttons.

If you have not already done so, register for an API key at https://portal.azure.com.

Add a private member called _webRequest of a WebRequest type. With that in place, we can create our constructor, as shown in the following code:

    public TextAnalysisViewModel()
    {
        _webRequest = new WebRequest("ROOT_URI","API_KEY_HERE");
        DetectLanguageCommand = new DelegateCommand(DetectLanguage, CanExecuteOperation);
        DetectKeyPhrasesCommand = new DelegateCommand(DetectKeyPhrases, CanExecuteOperation);
        DetectSentimentCommand = new DelegateCommand(DetectSentiment, CanExecuteOperation);
    }

The constructor creates a new WebRequest object, specifying the API endpoint and API key. We then go on to create the DelegateCommand objects for our ICommand properties. The CanExecuteOperation function should return true if we have entered the input query and false otherwise.

Detecting language

The API can detect which language is used in text from over 120 different languages.

This is a POST call, so we need to send in a request body. A request body should consist of documents. This is basically an array containing a unique id for each text. It also needs to contain the text itself, as shown in the following code:

    private async void DetectLanguage(object obj)
    {
        var queryString = HttpUtility.ParseQueryString("languages");
        TextRequests request = new TextRequests
        {
            documents = new List<TextDocumentRequest>
            {
                new TextDocumentRequest {id="FirstId", text=InputQuery}                            
            }
        };

        TextResponse response = await _webRequest.MakeRequest<TextRequests, TextResponse>(HttpMethod.Post, queryString.ToString(), request);

We create a queryString specifying the REST endpoint that we want to reach. Then we go on to create a TextRequest contract, which contains documents. As we only want to check one piece of text, we add one TextDocumentRequest contract, specifying an id and the text.

When the request is created, we call MakeRequest. We expect the response to be of a TextResponse type and the request body to be of a TextRequests type. We pass along POST as the call method, the queryString, and the request body.

If the response is successful, then we loop through the detectedLanguages. We add the languages to a StringBuilder, also outputting the probability of that language being correct. This is then displayed in the UI, as shown in the following code:

    if(response.documents == null || response.documents.Count == 0)
    {
        Result = "No languages was detected.";
        return;
    }

    StringBuilder sb = new StringBuilder();

    foreach (TextLanguageDocuments document in response.documents)
    {
        foreach (TextDetectedLanguages detectedLanguage in document.detectedLanguages)
        {
            sb.AppendFormat("Detected language: {0} with score {1}n", detectedLanguage.name, detectedLanguage.score);
        }
    }

    Result = sb.ToString();

A successful response will contain the following JSON:

    {
        "documents": [
        {
            "id": "string",
            "detectedLanguages": [
            {
                "name": "string",
                "iso6391Name": "string",
                "score": 0.0
            }]
        }],
        "errors": [
        {
            "id": "string",
            "message": "string"
        }]
    }

This contains an array of documents -, as many as were provided in the request. Each document will be marked with a unique id and contain an array of detectedLanguage instances. These languages will have the name, iso6391Name, and the probability (score) of being correct.

If any errors occur for any document, we will get an array of errors. Each error will contain the id of the document where the error occurred and the message as a string.

A successful call will create a result similar to the one shown in the following screenshot:

Detecting language

Extracting key phrases from text

Extracting key phrases from text may be useful if we want our application to know key talking points. Using this, we can learn what people are discussing in articles, discussions, or other such sources of text.

This call also uses the POST method, which requires a request body. As with language detection, we need to specify documents. Each document will need a unique ID, the text, and the language used. At the time of writing, English, German, Spanish, and Japanese are the only languages that are supported.

To extract key phrases, we use the following code:

    private async void DetectKeyPhrases(object obj)
    {
        var queryString = HttpUtility.ParseQueryString("keyPhrases");
        TextRequests request = new TextRequests
        {
            documents = new List<TextDocumentRequest>
            {
                new TextDocumentRequest { id = "FirstId", text = InputQuery, language = "en" }
            }
        };

        TextKeyPhrasesResponse response = await _webRequest.MakeRequest<TextRequests, TextKeyPhrasesResponse>(HttpMethod.Post, queryString.ToString(), request);

As you can see, it is quite similar to detecting languages. We create a queryString using keyPhrases as the REST endpoint. We create a request object of the TextRequests type. We add the documents list, creating one new TextDocumentRequest. Again, we need the id and text, but we have also added a language tag, as shown in the following code:

    if (response.documents == null || response.documents?.Count == 0)
    {
        Result = "No key phrases found.";
        return;
    }

    StringBuilder sb = new StringBuilder();
            
    foreach (TextKeyPhrasesDocuments document in response.documents)
    {
        sb.Append("Key phrases found:n");
        foreach (string phrase in document.keyPhrases)
        { 
            sb.AppendFormat("{0}n", phrase);
        }
    }

    Result = sb.ToString();

If the response contains any key phrases then we loop through them and output them to the UI. A successful response will provide the following JSON:

    {
        "documents": [{
            "keyPhrases": [
            "string" ],
            "id": "string"
        }],
        "errors": [
        {
            "id": "string",
            "message": "string"
        } ]
    }

Here we have an array of documents. Each document has a unique id, corresponding to the ID in the request. Each document also contains an array of strings, with keyPhrases.

As with language detection, any errors will be returned as well.

Learning whether a text is positive or negative

Using sentiment analysis, we can detect whether or not a text is positive. If you have a merchandise website where users can submit feedback, this feature can automatically analyze whether the feedback is generally positive or negative.

The sentiment scores are returned as a number between 0 and 1, where a high number indicates a positive sentiment.

As with the previous two analyses, this is a POST call, requiring a request body. Again, we need to specify the documents, and each document requires a unique ID, the text, and the language, as shown in the following code:

    private async void DetectSentiment(object obj)
    {
        var queryString = HttpUtility.ParseQueryString("sentiment");
        TextRequests request = new TextRequests
        {
            documents = new List<TextDocumentRequest>
            {
                new TextDocumentRequest { id = "FirstId", text = InputQuery, language = "en" }
            } 
        };

        TextSentimentResponse response = await _webRequest.MakeRequest <TextRequests, TextSentimentResponse>(HttpMethod.Post, queryString.ToString(), request);

We create a queryString pointing to sentiment as the REST endpoint. The data contract is TextRequests, containing documents. The document we pass on has a unique id, the text, and the language:

A call to MakeRequest will require a request body of a TextSentimentRequests type, and we expect the result to be of a TextSentimentResponse type.

If the response contains any documents, we loop through them. For each document, we check the score, and output whether or not the text is positive or negative. This is then shown in the UI, as follows:

    if(response.documents == null || response.documents?.Count == 0)
    {
        Result = "No sentiments detected";
        return;
    }

    StringBuilder sb = new StringBuilder();

    foreach (TextSentimentDocuments document in response.documents)
    {
        sb.AppendFormat("Document ID: {0}n", document.id);

        if (document.score >= 0.5)
            sb.AppendFormat("Sentiment is positive, with a score of{0}n", document.score);
        else
            sb.AppendFormat("Sentiment is negative with a score of {0}n", document.score);
    }

    Result = sb.ToString();

A successful response will result in the following JSON:

    {
        "documents": [
        {
            "score": 0.0,
            "id": "string"
        }],
        "errors": [
        {
            "id": "string",
            "message": "string"
        }]
    }

This is an array of documents. Each document will have a corresponding id as the request and the sentiment score. If any errors have occurred, they will be entered as we saw in the language and key-phrase detection sections.

A successful test can look like the following:

Learning whether a text is positive or negative
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset