Part-of-speech (POS) tagging

On Tuesday, October 10, 2017, between 9:34 AM and 9:36 AM, the US Dow Jones newswire encountered a technical error that resulted in it posting some strange headlines. One of them was, "Google to buy Apple." These four words managed to send Apple stock up over two percent.

The algorithmic trading systems obviously failed here to understand that such an acquisition would be impossible as Apple had a market capitalization of $800 billion at the time, coupled with the fact that the move would likely not find regulatory approval.

So, the question arises, why did the trading algorithms choose to buy stock based on these four words? The answer is through part-of-speech (POS) tagging. POS tagging allows an understanding of which words take which function in a sentence and how the words relate to each other.

spaCy comes with a handy, pretrained POS tagger. In this section we're going to apply this to the Google/Apple news story. To start the POS tagger, we need to run the following code:

import spacy
from spacy import displacy
nlp = spacy.load('en')

doc = 'Google to buy Apple'
doc = nlp(doc)
displacy.render(doc,style='dep',jupyter=True, options={'distance':120})

Again, we will load the pretrained English model and run our sentence through it. Then we'll use displacy just as we did for NER.

To make the graphics fit better in this book, we will set the distance option to something shorter than the default, in this case, 1,120, so that words get displayed closer together, as we can see in the following diagram:

Part-of-speech (POS) tagging

spaCy POS tagger

As you can see, the POS tagger identified buy as a verb and Google and Apple as the nouns in the sentence. It also identified that Apple is the object the action is applied to and that Google is applying the action.

We can access this information for nouns through this code:

nlp = spacy.load('en')
doc = 'Google to buy Apple'
doc = nlp(doc)

for chunk in doc.noun_chunks:
    print(chunk.text, chunk.root.text, chunk.root.dep_,chunk.root.head.text)

After running the preceding code, we get the following table featured as the result:

Text

Root Text

Root dep

Root Head Text

Google

Google

ROOT

Google

Apple

Apple

dobj

buy

In our example, Google is the root of the sentence, while Apple is the object of the sentence. The verb applied to Apple is "buy."

From there, it is only a hard-coded model of price developments under an acquisition (demand for the target stock goes up and with it the price) and a stock lookup table to a simple event-driven trading algorithm. Making these algorithms understand the context and plausibility is another story, however.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset