Chapter 1. Analytical thinking and the AI-driven enterprise

This book is about improving our analytical skills to generate value by using the most recent advances in artificial intelligence (AI). In case you’re wondering about the relation between the two, don’t worry, that’s the topic of this chapter. But first we must understand what AI is and why everyone’s talking about it.

AI is all about making predictions

Even though AI is the buzz word, many practitioners prefer to use the less sexy machine learning (ML) label for what they do. If you search online, you may find Venn diagrams suggesting that ML is part of AI, and this may well be true on an academic level.1 But in practice, today, when someone says they’re using AI to solve a business problem, they are unmistakenly using prediction algorithms, so at this point we may take the two words as synonyms.

How does prediction work

Think about how we go about doing predictions on a day-to-day basis. One very common situation that almost everyone can relate to is deciding at what time to leave to make it to work on time. The first time we did it we may have asked others and taken up their recommendation.

Alternatively, we probably preferred to play it safe and left really early, most likely overestimating the time it would take us to make it to the office. After a while, after driving or taking the train over and over, we learned to make more reliable predictions. Here, as in any other prediction problem, we learn from our environment and adjust our predictions accordingly.

It shouldn’t come as a suprise, then, that machine-created predictions apply the same principle: we feed data into algorithms that are designed to provide accurate predictions when some conditions take place.

Of course, we need to precisely define what “accurate” means, and if you’re interested, in the Appendix I provide a more formal introduction to ML. But for now it will suffice to say that the vast majority of learning algorithms we use at the enterpise work as follows:

  1. Start creating a model of how the world works — or more precisely, how the outcome we want to predict works.

  2. Feed data that relates closely to that model and make a prediction using the model.

  3. Compare the prediction with the actual outcome and define a criterion to identify how good the prediction is.

  4. Using some sensible way to update our predictions, calibrate the model until we feel comfortable with the prediction accuracy.

Let’s put this basic learning algorithm in practice using our daily commute example:

  1. The time it takes me to get to my destination depends on the distance and the speed. Speed itself depends on the amount of traffic and, we conjecture, follows a U shape: it is high when there is no traffic, decreases to a minimum at peak hours and increases afterwards. It follows that commute time has an inverted U shape on the amount of traffic, reaching a maximum during rush hour.

  2. Try leaving at 6:30 am and measure the time it takes to get to the office. Repeat at 6:45 am, 6:57 am, 7:15 am and so on. Our input data will be the departing time and the output we wish to predict is the commute time.

  3. Make an initial guess of our inverted U-shape model to predict commute time if I leave tomorrow at 6:50. Leave at 6:50 tomorrow, measure the time, and compare with the actual time.

  4. Recalibrate the U-shape model and repeat.

What I just described is called a “supervised” learning algorithm. In steps 3 and 4, our experience helps to supervise the calibration process since we are able to measure precisely how good or bad our predictions were.

This process is replicated, at scale, using computing power and smart updating rules when we use supervised machine learning algorithms. These are at the core of the modern data scientist toolkit.

Why is everyone talking about AI today: Deep Learning

AI has become a buzz word thanks to the predictive power of deep learning algorithms. This is a shortcut for deep neural network learning algorithms. In the Appendix I describe how these work so I won’t deviate our focus right now by delving into the technical details.

The important thing to notice is that these have become very powerful in tackling problems that previously (less than a decade ago) were very hard to solve using machines. And by “very hard” I mean that humans significantly outperformed the best existing algorithms. In many domains this is no longer the case and the best deep learning algorithms have reached super human predictive power.

Two examples where deep learning excels are in the domains of image recognition and language processing. It is interesting to discuss how prediction comes about in tackling these problems as this will reinforce the idea that current AI are just predictive algorithms.

Deep learning in image and speech recognition

Let’s think of understanding handwritten text. If it’s yours, most likely you’re very good at understanding it, even though it may sometimes be a challenging task. Understanding — that is, recognizing each letter —  someone else’s writing might be unsurmountable (think of the last time you got a medical prescription from your doctor). We can turn this into a predictive problem by assigning a measure of confidence that a glyph represents a given letter. Some times we are superconfident (we are willing to bet on it), other times we aren’t so sure (so we’re less willing to put some money on the table).

Deep learning, as applied to image recognition, does exactly that: we train the algorithm with a labeled dataset, that is, for any given glyph in the training data we assign to it the correct letter that it represents. You might imagine that the process of labeling is difficult and time consuming, and yes, the lack of high quality, labeled data in our enterprises is one of the problems that practitioners encounter on a day-to-day basis.

At this level we can think of a learning algorithm as a black box: we input the image of a glyph, and the algorithm outputs a probability estimate that it represents any given letter on the alphabet. How do we convert probabilities to predictions, that is, letters? Well, the usual way is to assign the letter that has the largest probability. Say that the probability that a given glyph is letter “a” is 11%, letter “b” 1%, and so on until we exhaust all letters in the alphabet: the decision rule assigns the letter with the highest probability, say letter “e” that had a predicted probability of 17%.

A similar process applies to speech recognition, but now our dataset consists of audio files that have been labeled or transcribed by humans. Think of talking with someone with an accent you’re not familiar with or who speaks at a very low volume. Some times we are confident that we understood the word she said. Other times we are not, and we try to understand from the context. Deep learning algorithms applied to natural language processing work like that: they predict probabilities that something was said. Note, in passing, that probabilities are the way quantify our degree of confidence.

Have you seen how Gmail autocompletes our sentences? It is making a prediction from the context of the sentence (previous words) and by training the algorithm on zillion of emails and texts written in the past by you and others. Unfortunately our own past emails may not be enough to train these models, so you may end up using words proposed by Gmail that you would never use in your own conversations.

I’ve stripped away all of the fascinating technical intricacies of how these work, but I hope you got the point: as of today, artificial intelligence are predictive algorithms. Powerful and complex, but just that: predictive machines. If you don’t have a data science background and want to have slightly deeper glimpse into the subject please go through the Appendix.

How is AI transforming our businesses

Every single major media outlet has at some point in the recent past highlighted some news about the powers of AI. Let’s now look at different ways these powerful predictive algorithms are transforming the way we run our businesses.

Better predictions may produce better decisions

At a first level, predictive technologies will improve the accuracy of our decisions, as most relevant decisions are taken under uncertainty where we, consciously or not, solve a predictive problem. For instance, when you decided to buy this book you must have thought that it was worth your money and your time. Your decision (buying) was fed with a prediction (quality), so if we make better predictions we should also make better decisions.

How you approached and solved this problem is unclear, but most likely you did not consciously pose it as predictive question. Perhaps you used a set of heuristics or shortcuts to decide if it was worth it for you: maybe O’Reilly’s reputation as a high-quality publisher gave you some confidence, you read some positive reviews, or you found the title catchy. These are all shortcuts we constantly use to solve potentially low-risk predictive questions. For higher-risk problems we may use some more explicit conscious reasoning.

Note that in the title to this section I have emhasized the verb “may”. It is not straightforward that improving the accuracy of our predictions generate better decisions. You must act on this evidence, that is, your decision must depend on these predictions. Many companies hire armies of data scientists that create no value whatsoever because of this problem.

Data translators: identifying business cases for prediction

It seems safe to anticipate, then, that this ability to identify and prioritize different predictive opportunities will soon become of very high value to companies. Taking this skill one step ahead, if you can also translate these business problems to predictive problems that can be solved by data scientists your value for the company will be much higher. One study from the McKinsey Global Institute estimates the demand for data translators in the United States alone to be between 2 and 4 million by 2026.2

Moreover, companies that can adapt their business models early enough, in order to take advantage of these advances should be able to capture a significant share of the market. We are already seeing this with many startups that are built upon AI capabilities, and the speed of adoption of other more consolidated companies will depend on the competitive challenges they face from these newcomers and also on their executives’ and higher-management vision into the future. The banking and telco industries are two examples where disruption is forcing the traditional players to embark on widespread transformations.

Job automation and demand for complementary skills

At a second level, the process of task automation is on its way and will affect the ways companies operate. Take Robot Process Automating or RPA, where highly repeatable and stable actions are automated with the help of a computer program. RPA’s success comes from automation in low-uncertainty environments. More complex actions are also being automated with the help of the current wave of prediction technologies and this trend is only going to increase. The 2018 World Economic Forum´s Future of Jobs Report shows that 50% of interviewed companies expect automation to reduce their workforce by 2022.3 A study by the UK’s Office for National Statistics estimates that 1.5 million of current jobs (7.4%) are at risk of automation.4 Another worldwide analysis puts the figure between 75 and 375 million by 2030.5

As machines continue to substitute humans in performing certain tasks, demand for complementary skills that are not easily automated will most likely increase. The above-mentioned Future of Jobs Report ranks number one the skill of “analytical thinking and innovation” both at the time of the study and up to 2022, their most futuristic forecast. Machines are very good at solving prediction problems once humans have translated the problem, so analytic translation is one such complementary skill that will be on the rise.

The Data Revolution: or is it?

The AI revolution is underway and was only possible thanks to the availability of large datasets necessary to train the algorithms and much larger and better designed computers. Most of these algorithms were developed several decades ago but researchers were only able to show their superior performance once they had enough data and computing power. The current AI hype replaced a previous one, the big data hype. Let’s see the differences between the two.

The year 2004 is often said to have marked the beginning of the big data revolution, with the publication of Google’s famouse Map Reduce paper.6 A few years later, by the second half of the previous decade, technology commentators and consulting firms alike claimed that the big data revolution would provide companies with endless opportunities for value creation. At the beginning this revolution was built around one pillar: having more, diverse and fast-accessible data. As the hype matured, two more pillars we are also included:predictive algorithms and culture.

The 3 V’s

The first pillar involved the now well-known three Vs: volume, variety and velocity. The internet transformation had provided companies with ever-increasing volumes of data. One 2018 estimate claims that in the previous two years, 90% of the data created in the history of human kind had been generated, and many such calculations abound.7 Technology had to adapt if we wanted to analyze this apparently unlimited supply of information: we not only had to store and process larger amounts of data, but also needed to deal with new unstructured types of data such as text, images, videos and recordings that were not easily stored or processed with the data infrastructure available at the time.

Structured and unstructured data

The second V, variety, emphasizes the importance to analyze all type of data, not only structured data. If you have never heard of this distinction think of your favorite worksheet (excel, google sheets, etc.). These organize the information in tabular arrangements of rows and columns, that provide a lot of structure so that we can efficienty process information. This a simple example of structured data: anything you can store and analyze using rows and columns belongs to this class.

Have you ever copied and pasted an image on Excel? You can certainly do it, that is, you can store images, videos, complete texts on worksheets. But you can’t analyze them efficiently. And even storage is not efficient: you can save way less space on disk by using some type of compression or efficient formats. Unstructured datasets are not efficiently stored or analyzed using tabular formats, and include all type of multimedia (images, videos, tweets, etc.). Now, these provide a lot of valuable information for companies, so why should we not use them?

After the innovations were made, consultants and vendors had to invent new ways to market the new technologies. For structured data we had the Enterprise Data Warehouse, so they created a new catchy term. The Data Lake was thus born with the promise of providing flexibility and computational power to store and analyze big data.

Flexibility came in two flavors: thanks to “linear scalability”, if twice the work needed to be done, we would just have to install twice the computing power to meet the same deadlines. Similarly, for a given task, we could cut the current time by half by doubling the amount of infrastructure. Computing power could be easily added by way of commodity hardware, efficiently operated by ever-more-clever open-source software readily available for us to use. But the data lake also allowed for quick access to the larger variety of data sources.

Once we tackled the volume and variety problems, velocity was the next frontier, and our objective had to be the reduction of time-to-action and time-to-decision. We were now able to store and process large amounts of very diverse data in real-time or near real-time if necessary. The three Vs were readily achievable.

Data-driven culture

I have already devoted some time to the second pillar in the big data revolution, predictive technologies, so let’s talk about the third pillar: culture. A company can buy the best technology to process large amounts of data, can hire the best data scientists and data engineers to create and automate machine learning data flows or pipelines, but if it doesn’t act on this evidence little can be accomplished. A data-driven organization is one that acts on data, or put differently, one that operates their business using evidence-based decision-making. I prefer the second definition since it takes the emphasis from data and puts it on evidence and the scientific method. Core cultural values are experimentation, curiosity, empowerment, data democratization and agility.

From pillars to recipes for success

The three Vs were physical descriptions of the data that we could now process; but they were not recipes to create value in the new data economy. The other two pillars were also enablers, but we needed futher guidance. Consultants and vendors alike provided us then with maturity models, roadmaps to be followed to ascend to paradise. One such model is depicted in Figure 1-1, which I will explain now.

Hierarchy of Value Creation
Figure 1-1. A possible data maturity model showing a hierarchy of value creation

Descriptive stage

Starting from the left, one thing was apparent from the outset: having more, better and timely data could provide a more granular view of our businesses’ current and past performance. And our ability to react quickly would certainly allow us to create some value. A health analogy may help to understand why.

Imagine you install sensors in your body, either externally through wearables or by means of other soon-to-be-invented internal devices. These provide you with more, better and timely data on your health status, but how can you use this data to create value for yourself, that is, to improve your health over time? Since you may now know when your heart rate or your blood pressure increases above some critical level, you can now take whatever measures are needed to take them back to normality. Similarly, you can track your sleeping patterns and adjust your daily habits. If we react fast enough, this newly-available data may even save our lives. This kind of descriptive analysis of past data may provide some insights about your health, and the creation of value depends critically on our ability to react quickly enough.

Predictive stage

But more often than not it’s too late when we react. Can we do better? One approach would be to replace reaction by predictive action. As long as predictive power is high enough, this layer should buy us time to find better actions, and thus, new opportunities to create value.

Thanks to the ability of looking into the future — that’s what we’re attempting through predictive technologies — , we were now able to create value by developing better data products such as recommendation engines, but also by using our internal data to provide actionable insights to other companies, a process known as data monetization.

The online advertising business was born: companies would not sell their data to advertisers or publishers, but they would rather create mechanisms to share insights (derived from their data) under the condition of getting higher conversion rates. Given a customer’s online behavior, these companies create predictive models of the likelihood of purchasing different goods or services and then auction ad opportunities for third parties to display their banners.

Many other monetization opportunities abound, of course. For example, telecommunication companies create insights that are of high value to other firms, by using mobility and behavioral information created by mobile users that navigate the internet with the use of their phones. Companies can now optimize the location of physical ads to specific market segments and customers, similar to what their online counterparts had done earlier.

Prescriptive stage

The top rank in this hierarchy of value creation is taken by our ability to automate and design intelligent systems. In order to achieve this we need to move to the prescriptive layer: once you have enough predictive power you can start finding the best actions. This is the layer where firms move from prediction to optimization, the throne in the data olympus. Interestingly enough, this is the least explored step by most maturity models, most likely because the power of AI has vastly restricted our focus to the prediction stage, but also because, arguably, optimization is harder than prediction.

In a sense we are following the natural path: big data was necessary for the AI renaissance (the term and first developments date the 1950s), and are now starting to see the possibilities on the prescriptive arena. But this require new analytical skills.

A tale of unrealized promises

The three pillars, along with data maturity models similar to the one just described, have been part of the conversation during the last 15 years or so. This conversation has been largely directed by consulting firms and given the size of the promised economic gains, most companies have followed suit.

Have the promises of unlimited data-driven value been realized? Is data the new oil, as The Economist famously claimed a couple of years ago?8 Let’s start with what the market thinks: Figure 1-2 shows the evolution of the top 10 companies by market capitalization, that is, by taking the total outstanding stock and valuing them at their market prices. With the probable exception of Berkshire Hathaway — Warren Buffett’s conglomerate — , Johnson & Johnson (J&J) and JP Morgan, all of the remaining companies are in the technology sector and all have embraced the data and AI revolution.9 This does not prove that big data is the cause of this unprecedented wealth, but at the minimum it shows that the markets’ top performers have fully adopted the new credo.

Market Capitalization
Figure 1-2. Evolution of market capitalization top-10 ranking

One 2011 report by the McKinsey Global Institute estimated sector-specific additional productivity growth rates between 0.5 and 1% from the use of Big Data.10 Their Big Data Value Potential Index positions finance, government and information sectors top, followed by wholesale trade and real estate. As the name suggests, these are estimates of potential and not realized gains. Another study estimates that becoming more data-driven increases a firm’s productivity by 5-6%.11

But after almost 15 years since the data revolution got kick-started most companies have not realized the promise of unprecedented value creation. According to one recent survey with more than 60 executives from leading firms, even though 92% report speeding up their AI and data investments, only 31% of them self-identify as data-driven — falling from 37% two years earlier — , and more than half admit that data is not treated as an asset in their enterprises. This results in 37% reporting that they have not yet found measurable results from their AI and data initiatives, and 77% that their adoption is still a challenge.12

The study also cites some factors that explains this lack of results: lack of organizational alignment, cultural resistance, misunderstanding the role that data plays as an asset for the company, faulty executive leadership and technological barriers. Similar results abound in a growing literature that tries to explain this paradoxical state: if everyone agrees that the data and AI revolutions can potentially generate considerable value, why can’t companies reach this potential? The promised fortunes are like the never-reachable tortoise in Zeno’s paradox, and we, of course, the frustrated Achilles.

The central tenet of this book is that one often-overlooked cause for this lack of results come from the general lack of analytical skills within the organizations. And the modern AI-driven economy needs a better definition of what thinking analytically means.

Analytical reasoning for the modern AI-driven enterprise

Tom Davenport’s now classic Competing on Analytics pretty much equates analytical thinking with what later became to be known as data-drivenness: “By analytics we mean the extensive use of data, statistical and quantitative analysis, explanatory and predictive models, and fact-based management to drive decisions and actions.” One alternative definition can be found in Albert Rutherford’s The Analytical Mind: “Analytical skills are, simply put, problem-solving skills. They are characteristics and abilities that allow you to approach problems in a logical, rational manner in an effort to sort out the best solution.”

In this book I will define analytical reasoning as the ability to translate business problems into prescriptive solutions. This ability entails both being data-driven, and being able to solve problems rationally and logically, so the definition is in fact in accordance with the two described above.

To make things practical, I will equate business problems with business decisions. Other problems that are purely informative and do not entail actions may have intrinsic value for some companies, but I will not treat them here, as my interest is in creating value through analytical decision-making. Since most decisions are made without knowing the actual consequences, AI will be our weapon to embrace this intrinsic uncertainty. Notice that under this approach, prediction technologies are important inputs into our decision-making process but not the end. Improvements in the quality of predictions can have first- or second-order effects depending on whether we are already making near-to-optimal choices.

Key takeways

  • Current AI is about predictive power: thanks to the advances in computing power and the availability of more and better labeled data, a renaissance in R&D in algorithm development has been possible, especially in the area of deep learning.

  • Prediction is powerful if it allows us to make better decisions: prediction is only one component in our decision processes, and companies create value by making decisions. We therefore need to learn to make better decisions, taking advantage of the advances in prediction technologies.

  • The current AI hype was preceded by a Big Data hype: it’s been almost 15 years since the big data revolution started, and in between, we have been promised over and over that companies would create substantial value from the new developments. By just focusing on prediction and not on decision-making there’s a risk that with the new hype this promises will only be attained by current market top performers.

  • There are necessary skills needed to create value from predictive technologies: we not only need to learn to identify and translate predictive problems into business opportunities, but it’s also necessary that we start by changing our mindset: start with the business question and move backwards to identify levers. We can then move forward to optimize this levers. In between, the necessary skills of simplification and dealing with uncertainty will help us pose and solve problems that are applicable to general business questions.

The next chapter will describe in detail the general framework used to decompose decisions. We will see that we must always start with the business, and move back to the actual levers that we have. Actions or levers create uncertain consequences and these are responsible for impacting our KPIs.

Futher Reading

Nick Bostrom’s Superintelligence. Paths, Dangers, Strategies discusses at great length and depth what intelligence is and how superintelligence could emerge, as well as the dangers from this development and how it can affect society. Similar discussions can be found in Max Tegmark’s Life 3.0. Being Human in the Age of Artificial Intelligence.

On the topic of how AI will affect businesses I highly recommend Ajay Agrawal’s, Joshua Gans’ and Avi Goldfarb’s, Prediction Machines. The Simple Economics of Artificial Intelligence. Written by three economists and AI strategists, they provide a highly-needed, away-from-the-hype, down-to-earth account of current AI. Their key takeaway is that thanks to current developments, the cost of predictive solutions within the firm has considerably fallen while quality has kept increasing, providing great opportunities for companies to transform their business models. Also written by economists Andrew McAffe and Erik Brynjolfsson, Machine Platform Crowd. Harnessing Our Digital Future discuss more generally how the data, artificial intelligence and digital transformations are affecting our businesses, the economy and society as a whole.

1 See https://www.quora.com/How-are-AI-and-ML-different-and-what-could-be-a-possible-Venn-diagram-of-how-AI-and-machine-learning-overlap for one such example.

2 See https://www.mckinsey.com/business-functions/mckinsey-analytics/our-insights/analytics-translator and https://mck.co/30MA4ws

3 http://www3.weforum.org/docs/WEF_Future_of_Jobs_2018.pdf

4 https://www.technologyreview.com/f/613166/15-million-jobs-in-the-uk-are-at-high-risk-of-being-automated/

5 https://www.mckinsey.com/featured-insights/future-of-work/how-will-automation-affect-jobs-skills-and-wages

6 https://static.googleusercontent.com/media/research.google.com/es//archive/mapreduce-osdi04.pdf

7 https://www.forbes.com/sites/bernardmarr/2018/05/21/how-much-data-do-we-create-every-day-the-mind-blowing-stats-everyone-should-read/#4cd02c7d60ba

8 https://www.economist.com/leaders/2017/05/06/the-worlds-most-valuable-resource-is-no-longer-oil-but-data

9 Data retrieved from https://en.wikipedia.org/wiki/List_of_public_corporations_by_market_capitalization. January 2019.

10 https://mck.co/2wLmJJZ

11 https://hbr.org/2012/10/big-data-the-management-revolution

12 https://hbr.org/2019/02/companies-are-failing-in-their-efforts-to-become-data-driven

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset