Chapter 12
The Role of AI in Data Science

Although Artificial Intelligence (AI) has been around for a few decades now, it is only since it has been utilized in data science that it has become mainstream. Before that, it was an esoteric technology that would be seen in relation to robotics or some highly sophisticated computer system, poised to destroy humanity. In reality, AI is mainly a set of fairly intelligent algorithms that are useful wherever they apply, as well as in the field of computer science that is involved in their development and application.

Even if there are a few success stories out there that help AI make the news and the marketing campaigns, more often than not, they are not the best resource in data science, since there are other resources that are better and more widely applicable (e.g. some dimensionality reduction methods and the feature engineering techniques). Plus, when they do make sense in a data science project, they require a lot of fine-tuning. AI is not a panacea, though it can be a useful tool to have in your toolbox, particularly if you find yourself working for a large company with access to lots of decent data.

In this chapter, we will take a look at various aspects of AI and how AI relates to data science including the problems AI solves, the different types of AI systems applied in data science, and considerations about AI’s role in data science. This is not going to be an in-depth treatise on the AI techniques used in data science. Rather it will be more of an overview and a set of guidelines related to AI in data science. The information provided can serve as a good opportunity to develop a sense of perspective about AI and its relationship to data science.

Problems AI Solves

AI is an important group of technologies. It has managed to offer a holistic perspective in problem-solving since its inception. The idea is that with AI, a computer system will be able to frame the problem it needs to solve and solve it without having anyone hard-code it into a program. This is why AI programs have always been mostly practical, down-to-earth systems that intend, sometimes successfully, to emulate human reasoning (not just a predetermined set of rules). This is especially useful if you think about the problems engineers and scientists have faced in the past few decades, problems that are highly complex and practically impossible to solve analytically.

For example, the famous traveling salesman problem (TSP) has been a recurring problem that logistics organizations have been tackling for years. Even if its framing is quite straight-forward, an exact solution to it is hard to find (nearly impossible) for real life scenarios, where the number of locations the traveler plans to visit is non-trivial. Yet, given enough computing resources, it is possible to find an exact (analytical) solution to it, though most people opt for an AI-based one. AI is not the best route out there in this case, but it is close enough to make the solution valuable and also practical. What good would a solution be if it took the whole day to compute, using a bunch of computers, even if it were the most accurate solution out there? Would such an approach be scalable or cost-effective? In other words, would it be intelligent?

Most AI systems tackle more sophisticated problems, where the option of obtaining an ideal solution is not only impractical but also impossible. In fact, the majority of problems in applied science are nothing but approximations, and that’s perfectly acceptable. Nowadays, it is usually mathematicians that opt for analytical solutions, and even among this group, some of them are willing to compromise for the purpose of practicality. Opting for an analytical solution may have its appeal, but there are many cases where it’s just not worth it, especially if there are numeric methods that accomplish a good enough result in a fraction of the time.

AI is more than mathematics though, even if at its core it deals with optimization in one way or another. It is also about connecting the macro-level with the micro-level. This is why it is ideal for solving complex problems that often lack the definition required to tackle them efficiency. As the interface of AI becomes closer to what we are used to (e.g. everyday language), this is bound to become more obvious, the corresponding AI systems more mainstream. The amount of data that needs to be crunched in order for this idea to have a shot at becoming a possibility is mind-boggling. This is where data science comes in.

Data science problems that use AI are those that have highly non-linear search spaces or complex relationships among the variables involved. Also, problems where performance is of paramount importance tend to lend themselves to AI based approaches.

Types of AI Systems Used in Data Science

There are several types of AI systems utilized in data science. Most of them fall under one of two categories: deep learning networks and autoencoders. All of these AI systems are some form of an artificial neural network (ANN), a robust system that is generally assumption-free. There are also AI systems that are not ANNs, and we will briefly take a look at them too.

An ANN is a graph that maps the flow of data as it undergoes certain, usually non-linear, transformations. The nodes of this graph are called neurons, and the function involving the transformation of the data as it goes through these neurons is called the transference function. The neurons are organized in layers, each of which can represent the inputs of the ANN (the features), a transformation of these features (or meta-features), or the outputs. In the case of predictive analytics ANNs, the outputs are related to the target variable. Also, the connections among the various neurons are called weights, and their exact values are figured out in the training phase of the system.

ANNs have been proven to be able to approximate any function, though more complex functions require more neurons and usually more layers too. The most widely used ANNs are also the predecessors of deep learning networks, the feed forward kind.

Deep Learning Networks

This is the most popular AI system used in data science, as it covers a series of ANNs designed to tackle a variety of problems. What all of these ANNs have in common is that there are a large number of layers in them, allowing them to build a series of higher-level features and the system to go deeper into the data it is analyzing. This kind of architecture is not new, but only recently has the computing infrastructure been able to catch up with the computational cost that these systems accrue. Also, the advent of parallel computing and the low cost of GPUs enabled this AI technology to become more widespread and accessible to data scientists everywhere. The use of this technology in data science is referred to as deep learning (DL).

Deep learning networks come in all sorts, ranging from the basic ones that aim to perform conventional classification and regression tasks to more specialized ones that are designed for specific tasks that are not possible with conventional ANNs. For example, recurrent neural networks (RNNs) are a useful kind of DL network, focusing on capturing the signal in time-series data. This is done by having connections that go both forward and backward in the network, generating loops in the flow of data through the system. This architecture allows RNNs to be particularly useful for word prediction and other NLP related applications (e.g. language translation), image processing, and finding appropriate captions for pictures or video clips. However, this does not mean that RNNs cannot be used in other areas not particularly related to dynamic data.

When it comes to analyzing highly complex data consisting of a large number of features, many of which are somewhat correlated, convolutional neural networks (CNNs) are one of the best tools to use. The idea is to combine multi-level analysis with resampling in the same system, thereby optimizing the system’s performance without depending on the sophisticated data engineering that would be essential for this kind of data. If this sounds convoluted, you can think of a CNN as an AI system that is fed a multi-dimensional array of data at a time, rather than a single matrix, as is usually the case with other ANNs. This allows it to build a series of feature sets based on its inputs, gradually growing in terms of abstraction.

So, for the case of an image (having three distinct channels, one for each primary color), the CNN layers include features corresponding to crude characteristics of the image, such as the dominance of a particular color on one side of the image, or the presence of some linear pattern. All of these features may look very similar to the human eye. Once these features are analyzed further, we start to differentiate a bit. These more sophisticated features (present in the next layer) correspond to subtler patterns, such as the presence of two or more colors in the image, each having a particular shape. In the layer that follows, the features will have an even higher level of differentiation, capturing specific shapes and line/color patterns that may resonate with our understanding of the image. These layers are called convolution layers. In the CNN, there are also specialized sets of features that are called pooling layers. The role of these layers is to reduce the size of the feature representation in the other layers, making the process of abstraction more manageable and efficient. The most common kind of pooling involves taking the maximum value of a set of neurons from the previous layer; this is called max pooling. CNNs are ideal for image and video processing, as well as NLP applications.

Autoencoders

These are a particular kind of ANN that, although akin to DL networks, are focused on dimensionality reduction through a better feature representation. As we saw in the previous section, the inner layers of a DL network correspond to features it creates using the original features. Also known as meta-features, these can be used either for finding a good enough mapping to the targets, or to original features again. The latter is what autoencoders do, with the inner-most layer being the actual result of the process once the system is fully trained.

So why is all of this a big deal? After all, you can perform dimensionality reduction with other methods and not have to worry about fine-tuning parameters to do so. Statistical methods, which have traditionally been the norm for this task, are highly impractical and loaded with assumptions. For example, the covariance matrix, which is used as the basis of PCA, one of the most popular dimensionality reduction methods, is comprised of all the pairwise covariances of the features in the original set. These covariances are not by any means a reliable metric for establishing the strength of the similarity of the features. For these to work well, the features need a great deal of engineering beforehand, and even then, the PCA method may not yield the best reduced feature set possible. Also, methods like PCA (including ICA and SVD methods) take a lot of time when it comes to large datasets, making them highly impractical for many data science projects. Autoencoders bypass all these issues, though you may still need to pass the number of meta-features as a parameter, corresponding to the number of neurons in the inner-most layer.

Other Types of AI Systems

Apart from these powerful AI systems for data science, there are also some other ones that are less popular. Also, note that although there are many other ANN-based systems (see http://bit.ly/2dKrPbQ for a comprehensive overview of them), AI methods for data science include other architectures as well.

For example, fuzzy logic systems have been around since the 1970’s and were among the first AI frameworks. Also, they are versatile enough to be useful to applications beyond data science. Even if they are not used much today (mainly because they are not easy to calibrate), they are a viable alternative for certain problems when interpretability is of paramount importance, while there is also reliable information from experts that can be coded into fuzzy logic rules.

Another kind of AI system that is useful, though not as data science specific, is optimization systems, or optimizers. These are algorithms that aim to find the best value of a function (i.e. its maximum or minimum) given a set of conditions. Optimizers are essential as parts of other systems, including most machine learning systems. However, optimizers are applicable in data engineering processes too, such as feature selection. As we saw in the previous chapters, optimizers rely heavily on heuristics in order to function.

Extreme Learning Machines (ELM’s) are another AI system designed for data science. They may share a similar architecture with ANNs, but their training is completely different. They optimize the weights of only the last layer’s connections with the outputs. This unique approach to data learning makes them extremely fast and simple to work with. Also, given enough hidden layers, ELMs perform exceptionally in terms of accuracy, making them a viable alternative to other high-end data science systems. Unfortunately, ELMs are not as popular as they could be, since not that many people know about them.

AI Systems Using Data Science

Beyond these AI systems, there are other ones too that do not contribute to data science directly, but make use of it instead. These are more like applications of AI, which are equally important to the systems that facilitate data science processes. As these applications gain more ground, it could be that your work as a data scientist is geared toward them, with data science being a tool in their pipeline. This depicts its variety of applications and general usefulness in the AI domain.

Computer Vision

Computer vision is the field of computer science that involves the perception of visual stimuli by computer systems, especially robots. This is done by analyzing the data from visual sensors using AI systems, performing some pattern recognition on them and passing that information to a computer in order to facilitate other tasks (e.g. movement in the case of a robot). One of the greatest challenges of computer vision is being able to do all that in real time. Analyzing an image is not hard if you are familiar with image processing methods, but performing such an analysis in a fraction of a second is a different story. This is why practical computer vision had been infeasible before AI took off in data science.

Although computer vision focuses primarily on robotics, it has many other applications. For example, it can be used in CCTV systems, drones, and most interestingly, self-driving cars. Also, it would not be far-fetched to one day see such systems making an appearance in phone cameras. This way, augmented reality add-ons can evolve to something beyond just a novelty, and be able to offer very practical benefits, thanks to computer vision. Since the development of RNNs and other AI systems, computer vision has become highly practical and is bound to continue being a powerful application of AI for many years to come.

Chatbots

Chatbots are all the rage when it comes to AI applications, especially among those who use them as personal assistants (e.g. Amazon Echo). Even though a voice operated system may seem different than a conventional chatbot, which only understands text, they are in essence the same technology under the hood. A chatbot is any AI system that can communicate with its user using natural language (usually English) and carry out basic tasks. Chatbots are particularly useful in information retrieval and administrative tasks. However, they can also do more complex things, such as place an order for you (e.g. in the case of Alexa, the Amazon virtual assistant chatbot). Also, chatbots are able to ask their own questions, whenever they find that the user’s input is noisy or easy to misinterpret.

Chatbots are made possible by a number of systems. First of all, they have an NLP system in place that analyzes the user’s text. This way it is able to understand key objects. In this pipeline, there is also what is called an intent identifier, which aims to figure out what the intention of the user is when interacting with the chatbot. Based on the results of this, the chatbot can then carry out the task that seems more relevant, or provide a response about its inability to carry out the task. If it is programmed accordingly, it can even make small talk, though its responses are limited. After the chatbot carries out the task, it provides the user with a confirmation and usually prompts for additional inputs by the user. Some random delays can happen in the conversation in order to make it appear more realistic, as the chatbot learns to pick up new words (if it is sophisticated enough).

The fact that a chatbot’s whole operation is feasible in real time is something remarkable and made possible by incorporating data science into how it analyzes the inputs it receives. Synthesizing an answer based on the results it wants to convey is fairly easy (often relying on a template), but figuring out the intent and the objects involved may not be so straight-forward, considering how many different users may interact with the chatbot. Also, in the case of a voice-operated chatbot, an additional layer exists in the pipeline, involving the analysis of the user’s voice and the transcription of text corresponding to it.

Artificial Creativity

Artificial creativity is an umbrella of various applications of AI that have to do with the creation of works of art or the solution of highly complex problems, such as the design of car parts and the better use of resources in a data center. Artificial creativity is not something new, though it has only recently managed to achieve such levels that it is virtually indistinguishable from human creativity. In some cases (e.g. the tackling of complex problems), it performs even better than the creativity of domain experts (humans). An example of artificial creativity in the domain of painting is the idea of using a DL network trained with various images from the works of a famous artist, and then using another image in conjunction with this network so that parts of the image are changed to make it similar to the images of the network it is trained on. This creates a new image that is similar to both, but emulates the artistic style of the training set with excellence, as if the new image was created using the same technique.

RNNs are great for artificial creativity, especially when the domain is text. Although the result in this case may not be as easy to appreciate as in most cases, it is definitely interesting. At the very least, it can help people better comprehend the system’s functionality, as it is often perceived as a black box (just like any other ANN-based system).

Other AI Systems Using Data Science

Beyond these applications of AI that make use of data science, there are several more, too many to mention in this chapter. I will focus on the ones that stand out, mainly due to the impact they have on our lives.

First of all, we have navigation systems. These we may have come to take for granted, but they are in reality AI systems based on geolocation data and a set of heuristics. Some people think of them as simple optimizers of a path in a graph, but these days they are more sophisticated than that. Many navigation systems take into account other factors, such as traffic, road blockages, and even user preferences (such as avoiding certain roads). The optimization may be on the total time or the distance, while they often provide an estimate of the fuel consumption in the case of a motor vehicle the user predefines. Also, doing all the essential calculations in real-time is a challenge of its own, which these systems tackle gracefully. What’s more, many of them can operate offline, which is still more impressive, as the resources available on a mobile device are significantly limited compared to those on the cloud.

Another AI application related to navigation systems is voice synthesizers; the latter are a common component of the former. Yet voice synthesizers have grown beyond the requirements of a navigation system. They are used in other frameworks as well, such as ebook readers. To synthesize voice accurately and without a robotic feel to it is a challenge that has been made possible through sophisticated analysis of audio data and the reproduction of it using DL networks.

Automatic translation systems are another kind of AI application based on data science, particularly NLP. However, it is not as simple as looking up words in a dictionary and replacing them. No matter how many rules are used in this approach, the result is bound to be mechanical, not “feeling right” to the end user.

However, modern translation systems make use of sophisticated methods that look at the sentence as a whole before attempting to translate it. Also, they try to understand what is going on and take into account translations of similar text by human translators. As a result, the translated text is not only accurate, but more comprehensive, even if it is not always as good as that of a professional translator.

Some Final Considerations on AI

AI systems have great potential, especially when used in tandem with data science processes. However, they, just like data science in its first years, are over-ridden by a lot of hype, making it difficult to discern fact from fiction. It is easy to succumb to the excessive optimism about these technologies and adopt the idea that AI is a panacea that will solve all of our problems, data science related or otherwise. Some people have even built a faith system around Artificial Intelligence. As data scientists, we need to see things for what they are instead of getting lost in other people’s interpretations. AI is a great field, and its systems are very useful. However, they are just algorithms and computer systems built on these algorithms. They may be linked to various devices and make their abilities easy to sense, but this does not change the fact that they are just another technology.

Maybe one day, if everything evolves smoothly and we take enough careful steps towards that direction, we can have something that more closely resembles human thinking and reasoning in AI. Let us not confuse this possibility with the certainty of what we observe. The latter we can measure and reason with, while the former we can only speculate about. So, let’s make the most that we can with AI systems, whenever they apply, without getting carried away. Taking the human factor out of the equation may not only be difficult, but also dangerous, especially when it comes to liability matters. More on that in the chapter that follows.

Summary

AI is a field of computer science dealing with the emulation of human intelligence using computer systems and its applications in a variety of domains, as well as in data science. AI is important, particularly in data science, as it allows for the tackling of more complex problems, some of which cannot be solved through conventional approaches.

The problems that are most relevant to AI technologies are those that have one or more of the following characteristics: highly non-linear search spaces, complex relationships among their variables, and performance being a key factor.

Artificial Neural Networks (ANNs) are a key system category for many AI systems used in data science.

There are several types of AI systems focusing on data science, grouped into the following categories:

  • Deep learning networks – These are sophisticated ANNs, having multiple layers, and being able to provide a more robust performance for a variety of tasks. This category of AI systems includes Recurrent Neural Networks (RNNs) and Convolutional Neural Networks (CNNs), among others.
  • Autoencoders – Similar to DL networks, this AI system is ideal for dimensionality reduction and able to handle large datasets too
  • Other – This includes Fuzzy Logic based systems, optimizers, Extreme Learning Machines, and more

There are various AI systems employing data science on the back-end, such as:

  • Computer vision – This kind of AI system involves the perception of visual data by computer systems, especially robots
  • Chatbots – These are useful AI systems that interact with humans in natural language
  • Artificial creativity – This is not so much an AI system, but an application of AI related to the use of sophisticated AI systems for the creation of artistic works or for solving highly complex problems
  • Other – There are also other AI systems employing data science, such as navigation systems, voice synthesizers, automatic translation systems

AI systems are a great resource, but they are not a panacea. It is good to be mindful about their usefulness without getting too overzealous about AI and its potential.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset