8

The Future of Big Data

The semantic web, or web 3.0, is often referred to as the next phase of the Internet. Led by the World Wide Web Consortium (W3C), the ambition is to transform the current web of unstructured and semistructured data into a “web of data.” According to W3C, the semantic web will make it easier to share and reuse data across application, community, and enterprise boundaries.

In 1998, the inventor of the web, Tim Berners-Lee, already called the semantic web “a web of data, in some ways like a global database.”1 This database will be comprised of all unstructured, semistructured, and structured data currently online but still residing in silos. In that same white paper, Berners-Lee described the rationale for developing the semantic web as, “the Web was designed as an information space, with the goal that it should be useful not only for human–human communication, but also that machines would be able to participate and help. One of the major obstacles to this has been the fact that most information on the Web is designed for human consumption, and even if it was derived from a database with well-defined meanings (in at least some terms) for its columns, that the structure of the data is not evident to a robot browsing the web.”

The semantic web will enable all people and Internet connected devices (think: the Internet of Things) to communicate with each other as well as share and reuse data in different forms across different applications and organizations in real time. Obviously this has everything to do with Big Data.

In Chapter 1, I discussed the 7 Vs of Big Data: volume, velocity, variety, veracity, value, visualization, and variability. Velocity is the speed at which data is created; volume is the sheer amount of it; variety is different forms of data; veracity is the accuracy of the data; value is the economic benefits the data brings to companies, organizations, and societies; visualization is the art of making the data easy and understandable; and variability is the changing meaning of data over time.

Together these 7Vs define Big Data, and they immediately show the challenges of the semantic web and the future of Big Data: How to connect, link, and make available all the data on the web that is created in large volumes at high speed in different formats with different variables, but still ensur the correctness, quality, and understandability of that data for people and machines. It also shows how Big Data can help create the semantic web. All the technologies currently being developed for Big Data, such as Hadoop, open-source tools, and the technology developed by startups, will enable the development of the semantic web. Its development requires that processing, linking, and analyzing data become better and less expensive.

In a blog post, Ramani Pandurangan, Vice President of Network Engineering at XO Communications, describes the semantic web as “essentially a framework to link metadata (data about data) of data stored in disparate databases on the Web so that it will allow machines to query these databases and yield enriched results.”2 When all databases that are currently still in silos are connected, it will become possible for machines to find, connect, and communicate with that information.

A great example of this and a glimpse on the future of Big Data is the Knowledge Graph of Google, which was introduced in May 2012.3 Google calls it the future of search, indexing things and not strings. Knowledge Graph is very promising, but as mentioned by Larry Page, Cofounder of Google, they “are still at one percent of where Google wants to be.”4 In 2013 the semantic network created by Google contained 570 million objects and over 18 billion facts about relationships between different objects that are designed to understand the meaning of keywords entered for searching purposes. The objective is to develop a Star Trek experience, in which users can simply ask computers natural questions. Google is not the only organization working on semantic search. Big Data startups like Ontology also aim to connect things instead of strings, but Ontology focuses on enterprise applications rather than the web.

The semantic web as discussed by Tim Berners-Lee in 1998 focuses especially on machines connecting and communicating with the web. Nowadays, we call this the Internet of Things. Connecting 25 or 50 billion different devices that are interoperable can only happen when those devices can browse, connect, and communicate with the web as people do. It will become even more important when in about a decade a trillion sensors will be connected to the web in real time.5

Therefore, the technologies involved with Big Data do not only stimulate the development of the semantic web. They will also require that the semantic web works correctly and takes full advantage of the promises of Big Data. When a semantic web is operational, the future of Big Data can really unfold, and it promises to be a bright future.

Jason Hoffman, CTO of Joyent, predicts that the future of Big Data will be about the convergence of data, computing, and networks.6 The personal computer represented the convergence of computing and networks; the convergence of computing and data will result in analyses on exabytes of raw data that enable ad hoc questions to be asked on very large datasets.

The future of Big Data will allow us to ask questions and find answers more easily by asking questions in natural language. At the moment, users still need to know what they want to know, but in the future, it will all be about the things that you don't know.7

The real advance will come when organizations no longer have to ask questions to obtain answers, but will simply find answers to questions never thought of. Advanced pattern discovery and categorization of patterns will enable algorithms to perform decision-making for organizations. Extensive and beautiful visualizations will become more important and will help organizations understand the brontobytes of data.

THE FUTURE OF BUSINESS ANALYTICS

The future of Big Data will also transform the way we analyze the vast amounts of new data. In the past, we tried to understand how the organizations or the world around us behaved by analyzing the available data using descriptive analytics. This answered the question: “What happened in the past with the business?” With the availability of Big Data, we entered the new area of predictive analytics, which focuses on answering the question: “What is probably going to happen in the future?” However, the real advantage of analytics comes with its final stage, which can be called the future of business analytics and is called prescriptive analytics. This type of analytics tries to answer the question: “Now what?” or “so what?” It tries to offer a recommendation for key decisions based on future outcomes. To understand what this will mean for your organization, let's look at what the difference is among these three different types of analytics and how they affect your organization?

First, these three types of analytics should coexist—business, predictive, and prescriptive. One is not better than the other; they are just different. All, however, are necessary to obtain a complete overview of your organization. In fact, they should be implemented consecutively. All of them contribute to the objective of improved decision making.

Descriptive Analytics Is About the Past

Descriptive analytics helps organizations understand the past. In this context, the past can be one minute ago or a few years back. Descriptive analytics helps to understand the relationship between customers and products; the objective is to learn what approach to take in the future. In other words, learn from past behavior to influence future outcomes.

Common examples of descriptive analytics are management reports that provide information regarding sales, customers, operations, and finance, and then look to find correlations among the variables. Netflix, for example, uses descriptive analytics to find correlations among different movies that subscribers rent. To improve its recommendation engine, Netflix uses historical sales and customer data.8

Descriptive analytics is therefore an important source for determining what to do next. With predictive analytics, such data can be turned into information regarding the likely future outcome of an event.

Predictive Analytics Is About the Future

As is clear from the examples in this book, predictive analytics provides organizations with actionable insights based on data. It offers an estimation regarding the likelihood of a future outcome. To do this, a variety of techniques are used, including machine learning, data mining, modeling, and game theory. Predictive analytics can, for example, help to identify future risks and opportunities.

Historical and transactional data are used to identify patterns, while statistical models and algorithms are used to capture relationships in various datasets. Predictive analytics has really taken of in the Big Data era, and there are many tools available for organizations to predict future outcomes. With predictive analytics, it is important to have as much data as possible. More data means better predictions.

Prescriptive Analytics Provides Advice Based on Predictions

Prescriptive analytics is the final stage in understanding your business, but it is still in its infancy. In the 2013 Hype Cycle of Emerging Technologies by Gartner, prescriptive analytics was mentioned as an “Innovation Trigger” that will take another five to ten years to reach its plateau of productivity.9 Prescriptive analytics not only foresees what will happen and when, but also why it will happen. It provides recommendations for acting on this information to take advantage of the predictions.

It uses a combination of many different techniques and tools, such as mathematical sciences, business rule algorithms, machine learning, and computational modeling techniques, as well as many different datasets ranging from historical and transactional data to public and social datasets.10 Prescriptive analytics tries to see what the effect of future decisions will be in order to adjust the decisions before they are actually made.11 This will improve decision making a lot, as future outcomes are taken into consideration in predictions.

Prescriptive analytics is very new; it has only been around since 2003, and is so complex that there are very few best practices on the market. Only three percent of companies use this technique—and it still has lots of errors in it.12 One of the best examples is Google's self-driving car that makes decisions based on various predictions of future outcomes. These cars need to anticipate what's coming and what the effect of a possible decision will be before they make that decision to prevent an accident.

Prescriptive analytics could have a very large impact on business and how decisions are made. It can impact any industry and any organization and help organizations become more effective and efficient.

With an understanding of descriptive, predictive, and prescriptive analytics, your business will find it easier to make better informed decisions that take into account future outcomes. Prescriptive analytics is the future, and IBM has already called it “the final phase” in business analytics.13

The Brontobytes Era

Big Data Scientists will be in very high demand in the coming decades, but the real winners in the startup field will be those companies that can make Big Data so easy to understand, implement, and use that Big Data Scientists are no longer necessary. Large corporations will always hire Big Data Scientists, but the much larger market of SMEs cannot afford to do this. Those startups that enable Big Data for SMEs without the need to hire experts will have a competitive advantage.

The algorithms developed by those Big Data startups will become ever smarter, smartphones will become better, and everyone will be able to have a supercomputer in a pocket that can perform daunting computing tasks in real time and visualize them on the small screen in your hand. And, with the Internet of Things and the Industrial Internet and trillions of sensors, the amount of data that needs to be processed by these devices will grow exponentially.

Big Data will only becomes bigger, and those brontobytes will become common language in the boardroom. Fortunately, data storage will also become more widely available and less expensive. Brontobytes will become so common that eventually the term Big Data will disappear. Big Data will become just data again.

However, before we have reached that stage, the growing amount of data that is processed by companies and governments will create privacy concerns. Those organizations that stick to ethical guidelines will survive; other organizations that take privacy lightly will disappear, as privacy will become self-regulating. The problem will remain, however, with governments, as citizens cannot choose not to deal with them. Important public debates about the effects of Big Data on consumer privacy are inevitable; together, we have to ensure that we do not end up in Minority Report 2.0.

The future of Big Data is still unsure, as the Big Data era is still unfolding. It is clear, however, that future changes will transform organizations and societies. Hopefully, this book made clear that Big Data is here to stay, and organizations will have to adapt to the new paradigm. Organizations might be able to postpone their Big Data strategy for a while, but we have seen that organizations that have already implemented a Big Data strategy outperform their competitors. Therefore, start developing your Big Data strategy, as there is no time to waste if your organization also wants to provide products and services in the upcoming Big Data era. Good luck!

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset