Chapter 4

WHAT ARE THE STRATEGY AND BUSINESS APPLICATIONS OF BIG DATA?

LEARNING OBJECTIVES

After completing this chapter, you should be able to do the following:

     Distinguish the root reason for a Big Data strategy.

     Recall the goals of Big Data.

     Identify the strategic implications of Big Data.

INTRODUCTION

Figure 4-1: Big Data Goals

image

All business endeavors must increase value for shareholders or stakeholders. In this chapter, we’re going to consider the strategic implications of Big Data to an organization. What are the goals an organization should have in regards to Big Data? What type of strategies should be created to achieve the goals? How does an organization begin the journey of implementing Big Data? The first step for all these questions is to determine the overall objective of the organization and how Big Data will be used to help achieve that objective.

For-profit organizations focus on increasing enterprise value as their main objective or purpose. The main objective or purpose for not-for-profits must be to enhance the lives, services, and opportunities for the customers or stakeholders. In figure 4-1, notice that the center purpose is money, time, or value.

The first circle outside money, time, or value represents business insights. For the organization to enhance the enterprise or stakeholder’s value, it must consider business insights that affect the customer, operations, or predictions as to future results. When acted upon, the insights should yield an increase in the organization’s value.

The second circle represents Big Data architecture. The architecture indicates that for the organization to achieve business insights, it will most likely be required to make an investment in Big Data architecture. Big Data architecture has a myriad of hardware, software, and consultant options. It also involves software programs such as Hadoop, R, and MapReduce. It is safe to assume that without making an investment in the Big Data architecture, it will be very difficult to generate business insights relating to Big Data that will yield increased value.

The final circle represents Big Data sources. This level encompasses all of the structured and unstructured sources of Big Data that have been covered previously. If an organization is unwilling or incapable of accessing and analyzing Big Data sources, the organization will not be able to generate insights that will lead to increased enterprise or stakeholder value.

KNOWLEDGE CHECK

1.     Big Data requires that all business endeavors add as their root objective the goal of

a.     Increasing value for shareholders or stakeholders.

b.     Increasing scalable architecture.

c.     Increasing Big Data capacity.

d.     Data aggregation.

GOALS OF BIG DATA

The overarching goal of Big Data analytics is to inform businesses. By analyzing data and trends, businesses can gain valuable insights that can be applied in nearly every area of operations. Some specific ways that Big Data can contribute to achieving the goal of informing businesses include the following:

     Monetize data or interpret data to realize competitive advantage which can be monetized.

     Analyze operational effectiveness—machine sensors, product failures, and traffic patterns.

     Create a reliable, scalable, and capable infrastructure that aids the data gathering, analysis, and inferences.

     Access and use internal and external data that are structured, unstructured, and streaming.

     Predict business, social, political, economic, technological, and environmental trends.

     Take action based on prescribed scenarios.

BUSINESS INSIGHTS ASSOCIATED WITH BIG DATA

The following list attempts to convey the wide variety of applications of Big Data. Many of the data components relate to multiple industries.

1.     Customer analytics. As noted in a previous chapter, one of the main sources of Big Data comes from transactions. This kind of information from customers can be used to gain insights into many aspects of customer behavior, including the following:

     Dropping product or service

     Analyzing customer behavior while on company website

     Monitoring customer usage of products to detect manufacturing or design problems

     Identifying high-value customers

     Identifying cross-selling opportunities as well as up-selling opportunities

     Determining which customers not to engage

     Identifying, targeting, and retaining customers

     Combining clickstream data with transactional data to improve customer profile

     Limiting product offerings to those that interest the customer

     Determining any aspect of customer behavior or product preference

     Identifying customer segmentation

     Recording and analyzing customer service and support issues

     Engaging brand advocates and changing the perception of brand antagonists

     Empowering customers to sell your products

     Enabling customers to locate items more quickly

     Improving loyalty or net to promote are scores

     Analyzing smartphone or mobile data—called detail record processing, social analysis, churn prediction, GEO mapping

     Analyzing point of sale data

     Creating forums—crowd creativity, crowd solutions

2.     Manufacturing. Manufacturers now have access to real-time data from a variety of process activities that allow them to gain insight into many factors, including the following:

     Tracking of product quality or defects

     Supply chain management and planning

     Optimizing machines

     Engineering Analytics

     Predictive maintenance

     Process and quality analysis

     Warranty claim potential (based on social media comments or complaints)

     Enterprise resource planning—operations, service delivery, supply chain management, and automation of routine decisions

     Continuous improvement to processes and procedures

3.     Research and development. Recently, the federal government recognized the potential value that lies in Big Data for research and development. A new initiative was launched that intends to extract knowledge from collections of digital data to help solve challenges on a national level. On a business level, Big Data can help with the following:

     Monitoring product quality

     Identifying customer needs for potential new products

     Soliciting input from customers regarding products

     Improving products based on call center data

4.     Distribution. As warehouses and distribution centers become increasingly high-tech, they now generate information that can be used to monitor and track labor, inventory, and equipment, including the following:

     Monitoring product shipments

     Identifying variances in logistic costs

     Determining inventories levels

     Using location data like GPS

     Using Radio-frequency identification

     Using distribution optimization

5.     Logistics. "Information of Things" is a huge new source of data in logistics. It can be used to track goods and provide insight into the following:

     Demand forecasting

     Supply chain analytics

     Tracking

     Delivery forecasting

     Travel industry—searchers, pricing, bundling (air, hotel, car, ship, entertainment)

6.     Marketing. Marketing departments are no stranger to using data to determine customers’ habits. With great access to data, they can apply their insights to the following:

     Determining marketing campaign effectiveness

     Determining channel effectiveness

     Monitoring and improving customer experience

     Tailoring marketing campaigns based on location and demographic data

     Providing advertising and public relations campaigns—demand signaling, targeted advertising, sentiment analysis, customer acquisition, promotions, and other advertising mediums

     Offering brand sentiment analysis

     Providing product placement optimization

     Providing response modeling

     Providing retention modeling

     Providing market-based analysis

     Providing net promoter scores

     Providing customer segmentation

7.     Predictions. Big Data may enable predictions to be made in areas other than business. Some of these include the following:

     Crimes, threat analysis

     Weather

     Investments

     Mineral location

     Astrophysics

     Health

     Relationships

8.     Operations analysis. Operations analysis leverages information from machine sensors to improve operations in many ways, including the following:

     More accurate and timely decision making

     Deviation analysis of logs and operational data

     Facility layout—either in manufacturing or retail

     Supply chain optimization

     Dynamic pricing

9.     Human Resources. Some retail companies use wearable technology to track their employees’ communications and movements within stores. Although that’s an extreme example of using Big Data for human resources, many companies can use information about their employees to do some of the following:

     Identify employees at risk to leave company

     Monitor recruitment activities

     Identify recruits external candidates

     Résumé data

     Employee search

     Employee future team

10.     Accounting. We will discuss some of the following applications in other chapters of this course:

     Measuring risk

     Credit risk

     Market risk

     Operational risk

     Budgeting, forecasting, planning

     Fraud detection

     Detecting multiparty fraud

     Real-time fraud prevention

     Algorithmic trading

     Customer analysis

     Duplicate payments

     Pricing, business intelligence, and data mining

11.     Competition

     Tracking competitors’ prices

     Tracking competitors’ sales

     Tracking competitors’ marketing initiatives

     Mapping out the competitive landscape

12.     Media and telecommunications. Network optimization, customer scoring, sure and prevention, fraud prevention

13.     Energy. Smart grid analysis, exploration, operational modeling, power line sensors

14.     Healthcare and life sciences. Bioinformatics, pharmaceutical research, clinical outcomes research, pharmacogenomics, neonatal, ICU monitoring, epidemic early warning system, remote healthcare monitoring, likely return to the hospital.

     Drug discovery

     Health cures

     Health diagnosis

15.     Government

     Regulatory compliance

     Threat analysis

     Law enforcement, defense, and cyber security (for example, real-time surveillance, situational awareness, cyber security detection, license plate tracking, GPS tracking)

     Natural systems—wildfire management, water management, wildlife management

     Transportation—intelligent traffic management

     Tax avoidance, Social Security fraud, money laundering, terrorist detection, communication surveillance and monitoring, market governance, weapons systems and counterterrorism, econometrics, health informatics

16.     Unstructured data. Related to many of the preceding sections

     Sensor data—automotive, appliance, machine, temperature, security, vending machine

     Social networking—sentiment data from user-generated comments on ratings, reviews, and blogs

     Text messaging SMS Software—application logs

     Internet search—text and documents, mining

     Digital images and videos

     Voice data

     Web—web analytics, social media analytics, multivariate testing (Multivariate testing is a technique for testing a hypothesis in which multiple variables are modified. The goal of multivariate testing is to determine which combination of variations performs the best out of all of the possible combinations. Websites and mobile apps are made of combinations of changeable elements.)1

     Other—text analytics, business process analytics

     Clickstream—a virtual trail that a user leaves behind while surfing the Internet. A clickstream is a record of a user’s activity on the Internet, including every website and every page of every website that the user visits, how long the user was on a page or site, in what order the pages were visited, any newsgroups that the user participates in, and even the email addresses of mail that the user sends and receives. Both Internet service providers and individual websites are capable of tracking a user’s clickstream.2

17.     Stock market analysis. For example, the impact of weather on security prices or analysis of market data latencies

KNOWLEDGE CHECKS

2.     Which of the following was not mentioned as a Big Data insight as it relates to research and development?

a.     Monitoring product quality.

b.     Identifying customer needs.

c.     Creating third-and fourth-generation products.

d.     Soliciting input from customers.

Big Data Architecture

Figure 4-2

image

Source: Adapted from information found on bigdataandanalysis.blogspot.com/

In order to provide all of the insights needed for businesses, a Big Data platform needs to accomplish many goals. Big Data architecture must be designed so that data are analyzed in the natural environment as opposed to recreating data in voluminous data tables. The architecture must allow for reading and accessing a variety of sources such as email, financial, audio, images, GPS, and the like. The architecture should be created to accomplish the four Vs—volume, velocity, variety, and veracity—as described in chapter 2. The architecture must also be economically scalable, have an adequate response time, have multiple hardware options (due to hardware failures), and have built-in security to prevent unauthorized access to confidential detailed data.

STRATEGIC IMPLICATIONS OF BIG DATA—CHALLENGES

To pursue Big Data as a tool for the organization, there are some key strategic issues that first must be addressed. If the strategic issues are overlooked at the beginning of the process, it may be difficult to implement Big Data successfully. The following list can be used as a starting point to think about implementing a Big Data platform:

     Strategic challenges

     Establishing suitability for purpose

     Providing an overall system architectural plan

     Technological challenges

     Gaining access to data

     Gaining access to the associated methodology and metadata

     Establishing provenance and lineage of data sets

     Establishing data set quality with respect to a city (accuracy, fidelity), uncertainty, error, bias, reliability, and calibration

     Addressing security concerns

     Technological feasibility

     Existing data warehouse architecture

     Immature new systems or reliability of selected data

     Lack of metadata and schema for the Big Data

     Lack of tooling

     Availability of enterprise-ready products and tools

     High latency (Hadoop)

     Running inside the cluster

     Resource or capacity challenges

     Ability to implement a wide-scale Big Data initiative

     Consolidation of disparate data

     Quality and cost of collecting data

     Budget constraints

     Cost too high

     Staff challenges

     Experiment and trial testing big analytics

     Integrity of network transmission

     Poor data quality

     Ability to deal with real-time data

     Project management challenges

     Reliance on multiple consultants that may not work in harmony

     Starting with the right project

     Change management challenges

     Institutional change management

     Ensuring inter-jurisdictional collaboration and common standards

     Different department systems that inhibit collection and organization of Big Data

     Acquiring technically competent staff

     Steep technical learning curve

     Hiring qualified people

     Barriers between departments that are cultural and nature

     Data that are not accepted or believed

     Data ownership especially as it tries to organization culture

     Lack of business sponsorship

     Lack of belief in a business case

     Partnership challenges

     Forming strategic alliances with Big Data producers

     Legal and regulatory issues

Table 4-1

image

Source: Adapted from www.businesszcommunity.com/big-data/drive-real-time-revenue-world-big-data-01109279

KNOWLEDGE CHECK

3.     According to TDWI, what is the biggest challenge for Big Data?

a.     Lack of business sponsorship.

b.     Lack of skills for IT staff.

c.     Data integration complexity.

d.     Poor data quality.

DANGERS OF WRONG DATA

In addition to the strategic challenges of using Big Data, there is also a significant potential that an organization might use Big Data incorrectly. There are many situations which could jeopardize the integrity of the decision-making process if an organization uses Big Data without fully understanding statistical pitfalls. Any issues with small amounts of data will be magnified in larger quantities. Sample error or bias could create data that is not representative of the situation, a common inaccuracy found in polling. False assumptions could be the collection of data from the very beginning. For example, a company may assume that a variable is a strong predictor of customer retention, but in reality, that variable is only correlated to retention.

There’s a major danger that organizations become entranced with aggregating vast amounts of data only to draw improper conclusions regarding that data. This is especially important for manufacturing environments which may make crucial production decisions based on prescriptive analytics.

Wrong data

Let us consider an example that Ari Zoldan wrote about in Wired magazine, which discussed drawing conclusions from Twitter data collected during Hurricane Sandy.3

In an intriguing study from Rutgers University, scientists set out to understand people’s decision-making related to Hurricane Sandy. From October 27th to November 1st, over 20 million tweets were recorded that pertained to the super storm. Tweets concerning preparedness peaked the night before, and tweets about partying peaked after the storm subsided.

The majority of the tweets originated from Manhattan, largely because of the high concentration of smartphone and Twitter usage. Due to the high concentration of power outages, and diminishing cell phone batteries, very few tweets were made from the hardest hit areas, such as Seaside Heights and Midland Beach. From the data, one could infer that the Manhattan borough bared the brunt of the storm; however, we know that wasn’t the case. There was actually a lot going on in those outlying areas. Given the way the data was presented, there was a huge data gap from communities unrepresented in the Twittersphere.

This example illustrates several points. First, it refutes the myth that more data will create greater insights. It demonstrates the importance of not being overly influenced by volumes of data or statistics. As you look at data, be objective, critical, and independent of any outcome.

Statistics does not mean facts. Big Data may appear to be factual when it is just more volume. When it is raw, Big Data is large and unorganized, and organizing data for analysis is difficult.

You should also be wary of biases and missing context. Confirmation bias is the phenomenon that people search the data to confirm their preexisting viewpoint. Also, when data conflict with underlying assumptions, there is a tendency to ignore it. Just because the data can be charted or analyzed by an algorithm, it does not mean the interpretation is valid. Faster and more powerful systems mean that we can also make the wrong interpretation and prescription faster than ever.

When evaluating data, keep the following three cautions in mind:

1.     People tend to find what they seek. More data and speed do not necessarily mean that the results will be improved.

2.     There are two types of data—quantitative and qualitative. Qualitative analysis is necessary to explain the quantitative analysis. Consider the announcement of public earnings reports. The numbers are announced. Then they are put into context by explaining or "verbally" adjusting earnings to present in the best light.

3.     Remember that the context of data is very important. Consider how global warming has been interpreted, reinterpreted, and reinterpreted again. For every data set, it is important to understand the analyst’s bias, such as data presented, data modified, and data excluded. For example, the first quarter gross domestic product in 2015 did not come in as high as hoped. The first quarter has been lower than expected each of the past couple of years. Analysts originally credited this performance to harsh winters. Unfortunately, the weather still didn’t account for all of the disappointing results, so analysts stated that there was a "first quarter residual seasonality" and soft readings in other variables. It should be noted that the economists of the Federal Reserve did not find significant statistical evidence for such distortions on the aggregate GDP.4

FIVE IT BIG DATA MISTAKES

How big is the problem of Big Data for the information technology manager? According to Infochimps:

55% of Big Data projects don’t get completed, and many others fall short of their objectives.

Though there may be many reasons for this, undoubtedly one of the biggest factors is a lack of communication between top managers, who provide the overall project vision, and those charged with implementing it. Far too frequently the opinions of the IT staff doing the heavy lifting necessary to develop a Big Data project are taken as an afterthought and consequently considered only when projects veer off-course.5

According to that quote, almost half of Big Data projects are never finished. Of those remaining, a large subset will not add value to the organization or stakeholders. In addition to potential mistakes with data selection and processing, IT can add complications. Subramanian Iyer of Oracle wrote about the five Big Data mistakes that IT makes:6

1.     Too much emphasis on the technology needed rather than the business need.

2.     Many times IT management focuses on the wrong business cases assuming that the payback will be the same as others in the industry.

3.     Management may launch multiple initiatives in parallel as part of a big bang approach to implementing Big Data. This approach may lessen the chances of success with Big Data projects.

4.     Many times IT management does not complete a proper cost-benefit analysis to determine what the payback on the Big Data project will be.

5.     Placing a Big Data application under the same process requirements (mechanism for authentication, access, data isolation, and management of environments) as compared to traditional applications may jeopardize the project.

KNOWLEDGE CHECK

4.     Which of these is NOT a Big Data mistake?

a.     Using an iterative implementation strategy.

b.     Focusing on technology instead of the business need.

c.     Not executing a cost-benefit analysis.

d.     Executing multiple initiatives in parallel as part of a "big bang" approach or pilot implementations.

Practice Questions

1.     What is the root reason for developing Big Data?

2.     What are some of the change management issues confronting Big Data implementations?

3.     A cautionary tale of Big Data and tweets was related to Hurricane Sandy. What occurred with related tweets that, taken out of context, might produce a false conclusion?

Notes

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset