Chapter 9

ETHICS AND PRIVACY WITH BIG DATA

LEARNING OBJECTIVES

After completing this chapter, you should be able to do the following:

     Identify how Big Data can be misused even when there is no intent to do harm.

     Recognize how everyone is being tracked and recorded in Big Data systems.

     Recall the implications of large-scale video monitoring tools and the ability to track individuals across the country.

INTRODUCTION

One of the biggest drawbacks to Big Data is the potential for violation of individual privacy. In this chapter, we will examine examples of Big Data that have been used in ways that might be hurtful to an individual or intrusive from a privacy aspect. It is important that every organization evaluate how Big Data affects its ethics policies.

ETHICAL QUESTIONS

Throughout this text, we have considered many different examples of Big Data. Unfortunately, ethical

implications of use, accumulation, or sharing of that data have not kept pace with the vast increase in the

volume of Big Data. Consider the following situations:

     Data that is pushed by the owner to a provider (even with access permission) that the data owner is unaware of.

     Data that the user is aware of but has not given permission to access like personal photos.

     Data that the user is aware of but never thought that it would be used for such purposes, for example

     insurance and health information, or

     texting status (especially when there is an accident).

     Data that is collected from Internet surfing patterns.

     Data embedded in other objects such as video, audio, or photo.

     Ownership of Big Data generated by the individual which becomes part of a larger database.

     Are the assumptions and conclusions that you or your company make about employees or prospective employees appropriate?

     What unstructured Big Data, does your organization have that it is not using but could be used by someone else if they obtained access to it? (Think email.)

KNOWLEDGE CHECK

1.     The current state of ethics policies, as they relate to Big Data, are

a.     Ethics policies are fine as they exist today.

b.     Ethics policies have not kept pace with the explosion of Big Data.

c.     Ethics policies just need minor tweaks to incorporate Big Data.

d.     Big Data policies are already addressed by IT policies.

ETHICAL IMPLICATIONS1

Forbes shared an article on the ethics of Big Data in March 2014. The article raised interesting implications that could be used to create a framework for Big Data policies.

Just because Big Data is prevalent, it does not mean that privacy does not exist. It is imperative that the rules for privacy are defined. Where does the data come from? Who owns the data? What rights does a gatherer must accumulate, use and maintain data? Individuals ought to be able to deal with the stream of their private data crosswise over various communication frameworks. Determine what the difference between shared data and public data is. Unfortunately, many of our modern conveniences are designed to produce private information (GPS, Wi-Fi, cell towers, and the like). However, just because we (or our equipment) generate health, financial, location information, and so forth, it does not mean that anyone can assume ownership or use of that data.

Big Data requires full disclosure as to its access, archive, and usage. This can be especially confusing when companies have access to information that can be put to a new use and gain insights that they never had before (consider the Target pregnancy example.) For Big Data to fit within generally accepted mores, the accumulators of Big Data must be forthright of how individual information is being utilized or even sold.

Identities are vulnerable with Big Data. It is possible to determine an individual’s identity without their knowledge or permission. There also is an individual responsibility to understand the impact that one’s identity could have should it be revealed or theorized. For example, what would be the effect if we could identify the IPO clients of investment bankers based on Big Data? Would it not be possible to make assumptions that would be advantageous in stock transactions? But would it be ethical?

What is your organization doing to protect its data? Consider Edward Snowden and Chelsea (Bradley) Manning of Wikileaks, what resides in your corporate email?

EXAMPLES OF BIG DATA ETHICAL LAPSES

Let’s briefly examine four major events in Big Data lapses:

     Target and customer information

     Google’s location history

     Eric Snowden

     Julian Assange and WikiLeaks

Target and the Danger of Predictive Analytics2

As noted in chapter seven, Target was able to use Big Data for customer information and then use predictive analytics to predict the likelihood of pregnancy. How had Target obtained information from customers without spying on them and how do you take advantage of that information?

According to The New York Times, Target hired Andrew Pole as a statistician in 2002. Pole had a master’s degree in statistics and another in economics. Staff from Target’s marketing department approached Pole and asked him if he could determine whether a customer was expecting a baby. If this could be achieved, Target could market to the customer prior to the birth and hopefully garner a larger portion of "future spending" related to the baby needs.

What type of data does Target acquire on its customers? When the opportunity arises, Target assigns each shopper a unique code—known internally as the guest ID number—that keeps tabs on everything purchased.

The following are linked to the Guest ID:

     Use of credit card

     Use of a coupon

     Completion of a survey

     Mailed in a refund

     Call to the customer helpline

     Opened an email

     Visited Target website

The following additional demographic information is also linked to Guest ID: age, marital status, children, address, driving time to store, your estimated salary, recent relocation history, your credit cards, and websites you visit.

In addition, Target can buy additional data such as ethnicity, job history, the magazines you read, bankruptcy history, marital (divorce) history, the year you bought (or lost) your house, your college, online topics you participate in, preferred coffee brands, type of paper towels, cereal or applesauce, your political perspectives, reading habits, charitable giving, and the vehicles that you have.

To appreciate the relationship between predictive analytics and buying habits, the writer of the New York Times article, Charles Duhigg, highlighted some foundational research conducted in the 1980s by a team of researchers led by UCLA professor, Alan Andreasen. The research studied purchases such as soap, toothpaste, trash bags, and toilet paper. Most shoppers paid little attention to how they bought these products. These were habitual purchases and did not involve any complex decision making.

The researchers found that when some customers were going through a major life event like graduation, a new job, moving, or the like, shopping habits became flexible and predictable. If the habits were predictable, the retailers could capitalize on this knowledge. They also discovered that newlyweds changed coffee brands. The purchase of a new house results in new breakfast cereal choices and finally, divorce results in buying different brands of beer. Therefore, shopping habits would also be expected to change for a mother expecting a child.

Target had traditional indicators of motherhood such as a baby-shower registry, which became a source of data for Pole. Pole’s research identified about 25 products that, when analyzed together, enabled the ability to assign each shopper a "pregnancy prediction" score. He could also estimate the due date so Target could send coupons timed to very specific stages of her pregnancy.

Pole applied the "pregnancy prediction" score to every regular female shopper in Target’s national database and created a list of tens of thousands of women who were most likely pregnant. These shoppers could now be targeted for specific marketing programs.

Supposedly, about a year after the pregnancy-prediction model was created, an irate man walked into a Minneapolis Target to see the manager. He was angry that his daughter received coupons that an expectant mother would receive and wondered whether the store was encouraging teenage pregnancy (his daughter was still in high school.)

The manager apologized and then called a few days later to apologize again. On the phone, though, the father was somewhat abashed. "I had a talk with my daughter," he said. "It turns out there has been some activities in my house I haven’t been completely aware of. She’s due in August. I owe you an apology."3

Exercise: What is the implication of companies gathering data on customers, making predictive analytics and then sending advertising material based on the marketing data?

image

KNOWLEDGE CHECK

2.     Target assigns a unique code to each shopper known as a

a.     Guest ID.

b.     Shopper ID.

c.     Customer ID.

d.     Target Partner ID

3.     Target’s research team found that approximately how many products, when taken as a group, could be a good pregnancy indicator?

a.     25.

b.     30.

c.     35.

d.     37.

GOOGLES LOCATION HISTORY IS RECORDING

If you have an Android device, Google could be following and recording your locations and movement.

The tracking is a result of an overlooked component in Android called Google Location History. The application itself is not surprising. It uses cell towers and Wi-Fi to find your device and you. Apple and Microsoft also use similar applications in their devices.

The thing about the Google Location administration is that, although the standard Android setup routine asks whether you want to activate it, it does not mention that you can turn it off.

Google Location History is another matter of itself.

Consider what Google says beneath the Location choices:

"Google’s location program utilizes Wi-Fi and other signals to determine your location. The program may store some of this information on your device and may gather other information from your device at the same time."

GPS is not always necessary as the device can access cell sites and Wi-Fi signals. These can result in lower battery usage and, therefore, be more desirable. However, this process indicates that the signals will be both "anonymous" and "collected." It is not anonymous, as the information is tied to your account, and the activities are being recorded.

To check if you have Location History enabled, go to the Google Maps Location History page (https://maps.google.com/locationhistory/). Use the gear-icon button to access history settings and choose disabled or enabled.

Realize that disabling location history does not remove history. It is possible to erase the past 30 days of information from the location history page. The default time shows location history for the current day, so you may not see any plots on the map.

The technology review website c|net has laid out the technique to find this information on your phone. Use the calendar to the left to show your history for up to 30 days. If you have been tracked during this timeframe, the specific points will appear on the map. Below the calendar, there are options to delete your history from the time you have chosen or to delete all history.4 It is possible to shut off this tracking or delete portions of the history.

image
image

image

It is possible to zoom in and see all of the points that your mobile device connected to during a particular day.

image

Every dot on the map is a point where Google used Wi-Fi Positioning System (WPS) to locate this cell phone. Every time the cell phone connected to a Wi-Fi access point, the MAC address and SSID would be sent to Google’s servers. This information using GPS and cell ID data is then collected and stored. This becomes the source of the Google Location History map.5

There are some benefits to using this service. For instance, to make airline travel easier, your boarding pass can be automatically displayed on your phone as the cell identifies your arrival at the airport.

Keeping the history can also be helpful for your daily commute so that weather and traffic information is available before you begin the drive.

Unfortunately, there is little control to customize the information retained by Google. It is an "all or nothing" type of data storage. The following items are not controllable:

     It is not possible to limit how long location data is retained.

     It doesn’t expire automatically.

     History cannot be retained for one day, one week or one month (note: history for any period, from one to 30 days, can be erased).

With the Android device, the location services settings are easy to change. The first step is to open the location settings as shown in the following screenshot.

image

It is possible to simply turn off the location settings about anything on your device reporting your location. Turning off the location will render certain apps unusable.

image

There is the option to turn off location reporting completely. Or, leave location reporting enabled and turn off location history.

image

It is possible to wipe your entire location history by clicking on the "Delete Location History" button.

Apple device users can turn off Google’s location reporting, but cannot wipe the history from the mobile device. It will be necessary to visit the Google Map Location History page mentioned previously. Remember that additional Android devices may also be recording history (see preceding image for other connected devices.)

The Google Location History has practical uses, such as tracking and watching family members, tracking your work mileage, or following trip progressions from location to location.

The problem is that Google does not let the user know in a simple form that they are being tracked. Thus, confidential information can be shared without realizing that it is being collected. Google should consider adding on option during setup to allow the user to understand the implications of location history and the ability to easily opt out.

An unscrupulous person with access to the mobile device can add another account, turn off the syncing process or indications from that account and then track the real owner of the device. If this is done without the knowledge of the original owner of the device, the implications could be disturbing.

image

KNOWLEDGE CHECK

4.     If you disable the Google Location History, what happens?

a.     All past history is removed.

b.     All past history for the last 30 days remains.

c.     All past history is unaffected.

d.     Google Location will not allow you to use the map feature.

5.     Which of these was listed as a benefit of Location History being active?

a.     Obtaining your boarding pass as you arrive at the airport.

b.     Finding a lost device.

c.     Recognizing network associates that may be nearby.

d.     Determining whether sales staff are visiting customers.

6.     Which was a Location History danger outlined previously?

a.     An unscrupulous person may find a way to track your movement.

b.     Your boss could ask for your phone to account for your whereabouts.

c.     Your location based on your history may become part of a legal investigation.

d.     Your location history may provide clues about mergers and acquisitions.

Exercise: How could this mapping feature be used to benefit your company? How could it be misused to hurt your business? What are the implications for our private lives?

image

Loss of Privacy–Big Data Security Leaks

These last couple of years have been nothing short of astounding with the theft of confidential information from email systems and their corresponding release to the public. As a country, we have been shocked, intrigued, frightened, apprehensive and are still in a state of uncertainty as to what information will be revealed by either Eric Snowden or Julian Assange via Wikileaks.

Julian Assange and WikiLeaks

Julian Assange is the editor-in-chief of the website WikiLeaks. He has a background in computer programming and hacking. WikiLeaks distributes sensitive data, news leaks, and whistleblower information from unidentified sources. WikiLeaks achieved notoriety in 2010 when it distributed U.S. military and political reports received from Pfc. Bradley Manning, an Army intelligence analyst. Bradley Manning was able to acquire U.S. State Department Cables. In Sept. 2011, WikiLeaks posted all of the cables unreacted. This resulted in over 100,000 or more secret U.S. diplomatic cables available on the Internet.

According to "Democracy Now!," WikiLeaks recently published leaked chapters of the secret Trans - Pacific Partnership (TPP)—a global trade deal between the United States and 11 other countries. The TPP would cover 40 percent of the global economy, but details have been concealed from the public. A recently disclosed "Investment Chapter" highlights the intent of U.S. - led negotiators to create a tribunal in which corporations can sue governments if their laws interfere with a company’s claimed future profits. Assange warns the plan could chill the adoption of health and environmental regulations.6

DNC leaks—During the 2016 Presidential Election, Wikileaks released documents from the Democratic National Committee. The documents disclosed opinions and positions that angered the Democratic Presidential Candidates that were in the primaries for the overall Democratic Nomination for President.

CIA Vault 7 (another whistleblower)—The CIA appears to have been hacked by another individual, and confidential documents are currently being leaked to the Internet.

Exercise: Obtaining information in an unethical fashion is not the central issue in this exercise. What is the implication of the details contained in companies’ systems in both document and email forms? How could the data be used against a company today?

image

KNOWLEDGE CHECK

7.     Which was the most recent leaked info from Assange’s group?

a.     Affordable Care Act (ACA).

b.     Chapters of the Trans-Pacific Partnership (TPP).

c.     Benghazi papers.

d.     Petraeus affair.

Snowden

Snowden is best known for his role in stealing top-secret electronic documents from the National Security Agency. At the time, Snowden worked as an intelligence contractor for Booz Allen Hamilton in Hawaii.

This is the leak that "continues to give" as journalists have released more than 7,000 top-secret documents that Snowden entrusted them with, which some believe is less than 1 percent of the entire archive.

Snowden downloaded up to 1.5 million files, according to national intelligence officials, before jetting from Hawaii to Hong Kong to meet with journalists Glenn Greenwald and Laura Poitras. After he had handed off illegally obtained documents, he flew from Hong Kong and later became stranded in Moscow. His future is still far from certain, as the journalists he trusted started revealing his secrets.7

Your Company

What is the impact for your organization? Your company may have plenty of secret intangible assets (think about Coke’s formulation). In addition, the company has a dearth of information contained in documents and emails that disclose strategies, M&A, employment issues and this author bets racial and sexual harassment if your company has any significant size.

Consider the following questions next time your company addresses risks or threats:

1.     What information does the company maintain that could prove damaging if it was released?

a.     E-mail

b.     Electronic files

c.     Security cameras

d.     Travel information

e.     Smartphone information

f.     Conversations knowingly and unknowingly taped)

g.     Intangible assets

h.     Contracts, pricing information, and so on.

i.     Customer lists

j.     Possession of electronic company information

2.     What policies are in place that specifically address the preceding items?

3.     What security processes are in place for the general company, individual employees, IT employees and contractors?

4.     When were the IT systems checked for vulnerabilities by a trusted, outside contractor?

KNOWLEDGE CHECK

8.     Snowden purportedly released over 7,000 documents which were estimated to be what percentage of the total files that he stole?

a.     6 percent.

b.     8 percent.

c.     1 percent.

d.     12 percent.

9.     According to the text, Wikileaks released approximately how many unredacted U.S. State Department documents?

a.     100,000.

b.     50,000.

c.     200,000.

d.     250,000.

Ethics Policy Considerations8

What type of policies should an organization add to its existing ethics and code of conduct to address the needs of Big Data? Mark van Rijmenam of Datafloq had these suggestions:

Radical Transparency

Tell your customers in real time what information is being collected. Give users the option to remove any data that can be traced back to them. If you offer a free service, make the user aware of how any information they share in exchange for "free information" will be used. As an alternative, consider charging a fee for a service or product option that does not collect any data.

Simplicity by Design

Give users a simple option to adjust privacy settings, as well as options to determine what types of information they want to share.

Preparation and Security are Key

Determine what information is necessary for the business. Understand that data is valuable and that criminals may be interested in stealing it. Create a crisis strategy with a contingency plan in case the company gets hacked. Remind your staff of Manning and Snowden and the damage caused.

Make Privacy Part of the DNA

Consider hiring a Chief Privacy Officer (CPO) or a Chief Data Officer (CDO) who is responsible for data privacy and ethics. This individual will be accountable for data that are collected, archived, shared, or sold. Discuss the privacy and ethical issues of Big Data at the top levels of the organization.

Practice Questions

1.     Provide some examples of areas that would be discussed under the topic of "Rules for Privacy."

2.     What were the ethical implications of Target’s pregnancy prediction score?

3.     How could someone secretly track your movements (or an employee, or family member, and the like)?

4.     What were the four policy considerations for Big Data ethics at the end of the chapter?

Notes

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset