Chapter 13. When your data science project fails

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 13. When your data science project fails

This chapter covers

Why data science projects tend to fail
What you can do when your project fails
How to handle the negative emotions from failure

Most data science projects are high-risk ventures. You’re trying to predict something no one has predicted before, optimize something no one has optimized before, or understand data that no one has looked at before. No matter what you’re doing, you’re the first person doing it; the work is almost always exploratory. Because data scientists are continuously doing new things, you will inevitably hit a point where you find out that what you hoped for just isn’t possible. We all must grapple with our ideas not succeeding. Failure is heartbreaking and gut-wrenching; you want to stop thinking about data science and daydream about leaving the field altogether.

As an example, consider a company building a machine learning model to recommend products on the website. The likely course of events starts with some set of meetings in which the data science team convinces executives that the project is a good idea. The team believes that by using information about customers and their transactions, they can predict what customers want to buy next. The executives buy into the idea and green-light the project. Many other companies have these models, which seem straightforward, so the project should work.

Unfortunately, once the team starts working, reality sets in. Maybe they find out that because the company recently switched systems, transaction data is available only for the past few months. Or maybe the team runs an experiment and finds that the people who see the recommendation engine don’t buy anything more than the people who don’t. Problems such as these build up; eventually the team abandons the project, dismayed.

In this chapter, we define a project as a failure when it doesn’t meet its objective. In the case of an analysis, the project might fail when it doesn’t help the stakeholder answer the business question. For a production machine learning problem, a project could fail when it isn’t deployed or doesn’t work when deployed. Projects can fail in many ways.

Data scientists tend to not talk about projects failing, although it happens extremely often. When a project fails, a data scientist can feel vulnerable. If your project fails, you may think “Had I been a better data scientist, this wouldn’t have happened.” Few people are comfortable sharing stories about questioning their own abilities.

At its core, data science is research and development. Every day, data scientists take data that has never been analyzed before and search for a trend that may or may not be there. Data scientists set out to build machine learning models on data where there may not be a signal. It’s impossible for these tasks to always succeed, because new trends and signals are very rarely found in any field. In a field such as software engineering, however, it’s usually possible to complete a task (although it may take more time and resources than planned).

Understanding how data science projects fail and what to do when they fail is important. The better you understand a failed project, the more future failures you can avoid. Failed projects can also give you insights into what will succeed by investigating what parts of the project did work. And you might be able to adjust a failed project with a little work into something that could be useful within the organization.

In this chapter, we cover three topics: why data science projects fail, how to think about project risk, and what to do when a project is failing. We discuss three main reasons why most projects fail, what to do with the project, and how to handle the emotions you may feel.

13.1. Why data science projects fail

It seems that data science projects fail for an endless list of reasons. From budget to technology and tasks that take far longer to complete than expected, there are many reasons for failure. Ultimately, these many types of failures break down into a few core themes.

13.1.1. The data isn’t what you wanted

You can’t look into every possible data source before starting a project. It’s imperative to make informed assumptions about what is available based on what you know of the company. When the project starts, you often find out that many of your assumptions don’t hold true. Perhaps data doesn’t exist, isn’t stored in a useful format, or isn’t stored in a place you can access. If you’re doing an analysis to understand how a customer’s age affects their use of a loyalty program, for example, you may find out that customers are never asked their ages when joining the program. That failure can end a project very quickly.

Example failure: Analysis of loyalty-program status

A director in the marketing department of a large restaurant chain wants to understand whether customers spend differently as they increase in status in the company’s loyalty program. The program has silver, gold, and platinum levels, and the director wants to know whether someone who hits platinum bought in the same way when they were merely at silver.

The data science team agrees to look into this request because the task should be fairly straightforward and they haven’t worked with loyalty data before. They’re shocked to find that the antiquated loyalty-program database doesn’t keep track of historic program levels—just where customers are now. If a customer is currently at the platinum level, there’s no way to know when they used to be silver or gold. Thus, the analysis is impossible to do.

The data science team recommends that the system be adjusted, but changing a loyalty-program database architecture requires millions of dollars, and there’s little demand for it in the company, so no changes are made, and the analysis idea is abandoned.

Because you need data before you can do anything, these types of problems are the first major ones that arise. A common reaction to running into this issue is internal bargaining, in which you try to engineer around the holes in your data. You say things like “Well, we don’t have a decade of data like we wanted, but maybe a year of data will be sufficient for the model” and hope for the best. Sometimes, this approach can work, but the alternate solutions aren’t always adequate to make the project feasible.

When you pitch a project, you don’t always have access to the data or even a full understanding of what it is (a special problem in consulting, where you don’t get access to the data until the work for the project is sold). Further, the data may exist but have a critical flaw that renders it useless. The data might exist in a database table, but the customer IDs might be corrupted and unusable. There are so many ways a dataset could have problems that it’s extremely difficult to check them all before starting a project. For this reason, it’s common for data science projects to barely get past the launch phase.

The faster you can get access to the data and explore it, the faster you can mitigate the risk of inadequate data. The best-case scenario for avoiding this error is getting samples of data before starting a project. If that isn’t feasible, the next-best scenario is having a project timeline designed around the possibility that the data will be poor. By having an early “go/no go” step in the project at which the stakeholders agree to reassess the project feasibility, there’s less of a chance that stakeholders will be surprised that the data could be bad.

If you find yourself struggling with a lack of good data, you have limited options. You can try to find alternative data sources to substitute, for example. Maybe you don’t have data on which products were purchased, but you do know what product volume was manufactured, and you can use that instead. The problem usually is that these substitutes are different enough to cause real problems with the analysis.

When you can’t find a viable substitute, sometimes all you can do is start a separate project to begin collecting better data. Adding instrumentation and telemetry to websites and apps, creating databases to store data instead of throwing it out, and performing other tasks can help the team take on the task in the future with better data collected.

13.1.2. The data doesn’t have a signal

Suppose that a gambler hires a data scientist, hoping to use statistics to win a dice game. The gambler rolls a six-sided die 10,000 times and records the rolls; then he pays the data scientist to create a model that will predict the next die roll. Despite the data scientist’s having an immense amount of data, there is no way to predict what roll will come next beyond assigning each side a 1/6 probability (if the die is fair). Despite the data scientist having lots of data, there is no signal within that data as to what side will be rolled next.

This problem of not having a signal in the data is extremely common in data science. Suppose that you’re running an e-commerce website and want to create a model to predict which customers will order products based on their browser, device, and operating system. There is no way to know before starting a project whether those data points could actually be used to predict whether a customer will order, or whether the data lacks a signal, just like the die-roll data did. The act of creating a machine learning model to make a prediction is testing the data to see whether it has a signal within it, and there very well may not be. In fact, in many situations, it would be more surprising for there to be a signal than for there not to be a signal.

Example failure: Detecting bugs on a website with sales data

A hypothetical e-commerce company has a problem: the website keeps having errors and bugs. Worse, the errors aren’t always detected by DevOps or the software engineering team. Once, the error was detected by the marketing team, which noticed that daily revenue was too low. When marketing notices a bug instead of DevOps or engineering, that’s a bad situation.

The data science team sets out to use statistical quality-control techniques on the sales data so that they can have alerts when revenue is so low that there must be a bug on the site. They have a list of days when bugs were detected and historic revenue data. It seems straightforward to use sales to predict bugs.

Unfortunately, the number of reasons why revenue can change on a daily basis makes detecting bugs almost impossible. Revenue could be low because of the day of the week, the point in the year, promotions from marketing, global events, or any number of other things. Although marketing was once able to see a bug, that fact wasn’t generalizable because there wasn’t a signal for it in the data.

Unfortunately, not having a signal in the data can be the end of the project. If a project is built around trying to find a relationship in the data and make a prediction based on it, and there is no relationship there, the prediction cannot be made. An analysis may turn up nothing new or interesting, or a machine learning model may fail to have any results that are better than random chance.

If you can’t seem to find the signal in the noise, you have a couple of possible ways out:

Reframe the problem. You can try to reframe the problem to see whether a different signal exists. Suppose that you have a set of articles, and you’re trying to predict the most relevant article to the user. You could frame the problem as a classification problem to try to classify which article in a set of articles is the most relevant.
Change the data source. If nothing seems to pull a signal out of the data, you can try changing the data source. As with the previous failure point of not having good data, adding a new data source to the problem sometimes creates an unexpected signal. Unfortunately, you usually start with the dataset that had the highest chance of being useful, so the odds that this strategy will save you are fairly limited.

It’s usual for data scientists who are stuck in this situation to try using a more powerful model to find a signal. If a logistic regression can’t make a meaningful prediction, they try a random forest model. If a random forest model doesn’t work, they try a neural network. Each method ends up being more time-consuming and more complex. Although these methods can be useful for getting more accurate predictions, they can’t make something out of nothing.

Most often, if the simplest method cannot detect any signal, the more complex ones won’t be able to either. Thus, it’s best to start with simple modeling methods to validate the feasibility of the project and then move to more-complex and time-consuming- ones rather than start with the complex ones and go simpler. Don’t get lost spending months building increasingly complicated models, hoping that just maybe the next one will be the one that saves the project.

13.1.3. The customer didn’t end up wanting it

No matter how accurate a model or analysis is, what matters is that it provides value to the stakeholder. An analysis can have findings that are incredibly interesting to the data scientist but not to the businessperson who requested it. A machine learning model can make highly accurate predictions, but if that model isn’t deployed and used, it won’t provide much value. Many data science projects fail even after the data science work has been done.

Ultimately, a data science analysis, model, or dashboard is a product. Designing and creating a product is a practice that many people have put hundreds of years of collective thought into. Despite all that, every year, billions of dollars are spent creating products that people don’t end up wanting. From New Coke to Google Glass, some high-profile products don’t land with customers, and some low-profile ones don’t either. Just as Microsoft and Nokia can put lots of effort into creating Windows Phone, which customers didn’t end up buying, so can a data scientist create products that aren’t used.

Example failure: Sales and marketing campaign value prediction

A project at a retail company was started to create a machine learning model to predict how much return on investment (ROI) future advertising campaigns would bring. The data science team decided to build the model after seeing how much the marketing and sales teams struggled with making Excel spreadsheets that predicted the overall value. Suppose that by using machine learning and modeling at the customer level, the data science team created a Python-based model that more accurately predicted the ROI of the campaigns.

Later, the data science team found out that the only reason why the marketing and sales teams created Excel spreadsheets with ROI predictions was to get the finance department to sign off on them. The finance team refused to work with anything but Excel; Python was too much of a black box for them. Thus, the tool wasn't used because the data science team didn't consider the customer's needs. The need wasn't for the most accurate prediction possible; it was for a prediction that would convince the finance team that the campaigns were financially feasible.

The universal guidance on creating products that customers will like is to spend lots of time talking to and working with customers. The more you understand their needs, their desires, and their problems, the more likely you are to make a product that they want. The fields of market research and user experience research are different ways of understanding the customer, through surveys and focus groups in market research or through user stories, personas, and testing in user experience research. Many other fields have come up with their own methods and have been using them for years.

Despite all the good thinking people have done, data science as a field is especially susceptible to failing because of not understanding the customer needs. For whatever reason, data scientists are much more comfortable looking at tables and plots than they are going out and talking to people. Many data science projects have failed because the data scientists didn’t put enough effort into talking to customers and stakeholders to understand what their true problems were. Instead, the data scientists jumped into building interesting models and exploring data. In fact, this situation is one of the main reasons why we chose to devote chapter 12 to managing stakeholders. We hope that you already have a better understanding of how to think through stakeholder relationships from reading that chapter, but if you skipped it, maybe you should check it out.

If you find yourself in the situation of having a product that doesn't seem to be landing, the single best thing you can do is talk to your customers. It’s never too late to talk to your customers. Whether your customer is a business stakeholder or customers of your company, communication and understanding can be helpful. If your product isn’t useful to them, can they tell you why it isn’t? Could you potentially fix the problems by adding new features to the product? Maybe you could change an analysis by joining a different dataset to it. Maybe you could improve a machine learning model by adjusting the format of the output or how quickly it runs. You’ll never know until you talk to people.

This also feeds into the concept of a minimally viable product (MVP), which is used heavily in software development. The idea is that the more quickly you can get a product working and to market, the more quickly you can get feedback on what works or doesn’t and then iterate on that feedback. In data science, the faster you have any model working or any analysis done, the faster you can show it to customers or stakeholders and get their feedback. Spending months iterating on a model prevents you from getting that feedback.

The better you understand customers throughout the design and build processes of your work, the less likely you are to get a failure from a customer not wanting the product. And if you end up failing in this way, the best way forward is to start communicating to try to find a solution.

13.2. Managing risk

Some projects are riskier than others. Taking data the team has worked with before and making a standard dashboard in a standard way is pretty likely to succeed. Finding a new dataset in the company, building a machine learning model around it that will run in real time, and displaying it to the customer in a pleasant user interface is a riskier project. As a data scientist, you have some control of the amount of risk you have at any time.

One big consideration with risk is how many projects you are working on at the same time. If you are working on a single risky project, and that project fails, it can be quite difficult to handle that failure careerwise. If, however, you are able to work on several projects at the same time, you will be able to mitigate the risk. If one of those projects fails, you have other projects to fall back on. If one project is an extremely complex machine learning model that has a limited chance of success, you could simultaneously be working on simpler dashboarding and reporting; then, if the machine learning project fails, your stakeholders may still be happy with the reports.

Having multiple projects can also be beneficial from a utilization standpoint. Data science projects have many starts and stops, from waiting for data to waiting for stakeholders to respond and even waiting for models to fit. If you find yourself stuck on one project for some reason, you’ll have an opening to make progress on another. This can even help with mental blocks; distracting yourself when you’re stuck can be a great way to refresh your thinking.

Another way to mitigate risk is to bake early stopping points into a project. Ideally, a project that seems like it may fail should be designed with the expectation that if by a certain point it isn’t successful, it’ll be cut off. In a project in which it’s unclear whether the data exists, for example, the project can be scoped so that if, after a month of searching, good data can’t be found, it’s considered to be infeasible and scuttled. If the expectation that it might not work out is presented early, ending the project is less surprising and less costly.

In a sense, having the project end early codifies the fact that data science is research and development. Because data science is filled with so many unknowns, it makes sense to plan in the possibility that as more is learned through exploratory work, the idea may not pan out.

Although it’s worthwhile to minimize the risk in a project portfolio, you don’t want to remove it entirely. Data science is all about taking risks: almost any sufficiently interesting project is going to have plenty of uncertainty and unknowns. Those risky unknowns can occur because no one has used a new dataset before, no one in a company has tried a certain methodology before, or the stakeholder is from a part of the company that has never used data science before. Plenty of valuable data science contributions at companies have come from people trying something new, and if as a data scientist you try to avoid projects that could fail, you’re also avoiding potentially big successes.

Although this chapter covers many ways that data science projects have failed, data science teams can eventually fail in aggregate by not taking enough risks. Consider a data science team that comes up with some new project ideas and reports, finds them successful, and then stagnates by only updating and refreshing the previous work. Although those projects might not fail in that they are delivering work to the company, that team would miss potential new areas for data science.

13.3. What you can do when your projects fail

If your data science project has failed, that doesn’t mean all the time you spent working on it was wasted. In section 13.2, we outlined some potential actions you can take to turn the project around. But even if there’s no way the project can succeed, there are still steps you can take to get the most out what’s left of it. In the following sections, we give you some strategies for handling your emotions when a project fails.

13.3.1. What to do with the project

Although the project may have failed, there likely is still a lot that can be gained from the project, both in knowledge and technology. The following steps can help you retain many of those gains.

Document lessons learned

The first thing to do with a project that failed is assess what you can learn from it. Some important questions to ask yourself and the team are

Why did it fail? This question seems almost obvious, yet it’s often the case that you can’t understand why a project failed until you step back and look at the bigger picture. By having a discussion with all the people involved in the project, you can better diagnose what went wrong. The company Etsy popularized the concept of a blameless postmortem—a discussion held after something failed in which a team can diagnose the problem without blaming a person. By thinking of a problem as being caused by a flaw in the way the team works (instead of a person’s mistakes), you’re more likely to find a solution. Without the fear of punishment, people will be more willing to talk openly about what happened.
What could have been done to prevent the failure? When you understand the factors that contributed to the failure, you can understand how to avoid similar situations in the future. If the data wasn’t good enough for the project to work, for example, the failure could have been prevented by a longer exploratory phase. These sorts of lessons help your team grow and mature.
What did you learn about the data and the problem? Even if the project is a failure, you often learn things that will be valuable in the future. Maybe the data didn’t have a signal in it, but to get to that point, you still had to join a bunch of new datasets; now you can do those same joins more easily in other projects. These questions can help you brainstorm possible things that can be salvaged from the project and help you come up with alternative project ideas.

By having a meeting in which the team works through these questions and then saving the results in a shared location, you’ll get a lot more value out of the failed project.

Consider pivoting the project

Although the project itself may have been a failure, there may be ways to pivot it into something useful. If you’re trying to build a tool to detect anomalies in company revenue, for example, and it fails, you may still be able to use that same model as a pretty decent forecasting tool. Whole companies have been built on taking an idea that was a failure and repurposing it into something successful.

Pivoting a product requires a lot of communication with stakeholders and customers. You’re essentially back at the beginning of the product design process, trying to figure out a good use for your work. By talking to stakeholders and customers, you can understand their problems and see whether your work is useful for anything new.

End the project (cut and run)

If you can’t pivot the project, the best thing you can do is end it. By definitively canceling the project, you allow yourself and the team to move onto new, more promising work. It’s extremely easy for a data scientist to want to keep working on a project forever in the hope that someday it’ll work. (There are thousands of algorithms out there; eventually one will work, right?) But if you get stuck trying to get something to work, you end up spending unnecessary effort. Also, it’s not fun to work on the same thing until the end of time! Although cutting a project is hard, as it requires you to admit that it’s no longer worth the effort, it pays off in the long run.

Communicate with your stakeholders

A data scientist should be communicating with their stakeholders throughout the course of a data science project (see chapter 12), but they should increase the amount of communication if the project is failing. Although it may feel comfortable to hide risks and troubles from stakeholders to avoid disappointing them, running into the situation where a stakeholder is surprised to find that the project has failed can be catastrophic for a career. By letting the stakeholders know that problems are occurring or that the project can no longer move forward, you’re being transparent with the stakeholders and inspiring trust. After helping them understand the project state, you can work together to decide the next steps.

If you’re uncertain how to communicate the problems with a stakeholder, your manager should be a good resource. They can either brainstorm an approach to delivering the message or potentially take the lead in delivering it themselves. Different people and organizations like to have messages delivered in different ways, from spreadsheets that lay out the issues with green/yellow/red color coding to conversations over coffee. Your manager or other people on your team should have insight into what works best.

It’s common for you, as a data scientist, to feel anxious when communicating that a project is failing; you feel very emotionally vulnerable and think that you’re in a position of weakness. Although there are occasions when the news is received poorly, other people are often willing to help work with you to resolve issues and decide next steps. After communicating the project failure, you may feel relief, not suffering.

13.3.2. Handling negative emotions

Forget the project and the company for a bit: you also need to think about your own well-being. Having a project fail is emotionally difficult! It’s the worst! If you’re not careful, a failed project can be a real drain and haunt you long after the project is over. By being thoughtful about how you react to the failure and the story you craft about it, you can set yourself up for more long-term success.

A natural internal monologue at the end of a failed project is “If I were only a better data scientist, the project wouldn’t have failed.” This thought is a fallacy: most data science projects fail because data science is inherently based on trying things that could never work. Most great data scientists have been involved with, or even led, projects that haven’t succeeded. By placing the blame for the failed project on yourself and possible data science deficiencies, you’re putting the weight of the whole project on yourself. But as discussed earlier in this chapter, there are many reasons why data science projects fail, and it’s very rare that the issue is the competency of the data scientist. It’s very common to be anxious that the project is failing because of you, but that anxiety is in your head and isn’t a reflection of reality.

If you allow yourself to fail and accept that failure isn’t a sign of a weakness, you’ll be more able to learn from the experience. Being confident about yourself and your skills makes it easier to think about the failure and what contributed to it, because it won’t hurt as much. That being said, the ability to be confident and own a failure is one that takes time, patience, and practice to gain, so don’t be surprised if you’re struggling to have confidence. It’s okay!

The key point here is that the best thing you can do for yourself when a project fails is understand that failure is not a reflection on your skills. Projects fail for reasons outside your control, and you’ll be able to move on from failure. The more you’re able to hold those things close, the easier the failure will be to accept.

We’ll end this chapter with a metaphor for data science. It’s common for aspiring and junior data scientists to think of a professional data scientist as being like an architect of buildings. A novice architect may design simple homes, and an experienced architect can build skyscrapers, but if either of them has a building collapse, it’s a career-ending failure. Similarly, one way to view a data scientist is that they build more and more complex models, but if one fails, their career is jeopardized. After you read this chapter, we hope that you recognize that this isn’t an accurate model of a professional data scientist.

A better metaphor is that a data scientist is like a treasure hunter (figure 13.1). A treasure hunter sets out looking for lost valuables, and if they’re lucky, they’ll find some! A novice treasure hunter may look for standard goods, but an experienced hunter finds the most legendary of treasure. A data scientist is much more like a treasure hunter; they seek out successful models, and once in a while, their models and analyses work! Although a senior data scientist may work on more complicated or tricky projects, everyone is continuously failing, and that’s just part of the job.

Figure 13.1. Two metaphors for data science: architecture and treasure-hunting

13.4. Interview with Michelle Keim, head of data science and machine le- earning at Pluralsight

Michelle Keim leads the Data Science and Machine learning team at Pluralsight, an enterprise-technology learning platform with a mission to democratize technology skills. Having previously grown and led a data science team at a range of companies including Boeing, T-Mobile, and Bridgepoint Education, she has a deep understanding of why data science projects can fail and how to handle failure.

When was a time you experienced a failure in your career?

I got pulled in to lead a project to build a set of customer retention models. I thought I had talked with all the right stakeholders and understood the business need, how the team worked, and why the models were necessary. We built the models but soon learned that there was no interest in them. The problem was we hadn’t sat down with the customer care agents who would actually be using the output; I had only ever talked to the leaders. We delivered a list of probabilities of whether a customer would leave, but the care agents didn’t know what to do with that. They needed to know what they should do when a customer is at risk of leaving, which is a very different problem than the one we had tackled. The biggest lesson learned for me was that you really have to get down into the weeds and understand the use case of the problem. What’s the problem being solved by the people who will use the output?

Are there red flags you can see before a project starts?

I think partly it’s an instinct that you gain from experience. The more things you see that go wrong and the more you take the opportunity to learn from the failures, the more you know what red flags to look for. The key is keeping your cycle short so that you have a chance to see them sooner; you need to bring feedback in at a frequent rate.

Data scientists tend to get excited about their work and forget to pull their heads up. It’s really important to have not only an understanding of where you want to go at the end of the day, but also what success looks like at different points along the way. That way, you can check your work against that, get feedback, and be able to pivot if necessary. Checkpoints let you quickly know when you’ve missed or misunderstood something and correct course, rather than learning that at the end and having to backtrack.

How does the way a failure is handled differ between companies?

It’s highly tied to the culture of the company. I would advise folks when job searching to try to find out if the company has a culture of learning and ongoing feedback. When you’re interviewing, you have an opportunity to ask the interviewer: What are you learning yourself? How did that opportunity arise for you? If I were to take this role, how would I get feedback? Is it something that you have to seek out, or is it formalized? Getting the feel for how employees respond to these questions is very, very telling.

When you’re already at a company, there are questions you can try to answer for yourself to see if there’s a healthy culture. After a project is finished, is there an opportunity to pause and look back? Do you try to retrospectively learn at the end of projects? Do you see the leadership use open communication and taking ownership for failures at various levels in the company? You get a sense for fear too when a strong culture isn’t there. You start to see behaviors that are more self-serving than mission-serving, and that unhealthy behavior kind of just hits you.

How can you tell if a project you’re on is failing?

You can’t know if you’re failing if you haven’t from the outset defined what success is. What are the objectives that you’re trying to attain, and what do checkpoints along the way towards that success look like? If you don’t know this, you’re just taking a stab at whether the project is going well or not. To set yourself up for success, you have to make sure you’ve collaborated with your stakeholders to have a well-defined answer to those questions. You need to know why you’re doing this project and what problem you’re trying to solve, or you won’t know the value of what you’re delivering and if your approach is correct. Part of the role of a data scientist is to bring your expertise to the table and help frame the problem and define the success metrics.

How can you get over a fear of failing?

You need to remember that you actually want some failures; if everything went perfectly, you’d never learn anything. How would you ever grow? Those experiences are necessary, as there’s no replacement for dealing with failure and becoming okay with it. It’s true that failure can be painful, and you may ask “Oh, my gosh, what am I going to do?” But after you see yourself bounce back, learn from it, and turn it into the next thing, that resiliency you gain will snowball into confidence. If you know to expect some things to go wrong, that makes it easier the next time around. And if you make sure you’re frequently getting feedback, you’re going to catch your failures before they become devastating. No one expects perfection. What is expected is for you to be honest about what you don’t know and to keep learning by asking questions and looking for feedback.

Summary

Data science projects usually fail because of inadequate data, a lack of signal, or not being right for the customer.
After a project fails, catalog why and consider pivoting or ending it.
A project failure isn’t a reflection on the quality of the data scientist.
A data scientist isn’t solely responsible for a project failure.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 13. When your data science project fails

Create new playlist

Sign In

Sign Up

Chapter 13. When your data science project fails

13.1. Why data science projects fail

13.1.1. The data isn’t what you wanted

13.1.2. The data doesn’t have a signal

13.1.3. The customer didn’t end up wanting it

13.2. Managing risk

13.3. What you can do when your projects fail

13.3.1. What to do with the project

Document lessons learned

Consider pivoting the project

End the project (cut and run)

Communicate with your stakeholders

13.3.2. Handling negative emotions

Figure 13.1. Two metaphors for data science: architecture and treasure-hunting

13.4. Interview with Michelle Keim, head of data science and machine le- earning at Pluralsight

When was a time you experienced a failure in your career?

Are there red flags you can see before a project starts?

How does the way a failure is handled differ between companies?

How can you tell if a project you’re on is failing?

How can you get over a fear of failing?

Summary

Table of Contents for
Chapter 13. When your data science project fails