The Intelligence Engine: January 2015

Saturday, January 31, 2015

The Game Outcomes Project, Part 5: What Great Teams Do

This article is the conclusion to a 5-part series:

Part 1: The Best and the Rest is also available here: (Gamasutra) (BlogSpot) (in Chinese)
Part 2: Building Effective Teams is available here: (Gamasutra) (BlogSpot) (in Chinese)
Part 3: Game Development Factors is available here: (Gamasutra) (BlogSpot) (in Chinese)
Part 4: Crunch Makes Games Worse is available here: (Gamasutra) (BlogSpot) (in Chinese)
Part 5: What Great Teams Do is available here: (Gamasutra) (Blogspot) (in Chinese)
For extended notes on our survey methodology, see our Methodology blog page.
Our raw survey data (minus confidential info) is now available here if you'd like to verify our results or perform your own analysis.

The Game Outcomes Project team includes Paul Tozour, David Wegbreit, Lucien Parsons, Zhenghua “Z” Yang, NDark Teng, Eric Byron, Julianna Pillemer, Ben Weber, and Karen Buro.

The Game Outcomes Project, Part 5: What Great Teams Do

The following is a summary of our top 40 findings in the 2014 Game Outcomes Project. We’ve listed the factors in order from most to least important below, sorted by the strength of their correlations.

Our study was based on a 120-question survey of 273 developers. The resulting data led to some surprising conclusions: we discovered enormous cultural differences between teams, and most of the cultural factors we looked at correlated very strongly with project outcomes. This article explains what differentiates the most successful game development teams from the rest, according to our survey data.

A large part of our survey was based on team effectiveness models defined in the books Leading Teams: Setting the Stage for Great Performances, The Five Dysfunctions of a Team, and 12: The Elements of Great Managing. Our first surprise was that out of the 46 questions we asked related to these models, 39 (85%) were strongly correlated with every single outcome factor in exactly the way the models predicted, and 5 of the remaining 6 questions correlated with all but one or two outcome factors.

So before you do anything else, pick up those books and read them. This goes double if you’re in any sort of leadership position.

Just because we’re doing game development doesn’t mean the fundamentals of organizational effectiveness are somehow different from other industries, or that the decades of validated management research done elsewhere is magically rendered irrelevant.

We are not a unique and special snowflake.

Bear in mind that this article is an opinion piece – an editorial that extrapolates from our analysis of the Game Outcomes Project data. It includes our own subjective interpretations of what our correlations and other analyses actually imply, but only where we felt it was justified by our data (or the models that inspired our survey design, where the data agreed strongly with those models).

If you disagree with any of our interpretations, read the articles (Parts 1, 2, 3, & 4) and form your own conclusions. You can also download the raw data here in case you’d like to double-check our analysis or investigate the data on your own.

The Top 40: What Great Teams Do

1. Great game development teams have a clear, shared vision of the game design and the development plan and an infectious enthusiasm for that vision [1, 2, 3]. Having a viable, compelling, clear, and well-communicated shared vision was more important than any other factor we looked at. Make absolutely certain that the vision for the final version of the game is clear and well-communicated throughout the team, and that team members share a similar vision of the game throughout development. Leads in particular need to communicate a consistent shared vision for the game, carefully communicate changes to the design or the development plan, and resolve any conflicts of vision swiftly and professionally.

Great development teams care deeply about the vision for the game [4]. They have an infectious enthusiasm that sharpens their focus. This enthusiasm is a huge driver of positive outcomes, and a lack of enthusiasm is a clear warning sign of problems that need to be addressed. It may mean the vision needs work, or it could be a people problem. Investigate carefully and don't jump to conclusions.

2. Great game development teams carefully manage the risks to the design vision and the development plan [1].

They are very cautious about making changes in development that deviate too far from the vision. Fundamental design changes in development are a major driver of increased costs and risks, and are often a sign of deeper problems.

When they disagree about game design, they resolve the disagreement swiftly. They do not ignore it.

If core design elements DO change, great gamedev teams clearly communicate those changes to the team and justify them.

Some believe that a leader’s job is to “hire talented people and get out of the way.” This is a deeply flawed notion. A leader must constantly and proactively work to identify and mitigate any potential threats to the project or the team.

3. Members of great game development teams buy into the decisions that are made [1]. If there’s no buy-in, it's a clear warning sign of deeper problems.

4. Great game development teams avoid crunch [1].

Extended overtime seems to actually makes games worse overall. As far as our data is able to tell, crunch absolutely does NOT make games better. Our data provides no convincing evidence that any overtime is helpful in any way. Not even a little bit. It has negative correlations with outcomes and positive correlations with poor planning, miscommunication, turnover, and a disrespectful working environment, especially when crunch is mandatory instead of voluntary.

But even if you don’t buy our conclusion that crunch itself makes games worse, you should take a good, hard look at all of the other evidence that crunch increases burnout, disengagement, turnover, and project error rates … along with the extensive evidence in the broader management research showing that it also harms employees’ health, productivity, relationships, morale, engagement, and decision-making ability, while increasing the risk of alcohol abuse.

The very best results in our study came from teams that were focused and cohesive and worked the LEAST amount of overtime.

If you lead a team, try this exercise: ask your team to work no more than 40 hours a week for 3 months, with the specific goal of increasing productivity and focus as much as possible in those 40 hours. Genuinely work to optimize your team's productivity during normal working hours and see how much more you can do with less.

And even if crunching were effective, it’s pathetic to ask your team to put in more hours per week before you’ve genuinely done everything in your power to maximize productivity in the first 40.

5. Great gamedev teams build an environment where it's safe to take a risk and stick your neck out to say what needs to be said [1].

If team members don't feel safe and comfortable speaking openly, or harbor any worries about political blowback from speaking their minds, you're likely to miss some very important things. There could be a hole in your boat, but if people are afraid to call attention to it, you might never find out, and the boat could sink.

Don't let the boat sink. Don’t let holes go unpatched. Build an environment where everyone can give and receive honest feedback without getting defensive or political, and respect others’ right to do so.

Some very compelling validated management science shows that this kind of "psychological safety" is essential for building high-performing, learning teams.

6. Great gamedev teams do everything they can to minimize turnover [1] and avoid changing the team composition [2] except for growing it when needed. This includes avoiding disruptive re-organizations as much as possible [3].

7. Great gamedev teams resolve interpersonal conflicts swiftly and professionally [1, 2].

If you have to bring in outsiders to resolve internal conflicts, you have a problem.

That's not to say all conflict is bad. Respectful, professional disagreement – or “creative conflict” – should be embraced [3].

But confrontations, politics, and disrespect [4] should not. Foster constructive politics and ensure that teams stay focused on attacking the problem, not the individual.

8. Great gamedev teams have a clearly-defined mission statement and/or set of values, which they genuinely buy into and believe in [1]. This matters FAR more than you might think.

If your team doesn't have a mission or doesn’t genuinely believe in the stated mission, consider pulling the team together and rewriting your mission statement.

9. Great gamedev teams keep the feedback loop going strong [1]. No one should go too long without receiving feedback on their work.

As part of this, they also practice "no-surprises management" [2]. Give IMMEDIATE feedback and ensure that team members always know how well they are doing. If there are problems, don't ever wait for a meeting or a performance review to bring them up.

10. Great gamedev teams celebrate novel ideas, even if they don't achieve their intended result [1].

All team members need the freedom to fail, especially creative ones.

Team members are more likely to experiment when they can see that the team and the leads have their back. This experimentation is key to creativity, and it's an absolutely essential part of building a learning, growing team.

The best teams understand that mistakes are opportunities.

At the same time, this needs to be balanced against the need to carefully manage design risk (point #1 above). Keep the creative experimentation focused on the right areas, especially on what's needed to complete the game or resolve gameplay problems. Avoid wasteful design thrashing.

11. Great gamedev teams hold each other to high standards for their particular discipline (art, design, engineering, etc) [1]. Embrace respectful collaboration – including code reviews, design reviews, art reviews, etc. – as opportunities for learning.

12. Great gamedev teams build an environment of mutual respect [1].

Some compelling management research shows that employees who feel respected are significantly more engaged. This engagement has a direct and measurable impact on project outcomes.

Make sure this respect is maintained as professional dialogue even during passionate disagreements.

Make sure that leads and managers set an example by respecting all team members.

Ensure that respectful behavior is rewarded and disrespectful behavior is swiftly discouraged.

Don't keep team members on board who are unwilling or unable to behave respectfully. They will only poison the well.

13. Great gamedev teams deal with personnel / HR issues on the team swiftly, professionally, and appropriately [1].

14. On great gamedev teams, everyone on the team is committed to making a great game [1].

15. Great gamedev teams empower team members by ensuring that their opinions count [1].

Ensure that everyone is respectfully listened to and has a chance to change your mind – especially if you're the boss.

16. Great gamedev teams estimate task durations as accurately as possible [1]. This can be difficult, but has a significant impact on outcomes. Re-estimating task durations on a regular basis to maintain the accuracy of the schedule also seems to have clear positive benefits.

17. Great gamedev teams strive to minimize internal politics and foster an environment where political shenanigans are not acceptable [1].

Don’t let an environment of accountability deteriorate into a culture of blame and finger-pointing. Your team should be focused on making a great game, not one-upmanship or internecine conflict.

Don't let anyone on the team deliberately act in a way that undermines anyone else's efforts [2].

18. Great gamedev teams discuss failures openly [1]. This helps create an environment of psychological safety.

Failed ideas can be the seed of successful ones. Sometimes a seemingly bad idea can be inches away from a very good one.

When people to keep failed ideas to themselves, it’s a missed opportunity and a possible sign of team dysfunction. Mistakes are an opportunity for learning that should be shared, and knowledge hoarding comes with a very high organizational cost.

19. Great gamedev teams don't let any team members put their own priorities above the collective goals of the game project [1].

Team members must never put their own ego, their career, their need for recognition, or their own sub-team or discipline ahead of the team. If any of these happens, it's a sign of a deeper problem. Resolve it swiftly, and if necessary, remove the team members in question.

Great teams hold one another accountable and call their peers out on actions and behaviors that are counterproductive to the greater good of the team [2].

20. Great gamedev teams value and utilize the unique skills and talents of all team members [1, 2]. Make sure team members' responsibilities and job roles are carefully matched with their particular skills and abilities.

21. Great gamedev teams enlist all studio stakeholders in decisions to make significant changes to the core game design or architecture [1].

22. Great gamedev teams offer ample praise [1]. Don’t hesitate to call out when someone does a task well.

23. Great gamedev teams keep an open-door policy [1]. Everyone on the team should have easy access to senior leadership to raise concerns, offer feedback, or discuss personnel issues.

24. Great gamedev teams ensure that all team members understand clearly what is expected of them [1].

Team members' tasks should be well-defined and clearly specified [2].

It should always be clear what a team member is supposed to do, and who is supposed to be doing what on the project.

25. Great gamedev teams make the organizational structure and membership of the team clear from the outset and carefully communicate any changes to that structure [1].

26. Great gamedev teams ensure that all team members are well-trained in the studio's production methodology [1]. They also make a deliberate effort to continually hone and improve their production techniques throughout the development process.

Having said that, we see no statistically differences between agile, agile using Scrum, or waterfall production techniques [2]. The only production methodology that shows a difference is not having one: our study shows this is disastrous for teams of any significant size.

There seems to be no universal right or wrong answer here. Pick the methodology that you think will work best for your team and your project.

27. Great gamedev teams don't let important things go unsaid [1]. They point out the elephant in the room.

And they ensure that team members have the psychological safety they need to point out big problems and offer challenging feedback.

28. Great gamedev teams give team members opportunities to learn, grow, and improve their skill set [1, 2], and they ensure that someone in the organization encourages each team member to develop his or her skills further [3].

Ideally, this should include both on-the-job development and access to external training, coaching, and mentoring.

29. Great gamedev teams ensure that their team's tools (both software and hardware) work well and allow them to be productive [1]. They keep their game engine running smoothly and their tool chain and asset pipeline running smoothly at all times.

30. Great gamedev teams give team members the authority to determine their own tasks on a day-to-day basis [1]. They also ensure that the person responsible for performing a task is involved in determining how much time is allocated to it [2].

31. Great gamedev teams carefully manage technology changes in development [1], especially large ones. Switching to a new game engine or making deep changes to an existing engine can be very risky, and great teams take extra care to manage those risks.

32. Great game development teams involve the entire team in prioritizing the work to be done for each milestone or sprint [1].

33. Great gamedev teams meet regularly to discuss topics of interest, ask questions, and identify production bottlenecks [1].

34. Great gamedev teams hold team members accountable for meeting their deadlines [1].

At the same time, they DON'T treat deadlines as matters of life and death, and they don't crucify team members for missing a deadline. Sometimes expectations aren't reasonable, or design features or new technology features don't work out or take much longer than expected.

Avoid sacrificing team cohesion and morale on the altar of individual deadlines. The former are worth far more in the long run.

35. Great gamedev teams foster an environment of helpfulness [1]. They reward team members for asking for help and offering support to others. A "sink or swim" environment will guarantee that everyone sinks in the long run.

36. It's a good idea to have some specs or design documents that describe the vision for the game at the outset [1]. Although these can never replace the work of careful, day-to-day design, design documents show a positive correlation with project timeliness and goal achievement.

37. Great gamedev teams genuinely care about one another as human beings [1]. Treating staff like robots is counterproductive and hurts ROI.

Don’t tolerate “brilliant jerks.” Or any jerks, for that matter.

38. Great gamedev teams use individually-tailored financial incentives [1].

Financial performance incentives have surprisingly little impact, and they ONLY seem to work at all when directly linked to indvidual performance. Royalties appear to have no impact on outcomes. Incentives tied to team performance or MetaCritic scores appear similarly useless. If you're going to offer financial incentives, consider using Pay For Performance (PFP) plans or similar individual performance incentives.

39. Great gamedev teams – especially large ones – conduct code reviews, pair programming, or peer-reviewed code checkins [1]. These showed a positive correlation with schedule timeliness and meeting project goals, especially for larger teams, and there’s significant evidence that they reduce defects and improve a team’s programming skills.

40. Great gamedev teams recognize that even the best-laid plans sometimes require adjustment. Any predetermined plan grows increasingly out-of-date as a game project evolves. Most of the best teams in our survey determined the priorities for each new milestone or sprint based on the current status of the project each time [1].

Experience also matters enormously, but you probably already knew that. What you may not have known is that it’s about as important as #36 on this list – the first 35 factors we listed all showed a stronger correlation with project outcomes than a team’s average level of experience.

Conclusion

Despite the inevitable risks involved in game development, the clearest result of our study is that the lion's share of our destiny is in our own hands. It comes down to a culture that consciously and deliberately fosters, cultivates, and supports effective teamwork.

We spend enormous amounts of effort optimizing our code and our art assets. There's no reason we shouldn't spend just as much effort optimizing our teams, and we hope this study has pointed the way toward some of the tools to help with that process.

The Game Outcomes Project team would like to thank the hundreds of current and former game developers who made this study possible through their participation in the survey. We would also like to thank IGDA Production SIG members Clinton Keith and Chuck Hoover for their assistance with survey design; Kate Edwards, Tristin Hightower, and the IGDA for assistance with promotion; and Christian Nutt and the Gamasutra editorial team for their assistance in promoting the survey.

For further announcements regarding our project, follow us on Twitter at @GameOutcomes

Monday, January 26, 2015

The Game Outcomes Project, Part 4: Crunch Makes Games Worse

This article is the fourth in a 5-part series.

Part 1: The Best and the Rest is also available here: (Gamasutra) (BlogSpot) (in Chinese)
Part 2: Building Effective Teams is available here: (Gamasutra) (BlogSpot) (in Chinese)
Part 3: Game Development Factors is available here: (Gamasutra) (BlogSpot) (in Chinese)
Part 4: Crunch Makes Games Worse is available here: (Gamasutra) (BlogSpot) (in Chinese)
Part 5: What Great Teams Do is available here: (Gamasutra) (Blogspot) (in Chinese)
For extended notes on our survey methodology, see our Methodology blog page.
Our raw survey data (minus confidential info) is now available here if you'd like to verify our results or perform your own analysis.

The Game Outcomes Project team includes Paul Tozour, David Wegbreit, Lucien Parsons, Zhenghua “Z” Yang, NDark Teng, Eric Byron, Julianna Pillemer, Ben Weber, and Karen Buro.

The Game Outcomes Project, Part 4: Crunch Makes Games Worse

Extended overtime (“crunch”) is a deeply controversial topic in our industry. Countless studios have undertaken crunch, sometimes extending to mandatory 80-100 hour work weeks for years at a time. If you ask anyone in the industry about crunch, you’re likely to hear opinions stated very strongly and matter-of-factly based on that person’s individual experience.

And yet such opinions are almost invariably put forth with zero reference to any actual data.

If we truly want to analyze the impact of extended overtime in any scientific and objective way, we should start by recognizing that any individual game project must be considered meaningless by itself – it is a single data point, or anecdotal evidence. We can learn absolutely nothing from whether a single successful or unsuccessful game involved crunch or not, because we cannot know how the project might have turned out if the opposite path had been chosen – that is, if a project that crunched had not done so, or if a project that did not employ crunch had decided to use it.

As the saying goes, you can’t prove (or disprove) a counterfactual – you’d need a time machine to actually know how things would have turned out if you’d chosen differently.

Furthermore, there have undeniably been many successful and unsuccessful games created both with and without crunch. So we can’t give crunch the exclusive credit or blame for a particular outcome on a single project when much of the credit or blame is clearly owed to other aspects of the game’s development. To truly measure the effect of crunch, we would need to look at a large sample, ideally involving hundreds of game projects.

Thankfully, the Game Outcomes Project survey has given us exactly that. In previous articles, we discussed the origin of the Game Outcomes Project and our preliminary findings, and our findings related to team effectiveness and many additional factors we looked at specific to game development. We also wrote up a separate blog post describing the technical details of our methodology.

In this article, we present our findings on extended overtime based directly on our survey data.

Attitudes Toward Crunch

Developers have surprisingly divergent attitudes toward the practice of crunch. An interview on gamesindustry.biz quoted well-known industry figures Warren Spector and Jason Rubin:

“Crunch sucks, but if it is seen by the team members as a fair cost of participating in an otherwise fantastic employment experience, if they value ownership of the resulting creative success more than the hardship, if the team feels like long hours of collaboration with close friends is ultimately rewarding, and if they feel fairly compensated, then who are we to tell them otherwise?" asked Rubin.

[…] "Look, I'm sure there have been games made without crunch. I've never worked on one or led one, but I'm sure examples exist. That tells me something about myself and a lot about the business I'm in," said Spector.

[…] "What I'm saying is that games - I'm talking about non-sequels, non-imitative games - are inherently unknowable, unpredictable, unmanageable things. A game development process with no crunch? I'm not sure that's possible unless you're working on a rip-off of another game or a low-ambition sequel.

“[…] Crunch is the result of working with a host of unknown factors in creative mediums. Since game development is always full of unknowns, crunch will always exist in studios that strive for quality […] After 30 years of making games I'm still waiting to find the wizard who can avoid crunch entirely without compromising at a level I'm unwilling to accept.”

On the other side of the fence is Derek Paxton of Stardock, who said in an interview with Gameranx:

“Crunch makes zero sense because it makes games worse. Companies crunch to push through on a specific game, but the long-term effect is that talented developers, artists, producers and designers burn out and leave the industry.

“Companies and individuals should stop wearing their time spent crunching as a badge of honor. Crunch is a symptom of broken management and process. Crunch is the sacrifice of your employees. I would ask them why crunch isn’t an issue with other industries. Why isn’t crunch an issue at all game studios?

“Employees should see it as a failure. Gamers should be concerned about it, because in the long term the hobby they love is losing talent because of it. Companies should do everything in their power to improve their processes to avoid these consequences.”

So who is right – Spector and Rubin, or Paxton?

[Full disclosure: team member Paul Tozour leads Mothership Entertainment, whose flagship game is being published by Stardock.]

In the Game Outcomes Project survey, we provided 3 text boxes at the end that respondents could use to tell us about their industry experiences. Where they mention crunch, they invariably mention it as a net negative. One respondent wrote:

“The biggest issue we had was that the lead said ‘Overtime is part of game development’ and never TRIED to improve. As sleep was lost, motivation dropped and the staff lost hope ... everything fell apart. Hundred-hour weeks for nine months, and I'm not exaggerating. Humans can't function under these conditions ... If you want to mention my answer feel free. I'm sure it'd be familiar to many devs.”

Another developer put it more bluntly:

“Schedule 40 hours a week and you get 38. Schedule 50 and you get 39 and everyone hates work, life, and you. Schedule 60 and you get 32 and wives start demanding you send out resumes. Schedule 80 and you’re [redacted] and get sued, jackass.”

In this article, we will be getting a final word on the subject from the one source that has yet to be interviewed: the data.

The “Extraordinary Effort” Argument

We’ll begin by formulating the “pro-crunch” side of the discourse into testable hypotheses. Although no one directly claims that crunch is good per se, and no one denies that it can have harmful effects, Spector and Rubin clearly make the case in the article above that crunch is often (if not usually, or even always) a necessary evil.

According to this line of thinking, ordinary development with ordinary schedules cannot produce extraordinary results. We believe an accurate characterization of this viewpoint from the gamesindustry.biz article quoted above would be: “Extraordinary results require extraordinary effort, and extraordinary effort demands long hours.”

This position (we’ll call it the “extraordinary effort argument”) leads directly to two falsifiable hypotheses:

1. If the “extraordinary effort argument” is correct, there should be a positive correlation between crunch and game outcomes, and higher levels of crunch should show a measurable improvement in the outcomes of game projects.

2. If the “extraordinary effort argument” is correct, there should be relatively few, if any, highly successful projects without crunch.

Luckily for us, we have data from hundreds of developers who took our survey with no preconceptions as to what the study was designed to test, and which we can use to verify both of these statements. We’ll agree to declare victory for the pro-crunch side if EITHER of these hypotheses remains standing after we put it in the ring with our data set.

Crunching the Numbers

We’ll approach our analysis in several phases, carefully determining what the data does and does not tell us.

Our 2014 survey asked the following five questions related to crunch, which were randomly scattered throughout the survey:

“I worked a lot of overtime or ‘crunched’ on this project.”
“I often worked overtime because I was required or felt pressured to.”
“Our team sometimes seemed to be stuck in a cycle of never-ending crunch / overtime work.”
“If we worked overtime, I believe it was because studio leaders or producers failed to scope the project properly (e.g. insufficient manpower, deadlines that were too tight, over-promised features).”
“If I worked overtime, it was only when I volunteered to do so.”

Here’s how the answers to those questions correlate with our aggregate project outcome score (described on our Methodology page). On the horizontal axis, a score of -1.0 is “disagree completely” and a score of +1.0 is “agree completely."

Figure 1. Correlation of each crunch-related question with that project’s actual outcome (aggregate score). Each of the 5 questions is shown, as an animated GIF with a 4-second delay. Only the horizontal axis changes.

The correlations are as follows: -0.24, -0.30, -0.47, -0.36, +0.36 (in the same order listed in the bullet-pointed list above). All five of these correlations have statistical p-values well below 0.001, indicating that they are statistically significant. Note how all the correlations are strongly negative except for the final question, which asked whether crunch was solely voluntary.

“But wait,” a proponent of crunch might say. “Surely that’s only because you’re using a combined score. That score combines the values of questions like ‘this project met its internal goals,’ which are going to give you lower values, because they're subjective fluff. Of course people who are unhappy about crunch are going to give that factor low scores – and that’s going to lower the combined score a lot. It’s a fudge factor, and it’s skewing your results. Throw it out! You should throw away the critical success, delays, and internal goals outcomes and JUST look at return on investment and I bet you’ll see a totally different picture.”

OK, let’s do that:

Figure 2. Correlation of each of the 5 crunch-related questions with that project’s return on investment (ROI). As with Figure 1, each of the 5 questions is shown, as an animated GIF with a 4-second delay. Only the horizontal axis changes. Note that many of the points shown represent multiple coincident points. See our Methodology page for an explanation of the vertical axis scale.

Notice how the lines have essentially the same slopes as in the previous figure. The correlations with ROI are as follows (in the same order): -0.18, -0.26, -0.34, -0.23, and +0.28. All of these correlations have p-values below 0.012.

Still not convinced? Here are the same graphs again, correlated against aggregate reviews / MetaCritic scores.

Figure 3. Correlation of each of the 5 crunch-related questions with the project’s aggregate reviews / MetaCritic score (note that the vertical axis does not represent actual MetaCritic scores but is a normalized representation of the answers to this question; see our Methodology page for more info). As with Figures 1 and 2, each of the 5 questions is shown, as an animated GIF with a 4-second delay. Note that many of the points shown represent multiple coincident points. Only the horizontal axis changes.

The results are essentially identical, and all have p-values under 0.05.

So if our combined score has a negative correlation with ALL our crunch questions except the one about crunch being purely voluntary (which itself does not imply any particular level of crunch), that means that we’ve disproven the first part of the “extraordinary effort argument” – the correlation is clearly negative, not positive.

Now let’s look at the second testable hypothesis of the “extraordinary effort argument.”

In Figure 4 (below), we’re looking at the two most relevant questions related to overall crunch for a project. The vertical axis is the aggregate outcome score, while the horizontal axis represents the scale from “disagree completely” (-1) to “agree completely.” The black lines are trend lines. As you can see, in both cases, higher agreement with each statement corresponds to inferior project outcomes.

Figure 4. The two most relevant questions related to crunch compared to the aggregate project outcome score.

We’ve added horizontal blue and orange lines to both images. The blue line represents a score of 80, which will be our subjective threshold for “very successful” projects. The orange line represents a score of 40, which will be our threshold for “very unsuccessful” projects.

The dots above the blue line tell a clear story: in each case, there were more successful games made without crunch than with crunch.

However, these charts don’t tell the full story by themselves; many of the data points are clustered at the exact same spot, meaning that each dot can actually represent several data points. So a statistical deep-dive is necessary. We’re particularly interested the four corners of the chart – the data points above the blue line on the extreme left and right sides of each chart (below -0.6 and above +0.6 on the horizontal axis) and below the orange line on the left and right sides.

Looking solely at the chart on the top of Figure 4 (“I worked a lot of overtime or ‘crunched’ on this project”), we observed the following pattern. Note that the percentages are given in terms of the total data points in each vertical grouping (under -0.6 or above 0.6 on the horizontal axis).

We can see clearly that a higher percentage of no-crunch projects succeed than fail (17% vs 10%) and a much larger percentage of high-crunch projects fail rather than succeeding (32% vs 13%). Additionally, a higher percentage of the successful projects are no-crunch than high-crunch (17% vs 13%), while a higher percentage of the unsuccessful projects are high-crunch vs no-crunch (32% vs 10%).

Here’s the same chart, but this time looking at the bottom question, “Our team sometimes seemed to be stuck in a cycle of never-ending crunch / overtime work.”

These results are even more remarkable. The respondents that answered “disagree strongly” or “disagree completely” were 2.5 times more likely to be working on very successful projects (23% vs 9%), while the respondents who answered “agree strongly” or “agree completely” were, incredibly, more than 10 times more likely to be on unsuccessful projects than successful ones (41% vs 4%).

Some might object to this way of measuring the responses, as it is an aggregate outcome score which takes internal achievement of the project goals into account – and this is a somewhat subjective measure. What if we looked at return on investment (ROI) alone? Surely that would paint a different picture.

Here is ROI:

Figure 5. The two most relevant questions related to crunch compared to return on investment (ROI).

The first question (top chart) gives us the following results:

The second question (bottom chart) gives us:

These results are essentially equivalent to what we got with Figure 4 -- the probabilities have shifted a little bit but the conclusions haven't changed at all. The same results hold if we look at MetaCritic scores or any of the other outcome factors we investigated.

For further verification, we did a deep-dive statistical analysis of the data in figures 4 and 5, treating the left and right sides of each graph on each figure (all data points < -0.6 and all those > +0.6) as two separate populations and performing a Wilcoxon rank sum test to compare them.

The p-values of all of these are highly statistically significant, with the top two rows having p-values under 0.006 and the bottom two rows with p-values of 0.

It should be clear that our data set contradicts both of the testable hypotheses that we derived from the “extraordinary effort argument.” But before declaring victory for Paxton and the anti-crunch side, let’s take a look at the counter-argument.

The “Crunch Salvage Hypothesis”

The counter-argument goes something like this:

“Your correlation is bogus, because crunch is more likely to happen on projects that are in trouble in the first place. So there’s already an underlying correlation between crunch and struggling projects, and this is skewing your results. You seem to be saying that crunch causes poorer outcomes, but the causality actually works differently – there’s a third, hidden causal factor (“project being in trouble”) that causes both crunch and lower outcomes. And although crunch helps improve the situation, it’s never quite enough to compensate for the problems in the first place, which is why you get the negative correlation.”

This position warrants further investigation. As the Spector/Rubin interview linked above makes clear, there are some developers who are willing to demand crunch even in cases where their projects are not in trouble (“crunch will always exist in studios that strive for quality,” according to Spector), so it’s clear that at least in some cases, crunch is used on projects that are not yet having problems. But the notion that crunch is more likely on struggling projects is entirely plausible.

Let’s test this counter-argument. Let’s assume the causation is not A -> B but C -> (A and B), where “A”=crunch, “B”=poorer project outcomes, and “C” represents some vaguely-defined set of factors representing troubled projects.

We’ll call this the “crunch salvage hypothesis” – the idea that crunch is more likely to be used on projects in trouble, and that this “trouble” is itself the cause of the poorer project outcomes, and that when crunch is used in this way, it leads to outcomes that are less poor than would otherwise be the case.

We don’t really care about every part of this hypothesis: we’ll simply accept the first two parts (that trouble can arise on a project, and that crunch often happens as a reaction to this trouble) as self-evident truths (although whether they are correct or not isn't really relevant to this article).

What we really care about, and what we can test, is the third part of this hypothesis – that when crunch is used in this case, it leads to outcomes that are less poor than would otherwise be the case. In other words, if a project is in trouble, is crunch an effective response?

If the “crunch salvage hypothesis” is correct, then crunch should provide an improved project outcome score beyond what we would expect to see if crunch were not used, all else being equal.

In order to test this conjecture, we calculated a linear regression model that specifically excludes all 5 questions related to crunch/overtime. We’ll call this model the “crunch-free model.”

Figure 6. Correlations for the “crunch-free model” (a linear regression that excludes crunch-related questions) with aggregate game outcome scores.

This “crunch-free model” correlates with our overall outcome score with a correlation value of 0.811 (and a p-value under 0.001). This is, by any measure, an extremely strong correlation.

We then computed the crunch-free model’s error term – that is, we compared the actual aggregate outcome score to the predicted outcome score given by the crunch-free model for each response by subtracting the predicted score from the actual aggregate outcome score. A high value indicates that the project turned out better than the model predicted, while a negative error value indicates that the project turned out worse than it predicted.

If we accept that the crunch-free predictive model is a good predictor of game outcomes (and the extremely high correlation and tiny p-value suggest that it is), then the “crunch salvage hypothesis” tells us that we should expect that it should improve the outcomes of game projects where it is used at least to some tiny, observable extent … and the more it is used, the more it should improve game project outcomes.

In other words, if crunch works, it should provide a “lift,” and for projects that involved more crunch, we should see a positive error term (that is, game projects that crunched should have turned out better than the crunch-free model predicts), while for projects that involved little or no crunch, we should see a negative error term.

So according to this worldview, there should be a clear, positive correlation between more crunch and a greater positive error value for the crunch-free model.

Here is the correlation for the error term with the answers to each of the two primary crunch-related questions:

Figure 7. The two most relevant questions related to crunch, compared to the error value of the crunch-free model. The vertical axis is the error of the crunch-free model (positive = better than model predicts; negative = worse), and the horizontal axis indicates agreement with each question (-1.0 = disagree completely, +1.0 = agree completely).

As you can see, there is a slight negative correlation. However, it is not statistically significant (p-value = 0.24 for the upper graph, and 0.1 for the lower one). And even if it were statistically significant, the correlations – at -0.07 and -0.1, respectively – are negative.

So where the “crunch salvage hypothesis” tells us to expect correlations that are strong, positive, and statistically significant, we see correlations that are weak, negative, and statistically insignificant.

Testing all of the other crunch-related questions in this way gives us similar results.

If we accept the assumptions that went into calculating these correlations, then we must conclude that more crunch did not, to any extent that we can detect, help the projects in our study achieve better outcomes than they otherwise would have experienced … and in many ways appears to have actually made them worse.

We are left to conclude that crunch does not in any way improve game project outcomes and cannot help a troubled game project work its way out of trouble.

Voluntary Crunch

But what about when crunch is voluntary? Our analysis has already indicated that a when crunch is entirely voluntary, outcomes significantly improve. Does a lack of mandatory crunch then eliminate the negative effects of the quantity of crunch? In other words, do higher levels of voluntary crunch then turn crunch from a net negative into a net positive?

In short, no. We compared the two extremes of our primary crunch question (we categorized the highest two answers to “I worked a lot of overtime …” as “High” crunch, and the lowest two as “Low” crunch) against our question about whether crunch was purely voluntary (where we condensed all 7 answers into three 3 broad categories -- the top two as “Voluntary,” the bottom two as “Mandatory,” and the middle 3 as “Mixed”). We also compared these categories using Kruskal-Wallis to prove statistical significance.

Our analysis shows that although crunch seems to be significantly less harmful when it’s voluntary, low levels of crunch in each case above (voluntary, mandatory, and mixed) are consistently associated with better outcomes than high levels of crunch.

What Causes Crunch?

The conclusions above led us to ask: what actually causes crunch? The Spector/Rubin interview above clearly illustrates the attitudes that cause at least some developers to demand extended overtime, but we were curious what the data said.

If crunch doesn’t correlate with better outcomes, what does it correlate with? Does it really derive from a desire for excellence, or is it a reaction to a project being in trouble, or do its roots lie elsewhere?

To find out, we analyzed the correlations of all the input factors in our survey against one another, looking specifically at how factors outside of our group of five crunch-related questions correlated with the five crunch questions. The four strongest correlations with our crunch-related questions were:

+0.51: “There was a lot of turnover on this project.”
+0.50: “Team members would often work for weeks at a time without receiving feedback from project leads or managers.”
+0.49: “The team’s leads and managers did not have a respectful relationship with the team’s developers.”
-0.49: “The development plan for the game was clear and well-communicated to the team.”

(The three positive correlations indicate that they made crunch more likely; the negative correlation is the one that makes crunch less likely).

This seems to indicate that crunch does not, in fact, derive from any sort of fundamental drive for excellence, which would have resulted in higher correlations with completely different input factors on our survey. Rather, it appears to stem from inadequate planning, disorganization, high turnover, and a basic lack of respect for developers.

Conclusion: We Are Not a Unique And Special Snowflake

We should be clear that we are not attempting to write an academic paper, and our results have not been peer-reviewed. Therefore, we walk a fine line between analyzing the data and interpreting it.

However, no matter how we analyze our data, we find that it loudly and unequivocally supports the anti-crunch side. Our results are clear enough and strong enough that we believe it’s important to step over that fine line, and transition from objective analysis to open advocacy.

There is an extensive body of validated management research available showing that extended overtime harms health, productivity, relationships, morale, employee engagement, decision-making ability, and even increases the risk of alcohol abuse.

An enormous amount of validated management research demonstrates that net employee productivity turns negative after just a few weeks of overtime. Total productivity actually declines by 16-20% as we increase our work days from 8 hours to 9 hours. Even just a few weeks of working 50 hours per week reduces cumulative output below what it would have been working only 40 hours per week – those 10 extra hours of work actually have a negative impact on productivity. All of that while also increasing employee stress, straining relationships, and increasing product defect rates.

However, the game industry is remarkably insular for such a cutting-edge and successful industry, and it seems generally unaware of this data. We tend to ignore such evidence or blithely assume it doesn't apply to us. As a broad generalization, our industry tends to value industry experience highly while undervaluing fundamental management skills. As a result, we usually promote managers from within while rarely offering the kind of management training that would enable insiders to perform their jobs adequately.

Is it any wonder, then, that we find ourselves completely cut off from the plethora of validated management research clearly showing that crunch is harmful?

The hundreds of anonymous respondents who participated in our survey answered various questions about game development factors and outcomes separately and individually, without any real clue as to the broader objectives of our study. Simply correlating their aggregate answers shows overwhelmingly that crunch is a net negative no matter how we analyze the data. It’s not even a case of small amounts of crunch being helpful and then turning harmful; we see no convincing evidence of hormesis.

It’s common knowledge that crunch leads to higher industry turnover and loss of critical talent, higher stress levels, increased health problems, and higher defect rates – and quite often, broken or deeply impaired personal relationships. Those who feel that crunch is justified freely admit to knowing this, but they don’t necessarily care about any of these harmful side-effects enough to avoid using it, as they continue to cling to the notion that “extraordinary results require extraordinary effort.”

However, this notion appears to be a fallacy, and our analysis suggests that if the industry is to mature, we must cast it aside.

Our results clearly demonstrate that crunch doesn't lead to extraordinary results. In fact, on the whole, crunch makes games LESS successful wherever it is used, and when projects try to dig themselves out of a hole by crunching, it only digs the hole deeper.

Perhaps the notion that “extraordinary results require extraordinary effort” is misguided.

Perhaps “effort” – as defined by working extra hours in an attempt to accomplish more – is actually counterproductive.

Our study seems to reveal that what actually generates “extraordinary results” – the factors that actually make great games great – have nothing to do with mere “effort” and everything to do with focus, team cohesion, a compelling direction, psychological safety, risk management, and a large number of other cultural factors that enhance team effectiveness.

And we suggest that abuse of overtime makes that level of focus and team cohesion increasingly more difficult to achieve, eliminating any possible positive effects from overtime.

We welcome open discourse and debate on this subject. Anyone who wishes to double-check our results is welcome to download our data set and perform their own analysis and contact us at @GameOutcomes on Twitter with any questions.