The Game Outcomes Project, Part 1: The Best and the Rest
遊戲專案為何成功系列之一:脫穎而出者背後的金礦
網誌版:http://wp.me/pBAPd-pz
原文網址:
http://gamasutra.com/blogs/PaulTozour/20141216/232023/The_Game_Outcomes_Project_Part_1_The_Best_and_the_Rest.php
http://tinyurl.com/kwcza9y
撰文:Paul Tozour
繁體中文翻譯:NDark
20141216
譯按:本文是一篇統計學專業文章,若有翻譯不正確的文句,請以原文為主。
This article is the first in a 4-part series. The remaining 3 articles will
be published in January 2015.
For extended notes on our survey methodology, see our Methodology blog page
(link).
The Game Outcomes Project team includes Paul Tozour, David Wegbreit, Lucien
Parsons, Zhenghua “Z” Yang, NDark Teng, Eric Byron, Julianna Pillemer, Ben
Weber, and Karen Buro.
本文共分為四個篇章,剩餘三個篇章將在2015年一月發布。
關於問卷的方法論,請參閱我們的部落格頁面 "Game Outcomes Project
Methodology":
http://intelligenceengine.blogspot.tw/2014/11/game-outcomes-project-methodology-in.html
http://tinyurl.com/kdmcv7t
遊戲專案為何成功團隊成員包含Paul Tozour,David Wegbreit,Lucien Parsons,
Zhenghua “Z” Yang,NDark Teng,Eric Byron,Julianna Pillemer,Ben Weber,及
Karen Buro。
The Game Outcomes Project, Part 1: The Best and the Rest
遊戲專案為何成功之一:脫穎而出者的背後
What makes the best teams so effective?
Veteran developers who have worked on many different teams often remark that
they see vast cultural differences between them. Some teams seem to run like
clockwork, and are able to craft world-class games while apparently staying
happy and well-rested. Other teams struggle mightily and work themselves to
the bone in nightmarish overtime and crunch of 80-90 hour weeks for years at
a time, or in the worst case, burn themselves out in a chaotic mess. Some
teams are friendly, collaborative, focused, and supportive; others are
unfocused and antagonistic. A few even seem to be hostile working
environments or political minefields with enough sniping and backstabbing to
put Team Fortress 2 to shame.
是甚麼要素使得優秀的團隊脫穎而出?
在不同團隊工作過的資深的開發者,常談論團隊間的文化差異。有些團隊的運作方式像精
準的時鐘,能夠產生世界級的產品,同時還有愉快的工作環境與充分的休閒生活。有些團
隊則在無止盡的熬夜工作中掙扎前進,甚至一週工作八十到九十小時。更糟糕的是,成員
因此覺得自己油盡燈枯。有些團隊氣氛良好,合作愉快,專注,游刃有餘。有些團隊目標
搖擺不定,成員互相制衡。甚至有些團隊是在仇視,背刺,針對性行為或充滿政治的工作
環境下工作。難道大家都沒玩Team Fortress 2(絕地要塞2)嗎?。
What causes the differences between those teams? What factors separate the
best from the rest?
As an industry, are we even trying to figure that out?
Are we even asking the right questions?
到底甚麼要素使得這些團隊不同?甚麼要素使得好團隊脫穎而出?
遊戲產業想要知道這些答案嗎?
我們是否有問到關鍵問題?
These are the kinds of questions that led to the development of the Game
Outcomes Project. In October and November of 2014, our team conducted a
large-scale survey of hundreds of game developers. The survey included
roughly 120 questions on teamwork, culture, production, and project
management. We suspected that we could learn more from a side-by-side
comparison of many game projects than from any single project by itself, and
we were convinced that finding out what great teams do that lesser teams don’
t do – and vice versa – could help everyone raise their game.
對於這些答案飢渴就導致了"遊戲專案為何成功"這個專案的催生。2014年的十月開始,我
們團隊針對遊戲開發者進行了大範圍的問卷。問卷中包含了關於團隊合作,文化,製程,
專案管理的一百二十個問題。我們預估可以從這些問卷資料的比對中找到差異,我們也確
信可以找到優秀團隊到底有或沒有做了其他團隊做的甚麼關鍵動作,進而幫助其他團隊。
Our survey was inspired by several of the classic works on team
effectiveness. We began with the 5-factor team effectiveness model described
in the book Leading Teams: Setting the Stage for Great Performances. We also
incorporated the 5-factor team effectiveness model from the famous management
book The Five Dysfunctions of a Team: A Leadership Fable and the 12-factor
model from 12: The Elements of Great Managing, which is derived from
aggregate Gallup data from 10 million employee and manager interviews. We
felt certain that at least one of these three models would surely turn out to
be relevant to game development in some way.
我們的問卷被幾項團隊效率的數個研究所啟發:Leading Teams: Setting the Stage
for Great Performances的"團隊效率的五個指標";管理名著:The Five Dysfunctions
of a Team: A Leadership Fable;Gallup 公司所蒐集一千萬員工與管理者面試資料所產
出12: The Elements of Great Managing的"十二個指標";我們認為這三種模型中至少有
一種能真正套用在遊戲開發上。
We also added several categories with questions specific to the game industry
that we felt were likely to show interesting differences.
On the second page of the survey, we added a number of more generic
background questions. These asked about team size, project duration, job
role, game genre, target platform, financial incentives offered to the team,
and the team’s production methodology.
我們也依照遊戲產業的特殊性加入了幾個我們認為會探索出有趣差異的類別問題。
在問卷的第二頁是一般性的背景問題,關於團隊人數大小,專案長度,工作角色,遊戲類
型,平台,商業營收,以及開發方法。
We then faced the broader problem of how to quantitatively measure a game
project’s outcome.
Ask any five game developers what constitutes “success,” and you’ll likely
get five different answers. Some developers care only about the bottom line;
others care far more about their game’s critical reception. Small indie
developers may regard “success” as simply shipping their first game as
designed regardless of revenues or critical reception, while developers
working under government contract, free from any market pressures, might
define “success” simply as getting it done on time (and we did receive a
few such responses in our survey).
我們立即面臨的問題,就是如何量化所謂的成功?
這個問題是因人而異,有些開發者標準不高,有些則關心遊戲的名聲。獨立開發者希望遊
戲照著他們所設計的方向完成,而不管市場營收,也不管是否引來負評。有些開發者是依
照合約規範下工作,不需在意市場壓力,此時的成功就是專案完成。(而我們確實在問卷
中收到這樣的回應)
Lacking any objective way to define “success,” we decided to quantify the
outcome through the lenses of four different kinds of outcomes. We asked the
following four outcome questions, each with a 6-point or 7-point scale:
"To the best of your knowledge, what was the game's financial return on
investment (ROI)? In other words, what kind of profit or loss did the company
developing the game take as a result of publication?"
"For the game's primary target platform, was the project ever delayed
from its original release date, or was it cancelled?"
"What level of critical success did the game achieve?"
"Finally, did the game meet its internal goals? In other words, to what
extent did the team feel it achieved something at least as good as it was
trying to create?"
由於缺乏定義成功的方式,我們決定用不同的指標來量化這個數據。我們問了四個關於產
出的問題,分別有六,或七個回答等級:
就你所知,遊戲專案的回收狀況如何?也就是公司開發此專案的獲利與損失如何?
針對此遊戲專案的首要平台,專案有延遲或被取消嗎?
此遊戲專案有被認定為成功嗎?
此遊戲專案是否有達到原本認定的目標?也就是,團隊內部認為最終產出與原先預估
相契合?
We hoped that we could correlate the answers to these four outcome questions
against all the other questions in the survey to see which input factors had
the most actual influence over these four outcomes. We were somewhat
concerned that all of the “noise” in project outcomes (fickle consumer
tastes, the moods of game reviewers, the often unpredictable challenges
inherent in creating high-quality games, and various acts of God) would make
it difficult to find meaningful correlations. But with enough responses,
perhaps the correlations would shine through the inevitable noise.
我們希望我們能將產出的回答結果關聯到問卷的其他問題項目。了解哪些問題項目對產出
幫助最高。我們也關心哪些問題其實是對於產出無關緊要(如玩家的品味,評論家的心情
,高品質遊戲的突破),這些雜訊會造成我們找不到真正有意義的關聯。但有足夠的回應
,也許這些關聯性可以從雜訊中被我們撿拾出來。
We then created an aggregate “outcome” value that combined the results of
all four of the outcome questions as a broader representation of a game
project’s level of success. This turned out to work nicely, as it
correlated very strongly with the results of each of the individual outcome
questions. Our Methodology blog page has a detailed description of how we
calculated this aggregate score.
接著我們就混雜了先前提到的四個產出問題,依此設計了產出的合計分數,來代表遊戲專
案的成功度。結果運作的很棒,這個機制強烈的與各項問題連結。我們的方法論部落格頁
面有這項分數的詳細描述。
We worked carefully to refine the survey through many iterations, and we
solicited responses through forum posts, Gamasutra posts, Twitter, and IGDA
mailers. We received 771 responses, of which 302 were completed, and 273
were related to completed projects that were not cancelled or abandoned in
development.
我們小心的計算並重複定義這項問卷的數據,也從論壇,Gamasutra,Twitter,IGDA的郵
件群組中收到回饋。我們最終回收了七百七十一份的問卷,其中三百零二份有效,而沒有
被取消或放棄的專案留下了兩百七十三份。
The Results
So what did we find?
In short, a gold mine. The results were staggering.
結論
所以我們最後找到了甚麼?
簡短來說,我們找到了驚人的金礦。
More than 85% of our 120 questions showed a statistically significant
correlation with our aggregate outcome score, with a p-value under 0.05 (the
p-value gives the probability of observing such data as in our sample if the
variables were be truly independent; therefore, a small p-value can be
interpreted as evidence against the assumption that the data is independent).
This correlation was moderate or strong in most cases (absolute value >
0.2), and most of the p-values were in fact well below 0.001. We were even
able to develop a linear regression model that showed an astonishing 0.82
correlation with the combined outcome score (shown in Figure 1 below).
在一百二十項的問題中超過百分之八十五都顯示出與我們的產出分數有強烈的關聯性(
correlation),其顯著性(p-value,http://en.wikipedia.org/wiki/P-value,此值是
觀察資料中變數是否獨立的機率,因此此值越小代表可被解釋為推論的資料是獨立的)都
小於0.05。關聯性很大(大於0.2),大多數的p-values甚至小於0.001。我們甚至能夠建
立出驚人有0.82關聯性的回歸分析結果。
Figure 1. Our linear regression model (horizontal axis) plotted against the
composite game outcome score (vertical axis). The black diagonal line is a
best-fit trend line. 273 data points are shown. 圖片請連原文:
。我們的水平軸回歸分析對上垂直軸的產出分數。
To varying extents, all three of the team effectiveness models (Hackman's “
Leading Teams” model, Lencioni's “Five Dysfunctions” model, and the Gallup
“12” model) proved to correlate strongly with game project outcomes.
We can’t say for certain how many relevant questions we didn’t ask. There
may well be many more questions waiting to be asked that would have shined an
even stronger light on the differences between the best teams and the rest.
But the correlations and statistical significance we discovered are strong
enough that it’s very clear that we have, at the very least, discovered an
excellent partial answer to the question of what makes the best game
development teams so successful.
廣義來說,三個團隊效率的模型(Hackman's “Leading Teams” model, Lencioni's “
Five Dysfunctions” model, and the Gallup “12” model)全部都與我們的產出分數
高度相關。
我們不能聲稱是否我們沒列出來的問題才更加與其相關,確實可能有更多問題是我們應該
問的,更加將好團隊的要素發掘出來。
但統計的證據顯示出來我們發覺了此問題的可能答案,可以讓我們的遊戲開發團隊更加成
功。
The Game Outcomes Project Series
Due to space constraints, we’ll be releasing our analysis as a series of
several articles, with the remaining 3 articles released at 1-week intervals
beginning in January 2015. We’ll leave off detailed discussion of our three
team effectiveness models until the second article in our series to allow
these topics the thorough analysis they deserve.
This article will focus solely on introducing the survey and combing through
the background questions asked on the second survey page. And although we
found relatively few correlations in this part of the survey, the areas where
we didn’t find a correlation are just as interesting as the areas where we
did.
遊戲專案為何成功的系列
由於篇幅所限,我們會依序釋出一系列的分析文章,剩下的三篇會以一個禮拜為周期的方
式自2015年一月開始釋出。我們會在第二篇中探討三個團隊效率模型,讓它們能夠被詳盡
的分析及解釋。
本篇文章會專注在介紹這個問卷專案,以及問卷第二頁的背景問題。在其中我們發現一些
低關聯性的問題,這些區域中我們未能找到如同其他區域一樣顯著的關聯性。
Project Genre and Platform Target(s)
First, we asked respondents to tell us what genre of game their team had
worked on. Here, the results are all across the board.
專案的遊戲類型與發布平台
首先,我們請填寫者回答了他們團隊開發遊戲專案的類型。結果如圖。
Figure 2. Game genre (vertical axis) vs. composite game outcome score
(horizontal axis). Higher data points (green dots) represent more successful
projects, as determined by our composite game outcome score. 水平軸的遊戲類型
對垂直軸的產出分數。越高的數值代表越成功的案子。
We see remarkably little correlation between game genre and outcome. In the
few cases where a game genre appears to skew in one direction or another, the
sample size is far too small to draw any conclusions, with all but a handful
of genres having fewer than 30 responses.
我們在遊戲類型與產出分數間沒有發現顯著的關聯性。有些數據會出現一個方向的趨勢,
但由於個別的遊戲類型都只有不到三十份回應,因此這些數據無法讓我們做出結論。
(Note that Figure 2 uses a box-and-whisker plot, as described here).
We also asked a similar question regarding the product’s target platform(s),
including responses for desktop (PC or Mac), console (Xbox/PlayStation),
mobile, handheld, and/or web/Facebook. We found no statistically significant
results for any of these platforms, nor for the total number of platforms a
game targeted.
我們也問了類似關於發布平台的問題(桌機,家機,行動,手持,網頁),也都沒有顯著
的統計結果。
Project Duration and Team Size
We asked about the total months and years in development; based on this, we
were able to calculate each project’s total development time in months:
專案長度與團隊人數
我們問了關於開發的資總年月數。
Figure 3. Total months in development (horizontal axis) vs game outcome
score (vertical). The black diagonal line is a trend line. 總月數對產出分數
As you can see, there’s a small negative correlation (-0.229, using the
Spearman correlation coefficient), and the p-value is 0.003. This negative
correlation is not too surprising, as troubled projects are more likely to be
delayed than projects that are going smoothly.
如你們可見,有一個負向的關聯,p值是0.003。負關聯並不令人意外,不順利的專案都會
做比較長。
We also asked about the size of the team, both in terms of the average team
size and the final team size. Average team size was between 1 and 11 with an
average of 5.7; final team size was between 1 and 500 with an average of
48.6. Both showed a slight positive correlation with project outcomes, as
shown below, but in both cases the p-value is over 0.1, indicating there’s
not enough statistical significance to make this correlation useful or
noteworthy. We suspect that the small positive correlation can be explained
by the fact that a struggling project is less likely to receive additional
resources over time than one that’s going well. So the result is not too
surprising.
我們也問了關於團隊大小的問題,包含平均的人數,與團隊最後的人數。平均人數的數據
自1到11,平均5.7。最後的人數的數據自1到500,平均為48.6。對於專案的產出都有正相
關,但兩者的p值都大於0.1,顯示不出足夠的統計特徵,因此我們就不在此深入研究。我
們假設小的正相關可說是小團隊都資源都不足。
Figure 4. Average team size correlated against game project outcome
(vertical axis).平均人數對產出分數
Figure 5. Final team size correlated against game project outcome (vertical
axis).最終人數對產出分數
Figure 6. Percent change in team size (final divided by average) correlated
against game project outcome (vertical axis).專案人數改變綠對產出分數
Game Engines
We asked about the technology solution used: whether it was a new engine
built from scratch; core technology from a previous version of a similar game
or another game in the same series; an in-house / proprietary engine (such as
EA Frostbite); or an externally-developed engine (such as Unity, Unreal, or
CryEngine).
The results are as follows:
遊戲引擎
我們問了關於技術方案的問題。從自製引擎到市售引擎(如Unity,Unreal,CryEngine)
。結果如下。
Figure 7. Game engine / core technology used (horizontal axis) vs game
project outcome (vertical axis), using a box-and-whisker plot.遊戲引擎對產出分
數
Average composite score
Standard Deviation
Number of responses
New engine/tech
53.3 18.3 41
Engine from previous version of same or similar game
64.8 15.8 58
Internal/proprietary engine / tech (such as EA Frostbite)
60.7 19.4 46
Licensed game engine (Unreal, Unity, etc.)
55.6 17.5 113
Other
55.5 19.5 15
The results here are less striking the more you look at them. The highest
score was for projects that used an engine from a previous version of the
same game or a similar one – but that’s exactly what one would expect to be
the case, given that teams in this category clearly already had a head start
in production, much of the technical risk had already been stamped out, and
there was probably already a veteran team in place that knew how to make that
type of game!
結果並不如期待中驚訝。顯而易見最高分的專案是同樣或類似系列的續作。同類型專案已
經將風險降低,成員也可能是箇中老手。
We analyzed these results using a Kruskal-Wallis one-way analysis of
variance, and we found that this question was only statistically significant
on account of that very option (engine from a previous version of the same
game or similar), with a p-value of 0.006. Removing the data points related
to this answer category caused the p-value for the remaining categories to
shoot up above 0.3.
我們使用Kruskal-Wallis one-way analysis of variance來分析結果,我們發現只有沿
用同類型技術的專案才有顯著的低p值(0.006)。除此之外的數據的p值都超過0.3。
Our interpretation of the data is that the best option for the game engine
depends entirely on the game being made and what options are available for
it, and that any one of these options can be the “best” choice given the
right set of circumstances. In other words, the most reasonable conclusion
is there is no universally “correct” answer separate from the actual game
being made, the team making it, and the circumstances surrounding the game's
development. That’s not to say the choice of engine isn’t terrifically
important, but the data clearly shows that there plenty of successes and
failures in all categories with only minimal differences in outcomes between
them, clearly indicating that each of these four options is entirely viable
in some situations.
我們對此數據的解釋是沿用舊有技術的專案可能是該唯一可行的方案,換句話說最理性的
解答就是對於不同團隊而言,製作遊戲沒有所謂正確的工具,所謂正確的工具其實就是團
隊製造出來的工具。因此我們不能說引擎的選擇並不重要,只能說從數據來看引擎的選用
未能造成產出的差異。
We also did not ask which specific technology solution a respondent’s dev
team was using. Future versions of the study may include questions on the
specific game engine being used (Unity, Unreal, CryEngine, etc.)
但我們沒有以特別引擎來詢問,也就是說沒有細到問團隊使用的是Unity,Unreal,或是
CryEngine。這點我們在未來可以加入問卷之中。
Team Experience
We also asked a question on this page regarding the team’s average
experience level, along a scale from 1 to 5 (with a ‘1’ indicating less
than 2 years of average development experience, and a ‘5’ indicating a team
of grizzled game industry veterans with an average of 8 or more years of
experience).
團隊經歷
我們也問了關於團隊平均經歷的問題,從少於兩年,到八年以上。
Figure 8. Team experience level ranking (horizontal axis, by category listed
above) mapped against game outcome score (vertical axis)團隊經歷對產出分數
Here, we see a correlation of 0.19 (and p-value under 0.001). Note in
particular the complete absence of dots in the upper-left corner (which would
indicate wildly successful teams with no experience) and the lower-right
corner (which would indicate very experienced teams that failed
catastrophically).
這裡我們可看到一個0.19的相關度(其p值小於0.001)。請特別注意左上角(沒經驗但成
功的團隊)與右下角(很有經驗但失敗的團隊)是完全沒有資料的。
So our study clearly confirms the common knowledge in the industry that
experienced teams are significantly more likely to succeed. This is not at
all surprising, but it's reassuring that the data makes the point so clearly.
And as much we may all enjoy stories of random individuals with minimal game
development experience becoming wildly successful with games developed in
just a few days (as with Flappy Bird), our study shows clearly that such
cases are extreme outliers.
因此我們依照常識確認有經驗的團隊比較容易成功。也不意外。也因此關於那些極小團隊
與專案的巨大勝利(如Flappy Bird)其實真的是偶發事件。
The Surprises: Production and Incentives
This first page of our survey also revealed two major surprises.
The first surprise was financial incentives. The survey included a question:
“Was the team offered any financial incentives tied to the performance of
the game, the team, or your performance as individuals? Select all that
apply.” We offered multiple check boxes to say “yes” or “no” to any
combination of financial incentives that were offered to the team.
The correlations are as follows:
令人意外:製程與激勵因子
我們問卷的第一頁給我們兩個重要的意外結論。
第一個意外是金錢的激勵。問卷裡面問了一個問題:團隊會依據團隊,或個人產出給予金
錢方面的激勵嗎?填寫者可以對給團隊或個人來分別填寫有及沒有。
Figure 9. Incentives (horizontal axis) plotted against game outcome score
(vertical axis) for the five different types of financial incentives, using a
box-and-whisker plot. From left to right: incentives based on individual
performance, team performance, royalties, incentives based on game
reviews/MetaCritic scores, and miscellaneous other incentives. For each
category, we split all 273 data points into those excluding the incentive
(left side of each box) and those including the incentive (right side of each
box).獎勵對產出分數,從左到右根據分別代表根據個人效率,團隊效率,分成,根據網
頁評論評分,或其他數據。我們在各項目中分別列出有與沒有的情形。
Of these five forms of incentives, only individual incentives showed
statistical significance. Game projects offering individually-tailored
compensation (64 out of the 273 responses) had an average score of 63.2
(standard deviation 18.6), while those that did not offer individual
compensation had a mean game outcome score of 56.5 (standard deviation 17.7).
A Wilcoxon rank-sum test for individual incentives gave a p-value of 0.017
for this comparison.
在五種激勵因子中,只有個人的獎勵是有顯著的統計指標。273份數據中的64份有給個人
的獎勵,該些專案的平均產出分數是63.2,標準差18.6,而反過來沒有給個人獎勵的數據
則是平均56.5(標準差17.7)。在這個比較下給予個人獎勵的數據透過Wilcoxon
rank-sum test方法可以得到0.017的p值。
All the other forms of incentives – those based on team performance, based
on royalties, based on reviews and/or MetaCritic ratings, and any
miscellaneous “other” incentives – show p-values that indicate that there
was no meaningful correlation with project outcomes (p-values 0.33, 0.77,
0.98, and 0.90, respectively, again using a Wilcoxon rank-sum test).
那麼其他的獎勵方式,基於團隊,分成,網頁評論評分,或其他的方式,都沒有對於產出
分數有顯著的相關度。(p值0.33,0.77,0.98,0.90)
This is a very surprising finding. Incentives are usually offered under the
assumption that they are a huge motivator for a team. However, our results
indicate that only individual incentives seem to have the desired effect, and
even then, to a much smaller degree than expected.
這發現令人驚訝。我們都認為獎勵給予團隊會有巨大的激勵。然而,結果卻顯示,只有給
予個人的激勵才會達到效果,即便如此,都比我們認為能達到的等級都來得小。
One possible explanation is that perhaps the psychological phenomenon
popularized by Dan Pink may be playing itself out in the game industry –
that financial rewards are (according to a great deal of recent research)
usually a completely ineffective motivational tool, and actually backfire in
many cases.
可能的解釋是也許類似psychological phenomenon popularized by Dan Pink的解釋,也
就是金錢的獎勵可能反而會造成反效果。
We also speculate that in the case of royalties and MetaCritic reviews in
particular, the sense of helplessness that game developers can feel when
dealing with factors beyond their control – such as design decisions they
disagree with, or other team members falling down on the job – potentially
compensates for any motivating effect that incentives may have had. With
individual incentives, on the other hand, individuals may feel that their
individual efforts are more likely to be noticed and rewarded appropriately.
However, without more data, this all remains pure speculation on our part.
我們對於分成與網頁評論評分特別深入思考,也許遊戲開發者的不快樂是來自於無法掌控
的時候,如被迫接受設計,團隊有人搞砸無法交件。這些不快樂抵銷了能夠獲得獎勵的因
子。只有針對個人的獎勵卻仍能夠保持在成員身上。然而,我們的資料如果更多,才能夠
真正證明這點推論。
Whatever the reason, our results seem to indicate that individually tailored
incentives, such as Pay For Performance (PFP) plans, seem to achieve
meaningful results where royalties, team incentives, and other forms of
financial incentives do not.
不管原因如何,我們的結論指向個人的獎勵,如Pay For Performance所言,確實比其他
方式來的有效。
Our second big surprise was in the area of production methodologies, a topic
of frequent discussion in the game industry.
We asked what production methodology the team used – 0 (don’t know), 1
(waterfall), 2 (agile), 3 (agile using “Scrum”), and 4 (other/ad-hoc). We
also provided a detailed description with each answer so that respondents
could pick the closest match according to the description even if they didn’
t know the exact name of the production methodology. The results were
shocking.
另一個令人驚訝之處在於方法論,也是業界頻繁討論的問題。
我們問的問題是團隊是屬於沒有使用特定的開發方法,使用瀑布式,使用敏捷式,或使用
其他隨意的方式來開發。我們也在答案的旁邊附註了最有可能的情境。結果令人震驚。
Figure 10. Production methodology vs game outcome score.製程方法論對上產出分
數
Here's a more detailed breakdown showing the mean and standard deviation for
each category, along with the number of responses in each:
這裡是一個詳細的數據表格,顯示各項方法論的數據及標準差,與數量。
Average composite score
Standard Deviation
Number of responses
Unknown
50.6 17.4 7
Waterfall
55.4 17.9 53
Agile
59.1 19.4 94
Agile using Scrum
59.7 16.9 75
Other / Ad-hoc
57.6 17.6 44
What’s remarkable is just how tiny these differences are. They almost don’
t even exist.
看出甚麼了嗎?答案是甚麼也沒有。
Furthermore, a Kruskal-Wallis H test indicates a very high p-value of 0.46
for this category, meaning that we truly can’t infer any relationship
between production methodology and game outcome. Further testing of the
production methodology against each of the four game project outcome factors
individually gives identical results.
更進一步的說,透過Kruskal-Wallis H test方法測試,竟造成了一個高度的p值0.46。也
就是說我們完全無法找到方法論對於遊戲產出的關係。
Given that production methodologies seem to be a game development holy grail
for some, one would expect to see major differences, and that Scrum in
particular would be far out in the lead. But these differences are tiny,
with a huge amount of variation in each category, and the correlations
between the production methodology and the score have a p-value too high for
us to deny the assumption that the data is independent. Scrum, agile, and “
other” in particular are essentially indistinguishable from one another. “
Unknown” is far higher than one would expect, while “Other/ad-hoc” is also
remarkably high, indicating that there are effective production methodologies
available that aren’t on our list (interestingly, we asked those in the “
other” category for more detail, and the Cerny method was listed as the
production methodology for the top-scoring game project in that category).
對某些遊戲開發者來說,方法論被認為是聖杯,會造成巨大的差異,特別以敏捷式為標的
。但其造成的差異很微小。每個方法論與產出分數的相關度都很低(p值很高)。比之
Scrum,敏捷式,或其他方法,沒有使用特定的方法與使用其他方法都比想像中來得高。
對此我們只能解釋可能存在我們沒有列出的方法才可能是正解。(有趣的是,在填寫者的
回答中Cerny method得到了最高的分數)。
Also, unlike our question regarding game engines, we can't simply write this
off as some methodologies being more appropriate for certain kinds of teams.
Production methodologies are generally intended to be universally useful,
and our results show no meaningful correlations between the methodology and
the game genre, team size, experience level, or any other factors.
同時不像對於引擎的問題,我們無法直接了當寫出對於某些團隊來說,某些方法論有效與
否。製程的方法論原本被認為是放諸四海皆準,但我們的結果卻沒辦法顯示出方法論與遊
戲類型,團隊人數,經驗,或其他項目有相關性。
This begs the question: where’s the payoff?
這就引出了問題,甚麼才是決定性的要素。
We’ve seen several significant correlations in this article, and we will
describe many more throughout our study. Articles 2 and 3 in particular will
illustrate many remarkable correlations between many different cultural
factors and game outcomes, with more than 85% of our questions showing a
statistically significant correlation.
這篇文章中我們以談到幾個決定性的相關度,而我們還會在之後的研究提出更多的結論。
第二與第三篇會提出更多關於文化與遊戲產出分數顯著的相關性,我們提出的問題中超過
八成五都有顯著的相關性。
So it’s very clear that where there were significant drivers of project
outcomes, they stood out very clearly. Our results were not shy. And if the
specific production methodology a team uses is really vitally important, we
would expect that it absolutely should have shown up in the outcome
correlations as well.
But it’s simply not there.
所以,顯然對於專案產出,我們的問卷已經找到決定性的驅動因子,非常明顯。也因此我
們沒必要藏私。假如此處特定的方法論很重要,我們深信必定可以在產出分數上產生相關
性。
但結果並不是如此。
It seems that in spite of all the attention paid to the subject, the
particular type of production methodology a team uses is not terribly
important, and it is not a significant driver of outcomes. Even the
much-maligned “Waterfall” approach can apparently be made to work well.
似乎儘管我們試圖找到方法論的相關性,都沒辦法找到方法論對產出是重要的證據。即便
是大家最不喜歡的瀑布式也都運作的很好。
Our third article will detail a number of additional questions we asked
around production that give some hints as to what aspects of production
actually impact project outcomes regardless of the specific methodology the
team uses