16 December 2016

Why I can't recommend Git for game projects.

As mentioned before on this blog, I have worked with many source control systems in the past 15 years; SVN, CSV (boy that's old now I that look up that link, I was a student while I last had to use this), Plastic SCM, Git and Perforce.

At work, the idea has started to rise to use Git for our projects. Most of these projects are websites, written in php, javascript, html and css. I am very fond of the idea to start using Git for those, it opens many many doors to easier ways of life.

We can use Travis CI, we deploy everything automagically on AWS, we can work offline and most importantly: we can branch all we want.

Definitely read this article, "A successfull Git branching model". This branching model known as "Git-Flow" does indeed ensure that you have very few merge conflicts. It is the basic model that is also recommended by Sourcetree.

Basically, it boils down to this:

  • There is a master branch
  • There is a dev branch
  • Create branches per feature, only merge them with dev
  • Create branches per release, merge them with dev and master
  • Create branches per hotfix, merge them with dev or the active release branch

And that looks like this:

I believe this is a decent model for using git. In web development.

In game development however, I don't think its the right tool for the job. Since the discussion to start using git began to rise at work, I found myself defending this point of view more and more often. I believe git isn't fit for game development, but it is hard to come up with a compelling case why this would be, when challenged at the spot.

Notice that I write "I don't think it's the right tool". I am not sure. And by writing this article and doing research for it, I hope to be more sure and build a good argument for this thesis.

The situation

First, try and follow me on this. Say you have a photoshop file, a psd. This psd file consists of one hand-painted layer. This psd is in the product that we want to ship, build features for, fix mistakes in, etc.

(I agree this is a contrived example, since you would never create a psd like that, but I'll make an analogy later that will justify this.)

I want this psd in source control, so I don't lose any progress. Thus, first commit on the master branch, we push to dev and start working on the psd. We draw a landscape, we add a tree, we add the sky. Cool, first version is finished, we merge with master.

We start working on the next feature: a tree house in the tree, we do this on a feature branch.

Feature

But! We also need to hotfix the release while we're working on the treehouse! It was required that the tree had leafs and seven apples but we forgot. So we create a hotfix branch to add the apples, and merge that back to master and to dev.

Hotfix

Up until now, no problemo. But when we want to merge the feature branch with the tree house into dev we have an issue: binary files don't merge, so we need to choose one of both, the apples or the treehouse. And even if binary files could be merged, how would you merge hand painted strokes in an image? Probably best if we just redo the apples on the feature branch and merge that into dev.

Do we just multiply?

A possible "solution"

One psd file containing the whole project clearly won't work, so let's split the features in the file into layers, and let each layer reference another psd in which we draw each separate feature. To follow the same example, we start with a psd in the master branch, push into dev and add a layer with the landscape, a layer with the sky and a layer with the tree. This goes back to master, first version released.

Now the same thing happens: we start a feature branch where we add a tree house. We also do the hotfix and create a layer with apples. Both get merged into dev but we have a merge conflict. We need to add both the apples layer and the treehouse layer into the same file. But which one goes on top of the other? It may very well be that there is a possible order, but it could also be that this doesn't work either way:

No 7 apples
The house is behind the leaves

It's the same for games

Now the idea is as follows: the psd is a lot like a game. In fact, the situation for a game is a lot worse than with the psd.

  • A game consists of many textures, who all have the problem as the layer-less psd. They can't be merged.
  • Animations idem dito
  • Models idem dito
  • Audio files idem dito
  • Prefabs idem dito
  • Scenes idem dito

Prefab and scenes are a lot like the psd with the layers, they reference other assets. Even if you would be able to merge scenes and prefabs, changing parts of a scene or a prefab in different branches can easily break the scene/prefab.

Point is, code files and text files are easily merged, but colors, images, shapes and sound are impossible to merge in a reliable way.

Are there any solutions?

(I focus on Unity here, but most of the items here are applicable to all other game engines too.)

  • A typical workaround is to submit these files only when they're completely done, so you will never need to merge them. But this is a workaround, when you do need to have an unforeseen change then you're back in the same situation. And if the files never change, why are they in version control in the first place?
  • Another workaround is to try and have as many files as possible in a mergeable format. Scenes and prefabs for example can be stored as text instead of binary data. With Unity's SmartMerge you can even easily merge the complex YAML files in the editor! You would have to, because a merge conflict in GIT makes the YAML file unparseable and you can't review the changes anymore in the editor.
    This tool is a god-send, but still won't fix all issues. It can make a file parseable again, but it won't stop treehouses being placed in the wrong tree. The artist will have to redo a part of his work.
  • Another and often used workaround is to put as little as possible in a scene and split everything into small prefabs. This will indeed avoid many problems, but it can be very cumbersome to manage. And it's just another case of the psd with layers, eventually the parts won't mix at some point and you'll have to redo work.
  • The same article presents another workaround: ensure that only one person can work at the same time on the same file. Git has no support for this. And even if it would (like Perforce does), it still wouldn't make it impossible to change the same file in different branches. Even Perforce can't prevent that.

Other issues

  • Games can grow into large asset- and codebases. Kweetet for example has +300K files, 25GB in size on the client, +250GB on the server. And that's only counting the game, without all related websites, assets, scripts etc. Divinity II also has +300K files, 70GB in size on the client and +500GB on the server. As this article points out, git needs to examine all files to check whether they changed, so this is not ideal for large codebases. (Actually, reading that article, I might want to try Mercurial!)
  • Another often heard issue is that git can't handle big files very well. Since every client stores the entire history of a file this can grow out of control with many large files. This has recently been solved with git-lfs. This will only store the latest revision of a file and keep the rest on a centralized location. (Thus breaking the whole decentralized idea of git...)

Conclusion

In conclusion, I cannot recommend Git to be used for large game development projects with medium to large teams. Large projects with many large binary files cannot be easily merged, so you need to keep branching to a minimum. On the other hand, if you have a small and short-lived game project and a small team (fourish people) Git would be a valid choice.

So then what?

So what do we do for Kweetet? We use Perforce for version control and have two branches; master and dev. We only develop on dev, and work in cycles from stable dev state to stable dev state. Each time when the dev branch is stable (usually when a new feature is ready), we integrate to the master branch. We clean up any issues we find on the master branch and integrate those back into dev. These are mostly small so it does not happen often that we have conflicts. In practice, we only work on one branch.

One of the advantages of using Perforce is that the developers can open a scene exclusively, do what they need to do and check it back in. That way, there are never conflicts in the scenes.

Another big advantage is that Perforce does not need to check which files changed, you need to tell Perforce that. Of course you don't do that by hand, all our tools do this automatically when we change a file. Unity, Visual Studio, Notedpad++, Sublime and others all have perforce plugins.

If you know of a better way to deal with the issues I mentioned in this post, I'm more than happy to hear it, because I would love to give Git a second chance, or improve our Perforce workflow.

Other interesting articles/forums I read for this post

[EDIT]

Other articles I found since posting the above: