Finding Evidence for GPL Violation with Ghidra and Friends
Let’s say you’ve worked a few years on an open source project released under the GNU General Public License (GPL)
and let’s say you’ve spied an application that is eerily similar to your project but whose owner claims no infringement of your rights.
How would you find evidence for such infringement or lack thereof?
Obviously, I don’t know the answer to this question as it will vary depending on your case.
I can, however, tell what I did when faced with this situation, which I hope you will find educational or even entertaining.
This post is somewhere between a write-up and a tutorial. Hopefully it is interesting as both.
Hang on to your trousers, we’re going in!
Ingredients
- An open source project on which you’ve worked for the past few years, released under a copyleft license such as the GPL.
- Some rascal who released a highly suspicious proprietary application that might contain code from your project.
- PEid
- OllyDbg
- Ghidra
Background
Since 2015 I’ve been a maintainer of the Spring RTS Engine project. During this time I’ve written some code, got into many heated arguments and introduced quite a few bugs. All in all, I had a lot of fun and I highly recommend getting involved in open source projects.
Part of the charm is that you have to learn new things all the time, because if you don’t do something, it’s quite probable no one else will. A game engine developer investigating an alleged GPL violation is a good example for this.
A few months ago we’ve discovered a newly released proprietary game - [MARS] Total Warfare (will be referred to as MARS from now on) that looked and felt extremely similar to games based on Spring.
These similarities weren’t limited to look-and-feel but also included technical characteristics, such as using similar assets and containing a lua scripting engine with the exact same API up to the Spring.Foo()
calling convention.
The download itself is a small <6MB executable, but every time the game is started it “streams” (read: downloads) a few dozen MBs of assets that can be found here.
In this script, for instance, you can find a number of function calls (Spring.Echo(...)
, Spring.IsCheatingEnabled()
, Spring.SendMessageToPlayer(...)
) conveniently described in Spring’s documentation.
In fact this script was originally written for Spring games but now is used within MARS. This can be explained by one of two options:
- The developer of MARS recreated Spring’s lua API, asset loading code etc. without using Spring’s source. This is not entirely improbable, after all Spring itself started as a clone of the 1997 game Total Annihilation.
- The developer of MARS used Spring’s source code while making MARS without releasing the code of MARS, in violation of the GPL.
How can we distinguish between the two? The easiest way is to just ask the dev!
Alas, we have to delve into the executable and compare it with Spring’s.
Scouting
The easiest things to look for are strings, just fire up your favourite hex editor or viewer and search for them.
In our case, since MARS implements Spring’s lua API which mostly looks like Spring.FunctionName()
, we can expect to find both "FunctionName"
and "Spring"
in the binaries.
How could this be? We know for sure that the game uses these strings.
A voice in the back of your head has a suggestion: The executable might be packed!
But why would someone want to shave a few MB off of an executable when it downloads a few dozen more every single time it starts?
Might it be that they wanted to hide something? But let us not rush into conclusions which will present themselves in due time on their own accord.
PEid to the rescue! It should tell us what packer was used, so we can unpack the executable and start working already!
Wait! There’s “Extra Information”!
What this means is that while we don’t know which packer was used, statistical analysis shows that the entropy is higher than expected for a normal executable (Spring’s entropy, for instance, is 6.26). This makes a lot of sense, in a non-packed executable there are many strings, there are instructions which are more common and ones which are less, so the distribution is less flat => the entropy is lower.
While we don’t know the packer, we do know of a program that unpacks this executable - the executable itself! It must unpack the packed instructions into memory before the CPU can execute them, so all we need to do is run it and then dump the memory where the unpacked instructions reside. For this we use OllyDbg.
We start the game, we start OllyDbg and we attach it to the game’s process. If we are already running unpacked code, we should be able to see this in the call stack:
We can see that the code being ran is between offsets 0x017XXXXX-0x01BXXXXX
so let’s look where these reside in the memory map:
Voilà! We’ve found a good candidate - note how it has E (execute) permissions which is exactly what we would expect. Can we find our strings inside?
Time to dump this segment (Right-Click->Backup->Save data to file…) and press on forward.
Don’t forget to write down the base address of this segment (OllyDbg automatically appends it to the file’s name), we will need it very soon.
You can also close the game and delete it now, unless you want to play it, but do you, really?
Raiding
It is now time to bring the big guns. We will use Ghidra to reverse engineer the segment we’ve just dumped. Fire Ghidra up, import the file and make sure you set the base address in the options to the one you have written down.
Now double click the file and let Ghidra analyse it. While it’s analysing feel free to go for a walk, play a set of tunes on your accordion, put the kettle on and/or skim the front page of reddit. This takes some time.
Our foot-in-the-door here is the lua API. If you want to expose a C function to lua code, it will be done more or less like this:
So if we search for functionName
we should find a reference to it from the glue code, and right under it we should find the pointer to the function itself!
Follow the XREF to find something familiar:
In the right you can see the decompiled code, it’s not always 100% accurate, but it’s surprisingly good.
We can now rename functions to make the code easier to read. Bonus points if you change the signature of lua_rawset
so it shows -3
instead of 0xfffffffd
.
As we planned before, we now follow LAB_0154b010
where we expect to find Spring.GetUnitDefID()
. When we get there, press F to create a function.
Making heads or tails from this code might have been hard if not for the fact we have a reference! In Spring this function gets the unique ID of an ingame unit from lua, checks whether the lua state asking for its type (unitDef) is allowed to have this information and if it is, returns the ID of the unitDef to lua. Like homework, reverse engineering becomes much easier when you have the solution in front of you. We expect to see something similar to:
Most of the differences can be explained by the inlining of ParseUnit()
, IsAllyUnit()
, etc. so we will use them as reference as well.
Well, that’s weird… In Spring’s code we have one input error message but MARS has two? Maybe they’re not related?
Hold your horses, Who said MARS based itself on the latest version? Let’s go a little bit back in time to 2015:
Now that looks much more similar! Let’s rename things accordingly in the decompiled code:
We now continue to IsUnitVisible
and the rest
Here is the decompiled, renamed and annotated code (I’m not sure if it’s entirely correct, but that doesn’t matter much)
We can see the following things (check comments in code):
- MARS uses lua contexts to save visibility permissions about the lua state in the exact same way as Spring. A boolean value controls whether the state has fullread (can see everything) and a number represents the team (allyteam in Spring jargon) this lua state belongs to (basically limiting it to see what the player can see). If the state has fullread, the allyteam number will be negative.
- MARS is saving the line of sight status of units in the exact same way as Spring. A packed bitfield contains flags such as INLOS, PREVLOS and CONTRADAR. The tests whether a unit’s unitDef is known are identical to Spring’s.
- MARS is handling decoys (units that appear to be a different unit to the enemy) in the exact same way as Spring.
This should be enough to convince us we’re not entirely paranoid. It’s unlikely MARS was coded using clean-room design. One might say, however, that the lua API is insignificant, coding it according to the documentation is quite straightforward and the similarities are inevitable.
Alright, we can look for some inner function, something that exists not only as part of the lua API.
Attacking
Our candidate is Spring.GetUnitsInRectangle()
.
By now you should be able to find it, look for a reference and rename some vars to get this:
Seems very similar to:
But that’s to be expected. What we actually want to investigate is how GetUnitsExact()
is implemented.
While the lua API has to be quite similar, games that were developed separately will have subtly different ways of finding out which units reside inside a map rectangle.
To see what differences exist, we can compare Spring’s method (reference) with other open source strategy games such as 0AD (reference) and OpenRA (reference). All examples use the same general idea:
- Partition the map into a grid of squares (quads in Spring jargon).
- For each square keep a list of units in that square.
- When queried about a map rectangle, check which squares intersect with the given rectangle and return all units within them.
But there are also implementation differences:
- The square size (Spring: 128, 0AD: 20, OpenRA: 1024). Obviously that also depends on the scale of the game, but having an identical size should raise an eyebrow or two.
- The method of iterating over the squares. Both OpenRA and 0AD go with the straightforward way of calculating the max. and min. row and column of squares intersecting the rectangle, then doing a triple loop over row/col/unit. Spring (for some arcane reason) calculates the max. and min. row and column just the same, but instead of doing a triple loop, it does two double loops. In the first it iterates over row/col and populates a list of squares. In the second it iterates over <squares_in_list>/units.
- Since units in Spring have a radius, they can exist in more than one square. This requires a system for preventing duplicates.
Here is Spring’s code:
The two double loops and the list of quads are immediately visible. The next thing that grabs your eye is tempNum - the duplicates prevention mechanism. In short, every unit that was inserted to the returned vector is coloured with a unique colour (a running global counter). Units that have the current colour must have been already added and are therefore skipped.
Time to check MARS. We will start with GetQuadsRectangle()
:
In the start we can see the standard clamping of mins and maxs together with the conversion from world coordinates to row and column.
This conversion is simply coord >> 7
or in other words coord / 128
.
At this moment in time your eyebrow should be in a raised position.
For some code generation and/or compilation reason, it looks like we have three nested loops, but that’s only because it’s populating the square (quad) vector two squares at a time which requires dealing with an odd number of columns.
Ultimately, this function does the exact same thing as Spring’s GetQuadsRectangle()
.
Onwards to GetUnitsExact()
:
Even though the decompilation isn’t entirely coherent, the structure is clear. There are two nested loops in the same structure as Spring, first on the quad vector and then on the units within the quad. The same method (tempNum) is used to prevent duplicates.
The main difference is the branching of the external if
according to the result of FUN_01437cb0()
.
This might be necessary if our function is called from another function that uses tempNum or if two such functions can run in parallel.
Probably one of the changes to Spring made by the MARS developer.
To conclude:
- MARS uses the same square size as Spring.
- MARS has the same (non-standard) two double-nested loops implementation as spring for checking which units are in a given map rectangle.
- MARS uses the same colour (tempNum) mechanism for preventing duplicates.
Winning
This was enough to convince me that MARS was not coded using clean-room design and it is very likely that its developer copied and adapted code from Spring.
I hope it was enough to convince you as well.
Since the sources of MARS weren’t released and since the developer denied the above, I fear that it is knowingly in violation of the GPL.
It’s important to say that the Spring community is very happy that Spring is used in many projects (commercial or otherwise) and the best outcome in our opinion would be a source code release of MARS.
Thank you for reading!