Sponsored By

Investigating Unity iOS executable bloat

In this article I dig in to the iOS binary generated by Unity, and look at some tools we can use to understand what's going on and what's in there.

Sam Izzo, Blogger

July 17, 2017

12 Min Read
Game Developer logo in a gray background | Game Developer

This article is a cross-post from the Polyphonic blog!

We've been busy getting Resynth ready for submission, and part of that process has involved doing a lot of optimisation for both runtime performance and binary size. We made a lot of fixes to improve performance on low end iOS devices, and crunched down textures and sounds where we could to reduce the installed app size. The last thing I want to do is to look at the iOS executable binary size.

According to the Unity build logs, our data size breakdown looks like this:


Textures      26.1 mb    71.1%
Meshes        0.0 kb     0.0%
Animations    156.0 kb   0.4%
Sounds        1.6 mb     4.3%
Shaders       47.7 kb    0.1%
Other Assets  3.4 mb     9.2%
Levels        324.1 kb   0.9%
Scripts       1.1 mb     3.1%
Included DLLs 3.9 mb     10.7%
File headers  77.1 kb    0.2%
Complete size 36.8 mb    100.0%

We've got a bunch of textures that we want looking crisp on high resolution devices so there's not much more we can do here. It's pretty small compared to a lot of games anyway.

The iTunes Connect file size estimate for iPhone 6S is 61.3 Mb. That's the full install size after the IPA file is uncompressed on the device. If we subtract the total data size Unity gave us, that leaves 24.5 Mb for the code and any other miscellaneous files in the IPA like icons and launch screens.

Looking again at the Unity breakdown, we see that scripts seem to contribute only 1.1 Mb. That's a far cry from our estimate of 24.5 Mb! Scripts are compiled to native code, of course, and the iTunes estimate is the full binary size including the Unity engine code, launch screens, and icons. Let's see if we can get a more detailed breakdown and work out what's going on and maybe even reduce the iOS executable binary size.

Examining the IPA

To investigate this, let's start with the build that we uploaded to iTunes Connect. The IPA file is really just a zip file with its contents arranged in a specific way. If we unzip the IPA and take a look in the Payload/resynth.app directory, we see this:


-rw-r--r--   1 buildbot  staff       2371 Jun  9 14:14 AppIcon57x57.png
                               ..etc
-rw-r--r--   1 buildbot  staff      13313 Jun  9 14:14 AppIcon83.5x83.5@2x~ipad.png
drwxr-xr-x  32 buildbot  staff       1088 Jun  9 14:14 Data
-rw-r--r--   1 buildbot  staff       2763 Jun  9 14:15 Info.plist
-rw-r--r--   1 buildbot  staff       4617 Jun  9 14:14 [email protected]
                               ..etc
-rw-r--r--   1 buildbot  staff       3935 Jun  9 14:14 [email protected]
-rw-r--r--   1 buildbot  staff       2053 Jun  9 14:15 LaunchScreen-iPad.nib
-rw-r--r--   1 buildbot  staff     221960 Jun  9 14:14 LaunchScreen-iPad.png
-rw-r--r--   1 buildbot  staff       4357 Jun  9 14:15 LaunchScreen-iPhone.nib
-rw-r--r--   1 buildbot  staff     133646 Jun  9 14:14 LaunchScreen-iPhoneLandscape.png
-rw-r--r--   1 buildbot  staff     133646 Jun  9 14:14 LaunchScreen-iPhonePortrait.png
-rw-r--r--   1 buildbot  staff          8 Jun  9 14:15 PkgInfo
drwxr-xr-x   3 buildbot  staff        102 Jun  9 14:20 _CodeSignature
-rw-------   1 buildbot  staff        667 Jun  9 14:20 archived-expanded-entitlements.xcent
drwxr-xr-x   4 buildbot  staff        136 Jun  9 14:14 de.lproj
-rw-------   1 buildbot  staff       8247 Jun  9 14:21 embedded.mobileprovision
drwxr-xr-x   4 buildbot  staff        136 Jun  9 14:14 en.lproj
drwxr-xr-x   4 buildbot  staff        136 Jun  9 14:14 es.lproj
drwxr-xr-x   4 buildbot  staff        136 Jun  9 14:14 fr.lproj
drwxr-xr-x   4 buildbot  staff        136 Jun  9 14:14 it.lproj
drwxr-xr-x   4 buildbot  staff        136 Jun  9 14:14 ja.lproj
drwxr-xr-x   4 buildbot  staff        136 Jun  9 14:14 ko.lproj
-rwx------   1 buildbot  staff  271362480 Jun  9 14:21 resynth
drwxr-xr-x   4 buildbot  staff        136 Jun  9 14:14 zh.lproj

There are multiple variations of the app icon and launch screens, so I've removed those lines for brevity.

The executable binary itself is highlighted above, and it clocks in at 271 Mb in size! That doesn't seem right!

Well actually, there are two reasons this binary is so big. Firstly, I built a universal app, which means the binary includes two versions of the executable, one for armv7 (32-bit) CPUs and one for arm64 (64-bit) CPUs. Secondly, I enabled bitcode, which increases the binary size significantly. Bitcode is an intermediate representation of a program though; it's not the final, machine-readable binary version. Apple's servers recompile this bitcode into an armv7 or arm64 binary depending on the device of the user, so you don't have to worry: your users aren't going to get a massive binary file!

Non-bitcode builds

We can get more useful information from a non-bitcode build. This will give us a better approximation of the final binary size on a user's device. Here is the binary from a universal build with bitcode disabled:


-rwx------   1 buildbot  staff  38247008 Jun 15 15:19 resynth

38 Mb, much more reasonable! We can use Apple's otool program to examine this in a bit more detail. At a shell prompt we can run:


$ otool -fv resynth

This will show us the headers for the binary:


Fat headers
fat_magic FAT_MAGIC
nfat_arch 2
architecture armv7
    cputype CPU_TYPE_ARM
    cpusubtype CPU_SUBTYPE_ARM_V7
    capabilities 0x0
    offset 16384
    size 17976384
    align 2^14 (16384)
architecture arm64
    cputype CPU_TYPE_ARM64
    cpusubtype CPU_SUBTYPE_ARM64_ALL
    capabilities 0x0
    offset 18006016
    size 20240992
    align 2^14 (16384)

Here we see that for the arm64 architecture, the size is around 20 Mb. This is much closer to our iPhone 6S estimate of 24 Mb. Note that the header gives us the full size which includes code as well as constant, static, and global data in the binary.

Remember that this size also includes the Unity engine itself. Let's see if we can figure out what code is actually included in our binary, and if Unity is doing a good job of only including the code that we use.

Unity compilation refresher

Before we go on, let's talk about how Unity generates your iOS binary.

When you build your project, Unity takes all your non-editor C# scripts and compiles them with the Mono compiler. This process generates several DLLs. Game scripts will end up in two DLLs called Assembly-CSharp.dll and Assembly-CSharp-firstpass.dll.

These DLLs are actually not in any machine-specific binary format; they are in an intermediate language (IL), which is a bytecode that is interpreted by the Mono (or .NET) runtime and turned into platform-specific binary code dynamically, while the program is running. This process is called just-in-time (JIT) compiling.

Unfortunately Apple doesn't allow binary code to be dynamically generated and executed, so for iOS Unity must convert the bytecode into a platform-specific, machine-readable binary format ahead-of-time (AOT), that is, at build time. Before IL2CPP (Intermediate Language 2 CPP) existed, Unity used Mono's AOT compiler to do this. IL2CPP sidesteps that process and instead turns the IL bytecode into C++ code which can be compiled with any C++ compiler.

This makes it much easier for the Unity engine to be ported to new platforms. Unity no longer has to add support for the platform to Mono's AOT and JIT compilers; instead they just rely on the platform's native C++ compiler. This also results in better performance.

For iOS, Unity writes out an Xcode project that includes all the generated C++ files. This is then compiled and linked and the result is your iOS binary executable. The Xcode compiler will remove any code that is not referenced and optimise the remaining code.

Digging in to symbols

The symbol files contain the names of all symbols (functions, global variables, classes, etc) in the app and where they are located in memory. This is useful because the binary that Apple distributes to users is stripped of most symbols, so if we get a crash report from a user or from Apple, it will contain a bunch of meaningless memory addresses. We need to convert the memory addresses to human readable names to get anything useful out of the crash info, and we can do this using the symbol files (see here for how to do this).

In iTunes Connect we can download the symbols (DSYMs) for the IPA that we uploaded. If our app is a universal app, when we download and extract the symbols we should find two directories with random GUIDs that correspond to the two architectures, armv7 and arm64 (if it's not a universal app there will be only one directory). Drilling all the way down into the directories we should find a single binary. We can determine what architecture the binary is for using the file command:


$ file resynth

This will output:


resynth: Mach-O 64-bit dSYM companion file

This tells us that this binary is the symbol file for the arm64 version of the app. It's a iOS Mach-O binary but it doesn't contain any executable code, only symbol information.

We can also get the symbols after building in Xcode. In your Xcode DerivedData directory for your game, find the Release-iphoneos directory. There should be a directory in there named with a name like resynth.app.dSYM. The full path will be something like:


~/Library/Developer/Xcode/DerivedData/Unity-iPhone/Build/Products/Release-iphoneos/resynth.app.dSYM/Contents/Resources/DWARF

To view all symbols in the binary we can use the nm command:


$ nm -U -arch=arm64 resynth | less

This will pipe the output through the less program, which will let you scroll through the list using the arrow, home, end, page up, and page down keys (or 'j' and 'k' if your terminal is not configured correctly!).

Here's a part of the output for Resynth:


000000010006f0c8 t _PackButton_get_IsComingSoon_m833010487
000000010006f0b8 t _PackButton_get_IsUnlocked_m755995454
000000010006f0a8 t _PackButton_get_Pack_m3924171162
000000010006f0b0 t _PackButton_set_Pack_m3180976721
00000001000720f0 t _PackManager_Awake_m1581176858
0000000100071998 t _PackManager_CanPlayerBuyLevelPack_m1143483345
00000001000719a0 t _PackManager_CanPlayerBuyPack_m3419497270
0000000100071aa8 t _PackManager_CanPlayerBuyThemePack_m3160178262
0000000100073834 t _PackManager_CreatePackButton_m3207230420

This is showing some of the symbols for Resynth's DLC pack management code. Although IL2CPP has generated native code with mangled names, we can still make out the C# classes and methods that correspond to the above code (PackButton and PackManager).

These are things that we'd expect to see in here, because this is code that is used!

Just for sanity's sake, let's look for some other things that probably shouldn't be in here:


$ nm -U -arch=arm64 resynth | grep UnityWebRequest | wc -l
     367

Resynth doesn't use UnityWebRequest, so it's strange that it's included here.

Let's try physics colliders, which Resynth also does not use:


$ nm -U -arch=arm64 resynth | grep Collider | wc -l
     203

Okay, also a bit strange. It seems that Unity isn't stripping out some unused components. Additionally, even a lot of core .NET functionality is still being included: things like ArrayListHashtable, TLS code, and FTP code are still present, and the game definitely doesn't use these.

Unused code

Before going any further, we should make sure that we have removed all unnecessary non-editor scripts from the project. Unfortunately Unity will compile and link all non-editor scripts even if no GameObjects reference them. This means that if, for example, you imported a plugin from the asset store and it had some example code, that code might end up in your final build on the app store!

It's a shame that Unity has no way to disable these files or exclude them from being built. For now the only way seems to be to either delete them entirely or move them into an "editor" folder.

Method tables

One of the auto-generated C++ files that Unity writes out is a file called Il2CppMethodPointerTable.cpp. At the bottom of this file is a big array of pointers to methods called the "method pointer table". We can perform our own "stripping" by replacing entries in this table with a NULL or 0. This will cause the linker to avoid linking the code, and reduce the size of the final binary!

We have to be very careful though: if we remove methods that are used, the game will crash, so we have to test carefully. It's not obvious that some methods are required. For example, I wanted to remove all ArrayList and Hashtable methods since Resynth only uses the generic collections, but it turns out that these are used deep in the Mono base class libraries during application startup (fortunately I found this out quickly!).

We can do this stripping process semi-automatically in a build post-process step with some code like this:


[PostProcessBuild]
public static void OnPostProcessBuild(BuildTarget buildTarget, string pathToBuildProject)
{
    if (buildTarget != BuildTarget.iOS)
        return;
        
    StripMethods(pathToBuiltProject);
}

private static string[] s_stripList =
{
    // Strip some unused Unity methods.
    "UnityWebRequest_",
    "Collision_",
    "Collision2D_",
    "Physics_",
    "Physics2D_",
    
    // Strip some unused .NET class library methods.
    "FtpWebRequest_",
    "FtpWebResponse_",
    "FileWebRequest_"
    "X509",
    "BigInteger_"
};

private static void StripMethods(string pathToBuiltProject)
{
    var methodPointerTable = Path.Combine(pathToBuiltProject,
        "Classes/Native/Il2CppMethodPointerTable.cpp");
    var lines = File.ReadAllLines(methodPointerTable);
    var inMethodPointerTable = false;
    for (var i = 0; i < lines.Length; i++)
    {
        var line = lines[i];
        if (line.StartsWith("extern const Il2CppMethodPointer g_MethodPointers"))
        {
            inMethodPointerTable = true;
            // Skip the opening brace.
            i++;
        }
        else if (inMethodPointerTable)
        {
            if (line.StartsWith("};"))
            {
                // Finished parsing the method pointer table!
                inMethodPointerTable = false;
            }
            else
            {
                lines[i] = PatchLine(line);
            }
        }
    }

    File.WriteAllLines(methodPointerTable, lines);
}

private static string PatchLine(string line)
{
    var trimmed = line.Trim();
    foreach (var s in s_stripList)
    {
        if (trimmed.StartsWith(s))
        {
            line = "\t0, //" + line;
            return line;
        }
    }

    return line;
}

Final thoughts

In the above gist I've only stripped a handful of methods, but when I was testing in Resynth, I removed quite a few more. Unfortunately I was only able to reduce the binary by around 630k. There is also a lot of corresponding metadata for types and methods which is still included. Removing this by parsing the generated C++ files would be very messy and difficult.

Unfortunately, stripping code manually from the method table doesn't seem worth it. It's a time-consuming process, plus it has the added potential of crashing your game if you aren't careful! Although it's obvious that some methods are not used, for others, you have to play the entire game and ensure that all code paths are tested just in case you strip out something that is used. It's not worth it for the small gain.

Read more about:

Featured Blogs

About the Author

Daily news, dev blogs, and stories from Game Developer straight to your inbox

You May Also Like