Trending
Opinion: How will Project 2025 impact game developers?
The Heritage Foundation's manifesto for the possible next administration could do great harm to many, including large portions of the game development community.
In this reprinted <a href="http://altdevblogaday.com/">#altdevblogaday</a> in-depth piece, Gamer Camp's Alex Darby continues his series on a C/C++ Low Level Curriculum by looking at the conditional operator and switch statements.
[In this reprinted #altdevblogaday in-depth piece, Gamer Camp's Alex Darby continues his series on a C/C++ Low Level Curriculum by looking at the conditional operator and switch statements.] Hello humans. Welcome to the seventh part of the C/C++ Low Level Curriculum series I've been writing. This post covers the conditional operator, and switch statements. As per usual I will be showing snippets of C++ code and throwing the corresponding x86 assembler at you (as produced by VS2010) to show you what your high level code is actually doing at the assembler level. Disclaimer: in an ideal world, I'd like to try to avoid assumed knowledge, but keeping up the level of detail in each post that this entails is, frankly, too much work. Consequently, I will from now on point you at post 6 as a "how to" and then get on with it… Here are the backlinks for preceding articles in the series (warning: it might take you a while, the first few are quite long):
C / C++ Low Level Curriculum Part 6: Conditionals [see near the top of this post for details on compiling & running the code snippets]
The conditional operator I assume that everyone's familiar with the conditional operator, also known as the "question mark", or the ternary operator ("ternary" because it's the only C/C++ operator that takes three operands). If you're not, here's a link so you can catch up (I predict that you will be so stoked to find out about it that you will be over-using it within the week). Personally I heartily approve of the conditional operator when used judiciously, but it's not always great for source level debugging because it's basically a single line if-else and can be hard to follow in the debugger (in fact I've heard of it being banned under the coding standards at more than one company, but there you are we can't all be sane can we?). Anyway, let's have a quick look at it with some code:
#include "stdafx.h" int main(int argc, char* argv[]) { // the line after this comment is logically equivalent to the following line of code: // int iLocal; if( argc > 2 ){ iLocal = 3; }else{ iLocal = 7; } int iLocal = (argc > 1) ? 3 : 7; return 0; }
If you remember the the assembler that a basic if-else generated in the last article, then the assembler generated here will probably bust your mind gaskets… Note:
I've deliberately left the function prologue and epilogue out of the asm below, and just left the assembler involved with the conditional assignment
if your disassembly view doesn't show the variable names, then you need to right click the window and check "Show Symbol Names"
5: int iLocal = (argc > 2) ? 3 : 7; 01311249 xor eax,eax 0131124B cmp dword ptr [argc],2 0131124F setle al 01311252 lea eax,[eax*4+3] 01311259 mov dword ptr [iLocal],eax
Clearly this is not very much like the code for the simple if-else that we looked at previously. This is because there is trickery afoot and the compiler has chosen to do sneaky branchless code to implement the logic specified by the C++ code. So, let's examine it line by line:
line 1 – uses the xor instruction to set eax to 0. Anything XORed with itself is 0.
line 2 – as in the previous if examples this uses cmp to test the condition, setting flags in a special purpose CPU register based on the result of the comparison.
line 3 – this is a new one! The instruction setless equal sets its operand to 1 if the 1st operand of the preceding cmp was less than or equal to the 2nd operand, and to 0 if it was greater. We've not seen the operand al before, it's a legacy (386) register name which now maps to the lowest byte of the eax register (if you're a sensible person and are stepping through this code in your debugger with the register window open, you will see that this instruction causes the eax register to be set to 1 – also note that this only works because eax has already been set to 0).
line 4 – uses the load effective address instruction do do some sneaky maths that relies on the value of eax set by setle in line 3.
line 5 – moves the value from eax into the memory address storing the value of iLocal
That's all fine, but how does it work? Firstly, note that at the assembler level the comparative instruction setle is (as in the previous post's examples) testing the opposite condition to the conditonal specified in the C++ code. This means that the eax register will be set to 0 in line 3 if argc is greater than 2, which in turn means that the eax*4+3 part of line 4 will evaluate to (0*4)+3 - i.e. 3. Conversely, if argc is less than or equal to 2, the eax register will be set to 1 which in turn means that line 4 will evaluate to (1*4)+3 - i.e. 7. So, as you can see, the assembler is doing the same branchless set of instructions regardless of the condition, but using the 0 or 1 result of the conditional instruction in the maths to cancel out or include one of the terms and give what I like to call a "mathematical if". Clever. Incidentally this sort of branchless-but-still-conditional code has been a bit / lot of a hot topic over the last few years, especially on consoles since their CPUs are particularly branch mis-prediction sensitive. Judicious use of the "branchless conditional" idiom is a tool that can be used to combat branch (mis-)prediction related performance issues – for an example of this, see the use of the fsel PPU instruction in this ADBAD post by Tony Albrecht, and for brief a discussion of branch prediction issues (primarily PC related) see this article by Igor Ostrovsky (who works for Microsoft). The conditional operator (part deux) So, clearly our above super-simple-sample resulted in the compiler generating clever assembler because of the constant values in it; interesting certainly, but not necessarily representative of most "real world" assembler. Let's see what happens if we use variables with the conditional operator…
#include "stdafx.h"
int main(int argc, char* argv[])
{
int iOperandTwo = 3;
int iOperandThree = 7;
int iLocal = (argc > 2) ? iOperandTwo : iOperandThree;
return 0;
}
And, here's the relevant disassembly:
5: int iOperandTwo = 3; 00CF1619 mov dword ptr [iOperandTwo],3 6: int iOperandThree = 7; 00CF1620 mov dword ptr [iOperandThree],7 7: int iLocal = (argc > 2) ? iOperandTwo : iOperandThree; 00CF1627 cmp dword ptr [argc],2 00CF162B jle main+25h (0CF1635h) 00CF162D mov eax,dword ptr [iOperandTwo] 00CF1630 mov dword ptr [ebp-50h],eax 00CF1633 jmp main+2Bh (0CF163Bh) 00CF1635 mov ecx,dword ptr [iOperandThree] 00CF1638 mov dword ptr [ebp-50h],ecx 00CF163B mov edx,dword ptr [ebp-50h] 00CF163E mov dword ptr [iLocal],edx
Since the conditional operator is now assigning from variables we'd expect it to generate something that looks more like the sort of code we saw from the basic if-else we looked at last time, which it has. We have the expected cmp followed by a conditional jump testing against the opposite of the conditional, then two blocks of assembler, the first of which (lines 7 to 9) unconditionally jumps over the second (lines 10 and 11) if it executes, so essentially it's behaving more or less as expected; however there's clearly some interesting stuff happening in there:
the two branches use different registers to store their intermediate values; the first uses eax, the second uses ecx
both branches store their result to the same memory address in the Stack (see this post if you don't know or can't remember about Stack Frames) – i.e. [ebp-50h]
the code that assigns the value to iLocal (lines 12 and 13) only exists once and is executed regardless of which branch was taken; it takes the value from[ebp-50h] and writes it into iLocal using uses a third register (edx)
The use of different registers for the different branches in step 1 looks like it might be significant but (according to several expert sources) this is apparently perfectly normal compiler behavior and not anything to read into. Steps 2 and 3 show that the that generated from the conditional operator (at least with VS2010) isn't directly equivalent to the intuitively equivalent if-else statement:
// intuitively equivalent if-else of // int iLocal = (argc > 2 ) ? iOperandTwo : iOperandThree; int iLocal; if( argc > 2 ) { iLocal = iOperandTwo; } else { iLocal = iOperandThree; }
Rather than choosing between one of two assignments like this if-else, the assembler generated for our use of the conditional operator does exactly what we told it to: choose one of two values (store it temporarily in the Stack) and assign iLocal from it. Switch Statements The final type of conditional statement we'll be looking at is the switch statement. Like the conditional operator, the switch statement is an often abused and maligned construct that you wouldn't want to live without. To be 100% fair to the switch statement it's never the fault of the switch statement that it's possible for maniacs to write brittle and insane code using them; I would like to say that the maniacs in question know who they are, but in fact it's pretty unlikely that they do know or they wouldn't do it… In any case, rest assured that whilst you may never know whether you are one of these maniacs, everyone else in your team will know whether you are because they've looked at your switch statements; and don't think using if-else if-else instead of switch will help you evade detection, because that'll just make it even more obvious ;) Anyway, sniping aside, the switch statement is particularly interesting because when used in certain ways it can produce some pretty cool assembler. So let's take a look at a switch statement…
#include "stdafx.h" int main(int argc, char* argv[]) { int iLocal = 0; // n.b. no "break" in case 1 so we can // see what "fall through" looks like switch( argc ) { case 1: iLocal = 6; case 3: iLocal = 7; break; case 5: iLocal = 8; break; default: iLocal = 9; break; } return 0; }
And here's the disassembly…
9: switch( argc ) 00C61620 mov eax,dword ptr [argc] 00C61623 mov dword ptr [ebp-48h],eax 00C61626 cmp dword ptr [ebp-48h],1 00C6162A je main+2Ah (0C6163Ah) 00C6162C cmp dword ptr [ebp-48h],3 00C61630 je main+31h (0C61641h) 00C61632 cmp dword ptr [ebp-48h],5 00C61636 je main+3Ah (0C6164Ah) 00C61638 jmp main+43h (0C61653h) 10: { 11: case 1: 12: iLocal = 6; 00C6163A mov dword ptr [iLocal],6 13: case 3: 14: iLocal = 7; 00C61641 mov dword ptr [iLocal],7 15: break; 00C61648 jmp main+4Ah (0C6165Ah) 16: case 5: 17: iLocal = 8; 00C6164A mov dword ptr [iLocal],8 18: break; 00C61651 jmp main+4Ah (0C6165Ah) 19: default: 20: iLocal = 9; 00C61653 mov dword ptr [iLocal],9 21: break; 22: }
This is more or less exactly what you'd expect:
line 1 stores argc into the Stack at [ebp-48h]
then block from line 2 to 9 implements the logic of the switch by a series of comparisons of this value against the constants specified in the case statements and associated conditional jumps to the assembler generated by the code in the corresponding case statement
if none of the conditional jumps are triggered, the logic causes an unconditional jump to the default: case.
in particular, note that:
wherever the break keyword is used this causes an unconditional jump past the end of the assembler generated by the switch
the "drop through" from case 1: into case 3: in the high level code happens at the assembler level as a by product of the organization of the adjacent blocks of instructions generated for the switch by the compiler, and the lack of unconditional jump at the end of the assembler for case 1:
If you look at assembler from the sample if-else-if-else in the last article; you should be able to see that the assembler generated for this switch is (more or less) what would happen if we had written the switch as an if-else-if-else and then re-organized the assembler so all the logic was in one place at the top, and the assembler generated for each code block was left where it was. So other than the fact that the switch statement is a very useful C/C++ language convenience for managing what would often otherwise be messy looking and error prone chains of if-else-if-else statements, based on this example it doesn't appear to be doing anything which might offer a significant advantage at the assembler level – so why would I have claimed that the compiler might generate "pretty cool assembler" for a switch? Before we assume we've seen it all, let's try using a contiguous range of values for the constants in the cases of the switch. You know, just for fun – and for the sake of simplicity let's start at 0.
#include "stdafx.h" int main(int argc, char* argv[]) { int iLocal = 0; switch( argc ) { case 0: iLocal = 4; break; case 1: iLocal = 5; break; case 2: iLocal = 6; break; case 3: iLocal = 7; break; } return 0; }
And here's the disassembly it generates… Ok, so this time something more interesting is definitely going on – n.b. I've used a screenshot rather than just pasting the text because we need to look in a memory window to make sense of it. So what exactly is it doing?
it moves argc into eax, then stores it into the Stack at [ebp-48h]
it then compares the value stored in the address [ebp-48h] with 3 (i.e. our maximum case constant)
if this value is greater than 3 then ja (jump above) on the next line will cause execution to jump to 8D1658h – the 1st instruction after the code generated by the case blocks, skipping the switch
if the value is less than or equal to 3 then the value is moved into ecx, and we then have an unconditional jump to … somewhere :-/
Ok, so that final unconditional jump has some syntax we've not yet seen for its address operand, and which clearly isn't a constant:
jmp dword ptr (1B1664h)[ecx*4]
This says "jump to the location stored in the memory address at an offset of 4 times the value of ecx from the memory address 8D1664h", so how is this implementing the logic of the C++ switch statement? To answer this question we need to look in a memory window at the address 8D1664h (n.b. to open a memory window from the menu in VS2010 when debugging go Debug -> Windows -> Memory -> … and choose one of the memory windows. To set the address just copy and paste it from the disassembly into the "Address:" input box. You will also need to right click and choose "4-byte integer" and set the "Columns:" list box to 1 to have it look like the screenshot above). So, if you cast your eyes up to the memory window on the left of the screenshot above, you will see that the top 4 rows are highlighted, these values start at address 8D1664h and are 4 byte integers (hence the ecx*4 in the operand) – which specifically in this case are pointers. The instruction jmp dword ptr (8D1664h)[ecx*4] will jump to the value stored in the address:
8D1664h + 0 = 8D1664h if the value in ecx is 0
8D1664h + 4 = 8D1668h if the value of ecx is 1
8D1664h + 8 = 8D166Ch if the value of ecx is 2
8D1664h + Ch = 8D1670h if the value of ecx is 3
So, the four highlighted rows make up a jump table – since our case constant's range is from 0 to 3 it is an array of 4 pointers – with each element of the array pointing to the execution address of the case block matching its array index. You can verify this by checking the addresses of the first instruction generated for each case against the 4 values stored in the array. Maybe it's just me, but I think this is some pretty cool assembler. It's certainly more elegant that the assembler generated by the first switch we looked at, but what – if anything – is the advantage of this over the assembler that was generated for the previous case statement? In theory this jump table form reaches the code in constant time for all cases, whereas in the if-else-if-else form the time to reach the code corresponding to each case will be proportional to the number of previous cases in the switch statement. You're pretty unlikely to find that a switch statement is a performance bottleneck in your code (unless you've done something silly) but, all things being equal, the jump table approach uses less instructions to get to the conditional which is normally A Good Thing and – in theory – should make it faster on average. One final note on switch statements; I am reliably informed that in addition to the if-else-if-else alike linear search behavior for resolving the correct case to execute, most modern compilers are also capable of generating a binary search for the cases of switch statements with appropriate ranges of case constant values. Using a binary search rather than a linear search will improve average search time from linear to logarithmic (i.e. O(n) to O(log n)). However, in the average case a binary searched switch will still almost always take more instructions and branches to reach the correct case than a jump table switch. It's also possible that the compiler might choose to use one or more of these methods in a single switch, though this would probably require a large number of cases in the switch and ranges of case constants with very specific properties so it's not likely you will come across these very often. Summary So, this concludes our look at conditionals, hopefully you've found it interesting and illuminating ;) We've now covered enough ground that you should be finding that you can apply the information I've given you to everyday programming problems such as debugging release code, or code you don't have debugging information for. The main things I'd like you to take away from our look at conditionals are all things that will help you when debugging without symbols:
any time you see cmp followed by a jxx to a nearby address in the disassembly you're probably looking at code generated by a conditional statement in the C/C++ code
if the address operand to the jump instruction is lower than the current instruction's address (i.e. it's jumping backwards) you're most likely looking at a loop
assembler generated from conditionals generally tests the opposite of the test being done in the C / C++ code
By using these heuristics, looking at the values in the registers, the values in the Stack that have been written by the assembler, and by looking up your current address in the symbol file to tell you which function you're in (if you're not generating a symbol file for all your builds you probably should be – look in the documentation for your platform's compiler toolchain to find out how), you should be able to make an educated guess at what variables in the C/C++ code are likely to be causing the current issue. This will usually tell you why it crashed, or give you a lead so you can Sherlock Holmes your way to the root of the problem – it's certainly a lot quicker than the ubiquitous insertion of the many printf()… Our next topic will be loops, which obviously also use conditional jumps (which is why we covered conditionals first…) One final thing… Thanks to Tony, Bruce, and Fabian for extra information, advice, and proof reading. And, for those of you who like to go off and look for yourselves (hopefully most of you!), I've recently discovered this wiki book on x86 Assembler. It has a large overlap with this series of articles and also covers programming in x86 assembler. Highly recommended – I've certainly found it pretty useful. [This piece was reprinted from #AltDevBlogADay, a shared blog initiative started by @mike_acton devoted to giving game developers of all disciplines a place to motivate each other to write regularly about their personal game development passions.]
You May Also Like