Optimizations That Work and 'Optimizations' That Do Not Work

I’ve begun to run into this more and more attempting to optimize Flash code and ‘pessimizing’ as a result with my old C habits.

Even in native C code, many of the dirty old tricks and techniques of yore that are being dug up in old ‘How to Program Games’ books and applied to Flash are obsolete. These seem to turn up as people make ‘demos’ (like in the days of yore) for Flash that attempt to resurrect the old tricks of days gone by and get nifty visual effects.

In native CPU code optimization today, the biggest worry is bulky code that causes cache misses, even page faults. That handy ‘unrolled loop’ in the 80’s and early 90’s is now a major performance liability, because fitting it all in the cache is the most crucial thing. That big ‘lookup table’ to save a few cycles in an inner loop can be ‘swapped out’ and need to be read from the hard drive.

Pre-calculated sin/cos/tan tables used to be very helpful until around 1995, when the 386 and 486 machines started getting replaced by Pentium machines. They don’t help at all in native code on modern desktop CPUs that all have efficient floating point coprocessors that work in parallel with the CPU. In some embedded CPUs without FPUs, it still helps. In Flash, running on a desktop machine or a game console they’re worse than useless; they add extra operations that consume more time than letting the FPU do its work, and make the code much harder to read and maintain. Unless that table has *several *operations canned in the math, it isn’t going to save you time, and it will increase the memory requirements to run your application.

When coding cross-platforms (as with Flash), anything like binary shifting we do ‘for performance’ instead of simply multiplying or dividing may help and hurt us on different platforms, while at the same time make it more difficult to define constants and make the code a little less readable for many people. Certainly bitwise operations are the most efficient thing for operating on bits, but not for doing math. They just make things harder to read for most people (not for me, I’m a bit-head from way back), and unsigned math occasionally introduces nasty little bugs that take a long time to track down.

Some things still work.[LIST]
[]Saving results that are used in more than one place in a function instead of re-calculating it obviously helps, but sometimes it can make the code a little harder to follow.
[
]‘Finding’ a repeatedly called function or referenced variable attached to a class ahead of time and saving it in a local helps in AS2. That’s sorted binary string lookups all the way. The Flex 2 compiler.optimize appears to do some of this for you (it doesn’t always make as much of a difference as you’d expect).
[]Keeping your class and member names concise and unique at the beginning helps a little. For locals within a function, this doesn’t matter (they’re not searched for by name). “MyBigClassThatDoesUsefulThingsEverywhere” and “MyBigClassThatDoesUsefulThingsSomeplace” will have to be compared nearly all the way through, and if you have a lot of classes/members that start off “MyBigClassThat”, this can tend to add up. Using ‘packages’ can somewhat reduce the amount of clutter that needs to be sorted through in the namespace. Also keep in mind that people who decompile your Flash code will get a big leg-up if your labels are all self-explanatory, but balance that with the need not to be working with gibberish.
[
]Keeping the number of members in classes down helps a little, bit too. Watch the inheritance. It’s always tempting to write a ‘Vector’ class that has a hundred handy little 3D functions in it, but consider you might end up with an Object with a hundred function references in each instance to store ‘X,Y,Z’. Maybe static functions?
[]Doing things computationally in fewer logical steps ALWAYS saves a lot.
[
]Choosing better algorithms naturally is a mighty big help, and sort of the same as the last point. A qsort in C almost always outperforms a bubble sort in ASM (there’s the incremental frame-by-frame sort, but ASM didn’t save much over C anyway). This has always allowed me to ‘out-perform’ hand-tweaked ASM code with C code, because I could more easily implement better algorithms that ultimately saved time. Most of the ASM code I’ve ever encountered has been very brute-force and readily replaced with more portable and readable C code.
[]Really search hard for a native library calls and techniques to do your work for you, but then TEST to make sure they really do save time. Not all of the ‘handy new things’ are written in natively compiled code.[LIST]
[
]BitmapData is super-handy for 2D integer arrays of all kinds, so long as your integer values are less than (1<<24). The highest 8 bits are ‘alpha’, and you generally don’t want those operations on indexes and numbers.
[]Assuming the ‘DisplacementMapFilter’ is native (I have not tested this), it could have a lot more interesting uses than the examples on the web already. Perhaps a Gameboy ‘Mode 7’ emulation that needs only one or two bitmap passes and a DisplacementMapFilter or two.
[
]Rather than build an array of tiles and piece them together with interpreted code and data, paste the tiles onto the stage and let Flash do it. It’s very easy to embed a Flash ‘Movie’ within another one that references your ‘tiles’, and you can easily make far more interesting and robust levels from common parts layered together. You also get your ‘level editor’ done for you: it’s just Flash.
[*]BitmapData.lock() and BitmapData.unlock() are very useful. Rather than maintaining two BitmapDatas to draw on and display, you can let Flash handle that with its internal caching.[/LIST] [/LIST]Any technique can be over-used, and any code can be made into pure drivel by over-applying some techniques. Inner-most loops requiring lots of operations need the most attention. Outer loops require far less.

Remember: /comments/ are FREE. Use them. They don’t add one bit to the size of your resulting Flash applet, but they make the source code so much more readable when you need to go back in. You may be surprised how often you go back into code you wrote five or ten years ago, and it doesn’t make much sense anymore. Even one year ago. Do it for yourself.