Wednesday, June 24, 2009


Well, a couple things have come up recently, so I haven't had much time to work on the cache algorithm or do much coding...

I've accepted a job at Edge of Reality in Austin, Tx. Not quite sure what I'll be doing yet, but the crew seems awesome and the work looks very interesting. I start in early July and just found an apartment (yay!) so I'm sort of freaking out trying to get everything together to do the move.

Hopefully I'll have some time to do some coding for fun (dont' laugh! it's fun for me!) in the next couple of days. :)

Friday, June 12, 2009

Current Project:: Wait-Free Cache Allocator (pt 2)

So this post is a little... no, wait, a lot more directed at programmers. If you're not a programmer... you might want to stop now 'cause the next bit gets fairly scary.

Traditional multithreaded programming uses mutual exclusion (Mutex) or critical sections to require that only one thread is using a piece of memory. Mutexes work by saying "I'd want to be the only one using this piece of memory" and waits until it is the exclusive owner. When done, the thread releases the mutex, allowing other processes to do something with that memory.

Another way to do this is through something called a "spin lock." A mutex uses extremely efficient "no CPU" waits where the entire thread is shut down until the mutex can be obtained. Spin locks, on the other hand, burn CPU time constantly checking a variable.

Lock-free algorithms typically use atomic operations as a replacement for mutexes. Atomic operations allow a program to modify a single piece of memory guaranteeing that no other thread is currently reading or writing to it. Under Windows, one of the functions to do this is InterlockedCompareExchange. These atomic operations can fail if some other thread is using the data or if the data used during the compare and exchange is stale, so some sort of wait (spin locks, for example) are used to pass the time until the data can be changed.

Wait-free algorithms are lock free algorithms, but they're written in such a way that if a wait is ever needed, they can do something else instead. Essentially, they never sit around and wait. Instead, they perform some other operation then come back and try again later. These types of algorithms are extremely complicated to set up. You've got to guarantee that you always have something else to do just in case an atomic operation can't happen. Essentially, it's loosely organized chaos.

Another problem with wait-free algorithms is that they're rarely performant in either development or execution. So much time and effort is put into making it "truly" wait-free that the algorithm gets very complicated. Not to mention that it takes an incredible amount of time to write one of these correctly.

The cache algorithm I created straddles a fine line between wait-free and lock-free. It is a lock-free algorithm in that it guarantees that some progress can be made if all other threads are suspended (don't ask me, this is someone else's definition.) It is sort of a wait-free algorithm in that the steps used in the cache are broken up to be parallelized as much as possible. It is not, however, a true wait-free algorithm because it uses spin locks to set data and to wait if no work is available. The reason I did it this way is for performance, both in execution and development. The algorithm is compact and very efficient and causes very few hits on the spin locks.

In the next post, I'll go back to the box analogy and explain how atomic operations can be used in a cached memory allocation scheme.

Current Project:: Wait-Free Cache Allocator (pt 1)

About a week ago, I started looking into block-free, lock-free and wait-free algorithms. I've been having an absolute blast with them and thought I'd post what I've learned and done so far.

(This first post is the English version. I'll get into WAY more detail in later posts.)

Imagine, for a moment, that you need a series of boxes for packing. Every time you need a new box, you have to make it. Work with me here... imagine you've got the skills/facilities to do this...

Method A is to create a new box every time you need one. You stop packing, make a new box, then fill it. It takes you about 1 minute to make each box this way because you have to gather the materials, do the work, then clean up afterwards.

Method B is to create a bunch of boxes whenever you run out, fill them all up, then create a bunch more boxes. This method avoids a lot of the startup and cleanup and you can cut boxes in groups; all in all, this is way more efficient. This method costs about 0.04 minutes, or 2.4 seconds per box. Very significant improvement! The only problem with this approach is that you must have only one person packing and creating boxes (read on and you'll understand why.)

Method C is to create a bunch of boxes up front, then every time you use a box do a little work towards making the next set of boxes. Nothing significant, mind you... put a piece of cardboard near the machine. Make one cut in the group of boxes. Maybe pick up a scrap or two off of the floor and throw it away. We also have the advantage of avoiding the downtime of making the entire group wait while one person makes boxes. Doing it this way takes 0.08 minutes or 4.8 seconds per box. While it's still way more efficient than singly creating boxes, it's less efficient than Method B. It's also much more complicated.

The single worker approach:

With only one person doing all of the work, any of these approaches work. Method B is obviously the best, but they all at least work.

The multiple worker approach:

With multiple workers, we expand our problems. The single worker approach insists that all we have to worry about is the one person making and packing boxes. We work as fast as we can and that's all we have to worry about. With multiple workers, if anything stalls the workers, we quickly get huge dents in our productivity. Another problem is that we only have one box making machine, so only one person can use it at a time.

The problem with Method A is that this will inevitably make a backup at the machine. The more workers you get, the larger the backup at the box machine grows and eventually you reach a maximum throughput of the time it takes to create a single box. Definitely not ideal.

Method B has an issue in that you waste a lot of man hours. When the group runs out of boxes, only one person can make new ones and everyone else has to sit around waiting. This approach greatly wastes a lot of time and only gets worse with the more workers you have. With 10 workers and an average time of 0.04 minutes per set of boxes, every stall will cost you an additional 0.4 minutes, or 24 seconds. In other words, the more workers you have, the worse your efficiency will be.

Method C looks pretty good. None of the workers ever get stalled and they all contribute a bit towards making the next set of boxes. This method also has the nice little feature that every time you run out of boxes, the next set of boxes is finished and waiting. As an added bonus, this version doesn't care if there are one worker or a thousand, it'll work just as well either way.

As you can probably guess, the method I'm working on is C. More info on how, exactly, this works later on.

Thursday, June 11, 2009


Well, the time has come for me to finally accept a job offer. I've got to admit, I've got quite a bit of trepidation in choosing... mostly because I barely know the companies I'm choosing between. The only way for me to really get to know them, however, is to go work there. If it turns out to be a Bad Thing then I'll need to continue the search or grind it out until my first title.

On the other hand, I've met a lot of really great people. Most everyone I've met is really into making a great product and doing the best job they can do. That's a major bonus for me as it reflects on the industry as a whole.

When it comes right down to it, though, I have to make a decision. My criteria is fairly simple: Is the company credible? Do I get along with them and them with me? Do they have smart people I can learn from? Are there other studios around them in case things go south?

Some of the criteria is a bit odd... I am not ranking based on pay (although that obviously has some to do with it...) and I'm not limiting myself by genre very much. I know that this is my first game development job, so I can't be too picky. Especially in this economy; I'm just happy to get offers.

I'll post by this weekend who I've decided on... until then, all of us will just wonder.

Tuesday, June 9, 2009

My Specialization

I've been on a lot... no, a lot of job interviews recently and one topic comes up a lot: what is my specialty?

It took me a while to figure out what this was. Not because I don't know, but because I like to do so many things. I sort of stressed over this for a while because I didn't want to pigeonhole myself. To find one, I went back to my old projects and looked at what I focused on time after time...

Parallel Processing

Using my gremlin analogy, imagine that your computer contains all sorts of different kinds of gremlins. Some are really small and only work in groups of four. Some are medium sized and can work well with others. Others are really big but don't work well with others.

My specialty is to get all of the different types of gremlins to work together and communicate very efficiently all at the same time.

What is a computer?

I guess before I can get started in deep coding stuffs I need to create an analogy for what a computer is and how it works. Don't worry, this is for the uninitiated; you don't need to know anything about computers to understand this.

Your computer is a box. Yeah, yeah, I know... difficult stuff here. In fact, it is a "magic" box to most people. It just plows along doing what it does. Most people don't pay any attention to how it works or what it does, they just click the buttons and do what they need to do.

But how does it work?

That's actually a huge topic that very few people can truly answer... so I'll create an analogy that I'll likely reuse a lot in the future.

Imagine that your computer is actually a home for mathematically inclined gremlins. Perhaps smart chupacabras, but we'll go with gremlins because it's easier to type. These gremlins know how to add, subtract, multiple and divide and they're very good at it. Incredibly good; in fact they can do billions of these simple math operations per second.

The gremlins have other talents too. They can work in groups or on their own and, most importantly, they can read and follow instructions.

Their drawback, however, is that they will only do what they're told. Nothing more, nothing less. They don't think for themselves and they don't understand the concept of common sense. If we tell them to lick their finger and jam it into a light socket they'll do it without question or hesitation. Even more odd, they'll keep their finger in the light socket until we tell them to take it out! They're cute, but not too smart.

Writing a computer program (programming/coding) is essentially telling the gremlins what to do. Add this, multiply it with this, divide it by this, etc. The gremlins will follow the instructions exactly as the programmer told them to.

Some instructions can cause the gremlins to get confused. If the programmer tells the gremlins to do something and put the result in a box but they fail to give the gremlins a box, bad things can happen. In fact, the gremlins will freak out because they have to put the result in a box but they don't have a box but they have to have a box but, but, but...

This is your traditional "crash." The gremlins have to do something but they can't. The net effect is that they do the only thing they can do: pull the fire alarm.

A programmer's job is to tell the gremlins what to do in such a way that the gremlins can never be confused. The job of the user (You!) is to never have to worry about how the gremlins do what they do.

Computers are STUPID!

Computers are stupid creatures. They do exactly what we tell them to do, even if it's not what was intended. They only do exactly what we tell them to do and nothing more, so a programmer's job is to take a generalized statement and break it down into individual actions.

Writing code isn't hard. In fact, it's really easy. It's like learning a foreign language that uses just a small subset of English. Admittedly this is much easier if you already speak English...

The hard part is wrapping your brain around every nook and cranny of what's going on. You've got to have both a 10,000 foot view and a 1 inch view of what's going on in order to write good code. What's the overall purpose of the piece you're writing? What's the purpose of this one single line? What's the purpose of this one single character on this line?

A pondering: how to do this...

So I want to post a bunch of stuff about coding (programming), but the majority of people out there, let's face it, don't know anything about it. It's "magic" to them and something theyconsider way above their head. In that respect, when I post, I'll try to:

1) Write in plain English. As Albert Einstein said, "If you can't explain it to a 6 year old, you don't understand it yourself."
2) Give the in-depth code analysis for those that want it.
3) Ask Questions! I love to learn and I like to think I take criticism well so I'll be asking for feedback. This, of course, doesn't mean that you have to comment... I'm just hoping you will. :)

Wow, I'm blogging...

Yep, I never thought this day would come... when I actually start my own blog.

So why am I doing it? No, it's not because I think I have the ultimate view on anything. I just want to share things I've learned and (hopefully) get feedback. :)

We'll see how it goes. It'll probably mostly revolve around coding with the occasional random thought thrown in.