Monday, August 30, 2010

Personal project: in-place list and tree management

Time is a valuable resource. Time spent cannot be regained no matter how hard you try. The only thing you can try to do is be more efficient with your time. Spend it wisely.

Since I started at Obsidian (wow, that was like 6 months ago, time flies!) I've been working non-stop. As a result, some of my personal projects have, shall we say, lagged behind the times. Interestingly enough, it's given me time to gather my thoughts on my own stuff and come up with some interesting new ideas.

One little project I've been toying around with is in-place list and tree management. This is a mundane little project that can actually have some significant impact.

Take a doubly linked list, for example. The difficult and time consuming part of doubly linked lists is getting the insert/remove operations correct. Even if you're an expert, it still takes brain power to trace the code and ensure that it's correct. After that, it's pretty mundane as it's just iteration and allocation of node space. The in-place functions (actually just inline templates) assume that you're providing the memory for the storage and do direct management of the data.

So why, you might ask, is this important or even need addressing? I mean seriously, you could either:
  1. Just write the doubly linked list when you need it and be-done-wif-it!
  2. Use a pre-canned doubly linked list and be-done-wif-it!
Say you go route number 2 and use the pre-canned doubly linked list. It allocates a node, manages next/prev pointers, and is nice and clean and happy from the front end. And then you use it in a memory allocator. Oops! You've now got the memory manager that is supposed to be managing a block of memory allocating. This causes the memory manager to call itself again, which causes a list operation, which calls the memory manager... which makes you pull your hair out as you try to resolve the circular reference.

Ok, screw that. Go with option number 1 so we can avoid the recursion. I need to write a doubly linked list using space that I've provided. Let's see, I'll create a node with next and previous pointers. Now I'll figure out the pointers to manipulate the nodes... ah that's right, three special cases, head, tail and middle... handle those... and crap now it's broke because I missed something stupid.

Ok, no, really, screw all of that. That's a recipe for disaster both from a maintenance perspective as well as from initially writing the code. K.I.S.S. How much time do you want to spend debugging something that isn't the core of your work?

Let's break it down just a touch... Option number 1 is nice in that you've got this reusable doubly linked list that allocates nodes and manages the list. Option number 2 is necessary because there are lots of situations when you don't want to allocate the node. The similarity between the two of these is that the list is managed in some known block of memory.

Easy, write this and have both Option 1 and Option 2 call it with the data to manage!

template <typename NodeType>
inline void insert_after(NodeType *target, NodeType **list)
{
NodeType *position = *list;

if (position)
{
target->next(position->next());
target->prev(position);

if (position->next())
{
position->next()->prev(target);
}

position->next(target);
}
else
{
*list = target;
target->next(0);
target->prev(0);
}
}


Not a groundbreaking idea, I know, but there are so many possibilities here... you can specialize the function for different arguments. For instance, if you know that the node that you're inserting into is not null and that the next node in the list is not null, you can do a branchless version:

template <typename NodeType>
inline void insert_after_ln(NodeType *target, NodeType *list)
{
target->next(list->next());
target->prev(list);
list->next()->prev(target);
list->next(target);
}


But wait, order now and I'll throw in this fancy little tidbit:

You can use this functionality to make a BRANCHLESS doubly linked list. Imagine, a doubly linked list that never hits a conditional. Ever! Ok, so maybe this is only really interesting to me, but on platforms like the PS3 SPU which absolutely *hate* branch prediction misses, this could be a significant gain in performance.

As to how, I'll leave that as an exercise to you... or maybe post it later today, we'll see. :)

No comments:

Post a Comment