Since I started at Obsidian (wow, that was like 6 months ago, time flies!) I've been working non-stop. As a result, some of my personal projects have, shall we say, lagged behind the times. Interestingly enough, it's given me time to gather my thoughts on my own stuff and come up with some interesting new ideas.
One little project I've been toying around with is in-place list and tree management. This is a mundane little project that can actually have some significant impact.
Take a doubly linked list, for example. The difficult and time consuming part of doubly linked lists is getting the insert/remove operations correct. Even if you're an expert, it still takes brain power to trace the code and ensure that it's correct. After that, it's pretty mundane as it's just iteration and allocation of node space. The in-place functions (actually just inline templates) assume that you're providing the memory for the storage and do direct management of the data.
So why, you might ask, is this important or even need addressing? I mean seriously, you could either:
- Just write the doubly linked list when you need it and be-done-wif-it!
- Use a pre-canned doubly linked list and be-done-wif-it!
Ok, screw that. Go with option number 1 so we can avoid the recursion. I need to write a doubly linked list using space that I've provided. Let's see, I'll create a node with next and previous pointers. Now I'll figure out the pointers to manipulate the nodes... ah that's right, three special cases, head, tail and middle... handle those... and crap now it's broke because I missed something stupid.
Ok, no, really, screw all of that. That's a recipe for disaster both from a maintenance perspective as well as from initially writing the code. K.I.S.S. How much time do you want to spend debugging something that isn't the core of your work?
Let's break it down just a touch... Option number 1 is nice in that you've got this reusable doubly linked list that allocates nodes and manages the list. Option number 2 is necessary because there are lots of situations when you don't want to allocate the node. The similarity between the two of these is that the list is managed in some known block of memory.
Easy, write this and have both Option 1 and Option 2 call it with the data to manage!
template <typename NodeType>
inline void insert_after(NodeType *target, NodeType **list)
{
NodeType *position = *list;
if (position)
{
target->next(position->next());
target->prev(position);
if (position->next())
{
position->next()->prev(target);
}
position->next(target);
}
else
{
*list = target;
target->next(0);
target->prev(0);
}
}
Not a groundbreaking idea, I know, but there are so many possibilities here... you can specialize the function for different arguments. For instance, if you know that the node that you're inserting into is not null and that the next node in the list is not null, you can do a branchless version:
template <typename NodeType>
inline void insert_after_ln(NodeType *target, NodeType *list)
{
target->next(list->next());
target->prev(list);
list->next()->prev(target);
list->next(target);
}
But wait, order now and I'll throw in this fancy little tidbit:
You can use this functionality to make a BRANCHLESS doubly linked list. Imagine, a doubly linked list that never hits a conditional. Ever! Ok, so maybe this is only really interesting to me, but on platforms like the PS3 SPU which absolutely *hate* branch prediction misses, this could be a significant gain in performance.
As to how, I'll leave that as an exercise to you... or maybe post it later today, we'll see. :)