Posts tagged ‘arrays’

An interesting data structure

Admittedly, I can’t think of an uncontrived use case when I would prefer this to a hash table, union find, or traditional bit field, but I still think this is a pretty cool concept. It’s a data structure to hold n items, with constant time initialization (i.e. you can’t initialize all n items at the beginning), and constant time random access. Moreover, it can be used as a set: on top of the constant time random access, you can store and remove items as well as get the number of items in it in constant time, and you can iterate through all the inserted items in O(n) time (where n is the number of items currently stored in the set; not the maximum number of items it can have). and on top of that, if you don’t need to call the items’ destructor (for instance, you’re storing doubles or ints), you can empty the set in constant time! The only drawback to using it as a set is that it takes memory proportional to the size of the keyspace (for instance, if you’re trying to store items indexed by b-bit numbers, it will take O(2^b) space). However, it can be used as a bit vector: to store u bits, it takes a mere O(u) space, and you don’t need to zero all the bits out at the beginning! This implies that if you only end up using a sparse subset of the bits, you don’t have to waste time initializing all the bits you didn’t use. To be fair, the constant hidden behind that O(u) space is bigger than the one in a normal bit vector, but I think it’s still a pretty neat idea. Besides, space is cheap these days; people already often use a byte or four for every bit needed.

Anywho, here’s a great write-up of it, along with the original 1993 paper that introduces it (see the top of page 4 to compare the running times of this to a traditional bit field). On top of what is written in those links, the space required can be cut in half by only having a single array instead of two: just have it point to other parts of itself. Insertion/deletion gets a little more complicated, but it totally works. Neat!