The Revolving HashSet

A project I was working on had a fairly unique storage requirement: I needed a storage medium in which I could query its contents VERY quickly with Contains(), even when the set was relatively large; and when it reached a set size, it should start replacing its oldest values with incoming values – essentially a queue where the oldest values get trimmed once the queue reaches a certain size. Lets take a look at the solution achieved.

The first requirement is easily satisfied by a HashSet. However the second requirement exceeds the HashSet’s capabilities – it implies some sort of indexing is available on the collection; which HashSet does not (and was never intended to) provide.

What I ended up creating was an extended HashSet, utilising a dictionary to track the hashset values. The dictionary is used to facilitate the ‘revolving queue’ type behaviour of the collection. When the size of the collection reaches a set maximumBufferSize (supplied upon creation) it will start trimming the oldest values of the HashSet and inserting the newer values incoming.

Code Snippet
  1. internal class RevolvingHashSet<T> : HashSet<T>
  2.     {
  3.         private Dictionary<int, T> _keys;
  4.         private int _maximumBufferSize;
  5.         private bool _revolve;
  6.         private int _insertionIndex;
  7.  
  8.         public RevolvingHashSet(int maximumBufferSize)
  9.         {
  10.             _maximumBufferSize = maximumBufferSize;
  11.             _keys = new Dictionary<int, T>();
  12.             _revolve = false;
  13.             _insertionIndex = 0;
  14.         }
  15.  
  16.         public new bool Add(T value)
  17.         {
  18.             if (_revolve)
  19.             {
  20.                 this.Remove(_keys[_insertionIndex]);
  21.             }
  22.  
  23.             base.Add(value);
  24.             _keys[_insertionIndex] = value;
  25.             _insertionIndex++;
  26.  
  27.             if (_insertionIndex == _maximumBufferSize)
  28.             {
  29.                 _revolve = true;
  30.                 _insertionIndex = 0;
  31.             }
  32.  
  33.             return true;
  34.         }
  35.     }

So we have introduced a little overhead when adding values (Although not much I have found through empirical testing), but achieved the main goals set out for the collection. Enjoy.

Print | posted on Friday, 22 July 2011 2:39 PM

Feedback

No comments posted yet.
Title  
Name
Email (never displayed)
Url
Comments   
Please add 4 and 7 and type the answer here: