A project I was working on had a fairly unique storage requirement: I needed a storage medium in which I could query its contents VERY quickly with Contains(), even when the set was relatively large; and when it reached a set size, it should start replacing its oldest values with incoming values – essentially a queue where the oldest values get trimmed once the queue reaches a certain size. Lets take a look at the solution achieved.
The first requirement is easily satisfied by a HashSet. However the second requirement exceeds the HashSet’s capabilities – it implies some sort of indexing is available on the collection; which HashSet does not (and was never intended to) provide.
What I ended up creating was an extended HashSet, utilising a dictionary to track the hashset values. The dictionary is used to facilitate the ‘revolving queue’ type behaviour of the collection. When the size of the collection reaches a set maximumBufferSize (supplied upon creation) it will start trimming the oldest values of the HashSet and inserting the newer values incoming.
Code Snippet
- internal class RevolvingHashSet<T> : HashSet<T>
- {
- private Dictionary<int, T> _keys;
- private int _maximumBufferSize;
- private bool _revolve;
- private int _insertionIndex;
-
- public RevolvingHashSet(int maximumBufferSize)
- {
- _maximumBufferSize = maximumBufferSize;
- _keys = new Dictionary<int, T>();
- _revolve = false;
- _insertionIndex = 0;
- }
-
- public new bool Add(T value)
- {
- if (_revolve)
- {
- this.Remove(_keys[_insertionIndex]);
- }
-
- base.Add(value);
- _keys[_insertionIndex] = value;
- _insertionIndex++;
-
- if (_insertionIndex == _maximumBufferSize)
- {
- _revolve = true;
- _insertionIndex = 0;
- }
-
- return true;
- }
- }
So we have introduced a little overhead when adding values (Although not much I have found through empirical testing), but achieved the main goals set out for the collection. Enjoy.
Print | posted on Friday, 22 July 2011 2:39 PM