Wednesday, May 11, 2016

Impress your colleagues with your knowledge about… HashSet

Sometimes when working with .NET you discover some hidden gems. Some of them very useful, other ones a little bit harder to find a good way to benefit from their functionality.

.NET offers a lot of collection types out-of-the-box. Most developers just use the List<T> without much thought. However there are a lot of (better?) alternatives available. One of the collection types that doesn’t receive a lot of love is the HashSet<T>.

What makes a HashSet<T> different from a regular List<T>?

HashSet is an unordered collection containing unique elements. It offers the standard collection operations Add, Remove, Contains, but since it uses a hash-based implementation, these operations has a cost of O(1). (Compare this to the List<T> for example, which has a cost of O(n) for Contains and Remove.) What this means it does not matter how many elements HashSet has it will take same amount of time to check if there's such element or not. HashSet also provides standard set operations such as union, intersection, and symmetric difference.

Most programming languages have their own (Hash)Set implementation. The HashSet class in C#  not preserves the order of elements. This makes it much faster than a regular List, but it doesn’t allow access by indices. To access elements you can either use an enumerator or use the built-in function to convert the HashSet into a List and iterate through that.

Here are some performance benchmarks:

1 comment:

Lakshmi said...

Wonderful article, very useful and well explanation. Your post is extremely
incredible. I will refer this to my candidates...
hadoop Training in Chennai