Collections & Generics with the .NET framework
Lesson 1: Collections
Name | Description |
---|---|
ArrayList | Resizeable index based collection |
SortedList | Sorted collection of name / value pairs |
Queue | FIFO collection |
Stack | LIFO collection |
Hashtable | Name / value collection that allows retrieval by name or index |
BitArray | Compact collection of boolean values |
StringCollection | Resizeable collection of strings |
StringDictionary | Name / value pairs of strings, retrieval by name or value |
ListDictionary | Stores small lists of objects |
HybridDictionary | Uses ListDictionary when number of items in collection is small and migrates to Hashtable for large collections |
NamedValueCollection | Name / Value pairs of strings that allow retrieval by name or index |
Adding / removing items
ArrayList supports two addition methods - Add and AddRange, former for single object latter for collection. Can add objects that exist as variables or are created inline to the Add method. Even value types (e.g. Number 50) can be stored in collections, but must be wrapped in object reference (boxing). The AddRange method adds range of items, usually from array or another collection, from object that supports ICollection interface.
The Add and AddRange methods add to end of collection.
Insert and InsertRange place at specific location in collection.
Can use indexer to set specific object in collection, in C# this is represented as []
Using indexer is not same as calling Insert as it sets the item at the specified location by overwriting the old object at that position.
Three methods for removal - Remove, RemoveAt and RemoveRange. The Remove method deletes a specific object, e.g.
coll.Add("Hello");
coll.Remove("Hello");
The Remove method does not indicate if removal was successful.
RemoveAt removes item at specified index, whilst RemoveRange moves objects as a range of indexes.
Clear will empty a collection of all its items.
IndexOf will return the index of a specific object
Contains will test if a particular object exists in a collection.
Iteration
Supports numeric indexer allowing items to be accessed in order.
Supports IEnumerable interface that provides GetEnumerator to access the IEnumerator interface for the enumerator object of the collection. The IEnumerator interface provides simple mechanism for iterating in forwards direction, supports Current property (gets current item in collection being enumerated), MoveNext methods which moves to next item in collection (return value indicates if end of collection reached) and Reset method that moves enumerator to before the first item in collection (allowing MoveNext to get the first item in collection), e.g.
IEnumerator enum = coll.GetEnumerator();
while (enumerator.MoveNext())
{
Console.WriteLine(enumerator.Current);
}
The same can be achieved using the foreach construct...
foreach(object item in coll)
{
Console.Writeline(item);
}
If know that all types in collection are same then can specify type in foreach construct, e.g. foreach(string item in coll). Note if any non strings encountered then the framework would throw a casting exception.
Consistent interfaces
IEnumerable provides common way to iterate over collection.
Collections should also implement ICollection (which derives from IEnumerable) that provides common way to access and copy collection.
Supported properties:
- Count - number items in collection
- IsSynchronized - indicates in collection is thread safe
- SyncRoot - provide object that synchronizes collection
Supported methods:
- CopyTo - copies contents of collection to array
For simple collections should support IList (that derives from ICollection).
Supported properties:
- IsFixedSize - can collection be resized
- IsReadOnly - can collection be changed
- Item - gets / sets item at specific index in collection
Supported methods:
- Add
- Clear
- Contains
- IndexOf
- Insert
- Remove
- RemoveAt
Sorting
ArrayList supports Sort method that works by using Comparer class. The Compare class is a default implementation of IComparer. The Sort method also allows caller to specify their own IComparer object which performs custom comparisons (e.g. Case sensitive compare).
Writing own comparer is straightforward. Only have to implement Compare method of IComparer, e.g.
public class DescendingComparer : Icomparer
{
CaseInsensitiveComparer _comparer = new CaseInsensitiveComparer();
public int Compare(object x, object y)
{
return _comparer.Compare(y,x);
}
}
Lesson 2: Sequential Lists
Collection of objects to be dealt with in orderly way. Having access to items in middle of list is unlikely to be required. Instead of using ArrayList, consider Queue or Stack.
Queues
FIFO handling of objects.
Interface supports putting items into queue and pulling them out.
In ArrayList accessing and removing items from collection were different operations, in queue they are combined into Dequeue method.
Use Enqueue to add items to queue.
Permits addition of duplicate items and null values - can't use result of Dequeue or Peek to determine if queue empty, instead check Count property.
Stacks
LIFO handling of objects.
Interface supports pushing items onto stack and popping them off.
Similar to queue, but instead of enqueueing and dequeuing objects are being pushed and popped.
Use Push to add items.
As with queue can not rely on result of Pop or Peek, instead check Count property.
Lesson 3: Dictionaries
Usage
Store lists of key / value pairs.
Most basic form is Hashtable.
Dictionaries always require tow pieces of information for an add operation - a key and a value, e.g.
emailLookup.Add("[email protected]", "Bishop, Scott");
or
emailLookup["[email protected]"] = "Bishop, Scott";
To retrieve item call indexer with key required, e.g.
Console.WriteLine(emailLookup["[email protected]"]);
Dictionaries designed for looking up key / value pairs - thus iterating is more complicated. Each element in dictionary is a DictionaryEntry object which is a container for a Key and Value pair. This to iterate over values in a dictionary perform the following:
foreach(DictionaryEntry entry in emailLookup)
{
Console.WriteLine(entry.Value);
}
All dictionary classes support the IDictionary interface (derived form ICollection) that provides, amongst others, the Keys property that gets an ICollection of the keys in the collection and Values that provides an ICollection of the values in the collection.
IDictionary somewhat like IList, but does not provide access by index, only key. Supports two methods to test for existence of key (ContainsKey) or value (ContainsValue).
Equality
Hashtable uses integer value (a hash) to aid in key storage. Every object in framework derives from Object and so supports GetHashCode method. To test for equality the Hashtable checks that hash value of the objects presented. Hashtable only supports unique hashes of values- not unique values. Consequently the following code will only store one item (i.e. the second entry overwrites the first as the hash of "First" is the same.
duplicates["First"] = "1st";
duplicates["First"] = "the first";
The Hashtable does not rely on the hash value alone for equality tests. If it finds two objects with the same hash it will go on to test that the two objects are equal (by calling the Equals method implemented by Object). If these two tests pass then the keys are identical.
May need to override the implementations of GetHashCode and Equals to get desired behaviour - e.g. Basing generation of hash on a user defined name, etc.
IEqualityComparer interface
Provides equality test outside of class, e.g. Want to store string keys in Hashtable, but ignore case. Changing string class to be case insensitive would be painful, therefore use external class to calculate equality. Hashtable class can accept instance of IEqualityComparer as argument.
Supports two methods: GetHashCode and Equals.
public class InsensitiveComparer : IEqualityComparer
{
CaseInsensitiveComparer _comparer = new CaseInsensitiveComparer();
public int GetHashCode(object obj)
{
return obj.ToString.ToLowerInvariant().GetHashCode();
}
public new bool Equals(object x, object y)
{
if (_comparer.Compare(x,y) == 0) return true;
return false;
}
}
SortedList
A dictionary class that shares some behaviour with list. Can access items within SortedList in order, e.g.
sort["first"] = "1st";
sort["second"] = "2nd";
sort["third"] = "3rd";
sort["fourth"] = "4th";
foreach(DictionaryEntry entry in sort)
{
Console.WriteLine("{0} {1}", entry.Key, entry.Value);
}
generates...
first 1st
fourth 4th
second 2nd
third 3rd
i.e. the list is sorted by the key
SortedList supports additional methods that allow key and value access by index number, e.g. GetByIndex, IndexOfKey and IndexOfValue.
Specialised Dictionaries
ListDictionary
Hashtable is efficient, but requires overhead and so is not good for small collections (less than 10 items). ListDictionary has same interface as Hashtable, but is implemented internally as an array.
HybridDictionary
Using ListDictionary for larger collections is not good. The HybridDictionary is implemented as a ListDictionary until the list becomes too large in which case it converts itself to a Hashtable.
OrderedDictionary
Provides functionality of Hashtable but with ability to control order of elements. Could use SortedList to achieve same, but don't always want to sort on the order provided by the keys. Extends the Hashtable interface to provide
Item property that supports access by index and Insert and RemoveAt methods that allow key / value pairs at specified indexes to be added / removed.
Lesson 4: Specialized Collections
Working With Bits
Two classes simplify bit operations - BitArray and BitVector32.
BitArray
Resizeable collection. When create must specify it size. This can be changed later via the Length property. Does not support Add or Remove methods - missing because each entry can only be true or false, so idea of adding / removing does not really apply. Once initialised collection of boolean values all set to false. Power of BitArray resides in ability to perform boolean operations on two BitArrays of same size, e.g.
BitArray bits = new BitArray(3);
bits[1] = true;
BitArray moreBits = new BitArray(3);
moreBits[0] = true;
BitArray xorBits = bits.Xor(moreBits);
BitVector32
Useful for managing individual bits in larger number. Data stored in 3-bit integer. All operations change the value of the integer held within the structure. At any time can get value of integer via Data property.
Create masks in sequential order by calling static CreateMask method. Calling without any parameters creates mask for first bit in structure, calling again with integer value (representing previous mask created) will provide a mask for the next bit, e.g.
int firstBit = BitVectore32.CreateMask();
int secondBit = BitVector32.CreateMask(firstBit);
int thirdBit = BitVector32.CreateMask(thirdBit);
BitVector32 vector = new BitVector32(0); // Init to all false
vector[firstBit] = true;
vector[thirdBit] = true;
Besides dealing with individual bits, also supports bit packing. Bit packing is the process of taking several small numbers and packing them into one large number - used to decrease storage requirements.
BitVector32 allows sections to be created within structure to store numbers up to certain values, e.g. To store maximum value of 10, maximum value of 50 and maximum value of 500 use the following:
BitVector32.Section firstSection = BitVector32.CreateSection(10);
BitVector32.Section secondSection = BitVector32.CreateSection(50, firstSection);
BitVector32.Section thirdSection = BitVector32.CreateSection(500, secondSection);
Like CreateMask the CreateSection uses the last section to determine where to pack the new number. Can gain access using indexer as follows:
packedBits[firstSection] = 10;
packedBits[thirdtSection] = 192;
Can use Data property to gain access to underlying number that contains the three packed numbers.
Collecting Strings
StringCollection
Dynamically sized collection that only stores strings. Working with it is virtually identical to ArrayList. Adding a non-string generates a compilation error. When retrieving a value the code is working with strings (not objects) - reduces the number of casts required.
StringDictionary
Strongly typed dictionary. Use like Hashtable, except both key and value must be string. Keys are case-insensitive, i.e. "Four" and "four" are the same.
Case-insensitive Collections
Previously case insensitive collections were produced by using the IComparer and IEqualityComparer interfaces. This is a common requirement and so the CollectionsUtil class provides static methods to create case insensitive Hashtables and SortedLists.
Culture-Invariant Collections
Default collection behaviour is to use current threads culture. Sometimes need to do comparisons without regard to current culture, e.g. Web applications or applications that store information across cultures. Cannot be created via CollectionsUtil class, instead specify collection with a StringComparer using the comparison rules of the invariant culture, e.g.
Hashtable hash = new Hashtable(StringComparer.InvariantCulture);
NameValueCollection
Similar to StringDictionary, but allows multiple values per key and retrieval by index as well as key. To retrieve all values for a key use the GetValues method, e.g.
NamsValueCollection nv = new NameValueCollection();
nv.Add("key", "value 1");
nv.Add("key", "value 2");
foreach (string s in nv.GetValues("key"))
Console.WriteLine(s);
Note, adding values via the indexer will cause them to be overwritten, i.e. The following code will only have 1 value ("first") in nv:
nv["First"] = "1st";
nv["First"] = "first";
Can access values by index, if more than one value is associated with an index then they are returned as a comma-delimited list, e,.g.
nv.Add("First", "1st");
nv.Add("Second", "2nd");
nv.Add("Second, "not first");
for(int x=0; x < nv.Count; x++) Console.WriteLine(nv[x]);
generates...
1st
2nd, not first
Lesson 5: Generic Collections
Programming is about solving problems - often the same solution is common to lots of situations, e.g. The need to collect ordered list of items. .NET offers ArrayList to solve this problem as it stores lists of objects, but this introduces problems. For example can store list of integers in ArrayList. Consumer of ArrayList assumes only integers are present in the list - but there is nothing to stop non-integers being placed into the ArrayList. When the consumer code casts the object from the ArrayList to an integer a run-time error will occur. Could write wrapper class around ArrayList that only accepts integer values, but this is a lot of work - especially when need to implement similar class for other data types.
Generic types are types that take other type names to define them. For example instead of creating collection strongly typed to a specific type (as in the example before), can write a collection that can use any type, e.g.
public class MyList<T&> : ICollection, IEnumerable
{
private ArrayList _innerList = new ArrayList();
public void Add(T val)
{
_innerList.Add(val);
}
public T this[int index]
{
get { return (T)_innerList[index];}
}
}
Instead of making it a collection of integers, the code uses a generic type parameter T. Every place that would refer to an integer now refers to T. Instantiate as follows...
MyList<int> myIntList = new MyList<int>;
MyList<string> myStringList = new MyList<string>;
Generics used in many places within framework, but most frequently see in collection classes.
Generic classes exist for most of the non-generic collections present in .NET 1.0
Type | Generic Type |
---|---|
ArrayList | List<> |
Queue | Queue<> |
Stack | Stack<> |
Hashtable | Dictionary<> |
SortedList | SortedList<> |
ListDictionary | Dictionary<> |
HybridDictionary<> | Dictionary<> |
OrderedDictionary | Dictionary<> |
SortedDictionary | SortedDictionary<> |
NameValueCollection | Dictionary<> |
DictionaryEntry | KeyValuePair<> |
StringCollection | List<String> |
StringDictionary | Dictionary<String> |
N/A | LinkedList<> |
Generic List
Simple, type safe ordered list of objects
Use Add to add items to list, items must be same type as that specified in the generic type parameter of the list.
Can use indexer to access list items
Can use foreach syntax to iterate over list contents.
The Sort method supports a generic delegate. A generic delegate uses generic parameters to define the calling convention of the delegate. For example the generic Comparison delegate for the Sort method is defined as:
public delegate int Comparison<T>(T x, T y);
To sort list in reverse order could write entire Comparer class, or just a method that matches the generic delegate...
static int ReverseIntComparison(int x, int y){ return y - x;}
Note the method is not generic, but it matches the generic Comparison delegate - the List in this example is composed of integers so the Comparison must use integers. To invoke the sort issue the following command...
intList.Sort(ReverseIntComparison);
Generic Queue and Stack
When adding items (via Enqueue or Push) the type of the item must match the type specified in the generic type parameter.
Items retrieved (via Dequeue or Pop) will be of the type specified in the generic type parameter.
Generic Dictionary
Closely resembles Hashtable, ListDictionary and HybridDictionary.
Stores key / value pair in a collection - need to specify two generic type parameters when creating an instance of the Dictionary class - one for the key, the other the value.
Can use indexer syntax to access the entries, but types used must match the generic type parameters passed to the constructor of the dictionary.
When retrieving entries they are presented as a KeyValuePair, not a DictionaryEntry as in the non-generic Dictionary.
The KeyValuePair class takes two types (like the Dictionary) - one for the key and the other the value. Ordinarily instances of this type are not created directly, instead they are returned from the generic dictionary class, e.g. when iterating
foreach(KeyValuePair<int, string> entry in dict)
{
Console.WriteLine("{0} = {1}", entry.Key, entry.Value);
}
Generic SortedList and SortedDictionary
Similar to generic dictionary except items sorted by the key of the collection.
Generic LinkedList
A LinkedList is a set of items that are linked to each other. From each item can navigate to the next or previous item without having to access the collection itself. Very useful when passing objects around that must known about their siblings. Important properties include:
- Count - number of nodes in LinkedList
- First - first node in LinkedList
- Last - last node in LinkedList
Important methods include:
- AddAfter - add node after existing node in list
- AddBefore - add node before existing node in list
- AddFirst - add node at beginning of list
- AddLast - add node at end of list
- Find - find first node containing specified value
- FindLast - find last node containing specified value
- Remove - remove first node containing specified value from list
- RemoveFirst - remove first node from list
- RemoveLast - remove last node from list
LinkedList contains collection of LinkedListNode objects. When working with list primarily getting and walking down the LinkedListNodes. Important properties include:
- List - get the LinkedList the node belongs to
- Next - the next node in the LinkedList
- Previous - the previous node in the LinkedList
- Value - the value of the node
The LinkedList enumerator does not use LinkedListNode objects - unlike the behaviour of the generic Dictionary class (which returns a KeyValuePair object). The difference is because the LinkedListNode object can be used to walk the list of items, but only one piece of data is held in each node. Therefore there is no need to return the nodes during enumeration - only their values.
Generic Collection Class Structure
The generic collections work in similar ways in many areas. These common areas include the use of generic collection interfaces, enumerators and comparisons.
Interfaces
Besides implementing the standard IEnumerable, ICollection, IList, interfaces etc., the generic classes implement generic variations that provide type safe interface methods, e.g.
IList theList = (IList)) stringList;
object firstItem = theList[0];
Ilist<string> typeSafeList = (Ilist<string>) stringList;
String firstString = typeSafeList[0];
Enumerators
To facilitate iteration each collection supports generic nested Enumerator structure, e.g.
List<string>.Enumerator e = stringList.GetEnumerator();
while (e.MoveNext()) { string s = e.Current; }
Comparisons
In cases where need to write own implementations of the IComparer and IEqualityComparer there are generic base classes that do much of the required work. Implement any abstract methods and override default behaviour as appropriate, e.g.:
class MyComparer<T> : Comparere<T>
{
public override int Compare(T x, T y)
{
return x.GetHashCode() - y.GetHashCode();
}
}