Archive for the ‘Development’ Category

Ordered Dictionaries, Keyed Lists and .NET : A review

by Alex Mekelburg

(close)


See other posts by Alex Mekelburg
Friday, April 4th, 2008

As far as .NET generic collections go, you’ve pretty much got two basic types. There’s List<T>, which gives you an array-style type-safe collection of objects. And then there’s Dictionary<tkey, tval> (and a number of other related types), which you can use for hash-style type-safe collections. But what if you want a collection of objects that you access by key and index? Or, more subtly, take this example:

class Thing {

  private string txt;

  public Thing(string text)
  {
    txt = text;
  }

  public override string  ToString()
  {
    return txt;
  }
}

class Stuff : Thing
{
  public Stuff(string text) :  base(text) { }
}

class Program
{
  static void Main(string [] args)
  {
    Dictionary<Stuff, Thing> stuffAndThings = new Dictionary<Stuff, Thing>();
    stuffAndThings.Add(new Stuff("1: "), new Thing("One Thing"));
    stuffAndThings.Add(new Stuff("2: "), new Thing("Two Thing"));
    stuffAndThings.Add(new Stuff("3: "), new Thing("Red Thing"));
    stuffAndThings.Add(new Stuff("4: "), new Thing("Blue Thing"));

    foreach (KeyValuePair<Stuff, Thing> pair in stuffAndThings)
    {
      Console.WriteLine(pair.Key.ToString() + pair.Value.ToString());
    }
}

Which Outputs:

1: One Thing
2: Two Thing
3: Red Thing
4: Blue Thing

Right? Well, maybe not…
In all the MSDN descriptions of .NET Generic hash-type collections we see this perfectly clear little statement “The order in which the items are returned is undefined.” Fantastic. Although I’ve never seen items returned in any order other than the one they were entered, I guess we can’t rely on it. If anyone has any details of how/why/when items will be returned in any other order, I’d love to hear it.

So, now we’re looking for a class that will store key-value pairs that’s accessible by index (which also implies it will return items in the order entered). There is the System.Collections.Specialized.NameValueCollection class, which does exactly this, but only for string-string pairs. Unfortunately there is no .NET framework class to do this generically. The closest thing is the KeyedCollection<tkey, tval> class, which is abstract (and has some other interesting qualities that we’ll look at later), and the IOrderedDictionary class, which, as you can probably guess, is just an interface. So it looks like we’re up to writing our own class. Or search around and see if someone’s already done it for us (of course they have!).

After a bit of searching I found a couple implementations that do the job nicely, and discuss some of the performance plusses and minuses.

MarcClifton.com – KeyedList<tkey, tval>
CommonGenius.com – OrderedDictionary<tkey, tval>

It turns out that these two implementations are pretty similar.

The KeyedList class implements IDictionary<TKey, TVal> and IList<KeyValuePair<TKey, TVal>>.
The OrderedDictionary class implements IOrderedDictionary and IDictionary<TKey, TVal>.
Both classes have an internal Dictionary<Tkey, TVal> field and List<KeyValuePair<TKey, TVal>> field for storing all the data. This means that all the data stored by this class is doubled (or at least the number of references to each piece of data is doubled). Additionally, all of the add and remove actions need to be applied to each internal object, so the time for these operations is increased. As David Nelson (the OrderedDictionary<tkey, tval> author) points out, “OrderedDictionary should only be used when insert/delete operations will be minimal, AND there is a need to efficiently access elements by index and/or key.”

It feels like there should be an easier or better way to do this. Or at least somehow not have to store two separate datasets. One thing to note is that in these implementations, all the necessary data is stored, in order, in a List<KeyValuePair<TKey, TVal>>. So a very simple option would be to just use this list of pairs. For exmaple:

List<KeyValuePair<Stuff, Things>> stuffAndThings =
          new List<KeyValuePair<Stuff, Things>>

Or if you want a new type for it:

public class PairList<TKey, TVal> : List<KeyValuePair<TKey, TVal>

That’s it! Now you have a “leaner” implementation of ordered key-value pairs. But you can’t look things up by key. And there’s nothing to validate that the keys are all unique. And you have to use it like this:

PairList<Stuff, Things> stuffAndThings = new PairList<Stuff, Things>
stuffAndThings.Add(new KeyValuePair<Stuff, Things>("stuff","things"));

Ew. This isn’t at all what we were talking about, but now is a good time to really think about what you actually need to do. For our initial example (putting together a list of key-value pairs and getting them out in order) this does just fine. And you can solve the ugly ‘Add’ problem easily by creating a new Add method:

public void Add(TKey key, TValue value)
{
  base.Add(new KeyValuePair<TKey,TValue>(key, value));
}

Not bad.
But if we want to add the ability to look up by key, we have to do something like this:

public List<TKey> Keys
{
  get
  {
    List<TKey> keys = new List<TKey>();
    foreach(KeyValuePair<TKey, TValue> kvp in this)
    {
       keys.Add(kvp.Key);
    }
    return keys;
  }
}

public TValue this[TKey key]
{
  get
  {
    return this[Keys.IndexOf(key)].Value;
  }

  set
  {
    this[Keys.IndexOf(key)] = new KeyValuePair<TKey, TValue>(key, value);
  }
}

Now it’s getting ugly. Especially because every time you access by key you have to generate a new list of keys. Super slow. It definitely looks like this is not the road to take for a true generic “OrderedDictionary” kind of functionality.

So lets look back at this abstract KeyedCollection<tkey, tval> class.
If you derive this class you have to implement:

protected override tkey GetKeyForItem(tval  item)

Basically this means that it expects you to be using objects where the key is stored as part of the object (or can be systematically derived from it). This seems annoying at first, but might be a really good idea – you always have the object’s key value any time you need to access the object.

In particular, we can genericize things a bit to make it so we don’t have to create separate concrete types for every kind of thing we want to store:

public interface IHasKey<TKey>
{
  TKey Key
  {
    get ;
    set ;
  }
}

public class KeyedObjectCollection<TKey, TVal> :
   System.Collections.ObjectModel.KeyedCollection<TKey, TVal>
   where TVal : IHasKey <TKey>
{
  protected override TKey GetKeyForItem(TVal item)
  {
    return item.Key;
  }

  public void Add(TKey key, TVal item)
  {
    item.Key = key;
    base.Add(item);
  }
}

Lets try this out with our stuffAndThings example:

class KeyedThing: Thing, IHasKey< string>
{
  public string key;

  public KeyedThing(string keyText, string valueText)
    : base(valueText)
  {
    key = keyText;
  }

  public string Key
  {
    get { return  key; }
    set { key = value ; }
  }
}

class Program
{
  static void Main(string [] args)
  {
    KeyedObjectCollection<string, KeyedThing> stuffAndThingsTwo =
         new KeyedObjectCollection<string, KeyedThing>();
    stuffAndThingsTwo.Add( new KeyedThing ("1: ", "One Thing"));
    stuffAndThingsTwo.Add( new KeyedThing ("2: ", "Two Thing"));
    stuffAndThingsTwo.Add( new KeyedThing ("3: ", "Red Thing"));
    stuffAndThingsTwo.Add( new KeyedThing ("4: ", "Blue Thing"));

    //using the for instead of foreach to show that the items
    //_are_ available by index
    for (int i = 0; i < stuffAndThingsTwo.Count; i++)
    {
      Console.WriteLine(stuffAndThingsTwo[i].Key +
         stuffAndThingsTwo[i].ToString());
    }
  }
}

Not bad…except we have to declare the key type for the KeyedObjectCollection as string, even though the KeyedThing class inherently require’s its key type to be string. That gets a bit redundant.

It turns out that’s fixable by being even more generic:

public class KeyedCollection<tkey, tval> :
       System.Collections.ObjectModel.KeyedCollection<tkey, KeyValuePair<tkey, tval>>
{
  protected override tkey GetKeyForItem(KeyValuePair<tkey, tval> item)
  {
    return item.Key;
  }

  public void Add(TKey key, TVal item)
  {
    item.Key = key;
    base.Add(item);
  }
}

Now when you access the collection by key, it will return a KeyValuePair. This will probably be OK, but it also means you have to add everything by KeyValuePair, so there are a few extra methods we may want to add:

public ICollection<tkey> Keys
{
  get { return base.Dictionary.Keys; }
}

//This one's particularly ugly.
public ICollection<tval> Values
{
  get
  {
    ICollection <tval> values =  new List<tval>();
    foreach(KeyValuePair<tkey, tval> kvp in base.Dictionary.Values)
    {
      values.Add(kvp.Value);
    }
    return values;
  }
}

public void Add(tkey key, tval val)
{  base.Add( new KeyValuePair<tkey, tval>(key, val)); }

public void Insert(int index, tkey key, tval val)
{  base.Insert(index, new KeyValuePair<tkey, tval>(key, val)); }

public int IndexOf(tkey key)
{
  for (int i = 0; i < base.Count; i++)
  {
    if (this[i].Key.Equals(key))
        return  i;
  }
  return -1;
}

Here we see that the base abstract KeyedCollection class contains internal Dictionary (base.Dictionary) and internal List (base.Items) objects. It’s looking quite familiar to what we were doing with the OrderedDictionary and KeyedList, but the Dictionary in the KeyedCollection class is storing Keys and KeyValuePairs instead of just Keys and Values, which is not ideal.

However, there’s a lot less code and room for error. And, for better or worse, it’s using a lot more functionality provided in the .NET framework class library. Why might this be good?

Looking at the OrderedDictionary implementation, there’s this property:

private List<KeyValuePair<TKey, TValue>> List
{
  get
  {
    //_list is an internal field
    if  (null == List) //oops! Looks like we've got an infinite loop
    {
      _list = new List<KeyValuePair <TKey,TValue>>(_initialCapacity);
    }
    return  _list;
  }
}

Obviously it’s supposed to be if(null == _list) so it’s just a typo. But it’s one that doesn’t get caught until you actually run some code that calls the right methods on the object. This wouldn’t have happened if Microsoft had given you the class you needed.

After all debugging and testing is done, the OrderedDictionary and KeyedCollection classes are much cleaner implementations of a generic class for collections that is accessible by key and by index. However, there are times when the simple List<KeyValuePair<tkey, tval>> does everything we need, and others when some particular implementation of KeyedCollection may do the job best. This just point out again how a proper design process may save us a lot of refactoring down the road.

Breathe and Bend with Air and Flex

by George White

(close)


See other posts by George White
Monday, March 3rd, 2008

Adobe has released version 1.0 of Adobe AIR as well as the Flex 3.0 platform. I got a chance to work with betas pf AIR and Flex 3.0 on an engagement at MatchMine, and I’ve been looking forward to the final releases for a while.

AIR is a new runtime for building cross-platform desktop applications, based on the excellent WebKit browser engine (which also powers Apple’s Safari browser and the was derived from the open source KHTML engine). AIR allows development of desktop applications using the similar tools and techniques to Web application development: Flash, Flex, and HTML + AJAX. This means that Web developers can transition their existing skills to the world of desktop applications.

Flex is (or rather was, with the release of AIR) a Web application platform based on the browser-based Flash VM. It’s still based on Flash, but the 3.0 release can deploy applications to the AIR runtime as well, meaning that Flex plays in both the Web and desktop domains. 3.0 is an evolution of the 2.0 platform, rather than the huge server seen going for 1.x to 2.0, but the latest release is more mature and capable than it’s predecessor.

Adobe has also taken steps to move the Flex SDK, then free version of the Flex development to tools, to an open source model. Most of the core components are open source now, but there’s still a pretty hefty package of Adobe “add-ons” which are still closed (including the sources for the AIR SDK components). Still, this is a nice first step towards a more open set of tools for the Flash family. More info can be found at the Adobe Open Source site.

Beyond open source, Adobe is providing a wide range of tools for building AIR apps. You can use Flash, Flex Builder, or Dreamweaver to produce AIR applications. There also support for HTML +AJAX development in Aptana Studio. Or you can forgo the Adobe GUI tools and break out your favorite text editor to build apps with the Flex SDK or AIR SDK. The latter is included in the former. These freeware command-line based toolsets provide the bare bones tools needed to build, test and package AIR apps using whichever of the technologies it supports. The AIR SDK is focused on HTML+AJAX apps and the Flex SDK does it all. There’s support for ant as a build tool, which is a nice touch.

One of the most interesting things about AIR and other platforms like it is the explicit support for desktop applications. A couple of years ago, the question seemed to be when the Web would kill the desktop and the traditional OSes. A more rational pattern seems to be returning to the fore and folks are starting to realize that it’s not a question of Web apps OR Desktop apps, but rather an exciting melange of both. There’s a place for both and each type of application fits a certain domain of problems better than the other. And the place where things get really exciting is that area where they meet. Adobe AIR is an interesting take on building desktop apps that can easily leverage the power of the Web, too.