InfoQ

新闻

.NET集合类型的发展

作者 Jonathan Allen译者 朱永光 发布于 2008年7月4日 下午10时29分

社区
.NET
主题
.NET框架,
开放源代码,
编程

.NET Framework中的集合类型在这几年经历了显著的发展。得益于微软新的开放策略,让我们可以展现一个常用数据结构的两个版本:在.NET和Mono中的哈希表。

理论上,哈希表是一个非常简单的构造,就是数组或链表的集合被划分到有限数量的存储体中。然而,即使你擅长于多线程逻辑,在获取该数据结构的元素项时,仍然有些复杂。

//   Copyright (c) Microsoft Corporation.  All rights reserved.

public virtual Object this[Object key] {
         get {
             if (key == null) {
                 throw new ArgumentNullException("key", Environment.GetResourceString("ArgumentNull_Key"));
             }
             uint seed;
             uint incr;

             // Take a snapshot of buckets, in case another thread does a resize
             bucket[] lbuckets = buckets;
             uint hashcode = InitHash(key, lbuckets.Length, out seed, out incr);
             int  ntry = 0;
             bucket b;
             int bucketNumber = (int) (seed % (uint)lbuckets.Length);
             do
             {
                 int currentversion;
                 //     A read operation on hashtable has three steps:
                 //        (1) calculate the hash and find the slot number.
                 //        (2) compare the hashcode, if equal, go to step 3. Otherwise end.
                 //        (3) compare the key, if equal, go to step 4. Otherwise end.
                 //        (4) return the value contained in the bucket.
                 //     After step 3 and before step 4. A writer can kick in a remove the old item and add a new one
                 //     in the same bukcet. So in the reader we need to check if the hash table is modified during above steps.
                 //
                 // Writers (Insert, Remove, Clear) will set 'isWriterInProgress' flag before it starts modifying
                 // the hashtable and will ckear the flag when it is done.  When the flag is cleared, the 'version'
                 // will be increased.  We will repeat the reading if a writer is in progress or done with the modification
                 // during the read.
                 //
                 // Our memory model guarantee if we pick up the change in bucket from another processor,
                 // we will see the 'isWriterProgress' flag to be true or 'version' is changed in the reader.
                 //
                 int spinCount = 0;
                 do {
                     // this is violate read, following memory accesses can not be moved ahead of it.
                     currentversion = version;
                     b = lbuckets[bucketNumber];

                     // The contention between reader and writer shouldn't happen frequently.
                     // But just in case this will burn CPU, yield the control of CPU if we spinned a few times.
                     // 8 is just a random number I pick.
                     if( (++spinCount) % 8 == 0 ) {
                         Thread.Sleep(1);   // 1 means we are yeilding control to all threads, including low-priority ones.
                     }
                 } while ( isWriterInProgress || (currentversion != version) );
                 if (b.key == null) {
                     return null;
                 }
                 if (((b.hash_coll & 0x7FFFFFFF) == hashcode) &&
                     KeyEquals (b.key, key))
                     return b.val;
                 bucketNumber = (int) (((long)bucketNumber + incr)% (uint)lbuckets.Length);
             } while (b.hash_coll < 0 && ++ntry < lbuckets.Length);
             return null;
         }

是的,各位读者请注意,这里存在一个伪自循环锁(pseudo-spin lock),并调用了Threading.Sleep。

而在.NET 2.0和泛型集合中,微软决定放弃集合的内部锁定机制。相反,要求任何锁定都必须在外部调用。我们在Generic.Dictionary类中可以发现更为简洁的代码。

//   Copyright (c) Microsoft Corporation.  All rights reserved.

private int FindEntry(TKey key) {
    if( key == null) {
        ThrowHelper.ThrowArgumentNullException(ExceptionArgument.key);
    }
    if (buckets != null) {
        int hashCode = comparer.GetHashCode(key) & 0x7FFFFFFF;
        for (int i = buckets[hashCode % buckets.Length]; i >= 0; i = entries[i].next) {
            if (entries[i].hashCode == hashCode && comparer.Equals(entries[i].key, key)) return i;
        }
    }
    return -1;
}

Mono版本的Hashtable和Dictionary在实现上仍然是大相径庭,而且也不同于微软的实现。

// Copyright (C) 2004-2005 Novell, Inc (http://www.novell.com)

public virtual Object this[Object key]
{
     get
     {
         if (key == null)
             throw new ArgumentNullException("key", "null key");

         Slot[] table = this.table;
         int[] hashes = this.hashes;
         uint size = (uint)table.Length;
         int h = this.GetHash(key) & Int32.MaxValue;
         uint indx = (uint)h;
         uint step = (uint)((h >> 5) + 1) % (size - 1) + 1;

         for (uint i = size; i > 0; i--)
         {
             indx %= size;
             Slot entry = table[indx];
             int hashMix = hashes[indx];
             Object k = entry.key;
             if (k == null)
                 break;

             if (k == key || ((hashMix & Int32.MaxValue) == h
                 && this.KeyEquals(key, k)))
             {
                 return entry.value;
             }

             if ((hashMix & CHAIN_MARKER) == 0)
                 break;

             indx += step;
         }

         return null;
     }

而Mono的dictionary实现则为:

// Copyright (C) 2004 Novell, Inc (http://www.novell.com)
// Copyright (C) 2005 David Waite
// Copyright (C) 2007 HotFeet GmbH (
http://www.hotfeet.ch)

public TValue this [TKey key] {
    get {
        if (key == null)
            throw new ArgumentNullException ("key");

        // get first item of linked list corresponding to given key
        int hashCode = hcp.GetHashCode (key);
        int cur = table [(hashCode & int.MaxValue) % table.Length] - 1;
        // walk linked list until right slot is found or end is reached
        while (cur != NO_SLOT) {
            // The ordering is important for compatibility with MS and strange
            // Object.Equals () implementations
            if (linkSlots [cur].HashCode == hashCode && hcp.Equals (keySlots [cur], key))
                return valueSlots [cur];
            cur = linkSlots [cur].Next;
        }
        throw new KeyNotFoundException ();
    }

好了,现在你可以看到,四种方法本质上实现了同样的功能。毋庸置疑,使用Sleep命令的那种是最差的,至于哪一个最好,我想还得由你来决定。

查看英文原文:On the Evolution of the .NET Collections

相关赞助商

InfoQ中文站.NET社区,关注.NET和微软的其他企业开发解决方案,通过新闻、文章、视频访谈和演讲以及迷你书等为中国.NET社区提供一流资讯。

没有回复

回复

独家内容

书评:敏捷模式──指向成功的路标

Ryan Cooper对Amr Elssamadisy的新书发表了评价,并认为书中提供了一种为实施敏捷量身定做的框架。本书并没有给出一种人人可用的敏捷方法,而是为读者提供一些模式和工具,用以找出哪些敏捷实践可以最有效地达到该组织机构的特定目标。

构建的可伸缩性和达到的性能:一个虚拟座谈会

这个由业界主要专家们参加的座谈会探究了在使应用程序具备尽可能好的伸缩性及性能的过程中所面临的挑战和思考过程。

OpenSocial的分析与实现

本视频主要对OpenSocial进行了分析,并对实现的方式进行了介绍。其中包括:OpenSocial的开发经验、Container Provider的技术准备、平台的构成要素、具体的规范、以及对未来的展望。

缓存系统MemCached的Java客户端优化历程

Memcached在大型网站被应用得越来越广泛,但是Java客户端并不多,本文作者基于现有的开源客户端进行了封装优化,并翔实记录了这一过程。

超越SOA:动态业务应用的新企业应用框架(2)

在他们文章的第二部分,作者探讨了动态业务应用的架构并介绍了资源容器的概念。他们示范了如何在JEE之上构建这个架构,以及它如何影响实现生产力。

使用ClickOnce细分发布版本

ClickOnce让WinForms应用程序的部署轻而易举。David Cooksey演示了如何在ASP.NET中编写一个HttpHandler来实现对ClickOnce部署的版本细分。

敏捷教练,从A到Z

敏捷带来了新的领导者角色,“敏捷教练”。它是不是跟“部门经理”或“技术领导”一样,只是换汤不换药呢?教练Pat Kua在这篇启蒙文章中对敏捷教练一职做了概述。

利用Ruby简化你的Java测试(进阶篇)

本文是Productive Java with Ruby系列文章的第二篇,通过上一篇的介绍,我想大家对如何利用Ruby进行单元测试有了一个基本的了解,从这里开始,我将和大家一起讨论一些利用Ruby进行单元测试时的高级话题。