Saturday, September 4, 2010

Working With ArrayList: What is the best Iterator choice?

When you first look at which statement to use to iterate through an ArrayList collection, the foreach statement stands out.  It's syntax is concise and readable as well as no extra lines of code are needed to extract the element you want to work with.  But...[pause for effect]...if the collection is changed (e.g. adding, modifying, or deleting elements and even copying to it), you'll get a runtime error that the "Collection was modified."

What's going on? Why does it throw an error?
The foreach statement implements the System.Collections.IEnumerable or System.Collections.IEnumerable<T> interface.  The enumerator is used to read the collection's data, but it cannot be used to modify the underlying collection.  It remains valid as long as the collection remains unchanged.  If changes occur, the enumerator is invalidated and throws the error.  Also note that the enumerator is not thread-safe because it does not have exclusive access to the collection. See this article from MSDN.

What constitutes "modifying" the collection?
The obvious modifiers are adding or deleting elements from the collection.  Another is modifying the value of any of the elements.  The one that got me chasing my tail for too many hours was copying a private ArrayList field to the public Property in the object.  The data was dynamically updated each time the property was called, which meant that each time the foreach loop executed another loop (i.e. NextMove() in GetEnumerator()) to the enumerator the collection had changed.  It drove me nuts as the data and number of elements were the same.

Here was the code in the SolverEngine:

            foreach (int digit in gridObj.DigitsNeeded)
            {
                // Work done here
            }
Here is the gridObj's property:
        private ArrayList digitsNeeded = new ArrayList();

        public ArrayList DigitsNeeded
        {
            get
            {
                CheckDigitsNeeded(); //Dynamically checks the board
                return digitsNeeded;
            }
        }
Solution for the foreach "Collection changed error"
I tried several methods to solve this:

  1. Copying the ArrayList to another ArrayList so that the foreach locked down control.  But that causes extra overhead and for my application I need to deal with the real-time board information by dynamically checking what is needed when called and not an old copy of it.
  2. Changed to a for statement and iterated forward through it.  Same error popped from time-to-time as the .Count value changed when elements were modified (e.g. a tile was solved within the for loop which decreased the NeededDigits ArrayList).
  3. Iterated in reverse through the for statement.  Ahha that works.  No more "collection changed" errors.
Using #3 above, here was the code:
            int totalDigitsNeeded = gridObj.DigitsNeeded.Count;

            for (int d = totalDigitsNeeded - 1; d >= 0; d--)
            {
               digit = (int)gridObj.DigitsNeeded[d];

                // Does some work
            }


Do you see a problem?  What happens when a digit gets solved and stored in a tile, meaning that DigitsNeeded ArrayList has an element removed? The index (d) could be out of range for the ArrayList. (Did you guess right?)

Solution:  I had to implement a method to first validate d to .Count and then get the next element in the ArrayList.
      private void SomeClass(ref Grid gridObj)
      {

            //Get the sector number for the missing digit
            int totalDigitsNeeded = gridObj.DigitsNeeded.Count;
            for (int d = totalDigitsNeeded - 1; d >= 0; d--)
            {
                //Valid the ArrayList index & then get the next digit
                //If -1, then there are no more digits needed; so return.
                digit = GetNextElement(gridObj.DigitsNeeded, ref d, ref totalDigitsNeeded);
                // Does some work
            }
        }

        #region "GetNextElement"

        private int GetNextElement(ArrayList objArrayList, ref int index, ref int origCount)
        {
            int nextElement = -1;

            int count = objArrayList.Count;
            if (count == 0)
                return nextElement;

            if ((count != origCount) && ((count - 1) < index))
            {
                origCount = count;
                index = count - 1;
            }
            nextElement = (int)objArrayList[index];
          
            return nextElement;
        }
        #endregion




Final Thoughts:
I'm still not entirely understanding IEnumerable and why it gets confused.  But I accept that it does and have seen the result of it.

kick it on DotNetKicks.com

Shout it

3 comments:

  1. Sorry but in your case, you're doing it wrong.

    By using foreach, no collection change is implied. Even if there is no exception, what should foreach do with an added item, or when the current item is removed? foreach means no change of collection.

    Since you want to process a real-time change of data, you need a more robust process, instead of using a collection or array for processing. Every time the state of data changes, the collection changes, and there's noway to indicate which one was processed, which not with a collection. You should use a queue or a processing line for that.

    Trying to use a collection for your case is a disaster waiting to happen. Sooner or later, you'll spend way more time to find out why some item as not correctly processed, at random time.

    ReplyDelete
  2. Van, Thank you for your comment. My point exactly, as I discovered, is that foreach requires no changes to the collection...period. Looking at various forums, it appeared to be a common question of how to deal with it.

    For my application, ArrayList seemed to fit best with the tools that I have thusfar, remembering that I'm still learning. I attempted to build some robustness around the core issue of IEnumerable. I believe though that queue is also a System.Collection class that uses IEnumerable. Therefore, wouldn't it cause the same issues of "robustness"?

    I'm uncertain as to what you mean by "processing line." I'd appreciate whatever advice you can give.

    Thank you.

    ReplyDelete
  3. I meant you should have a separate queue/stack for incoming data.

    You're not looping through this queue/stack though. Instead, you remove an item from the queue/stack (queue.Dequeue/stack.Pop). Whatever in the queue/stack is not processed. The processed data can be saved in whatever list you want.

    ReplyDelete

About

My Photo
Welcome to my blog! This blog is about my journey to become a Master .NET coder and teacher.

Followers

Share it