Performance Metrics for a Design Driven by Tests

Recently, I found an interesting programming exercise, which read:

"A robot is located at the bottom-left corner of a 5x5 grid. The robot can move either up, down, left, or right, but can not visit the same spot twice. The robot is trying to reach the top-right corner of the grid."

Problem solved. We have a simple application that maps unique paths on a square board. However, there's one correctly held suspicion that this cannot scale. The reason for this is likely to be down to how memory is managed in .NET and that robot references are held as the controller recurses down to spawn more robots to explore the grid. This is an entirely crude implementation, and remember, one that had no prior design.

With the creation of each robot, a little more memory is required. For a grid of 3x3, for example, only ~250KB is used for the 50 or so robots used to map out the 12 unique paths. Once the grid increases to 5x5, however, there are 90,000+ robots chewing through ~65MB! of RAM. Despite this the Garbage Collector cleans up generation 0 only 47 times and the answer appears in less than half a second!

The way memory management occurs in .NET hasn't changed significantly since an article in the MSDN Magazine from November 2000. Garbage Collection: Automatic Memory Management in the Microsoft .NET Framework. Essentially, every so often the garbage collector will remove objects from memory that no longer have objects referencing them. How often this clean up process occurs is determined by how often the memory allocated for the shortest living objects (Generation Zero) is filled. Recursion and deep stacks are a sure way for this to happen more frequently and to reduce the amount of memory available to quickly multiplying machines. There isn't even enough memory to calculate paths for an 8x8 grid so the design will need some attention!

First we allow for some instrumentation:

[code:c#]    public class ProgramInstrumentation
    {
        public static int CollectionCount;
        public static long TotalMemory;
    }

    public struct BoardExplorerControllerInstrumentation
    {
        public int CollectionCount;
        public long TotalMemory;

        public BoardExplorerControllerInstrumentation(int collectionCount, long totalMemory)
        {
            this.CollectionCount = collectionCount;
            this.TotalMemory = totalMemory;
        }
    }

[/code]

And adding a property on the BoardExplorerController to keep score.

[code:c#]    // allows memory comparisons between board explorer controllers
        public BoardExplorerControllerInstrumentation Instrumentation { get; private set; }

[/code]

In the previous version each robot had a property offering current location information

[code:c#]    private LinkedListNode CurrentLocation;

[/code]

It's replaced by the last node of the journey because they point to the same location and this is redundant!

[code:c#]    internal int LocationId 
    { 
        get { return Journey.Last.Value; }
    }

[/code]

No celebration just yet, losing one reference type only buys us 16 bytes on x64 so we press on. If the problem is that we have controllers that create new controllers, each with the possibility of creating four new robots (one for each direction) then fixing this might mean separating creation from the responsibility for maintaining each bot. The controller passes off responsibility for newing up bots to a factory and then can place them on a queue. No further need for a robot's controller once this has been done then!

So let's use a BoardExplorerController with static methods rather than an instance of the object; no reference is held to either Board or BoardExplorer. A controller stripped bare:

[code:c#]    public static Queue<LinkedList> CompleteJourneys = new Queue<LinkedList>();

    internal static IEnumerable GetValidMoves(BoardExplorer explorer)
    {
        var availableMoves = Program.Board.GetAvailableMovesFromLocation(explorer.Location);
        while (availableMoves.Count > 0)
        {
            var location = availableMoves.Pop();
            if (!explorer.HasVisited(location))
                yield return location;
        }
    }

    internal static BoardExplorer Move(BoardExplorer explorer)
    {
        explorer.Move();

        if (explorer.Location == Program.Board.FinalSquare)
        {
            var journey = explorer.Journey;
            explorer = null;
            CompleteJourneys.Enqueue(journey);
        }
            
        return explorer;
    }

[/code]

Before being placed on the queue, each robot now needs to know where it's going to be moved next so a property is added to the BoardExplorer.

[code:c#]    public int NextLocation { get; set; }

[/code]

The result is somewhat expected. Memory use is improved requiring only ~10MB, down from ~65MB and performance is comparable. Go team! The secret source is in the Factory and the main Program refactored to perform more coordination:

[code:c#]    class BoardExplorerCloneFactory
    {
        public static int Count = 0;
        private Stack explorers;
        public bool HasLocations { get; private set; }

        public BoardExplorerCloneFactory(Stack explorers)
        {
            this.HasLocations = true;
            this.explorers = explorers;
        }

        internal IEnumerable GetExplorers()
        {
            HasLocations = false;
            while (explorers.Count > 0)
            {
                var explorer = explorers.Pop();
                foreach (var location in BoardExplorerController.GetValidMoves(explorer))
                {
                    HasLocations = true;
                    Count++;
                    yield return explorer.Clone(location);
                }
            }
        }
    }

[/code]

The Main method now resembles the following:

[code:c#]    static void Main(string[] args)
    {
        Board = new Board(int.Parse(args[0]));
        var explorer = new BoardExplorer();
        var explorers = new Stack();
        var factory = new BoardExplorerCloneFactory(explorers);
        var warmTimer = GetStartedStopWatch();

        explorers.Push(explorer);

        while (factory.HasLocations)
        {
            var explorersCount = 0;
            UpdateInstrumentation();
            foreach (var newExplorer in factory.GetExplorers())
            {
                var movedExplorer = BoardExplorerController.Move(newExplorer);
                if (movedExplorer != null)
                    explorers.Push(movedExplorer);

                movedExplorer = null;
                explorersCount++;
            }
        }
    }
[/code]

So our project stakeholders are thrilled that we now use less memory! Actually no. They don't care particularly. They want a solution that is capable of doing 10x10 grids and up. The metrics for a smaller grid 6x6 suggest that this solution won't get us there:


    There are 1,262,816 unique paths from the top-left corner to the bottom-right corner of a 6x6 grid!
    There were 18,470,410 robots used to discover this!
    Total memory: 995,957,488
    Generation Zero Garbage Collections: 2,936
    Time elapsed: 157,840ms

It's impressive to be creating and managing over 100,000 robots a second but at the expense of ~100MB this approach doesn't inspire much confidence. There must be more we can do in terms of the data structure we're using and that's a segue way into the next post.

Add comment