For When You Absolutely Need a Bigger Stack Than RAM Allows

by Kofi Sarfo 6. April 2011 10:32

Recently, I found an interesting programming exercise, which read:

"A robot is located at the bottom-left corner of a 5x5 grid. The robot can move either up, down, left, or right, but can not visit the same spot twice. The robot is trying to reach the top-right corner of the grid."

In a series of blog posts we've managed to solve this simply and reasonably quickly whilst demonstrating the process of going from problem to solution. However, we cannot say we've done this very efficiently. Shredding through MB of memory in order to complete calculations like this one above meant we tired quickly of the OutOfMemoryException when attempting this for grids larger than 6x6. To help put this in perspective, that's only 36 squares! It shouldn't be so difficult.

The method in a nutshell: The paths being taken were stored on a Stack and at each juncture where, for the next step, there was more than one square to choose from a path was added to the stack for each after the current path had been removed from the stack. See, is this why few bother with pseudo-code. Thankfully, nobody's reading this so no clearer explanation is necessary.

Having worked out that some kind of database is required and having ruled out a relational database we turn to Redis, an open source, advanced key-value store. The best part is that apart from being lightening fast it implements a stack. Perfect!

Whilst we're adding code to use Redis stack functionality we're interested in being able to preserve and swap out the different stack implementations: Memory, SQL Server & Redis. To do this we abstract towards an interface:

interface IStack { int Count(); int MaxCount(); void Push(T journey); T Pop(); }

The IStack of T is introduced to allow List and List to be used indiscriminately for journeys as we've not settled on a representation for our disk based stack. The implementation for each of the three now looks like this:

class MemStack : IStack { private Stack journeys; public MemStack() { journeys = new Stack(); } public void Push(T journey) { journeys.Push(journey); } public T Pop() { var journey = journeys.Pop(); return journey; } private int maxCount; public int MaxCount() { return maxCount; } public int Count() { var count = journeys.Count; if (count > maxCount) maxCount = count; return journeys.Count; } } class RediStack : IStack<List<int>> { RedisNativeClient redis = null; private const string explorerKey = "Whitt-Ku.Journeys"; public RediStack() { redis = new RedisNativeClient(); redis.FlushAll(); } public void Push(List<int> journey) { redis.RPush(explorerKey, journey.GetBytes()); } public List<int> Pop() { var bytes = redis.RPop(explorerKey); return bytes.ToList<int>(); } private int maxCount; public int MaxCount() { return maxCount; } public int Count() { var count = redis.LLen(explorerKey); if (count > maxCount) maxCount = count; return count; } }

The RedisNativeClient is in fact a class within the ServiceStack.Redis library. It's an "Open Source C# Redis client based on Miguel de Icaza previous efforts with redis-sharp." We change the program for the last time:

public static Board Board; public static IStack<List<int>> Journeys; static void Main(string[] args) { var boardLength = int.Parse(args[0]); Board = new Board(boardLength); Journeys = new RediStack(); var inaugural = new List<int>(); inaugural.Add(Board.StartLocation); Journeys.Push(inaugural); var factory = new JourneyFactory(); while (factory.HasIncompleteJourneys) { var explorersCount = 0; foreach (var journey in factory.GetJourneys()) { if (journey[journey.Count - 1] == Board.EndLocation) JourneyController.CompleteJourneys.Enqueue(journey); else Journeys.Push(journey); } } }

We run this with fingers crossed for a 5x5 grid and find that it's ~70x slower than using RAM:


	There are 8,512 unique paths from the top-left corner to the bottom-right corner of a 5x5 grid!
	There were 90,110 robots used to discover this!
        Max stack: 16
        Total memory: 3,740,864
        Time elapsed: 18,538ms
        Garbage collection gen0: 701
        Garbage collection gen1: 10
        Garbage collection gen2: 0

The most recent results using RAM and List Generics implementation:


	There are 8,512 unique paths from the top-left corner to the bottom-right corner of a 5x5 grid!
	There were 90,110 robots used to discover this!
        Max stack: 16
	Total memory: 2,704,944
	Time elapsed: 256ms
	Garbage collection gen0: 25
	Garbage collection gen1: 6
	Garbage collection gen2: 0

The reason why the same implementation using List was never used as the approach for Redis is shown below:


	There are 8,512 unique paths from the top-left corner to the bottom-right corner of a 5x5 grid!
	There were 90,110 robots used to discover this!
        Max stack: 16
	Total memory: 3,798,576
	Time elapsed: 1,302ms
	Garbage collection gen0: 49
	Garbage collection gen1: 4
	Garbage collection gen2: 0

The code responsible for converting List<int> to List<string> has terrible performance so a binary representation of List<int> is used instead as the value stored in Redis.

Finally, with little more performance to squeeze out using the current architecture we invite you to use this let us know how many unique paths there are on a 10x10 grid. Thanks.

SQL Server Based Stacks. Why Would You, Really?

by Kofi Sarfo 5. April 2011 07:47

Recently, I found an interesting programming exercise, which read:

"A robot is located at the bottom-left corner of a 5x5 grid. The robot can move either up, down, left, or right, but can not visit the same spot twice. The robot is trying to reach the top-right corner of the grid."

In previous posts we'd seen crude solutions to the problem above. Good news is that there's little respite to be found in this post also and it's more of the same sort of entertainment. Those concerns we've had so far remain, a burgeoning memory requirement, whilst looking to apply this same calculation for a 10x10 grid. The problem has been that our stack grows and then explodes resulting in an OutOfMemoryException. Naturally, the answer must be to use a disk-based stack. You can't help but think that the right answer for this involves NoSQL but, dammit, we're going to prove a relational database can't handle it before removing our tin foil hat!

First there's just the small matter of some unexpected performance gains to be made by replacing the LinkedList<int> with List<int> when representing robot journeys. Another little refactoring results in a decent performance improvement! Our Program class stripped of validation and instrumentation looks like this:

public static Board Board; static void Main(string[] args) { var boardLength = int.Parse(args[0]); Board = new Board(boardLength); var newJourney = new List<int>(){ Board.StartLocation }; var explorer = new BoardExplorer(newJourney); var journeys = new Stack<List<int>>(); var factory = new BoardExplorerCloneFactory(journeys); journeys.Push(explorer.Journey); while (factory.HasClones) { var explorersCount = 0; UpdateInstrumentation(); foreach (var movedExplorer in factory.GetExplorers()) { if (movedExplorer.IsExploring) journeys.Push(movedExplorer.Journey); explorersCount++; } } }

Changes to the BoardExplorer are almost too trivial to show once again. More fluff than stuff, cue our newest, lighter-weight robot:

public struct BoardExplorer { public List<int> Journey; public BoardExplorer(List<int> journey) { this.Journey = journey; } internal int GetLocation() { return Journey[Journey.Count-1]; } public bool IsExploring { get { return GetLocation() != Program.Board.EndLocation; } } internal bool HasVisited(int location) { foreach (var visitedLocation in Journey) if (visitedLocation == location) return true; return false; } public BoardExplorer Move(int location) { var journeyBeyond = new List<int>(Journey); journeyBeyond.Add(location); var moved = new BoardExplorer(journeyBeyond); if (!moved.IsExploring) BoardExplorerController.CompleteJourneys.Enqueue(moved.Journey); return moved; } }

The Controller continues to look as anaemic as ever.

public class BoardExplorerController { public static Queue<List<int>> CompleteJourneys = new Queue<List<int>>(); internal static IEnumerable<int> GetValidMoves(BoardExplorer explorer) { var availableMoves = Program.Board.GetAvailableMovesFromLocation(explorer.GetLocation()); while (availableMoves.Count > 0) { var location = availableMoves.Pop(); if (!explorer.HasVisited(location)) yield return location; } } }

And whoops of delight followed. Not really. Better numbers though.


	There are 1,262,816 unique paths from the top-left corner to the bottom-right corner of a 6x6 grid!
	There were 18,470,410 robots used to discover this!
	Total memory: 321,723,412
	Time elapsed: 44,092ms
	Garbage collection gen0: 2,923
	Garbage collection gen1: 1,019
	Garbage collection gen2: 0

We managed to knock off 30 seconds from the time it takes to calculate those paths. Almost twice the performance greets muted celebration. Is this good enough for a grid almost three times the size? Few seem to think so. First, let's see how this copes with a 7x7 grid.


	Out of Memory exception! 4,884,621 paths found before I capitulated.

Looking at the current state of our robot affairs we believe there's one final optimization we can make before we resort to storing the journey stack on disk. The 'epiphany' is this: Why do we need the robot at all? It was suggested by the way the problem was phrased and had 'robot' been omitted from the problem statement then R2 might never have made it into our code. Essentially, we care only about unique paths so we're about to send R2 into retirement. Our new Program minus BoardExplorer results in impressive performance gains:


	There are 1,262,816 unique paths from the top-left corner to the bottom-right corner of a 6x6 grid!
	There were 18,470,410 robots used to discover this!
	Total memory: 319,371,768
	Time elapsed: 34,679ms
	Garbage collection gen0: 2,906
	Garbage collection gen1: 1,029
	Garbage collection gen2: 0

Something to cheer also is that we've reduced the size of the application. Fewer lines of readable code is always better. I can't think of an exception right now so it must be true. Our new Program without robot participation:

static void Main(string[] args) { var boardLength = int.Parse(args[0]); Board = new Board(boardLength); Journeys = new Stack<List<int>>(); var inaugural = new List<int>(); inaugural.Add(Board.StartLocation); Journeys.Push(inaugural); var factory = new JourneyFactory(); var warmTimer = GetStartedStopWatch(); while (factory.HasIncompleteJourneys) { var explorersCount = 0; UpdateInstrumentation(); foreach (var journey in factory.GetJourneys()) { if (journey[journey.Count - 1] == Board.EndLocation) JourneyController.CompleteJourneys.Enqueue(journey); else Journeys.Push(journey); explorersCount++; } } }

Without the BoardExplorer it no longer makes sense to have a BoardExplorerController so the controller becomes a JourneyController:

public class JourneyController { public static Queue<List<int>> CompleteJourneys = new Queue<List<int>>(); internal static IEnumerable<int> GetValidMoves(List<int> journey) { var availableMoves = Program.Board.GetAvailableMovesFromLocation(journey[journey.Count - 1]); while (availableMoves.Count > 0) { var location = availableMoves.Pop(); if (!journey.Contains(location)) yield return location; } } }

Our BoardExplorerCloneFactory is renamed to the more pleasant JourneyFactory:

class JourneyFactory { public static int Count = 0; public bool HasIncompleteJourneys { get; private set; } public JourneyFactory() { this.HasIncompleteJourneys = true; } internal IEnumerable<List<int>> GetJourneys() { HasIncompleteJourneys = false; while (Program.Journeys.Count > 0) { var explorer = Program.Journeys.Pop(); foreach (var location in JourneyController.GetValidMoves(explorer)) { HasIncompleteJourneys = true; Count++; var journey = new List<int>(explorer); journey.Add(location); yield return journey; } } } }

You know, we're quietly optimistic that we've a decent chance now at taking on this 7x7 grid. And our journey planner says:


	Out of Memory exception! 4,839,550 paths found before I croaked.

Okay, we've clearly found the limit. 'Clearly' because we've said so. It's time for our relational database management system and there's just one more change to make before we're good to call on SQL Server. We'll transform the List<int> into a comma separated list of integers in string format for database table insertion. In order to be able to easily switch between using a RAM based stack and a disk based stack we'll introduce an interface 'IStack':

interface IStack { int GetCount(); void Push(List<int> journey); List<int> Pop(); }

The RAM based implementation embraces composition over inheritence:

class MemStack : IStack { private Stack<List<int>> journeys; public MemStack() { journeys = new Stack<List<int>>(); } public void Push(List<int> journey) { journeys.Push(journey); } public List<int> Pop() { var journey = journeys.Pop(); return journey; } public int GetCount() { return journeys.Count; } }

In order to use the new MemStack there are three lines of code that change in the main Program:

public static Board Board; public static IStack Journeys; // <- Change #1 here static void Main(string[] args) { var boardLength = int.Parse(args[0]); Board = new Board(boardLength); Journeys = new MemStack(); // <- Change #2 here var inaugural = new List<int>(); inaugural.Add(Board.StartLocation); Journeys.Push(inaugural); var factory = new JourneyFactory(); while (factory.HasIncompleteJourneys) { var explorersCount = 0; foreach (var journey in factory.GetJourneys()) // <- Change #3 here { if (journey[journey.Count - 1] == Board.EndLocation) JourneyController.CompleteJourneys.Enqueue(journey); else Journeys.Push(journey); explorersCount++; } } }

Before going any further we make sure that we've not broken anything by running the application to test performance. It's interesting to note here that we've moved away from TDD and we're focused on PDD (Performance Driven Development). I didn't coin this phrase so if you've not heard of it then it's because the phrase never really caught on! The reason unit tests are of minimal importance here is because we're consistently able to guarantee the units work because if they didn't then we'd see an incorrect count for the number of unique paths. The performance after this refactoring show we're still good.


	There are 1,262,816 unique paths from the top-left corner to the bottom-right corner of a 6x6 grid!
	There were 18,470,410 robots used to discover this!
	Total memory: 321,396,728
	Time elapsed: 32,512ms
	Garbage collection gen0: 2,920
	Garbage collection gen1: 1,021
	Garbage collection gen2: 0

Now let's turn attention to the database and the SQL script for pursuing this absurd course:

CREATE TABLE [dbo].[Journey]( [Id] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED , [Path] [varchar](128) NOT NULL ) GO CREATE PROCEDURE [dbo].[Push] @path varchar(128) AS DECLARE @id int INSERT INTO [Journey] ([Path]) SELECT @path SELECT @id = SCOPE_IDENTITY() GO CREATE PROCEDURE [dbo].[Pop] AS DECLARE @id int, @path varchar(128) SELECT TOP 1 @id = [Id], @path = [Path] FROM [Journey] ORDER BY id DESC DELETE FROM [Journey] WHERE id = @id SELECT @path 'Path' GO CREATE PROCEDURE [dbo].[CountPaths] AS SELECT COUNT(*) 'Count' FROM [Journey] GO

The database based stack implementation implements the IStack interface and uses Dapper.NET, a simple object mapper for .Net, for data access.

class DbStack : IStack { private string connectionString = "Server=.;Database=Wookie;Trusted_Connection=True;"; public void Push(List<int> journey) { using (var connection = new SqlConnection(connectionString)) { connection.Open(); var param = journey.ToCommaSeperatedIntegers(); var sql = string.Format("EXEC Push '{0}'", param); var result = connection.ExecuteMapperQuery(sql); } } private int maxCount = 0; public int MaxCount() { return maxCount; } public int Count() { using (var connection = new SqlConnection(connectionString)) { connection.Open(); var sql = string.Format("EXEC CountPaths"); var result = connection.ExecuteMapperQuery(sql); var count = result[0].Count; if (count > maxCount) maxCount = count; return count; } } public List<int> Pop() { using (var connection = new SqlConnection(connectionString)) { connection.Open(); var sql = string.Format("EXEC pop"); var results = connection.ExecuteMapperQuery(sql); return results.ToList(); } } }

In order to find out how large the stack grew, because this has implications on database performance, we've added a MaxCount method to the IStack interface.

interface IStack { int Count(); int MaxCount(); void Push(List<int> journey); List<int> Pop(); }

We're finally good to go so what's the reason for the pessimism that this is going to give less than stellar results? Well, for example:


	There are 8,512 unique paths from the top-left corner to the bottom-right corner of a 5x5 grid!
	There were 90,110 robots used to discover this!
	Total memory: 24,096,248
	Max stack: 16
	Time elapsed: 884,238ms

At this point we can forget about metrics in milliseconds. Using the new DbStack to calculate the unique paths in a 5x5 grid took almost fifteen minutes. Using MemStack, however, the same calculation takes ~200 milliseconds.


	There are 8,512 unique paths from the top-left corner to the bottom-right corner of a 5x5 grid!
	There were 90,110 robots used to discover this!
	Total memory: 24,096,248
	Max stack: 16
	Time elapsed: 163ms

DbStack is an incredible 5000x slower than MemStack in this instance so it appears we're not going to be able to calculate the paths for a 10x10 grid using a relational database to maintain the stack after all. In the next post we look at a more hopeful approach that involves NoSQL.

Structs For a More Satisfying Robot Factory

by Kofi Sarfo 4. April 2011 12:52

Recently, I found an interesting programming exercise, which read:

"A robot is located at the bottom-left corner of a 5x5 grid. The robot can move either up, down, left, or right, but can not visit the same spot twice. The robot is trying to reach the top-right corner of the grid."

So far we've come up with a solution but not altogether happy it's fit for future purpose. We're currently optimizing and the title gives away the fact that the next optimisation will be to use a Struct rather than a Class for the BoardExplorer. If we're going to be creating well over hundreds of millions of these robots then ideally we'd like them as lightweight as possible. The key aspects noteworthy parts of our new improved robot are shown below:

public struct BoardExplorer { public LinkedList<int> Journey; public int NextLocation; public BoardExplorer(int location, int nextLocation, LinkedList<int> journey) { this.Journey = journey; this.NextLocation = nextLocation; } internal int GetLocation() { return Journey.Last.Value; } internal void Move() { Journey.AddAfter(Journey.Last, NextLocation); } public bool IsExploring { get { return GetLocation() != Program.Board.EndLocation; } } internal LinkedList<int> CopyJourney() { var copy = new LinkedList<int>(); var steps = new int[Journey.Count]; Journey.CopyTo(steps, 0); return new LinkedList<int>(steps); } public BoardExplorer Clone(int location) { return new BoardExplorer(GetLocation(), location, CopyJourney()); } }

And the metrics for this change confirm this as being the way to go.

There are 1,262,816 unique paths from the top-left corner to the bottom-right corner of a 6x6 grid!
There were 18,470,410 robots used to discover this!
Total memory: 869,039,780
Time elapsed: 74,598ms
Garbage collection gen0: 4740
Garbage collection gen1: 1627
Garbage collection gen2: 0

The surprise here is the performance improvement. Memory use hasn't been affected much. Things to note here is that because a struct is a value type rather than a reference type these BoardExplorers aren't being stored on the heap so we need to explain why there is increased garbage collection activity!

Though our robots aren't being allocated memory on the heap their journeys do appear there as their of type LinkedList. Significantly, these are more transient and are removed more quickly than the earlier type of robots being stored and none of these robots too remain in memory by the time the calculation has been performed. Whilst this is certainly preferable it doesn't promise being able to use this improved solution for a 10x10 grid.

I think we can do better still and so we consider the next refactoring. We consider that the robot need not end on the queue as their journey is all we really care about so we make the following changes without tests driving these. In a contained system as small as this we have the luxury of being able to hold all the code and dependencies inside our developer's head. Ordinarily this may not be the case at all.

It's been a about a minute. How about some more refactoring? The BoardExplorerController is about to lose further responsibility!

public class BoardExplorerController { public static Queue<LinkedList<int>> CompleteJourneys = new Queue<LinkedList<int>>(); internal static IEnumerable<int> GetValidMoves(BoardExplorer explorer) { var availableMoves = Program.Board.GetAvailableMovesFromLocation(explorer.GetLocation()); while (availableMoves.Count > 0) { var location = availableMoves.Pop(); if (!explorer.HasVisited(location)) yield return location; } } }

The BoardExplorer has their Clone method replaced by a Move method which now also performs the action of propelling the robot onto the next square and then checking whether they've reached their destination.

public BoardExplorer Move(int location) { var journeyBeyond = CopyJourney(); journeyBeyond.AddLast(location); var moved = new BoardExplorer(journeyBeyond); if (!moved.IsExploring) BoardExplorerController.CompleteJourneys.Enqueue(moved.Journey); return moved; }

This looks like we've reintroduced recursion but the stack for maintaining robots only ever drops one level deep so there's little concern about additional memory requirement. New metrics will show this we're sure. Also the BoardExplorerCloneFactory needs to spit out BoardExplorers based on journeys found on the stack maintaining journeys in progress.

class BoardExplorerCloneFactory { public static int Count = 0; private Stack<LinkedList<int>> journeys; // <- changed from Stack public bool HasClones { get; private set; } public BoardExplorerCloneFactory(Stack<LinkedList<int>> journeys) { this.HasClones = true; this.journeys = journeys; } internal IEnumerable GetExplorers() { HasClones = false; while (journeys.Count > 0) { var journey = journeys.Pop(); var explorer = new BoardExplorer(journey); // <- new addition here too! foreach (var location in BoardExplorerController.GetValidMoves(explorer)) { HasClones = true; Count++; yield return explorer.Move(location); } } } }

I didn't think the additional step of constructing a robot from a saved journey would add much overhead either and it didn't:

There are 1262,816 unique paths from the top-left corner to the bottom-right corner of a 6x6 grid!
There were 184,70,410 robots used to discover this!
Total memory: 870,567,144
Time elapsed: 70,736ms
Garbage collection gen0: 4,694
Garbage collection gen1: 1,610
Garbage collection gen2: 0

The four second performance gain is spectacularly inconsequential. The memory requirement hasn't been dented either. To show that this three minute refactoring didn't need a test-first approach the changes to the main program appear below and this compiled first time without errors/warnings:

Board = new Board(boardLength); var newJourney = new LinkedList<int>(); newJourney.AddFirst(Board.StartLocation); var explorer = new BoardExplorer(newJourney); var journeys = new Stack<LinkedList<int>>(); var factory = new BoardExplorerCloneFactory(journeys); var warmTimer = GetStartedStopWatch(); journeys.Push(explorer.Journey); while (factory.HasClones) { var explorersCount = 0; UpdateInstrumentation(); foreach (var movedExplorer in factory.GetExplorers()) { if (movedExplorer.IsExploring) journeys.Push(movedExplorer.Journey); explorersCount++; } }

So we've abandoned fixing tests until the very end and we're happy with performance. This is because a contrived example allows it. Also, we know that the 'architecture' is going to keep changing and we don't yet know what it will resemble ultimately so many of the tests we write en-route will i) be thrown away almost immediately; and ii) slow us down. These robots are no Mars Rover!

To recap the series of changes, replacing instances of BoardExplorerControllers with a single static version yielded some gains. Replacing the BoardExplorer Class with a Struct had greater benefit too but it's really the BoardExplorerCloneFactory that's had the greatest impact. Storing journeys instead of robots helped only marginally in terms of performance and next we look at changing the representation stored once more and look outside for help in order to allow the size of the stack to grow without having the amount of memory consumed explode. Eventually, we may even be able to calculate the number of paths for a 10x10 grid.

Performance Metrics for a Design Driven by Tests

by Kofi Sarfo 2. April 2011 08:03

Recently, I found an interesting programming exercise, which read:

"A robot is located at the bottom-left corner of a 5x5 grid. The robot can move either up, down, left, or right, but can not visit the same spot twice. The robot is trying to reach the top-right corner of the grid."

Problem solved. We have a simple application that maps unique paths on a square board. However, there's one correctly held suspicion that this cannot scale. The reason for this is likely to be down to how memory is managed in .NET and that robot references are held as the controller recurses down to spawn more robots to explore the grid. This is an entirely crude implementation, and remember, one that had no prior design.

With the creation of each robot, a little more memory is required. For a grid of 3x3, for example, only ~250KB is used for the 50 or so robots used to map out the 12 unique paths. Once the grid increases to 5x5, however, there are 90,000+ robots chewing through ~65MB! of RAM. Despite this the Garbage Collector cleans up generation 0 only 47 times and the answer appears in less than half a second!

The way memory management occurs in .NET hasn't changed significantly since an article in the MSDN Magazine from November 2000. Garbage Collection: Automatic Memory Management in the Microsoft .NET Framework. Essentially, every so often the garbage collector will remove objects from memory that no longer have objects referencing them. How often this clean up process occurs is determined by how often the memory allocated for the shortest living objects (Generation Zero) is filled. Recursion and deep stacks are a sure way for this to happen more frequently and to reduce the amount of memory available to quickly multiplying machines. There isn't even enough memory to calculate paths for an 8x8 grid so the design will need some attention!

First we allow for some instrumentation:

public class ProgramInstrumentation { public static int CollectionCount; public static long TotalMemory; } public struct BoardExplorerControllerInstrumentation { public int CollectionCount; public long TotalMemory; public BoardExplorerControllerInstrumentation(int collectionCount, long totalMemory) { this.CollectionCount = collectionCount; this.TotalMemory = totalMemory; } }

And adding a property on the BoardExplorerController to keep score.

// allows memory comparisons between board explorer controllers public BoardExplorerControllerInstrumentation Instrumentation { get; private set; }

In the previous version each robot had a property offering current location information

private LinkedListNode<int> CurrentLocation;

It's replaced by the last node of the journey because they point to the same location and this is redundant!

internal int LocationId { get { return Journey.Last.Value; } }

No celebration just yet, losing one reference type only buys us 16 bytes on x64 so we press on. If the problem is that we have controllers that create new controllers, each with the possibility of creating four new robots (one for each direction) then fixing this might mean separating creation from the responsibility for maintaining each bot. The controller passes off responsibility for newing up bots to a factory and then can place them on a queue. No further need for a robot's controller once this has been done then!

So let's use a BoardExplorerController with static methods rather than an instance of the object; no reference is held to either Board or BoardExplorer. A controller stripped bare:

public static Queue<LinkedList<int>> CompleteJourneys = new Queue<LinkedList<int>>(); internal static IEnumerable<int> GetValidMoves(BoardExplorer explorer) { var availableMoves = Program.Board.GetAvailableMovesFromLocation(explorer.Location); while (availableMoves.Count > 0) { var location = availableMoves.Pop(); if (!explorer.HasVisited(location)) yield return location; } } internal static BoardExplorer Move(BoardExplorer explorer) { explorer.Move(); if (explorer.Location == Program.Board.FinalSquare) { var journey = explorer.Journey; explorer = null; CompleteJourneys.Enqueue(journey); } return explorer; }

Before being placed on the queue, each robot now needs to know where it's going to be moved next so a property is added to the BoardExplorer.

public int NextLocation { get; set; }

The result is somewhat expected. Memory use is improved requiring only ~10MB, down from ~65MB and performance is comparable. Go team! The secret source is in the Factory and the main Program refactored to perform more coordination:

class BoardExplorerCloneFactory { public static int Count = 0; private Stack explorers; public bool HasLocations { get; private set; } public BoardExplorerCloneFactory(Stack explorers) { this.HasLocations = true; this.explorers = explorers; } internal IEnumerable GetExplorers() { HasLocations = false; while (explorers.Count > 0) { var explorer = explorers.Pop(); foreach (var location in BoardExplorerController.GetValidMoves(explorer)) { HasLocations = true; Count++; yield return explorer.Clone(location); } } } }

The Main method now resembles the following:

static void Main(string[] args) { Board = new Board(int.Parse(args[0])); var explorer = new BoardExplorer(); var explorers = new Stack(); var factory = new BoardExplorerCloneFactory(explorers); var warmTimer = GetStartedStopWatch(); explorers.Push(explorer); while (factory.HasLocations) { var explorersCount = 0; UpdateInstrumentation(); foreach (var newExplorer in factory.GetExplorers()) { var movedExplorer = BoardExplorerController.Move(newExplorer); if (movedExplorer != null) explorers.Push(movedExplorer); movedExplorer = null; explorersCount++; } } }

So our project stakeholders are thrilled that we now use less memory! Actually no. They don't care particularly. They want a solution that is capable of doing 10x10 grids and up. The metrics for a smaller grid 6x6 suggest that this solution won't get us there:


    There are 1,262,816 unique paths from the top-left corner to the bottom-right corner of a 6x6 grid!
    There were 18,470,410 robots used to discover this!
    Total memory: 995,957,488
    Generation Zero Garbage Collections: 2,936
    Time elapsed: 157,840ms

It's impressive to be creating and managing over 100,000 robots a second but at the expense of ~100MB this approach doesn't inspire much confidence. There must be more we can do in terms of the data structure we're using and that's a segue way into the next post.

Write Code. Learn Fast. Have Fun.

by Kofi Sarfo 1. April 2011 17:31

Recently, I found an interesting programming exercise, which read:

"A robot is located at the bottom-left corner of a 5x5 grid. The robot can move either up, down, left, or right, but can not visit the same spot twice. The robot is trying to reach the top-right corner of the grid."

I thought this would be a useful problem for demonstrating the real value of unit testing and, in particular, TDD. Also, I thought that it would be something for which multithreading might come in handy as the initial solution is unlikely to scale with increased grid dimensions.

As often with a blank slate the question is where to begin. Let's start with the grid. A square grid should be able to calculate the number of squares it offers given the length of one side:

[TestFixture] public class TestBoard { [Test] public void Board_Should_Have_One_Square_For_Board_Length_Equals_One() { var boardLength = 1; var expectedSquareCount = Math.Pow(boardLength, 2); var board = new Board(boardLength); Assert.AreEqual(expectedSquareCount, board.SquareCount); } }

This fails to compile as there isn't yet a Board class!

public class Board { // a 3x3 board has Length = 3 private int Length; public Board(int length) { this.Length = length; } // number of squares on board public int SquareCount { get { return 1; } } }

Fantastic! Test passes but I suspect that this won't be correct for grids of all sizes...

[Test] public void Board_Should_Have_Four_Squares_For_Board_Length_Equals_Two() { var boardLength = 2; var expectedSquareCount = Math.Pow(boardLength, 2); var board = new Board(boardLength); Assert.AreEqual(expectedSquareCount, board.SquareCount); }

It's no tragedy that the test fails as intuition suggested. It's an easy fix, however, to correct that.

public class Board { // a 3x3 board has Length = 3 private int Length; public Board(int length) { this.Length = length; } // number of squares on board public int SquareCount { get { return (int) Math.Pow(Length, 2); } } }

Once again tests pass! So we've demonstrated mastery of numbers to the power of two. Awesome. How about being able to uniquely identify each square on the board? What's the test for that look like?

[Test] public void Board_Squares_Should_Have_Unique_LocationIds() { var boardLength = 3; var board = new Board(boardLength); var locationIds = new Dictionary<();int, int>(); foreach (var square in board.Squares) { if (!locationIds.ContainsKey(square)) locationIds[square] = 0; locationIds[square]++; } const int expectedCount = 1; foreach (var key in locationIds.Keys) { Assert.AreEqual(expectedCount, locationIds[key]); } }

And the code to make this test pass might even look something like this:

// a simple way to uniquely identify squares on the board private int GetLocationId(int x, int y) { return y * Length + x; } // enumerate all the squares on the board yielding a unique id for each public IEnumerable<int> Squares { get { for (int x = 0; x < Length; x++) { for (int y = 0; y < Length; y++) { yield return GetLocationId(x, y); } } } }

Wait! Did I just write some code without a unit test? Indeed. There is no test for GetLocationId. What was I thinking? That seems almost irresponsible. Actually, do I really care how the board numbers its squares? Not in the slightest! Provided each square can be uniquely identified I do not care in the slightest.

Now onto the matter of movement. From each corner of the grid our robot should be able to move in only two directions. And now we do care about the scheme used by the grid to number squares. It looks like we're going to get some unit testing of the square numbering logic here for free. Really, I'd want to write only a single test case and parameterise the values for robot location and to where the robot may move!

[TestCase(0, 1, 3)] // should be able to move up and move right from bottom left [TestCase(2, 1, 5)] // should be able to move up and move left from bottom right [TestCase(6, 3, 7)] // should be able to move down and move right from top left [TestCase(8, 5, 7)] // should be able to move down and move right from top left public void Board_Should_Allow_Moves_From_Corner( int fromLocationId, int firstMoveLocationId, int secondMoveLocationId ) { var expectedAvailableMoves = new List() { firstMoveLocationId, secondMoveLocationId }; var boardLength = 3; var board = new Board(boardLength); foreach (var expectedAvailableMove in expectedAvailableMoves) Assert.Contains(expectedAvailableMove, board.GetAvailableMovesFromLocation(fromLocationId)); Assert.AreEqual(expectedAvailableMoves.Count, board.GetAvailableMovesFromLocation(fromLocationId).Count); }

The test for locations to which the robot may move from the centre of the board is sufficiently different to get its own test. The others are literally edge cases.

[TestCase(4, 1, 3, 5, 7)] // should be able to move in any direction // from the middle of the board public void Board_Should_Allow_Moves_From_Centre( int fromLocationId, int firstMoveLocationId, int secondMoveLocationId, int thirdMoveLocationId, int fourthMoveLocationId ) { var expectedAvailableMoves = new List() { firstMoveLocationId, secondMoveLocationId, thirdMoveLocationId, fourthMoveLocationId }; var boardLength = 3; var board = new Board(boardLength); foreach (var expectedAvailableMove in expectedAvailableMoves) Assert.Contains(expectedAvailableMove, board.GetAvailableMovesFromLocation(fromLocationId)); Assert.AreEqual(expectedAvailableMoves.Count, board.GetAvailableMovesFromLocation(fromLocationId).Count); }

The code that satisfies both sets of tests is remarkably simple:

// what are the possible locations a piece can move to given current location internal Stack<int> GetAvailableMovesFromLocation(int locationId) { var availableMoves = new Stack<int>(); // if able to move left if (locationId % Length > 0) availableMoves.Push(locationId - 1); // if able to move right if (locationId % Length < Length - 1) availableMoves.Push(locationId + 1); // if able to move down if (locationId >= Length) availableMoves.Push(locationId - Length); // if able to move up if (locationId < Math.Pow(Length,2) - Length) availableMoves.Push(locationId + Length); return availableMoves; }

I had no idea I was going to use a stack until I had. I figured that if a stack was good enough for IL instructions and the Virtual Execution System then it would be good enough for our robot. Next, without showing the code here, there were several things to consider with regards to our robot. Let's give the robot a name, let's call it R2.

  • R2 needs to know from where to start
  • R2 shouldn't still be at the starting location after a single move
  • R2 needs to be able have some notion about where he is
  • R2 needs to be able to say where he's been
  • R2 needs to be able to say where he hasn't been
  • R2 needs to be able to say how far he's been

Thankfully, R2 in this case doesn't need an appreciation of what he knows and what he doesn't know. The difficult thing about R2 is that at any square he'd have up to four options about where to go next and we'd need some way to guide him to each of those options without his becoming unsettled. But there was no need to worry about that yet. I figured that being able to clone R2 and have it's clone take the same journey would later allow a solution in which R2 and each clone could go in different directions. The code for that looks like this:

// allows new robot to take same journey internal void TakeJourney(LinkedList<int> journey) { Start(); foreach (var step in journey) if(step > StartLocationId) MoveTo(step); } // clone of existing robot with identical journey history public BoardExplorer Clone() { var explorer = new BoardExplorer(); explorer.TakeJourney(Journey); return explorer; }

And so this passed on for several iterations until I had a grid (Board), a robot (BoardExplorer) and, finally, a BoardExplorerController so that the grid need not know about R2 and R2 could remain blissfully unaware of the board. Another implementation might have seen the BoardExplorerController made part of the robot. The question is whether we bless R2 with the ability to control its own action or not. Perhaps too much responsibility. The tests for our robot are fairly simple too as a result.

I'll include the code for the BoardExplorerController because there's not much to it. Also it's a chance to point out perhaps the only mildly tricky piece to this solution. The Controller is recursive in that it creates instances of itself within the Explore method in order to allow the robot to explore a number of different squares seemingly at the same time. Note that we're using a queue to record the path the robot takes across the grid and that the grid knows where the robot needs to end up!

public class BoardExplorerController { // maintains grid properties private Board board; // robot able to move, give current location, records journey // and indicates whether any given location has been visited private BoardExplorer explorer; // record of successful journeys private ConcurrentQueue<LinkedList(int)> completeJourneys { get; set; } public BoardExplorerController(Board board, BoardExplorer explorer) { this.board = board; this.explorer = explorer; } // lists all available moves for robot given position on grid public Stack<int> AvailableMoves { get { return board.GetAvailableMovesFromLocation(explorer.LocationId); } } // lists all available locations robot may move to without // revisiting any squares given position on grid public Stack<int> GetUnexploredLocationsAvailableMoves() { var allAvailableMoves = AvailableMoves; var unexploredAvailableMoves = new Stack<int>(); while (allAvailableMoves.Count > 0) { var location = allAvailableMoves.Pop(); if (!explorer.HasVisited(location)) unexploredAvailableMoves.Push(location); } return unexploredAvailableMoves; } // counts unique paths a robot may take from top-left to bottom-right of grid internal int GetUniquePathsCount() { explorer.Start(); completeJourneys = new ConcurrentQueue<LinkedList<int>>(); ExploreAllUnexploredLocations(completeJourneys); return completeJourneys.Count; } // send robot to each unexplored adjacent square to current location internal void ExploreAllUnexploredLocations(Queue<LinkedList<int>> completeJourneys) * { var unexplored = GetUnexploredLocationsAvailableMoves(); if (unexplored.Count == 0) return; this.completeJourneys = completeJourneys; foreach (var location in unexplored) Explore(location); } // create new robot to explore new location internal void Explore(int location) { var newExplorer = explorer.Clone(); var newController = new BoardExplorerController(board, newExplorer); newExplorer.MoveTo(location); if (location == board.FinalSquare) { this.completeJourneys.Enqueue(newExplorer.Journey); return; } newController.ExploreAllUnexploredLocations(completeJourneys); } }

I should confess here that it wasn't entirely a smooth journey. The beauty of this approach was accidentally demonstrated in that one method I wrote prior to any tests were written resulted in a bug that I didn't discover until pretty much all the code had been written... and the application should have been working. Here is the original method prior to the test. It should be fairly obvious to see what's wrong here:

public void TakeJourney(LinkedList journey, LinkedListNode currentLocation) { this.Journey = journey; this.CurrentLocation = currentLocation; }

The test that this method works making use of existing robot material...

[Test] public void BoardExplorer_Journey_Should_Not_Be_Affected_By_Clone_Move() { const int randomLocation = 8; var explorer = new BoardExplorer(); var clone = explorer.Clone(); clone.MoveTo(randomLocation); Assert.Less(explorer.LocationId, clone.LocationId); Assert.Less(explorer.Journey.Count, clone.Journey.Count); }

And there you have it - a working solution that will tell you how many unique paths there are from one corner of a grid to its opposite corner. However, running this for a 10x10 grid will result in an Out of Memory Exception and that may be the topic of the next entry.

*Compiler error included for the sake of formatting.

Testing Times

by Kofi Sarfo 31. March 2011 18:55

Any well functioning development team ought to be able to discuss software design and the various approaches to raising quality without the matter becoming an HR issue. Incredibly, the following disagreement took on a life of its own, eventually creating a monster! Consider the following:

class DAL_Untestable { public void Save(string currency1, string currency2, bool spotRate) { using(var conn = new SqlConnection("connectionString")) using (var cmd = new SqlCommand("SpotRate_Insert", conn)) { conn.Open(); cmd.Parameters.Add("@currency1", SqlDbType.VarChar).Value = currency1; cmd.Parameters.Add("@currency2", SqlDbType.VarChar).Value = currency2; cmd.ExecuteNonQuery(); } } }

It's pretty clear that the code above isn't written in such a way that it can be unit tested. The question here is whether this is indeed important. By no means am I saying that testing has no place here, rather, let's look at an example of something a colleague might do?

class BO_DTO { public string Currency1 { get; set; } public string Currency2 { get; set; } public decimal SpotRate { get; set; } public BO_DTO(string currency1, string currency2, decimal spotRate) { this.Currency1 = currency1; this.Currency2 = currency2; this.SpotRate = spotRate; } } class DAL_Testable { public IDbConnection Connection { get; set; } public IDbCommand Command { get; set; } public void Save(BO_DTO dto) { Command.Connection = Connection; Command.CommandText = "SpotRate_Insert"; Command.CommandType = CommandType.StoredProcedure; var currency1Param = Command.CreateParameter(); currency1Param.ParameterName = "@currency1"; currency1Param.DbType = DbType.String; currency1Param.Direction = ParameterDirection.Input; currency1Param.Value = dto.Currency1; var currency2Param = Command.CreateParameter(); currency2Param.ParameterName = "@currency2"; currency2Param.DbType = DbType.String; currency2Param.Direction = ParameterDirection.Input; currency2Param.Value = dto.Currency2; var spotRateParam = Command.CreateParameter(); spotRateParam.ParameterName = "@spotRate"; spotRateParam.DbType = DbType.String; spotRateParam.Direction = ParameterDirection.Input; spotRateParam.Value = dto.SpotRate; Connection.Open(); Command.ExecuteNonQuery(); } }

By exposing the Connection and Command objects we're able to supply mocked instances in order to isolate any tests from ever making a database call. Nothing clever here. The use of the DTO (data transfer object) serves only to better mimic the test I'm reproducing. And these are examples of the tests suggested:

[TestFixture] class DAL_Testable_Tests { public void Save_Should_Call_Correct_Stored_Procedure() { var mockConn = new DynamicMock(typeof(IDbConnection)); var mockCmd = new DynamicMock(typeof(IDbCommand)); var dal = new DAL_Testable(); dal.Connection = (IDbConnection)mockConn.MockInstance; dal.Command = (IDbCommand) mockCmd.MockInstance; var parameter1 = new SqlParameter(); mockCmd.ExpectAndReturn("CreateParameter", parameter1); var parameter2 = new SqlParameter(); mockCmd.ExpectAndReturn("CreateParameter", parameter2); var parameter3 = new SqlParameter(); mockCmd.ExpectAndReturn("CreateParameter", parameter3); mockCmd.Expect("set_CommandText", "SpotRate_Insert"); mockCmd.Expect("ExecuteNonQuery"); var dto = new BO_DTO("N/A", "N/A", 0.0m); dal.Save(dto); mockCmd.Verify(); } public void Save_Should_Use_Correct_Parameters() { var mockConn = new DynamicMock(typeof(IDbConnection)); var mockCmd = new DynamicMock(typeof(IDbCommand)); var dal = new DAL_Testable(); dal.Connection = (IDbConnection)mockConn.MockInstance; dal.Command = (IDbCommand)mockCmd.MockInstance; var parameter1 = new DynamicMock(typeof(DbParameter)); mockCmd.ExpectAndReturn("CreateParameter", parameter1.MockInstance); parameter1.Expect("set_Value", "EUR"); var parameter2 = new DynamicMock(typeof(DbParameter)); mockCmd.ExpectAndReturn("CreateParameter", parameter2.MockInstance); parameter2.Expect("set_Value", "USD"); var parameter3 = new DynamicMock(typeof(DbParameter)); mockCmd.ExpectAndReturn("CreateParameter", parameter3.MockInstance); parameter3.Expect("set_Value", 1.4m); var dto = new BO_DTO("EUR", "USD", 1.4m); dal.Save(dto); parameter1.Verify(); parameter2.Verify(); parameter3.Verify(); } }

Do we really need to test that a DAL (data access layer) is capable of calling the correct stored procedure with the correct parameters given a DTO? Is this the most value we can add assuming that automated tests need to be written? This isn't a million miles from the question about whether getters & setters require unit tests. In trivial cases such as this one above I don't understand why we'd choose unit tests over integration tests given our particular circumstance.

We have a database against which we can run integration test. The integration tests can be automated to run periodically and if they exist in their own project then the responsive feedback aspect of unit tests is preserved. Integration tests will reveal issues beyond the scope of this one method such as whether the correct data types are being used, whether the signature on the stored procedure is correct, whether the tables used by the stored procedure exist and, ultimately, whether the method call results in the correct spot rate value being saved for a given currency pair.

Once we've understood the rules about granularity and breaking up code (methods) into the smallest pieces that make sense I think we've reached a stage where we can do some chunking for more suitable testing. We care whether we're calling the correct stored procedure and the test for that should be determined by values in the database. Yes, it's possible to have written the stored procedure incorrectly so that it fails to persist any or all of the data passed through but this also happens to matter! If we decide to rename the stored procedure as part of a database refactoring then the unit test is flawed until we change the expectation of the first test. This suggests that we are in fact testing implementation rather than logic and, as far as I understand it, this is not the idea behind unit testing.

Google unit test behaviour not implementation and have a look at what various folks say.

Tags:

Unit Tests

Varying Takes on Database Naff

by Kofi Sarfo 1. June 2010 18:41

Migrating from a database we quite like to one we do not provides as much fun as surgery.

After the DBA has promoted the database into QA we find our application is spouting Primary Key violations because the sequence.nextval values are below those used within their respective tables. Identity columns. Yet another reason to prefer SQL Server. There's a clear pattern here.

Sure, there are some advantages but we'd take portability over the possibility of having more than one auto-generated sequence any day.

It's interesting catching the impressions of an Oracle expert going the other way. Oracle to SQL Server: Crossing the Great Divide, Part 2. The comments in Part 1 make for a fun read. Especially those that bemoan the developer tools.

Replace Development with Design

by Kofi Sarfo 27. May 2010 17:21

About seven years after everyone else got it I think I may have finally gotten it too. That final letter in TDD may be about Design as anything else!

I've never written anything in Java so the offer to deconstruct/reconstruct and deploy a Java EE website based on Struts sounded a death march. We brought in a dedicated contractor who struggled some with Tomcat instances and lock down. Difficulties correctly called.

We needed to work out which code (including SQL) was being called for each page on a public facing website. I didn't fancy clicking through the website and scribbling down the execution path, which someone suggested and I'd no idea whether there existed anything that allows bytecode instrumentation without modifying the Java source. A quick look at Spring and AspectJ indicated that neither are particularly lightweight. 

This looked promising though: Javassist is a load-time reflective system for Java. IBM developerworks has a handy guide to aspect-oriented changes with Javassist. Still not sure, however, how to tie method calls to database calls.

In the end I wrote some code generation utility to add logging for both method calls and database calls (via OJDBC), using Regex to identify each. The key thing here was that no single method in the C# tool was written without first writing a test. Not only is this the first thing I've ever written with 100% code coverage but I managed also to maintain the discipline needed to avoid squeezing in some code without first having a failing test. Each initial test was followed with a passing test using the very simplest solution, writing a second failing test by varying parameters, before completing the implementation.

Once applied to production code I then realised that I'd made no design decisions. Absolutely none! This is something I'd never previously considered. I expect this very same light bulb went on too for many folks almost a decade ago. If you only ever write the next method required the design just emerges. Also, the simplest possible solution in this case meant there were no redundant lines to delete at the end. Just very satisfying indeed.

Tags: , ,

Design

Have hammer, recognise screw, without screwdriver

by Kofi Sarfo 1. February 2010 06:14

For all it's usefulness XQuery never really took off the way it seems it should have done. We have a query language from the W3C for XML and since XML use is pervasive... it's very much like SQL which is wildly popular and since querying data is a fairly natural thing to want to do... and there are some excellent tools from Altova, Stylus Studio and Saxonica... However, there the story appears to end without the expected outcome. Going forward (sorry) we'll certainly be making greater use of XQuery because so many of the terabytes we process are wrapped in angle brackets and it's XML all the way down to even input & output parameters on stored procedures.

To demonstrate its value by example. Here's one way to generate XML using LINQ and hash table to validate against a given schema:

using System.Collections.Generic; using System.Linq; using System.Xml.Linq; namespace Wimiro.Data.Examples { public static class TransformXmlImperatively { public static string GenerateCdsSpreadOutput(this string xml, string requestXml) { var rootNode = xml.GetXmlDocument().FirstChild; var xmlWithNewDocumentRoot = rootNode.GetProcessingInstruction(); xmlWithNewDocumentRoot += "<" + requestXml.GetResponseRootElementName() + " xsi:noNamespaceSchemaLocation=\"curves.xsd\" " + " xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\">"; xmlWithNewDocumentRoot += xml.GetCurvesPointsFromXml(); xmlWithNewDocumentRoot += "</" + requestXml.GetResponseRootElementName() + ">"; return xmlWithNewDocumentRoot; } public static string GetCurvesPointsFromXml(this string xml) { var curveDictionary = new Dictionary<string, XElement>(); var rawDoc = XDocument.Parse(xml); var rawElements = from elements in rawDoc.Elements("root").Elements("row") select elements; var curveRoot = new XDocument(new XElement("root")); foreach (var rawElement in rawElements) { if (!ContainsCurve(curveDictionary, rawElement)) { AddCurve(curveDictionary, rawElement, curveRoot); } FindCurve(curveDictionary, rawElement).Add(GetCurvePoint(rawElement)); } return curveRoot.ToXml().StripRoot(); } private static XElement FindCurve(IDictionary<string, XElement> curveDictionary, XElement rawElement) { return curveDictionary[rawElement.GetKey()]; } private static XElement GetCurvePoint(XElement rawElement) { XElement curvePoint; curvePoint = new XElement("Point"); curvePoint.SetAttributeValue("Spread", rawElement.Attribute("Spread").Value); curvePoint.SetAttributeValue("Maturity", rawElement.Attribute("Maturity").Value); return curvePoint; } private static void AddCurve(Dictionary<string, XElement> curveDictionary, XElement rawElement, XDocument curveRoot) { XElement curveElement; curveElement = new XElement("Curve"); curveRoot.Element("root").Add(curveElement); curveDictionary.Add(rawElement.GetKey(), curveElement); foreach (var curveAttribute in rawElement.Attributes()) curveElement.SetAttributeValue(curveAttribute.Name, curveAttribute.Value); curveElement.SetAttributeValue("Spread", null); curveElement.SetAttributeValue("Maturity", null); } private static bool ContainsCurve(IDictionary<string, XElement> curveDictionary, XElement rawElement) { return curveDictionary.ContainsKey(rawElement.GetKey()); } } }

And here's the way it should have been done using XQuery:

select @xml_out.query (' for $Ticker in distinct-values(/root/row/@Ticker) let $Tickers := /root/row[@Ticker=$Ticker] for $Currency in distinct-values($Tickers/@Currency) let $Currencies := $Tickers[@Currency=$Currency] for $Date in distinct-values($Currencies/@Date) let $Dates := $Currencies[@Date=$Date] return <Curve Ticker="{$Ticker}" Currency="{$Currency}" Date="{$Date}">{ for $node in $Dates return <Point Maturity="{data($node/@Maturity)}" Spread="{data($node/@Spread)}" /> } </Curve>')

StackOverflow! For all your hardware needs.

More Double-Ds. This time it's AMDD.

by Kofi Sarfo 13. November 2009 16:59

During our three day Agile Training course with too many examples contrived to maintain audience engagement through cute caveman cartoons and engineering attempts familiar to all (house-building), one colleague questioned how suitable agile might be in model driven development.

The Agile view was presented in one instance as making use of Zeno's Paradox in reverse. The paradox says, essentially, that motion is illusory because to travel any distance there is a point half the way between start and finish (let's call this half-way) and there is also a point half the way between start and half-way (let's call this a quarter of the way) and so on. Because there are an infinite number of these half-way points it's impossible ever to get anywhere. This being the case the Agile take is that perhaps we're able to make better progress by considering how to only get half-way as opposed to considering in too much detail the end-game (or the whole journey).

If Agile's Raison (Scrum in this example) primarily is to produce some complete functionality periodically (frequently) in tight iterations then the question in the case of model development is "how much value does half an algorithm provide, if any?" If it's not possible to go to market with half a model then shooting for half-way appears only to help as a strategy for maintaining motion rather than for more frequent delivery.

Stated another way: Because the Quant team who are building complex mathematical models are unsure what the finished product will look like they almost have no choice but to work iteratively. The question is then whether their iterations include the development team and so far it looks as if they've not done so sufficiently that Agile's value here probably isn't more frequent delivery of complete vertical slices but helping to ensure that the direction traveled is more likely to be correct by facilitating conversation.

If more frequent contact between the Quant and Development team then mean fewer wasted cycles and fewer trips down blind alleys which might have resulted from more isolated efforts then it's another tick in the Adds Value column - this scenario leverages the Wisdom of Crowds. However, design by committee might just as easily be a problem instead. We'll see.

Meanwhile I'll be discovering how well Continuous Integration works on a development team of one and whether the overhead can be justified.

Tags:

Talks

Kiva Loans

  • Amgnekeleye Group

    Amgnekeleye Group

    Retail

    Requested loan: $2000

    Amount raised: $0

    Diema, Mali

    to buy goods to stock their shop.

    Loan Now »

  • Afërdita

    Afërdita

    Personal Housing Expenses

    Requested loan: $1100

    Amount raised: $0

    , Kosovo

    to purchase new furniture for her house.

    Loan Now »

  • Alfred

    Alfred

    Motorcycle Transport

    Requested loan: $775

    Amount raised: $0

    Lira, Uganda

    to buy a new motorcycle for hire to earn more income from the business.

    Loan Now »

To see more entrepreneurs »

Make a loan to an entrepreneur across the globe for as little as $25. Kiva is the world's first online lending platform connecting online lenders to entrepreneurs across the globe.