Copying & Pasting Example Code for the Google Buzz API

by Kofi Sarfo 14. June 2010 08:56

So the Google Buzz API has been available for a month and I've added all kinds of value to this example code. Below is a link to all those folks I follow (who have supplied an image):

It's trivial and yields a social graph in four lines of code essentially. Nice! The only thing we needed to consider was the use of response.data.entry instead of response.data.item which became obvious by looking at this JSON showing who I follow on Buzz.

Because most of our data-related time is in XML land immediately we think XSD and find there exists something similar for JSON. JSON Schema! The Address Schema, for example, looks reasonable but we wonder how necessary all those double quotes might be. They seem to be redundant tokens that only reduce readability. Otherwise it's friendly enough.

Tags:

bookmark | REST | Toys

Varying Takes on Database Naff

by Kofi Sarfo 1. June 2010 18:41

Migrating from a database we quite like to one we do not provides as much fun as surgery.

After the DBA has promoted the database into QA we find our application is spouting Primary Key violations because the sequence.nextval values are below those used within their respective tables. Identity columns. Yet another reason to prefer SQL Server. There's a clear pattern here.

Sure, there are some advantages but we'd take portability over the possibility of having more than one auto-generated sequence any day.

It's interesting catching the impressions of an Oracle expert going the other way. Oracle to SQL Server: Crossing the Great Divide, Part 2. The comments in Part 1 make for a fun read. Especially those that bemoan the developer tools. Toad's okay.

Replace Development with Design

by Kofi Sarfo 27. May 2010 17:21

About seven years after everyone else got it I think I may have finally gotten it too. That final letter in TDD may be about Design as anything else!

I've never written anything in Java so the offer to deconstruct/reconstruct and deploy a Java EE website based on Struts sounded a death march. We brought in a dedicated contractor who struggled some with Tomcat instances and lock down. Difficulties correctly called.

We needed to work out which code (including SQL) was being called for each page on a public facing website. I didn't fancy clicking through the website and scribbling down the execution path, which someone suggested and I'd no idea whether there existed anything that allows bytecode instrumentation without modifying the Java source. A quick look at Spring and AspectJ indicated that neither are particularly lightweight. 

This looked promising though: Javassist is a load-time reflective system for Java. IBM developerworks has a handy guide to aspect-oriented changes with Javassist. Still not sure, however, how to tie method calls to database calls.

In the end I wrote some code generation utility to add logging for both method calls and database calls (via OJDBC), using Regex to identify each. The key thing here was that no single method in the C# tool was written without first writing a test. Not only is this the first thing I've ever written with 100% code coverage but I managed also to maintain the discipline needed to avoid squeezing in some code without first having a failing test. Each initial test was followed with a passing test using the very simplest solution, writing a second failing test by varying parameters, before completing the implementation.

Once applied to production code I then realised that I'd made no design decisions. Absolutely none! This is something I'd never previously considered. I expect this very same light bulb went on too for many folks almost a decade ago. If you only ever write the next method required the design just emerges. Also, the simplest possible solution in this case meant there were no redundant lines to delete at the end. Just very satisfying indeed.

Tags: , ,

Obviously

April is the cruelest month?

by Kofi Sarfo 19. April 2010 12:47

Actually, it's not especially so having spent ten days in Palo Alto this month not doing anything remotely tech-related. The next time I'm there though I'll be sure to pop by the William Gates Center in Stanford at the very least. Well, I finally got round to playing with NServiceBus but that's pretty much it.

Back in Blighty meanwhile there's been a steadily growing realisation that I much prefer SQL Server 2008 to Oracle 11g. What ought to be simple, isn't necessarily so, productivity is near zero and success is being defined in terms of not doing anything too damaging to production systems during migration. The plan is to port the Bloomberg ETL process from SQL Server to Oracle and replace the shambolic FTP interface with what seems to be a more robust web services equivalent over HTTPS. Authentication via X509 certificate. Nothing too radical.

We're talking to Oracle via ODP.NET - having known for a while that Microsoft was deprecating System.Data.OracleClient - and using the OracleXmlCommandType. For inserts and updates I lost a crazy amount of time by not using the System.Xml.XmlConvert.ToDateTime method. Oracle's quite particular about date format so "2010-04-01" is an Unparseable Date OracleException, which seems more than a little ridiculous.

Also surprised to find that there are no DATEDIFF and DATEADD functions out of the box. Luckily, they're here. Perhaps I'm used to having better developer tools. Or I need my hand to be held. Probably a bit of both. There's certainly no contest in deciding between which database I'd rather work with given the partition enhancements in SQL Server 2008. Apparently, I'm not alone with regards to views about Oracle being an obstacle.

Swinging & Hitting Nothing But Tee!

by Kofi Sarfo 10. February 2010 04:12

How I nearly wet myself.

"We want to be 'Agile'"

"Really? You want to learn how to value and trust your people? Be courageous? Build the right software at the right time? That stuff? Cool..."

"Um, no. We want predictability and metrics. I want to know our velocity so that I know how lazy my developers are." - The Bovine Synchotron

It's almost as if an external consultant was providing measured commentary on our team's management imposed attempt to go agile. We're using Mingle as a time sheet application, amongst other things, with the sweetener that the alternative is either Microsoft Project and/or a dedicated time sheet application. So far I've not yet found where I'm able to record getting up between 3am and 4am to check the ETL process which fails on account of a fragile FTP interface to Bloomberg. Still, it's early days and the good news is that we're hurtling towards automated builds.

Have hammer, recognise screw, can't find screwdriver

by Kofi Sarfo 1. February 2010 06:14

For all it's usefulness XQuery never really took off the way it seems it should have done. We have a query language from the W3C for XML and since XML use is pervasive... it's very much like SQL which is wildly popular and since querying data is a fairly natural thing to want to do... and there are some excellent tools from Altova, Stylus Studio and Saxonica... However, there the story appears to end without the expected outcome. Going forward (sorry) we'll certainly be making greater use of XQuery because so many of the terabytes we process are wrapped in angle brackets and it's XML all the way down to even input & output parameters on stored procedures.

To demonstrate its value by example. Here's one way to generate XML using LINQ and hash table to validate against a given schema:

using System.Collections.Generic; using System.Linq; using System.Xml.Linq; namespace Wimiro.Data.Examples { public static class TransformXmlImperatively { public static string GenerateCdsSpreadOutput(this string xml, string requestXml) { var rootNode = xml.GetXmlDocument().FirstChild; var xmlWithNewDocumentRoot = rootNode.GetProcessingInstruction(); xmlWithNewDocumentRoot += "<" + requestXml.GetResponseRootElementName() + " xsi:noNamespaceSchemaLocation=\"curves.xsd\" " + " xmlns:xsi=\"http://www.w3.org/2001/XMLSchema-instance\">"; xmlWithNewDocumentRoot += xml.GetCurvesPointsFromXml(); xmlWithNewDocumentRoot += "</" + requestXml.GetResponseRootElementName() + ">"; return xmlWithNewDocumentRoot; } public static string GetCurvesPointsFromXml(this string xml) { var curveDictionary = new Dictionary<string, XElement>(); var rawDoc = XDocument.Parse(xml); var rawElements = from elements in rawDoc.Elements("root").Elements("row") select elements; var curveRoot = new XDocument(new XElement("root")); foreach (var rawElement in rawElements) { if (!ContainsCurve(curveDictionary, rawElement)) { AddCurve(curveDictionary, rawElement, curveRoot); } FindCurve(curveDictionary, rawElement).Add(GetCurvePoint(rawElement)); } return curveRoot.ToXml().StripRoot(); } private static XElement FindCurve(IDictionary<string, XElement> curveDictionary, XElement rawElement) { return curveDictionary[rawElement.GetKey()]; } private static XElement GetCurvePoint(XElement rawElement) { XElement curvePoint; curvePoint = new XElement("Point"); curvePoint.SetAttributeValue("Spread", rawElement.Attribute("Spread").Value); curvePoint.SetAttributeValue("Maturity", rawElement.Attribute("Maturity").Value); return curvePoint; } private static void AddCurve(Dictionary<string, XElement> curveDictionary, XElement rawElement, XDocument curveRoot) { XElement curveElement; curveElement = new XElement("Curve"); curveRoot.Element("root").Add(curveElement); curveDictionary.Add(rawElement.GetKey(), curveElement); foreach (var curveAttribute in rawElement.Attributes()) curveElement.SetAttributeValue(curveAttribute.Name, curveAttribute.Value); curveElement.SetAttributeValue("Spread", null); curveElement.SetAttributeValue("Maturity", null); } private static bool ContainsCurve(IDictionary<string, XElement> curveDictionary, XElement rawElement) { return curveDictionary.ContainsKey(rawElement.GetKey()); } } }

And here's the way it should have been done using XQuery:

select @xml_out.query (' for $Ticker in distinct-values(/root/row/@Ticker) let $Tickers := /root/row[@Ticker=$Ticker] for $Currency in distinct-values($Tickers/@Currency) let $Currencies := $Tickers[@Currency=$Currency] for $Date in distinct-values($Currencies/@Date) let $Dates := $Currencies[@Date=$Date] return <Curve Ticker="{$Ticker}" Currency="{$Currency}" Date="{$Date}">{ for $node in $Dates return <Point Maturity="{data($node/@Maturity)}" Spread="{data($node/@Spread)}" /> } </Curve>')

StackOverflow! For all your hardware needs.

Typical 2356% APR On a Payday Loan Does Seem Quite An Incredible Offer!

by Kofi Sarfo 1. January 2010 14:37

The Vanilla BlogEngine.NET allows comments to be posted via an HTTP post which is great in terms of enabling an AJAX implementation for blog post comments. However, it's great too for spam bots, almost exclusively, offering pay day loans throughout the comment sections of this site. One new year's resolution was to implement a solution using ReCaptcha. In this case the solution may require writing no code.

StackOverflow: How would one integrate ReCaptcha in to BlogEngine.net (ASP.net C#)?

I can't think why anybody would think a pay day loan a good idea when there's Zopa, for example.

They Seek Him Here, They Seek Him There

by Kofi Sarfo 14. November 2009 08:05

Having tremendous fun playing find the missing ThoughtWorks.CruiseControl.MSBuild.dll.
At times it feels as if some twisted soul is curating the internet simply to thwart my efforts.

More Double-Ds. This time it's AMDD.

by Kofi Sarfo 13. November 2009 16:59

During our three day Agile Training course with too many examples contrived to maintain audience engagement through cute caveman cartoons and engineering attempts familiar to all (house-building), one colleague questioned how suitable agile might be in model driven development.

The Agile view was presented in one instance as making use of Zeno's Paradox in reverse. The paradox says, essentially, that motion is illusory because to travel any distance there is a point half the way between start and finish (let's call this half-way) and there is also a point half the way between start and half-way (let's call this a quarter of the way) and so on. Because there are an infinite number of these half-way points it's impossible ever to get anywhere. This being the case the Agile take is that perhaps we're able to make better progress by considering how to only get half-way as opposed to considering in too much detail the end-game (or the whole journey).

If Agile's Raison (Scrum in this example) primarily is to produce some complete functionality periodically (frequently) in tight iterations then the question in the case of model development is "how much value does half an algorithm provide, if any?" If it's not possible to go to market with half a model then shooting for half-way appears only to help as a strategy for maintaining motion rather than for more frequent delivery.

Stated another way: Because the Quant team who are building complex mathematical models are unsure what the finished product will look like they almost have no choice but to work iteratively. The question is then whether their iterations include the development team and so far it looks as if they've not done so sufficiently that Agile's value here probably isn't more frequent delivery of complete vertical slices but helping to ensure that the direction traveled is more likely to be correct by facilitating conversation.

If more frequent contact between the Quant and Development team then mean fewer wasted cycles and fewer trips down blind alleys which might have resulted from more isolated efforts then it's another tick in the Adds Value column - this scenario leverages the Wisdom of Crowds. However, design by committee might just as easily be a problem instead. We'll see.

Returning to the initial question of how well suited the Agile Methodology might be for Model Development, Scott Ambler provides one possible answer: Agile Model Driven Development (AMDD): The Key to Scaling Agile Software Development.

 

Meanwhile I'll be discovering how well Continuous Integration works on a development team of one and whether the overhead can be justified.

 

Tags:

Talks

Watching Others Do the Same TDD Calculator Kata

by Kofi Sarfo 23. October 2009 06:17
I've been doing this String Calculator Kata whenever I've had a spare half hour before 7am and wanted to see how others did it. See the Andrew Woodward Calculator Kata and the Bobby Johnson Calculator Kata.

Kofi Sarfo modified theme by Mads Kristensen



Content by WIMIRO Technology is licensed under a Creative Commons Attribution-Share Alike 2.0 UK: England & Wales License.

Creative Commons License

Powered by BlogEngine.NET 1.5.0.7

About Me

Director, Wimiro Technology
London, United Kingdom

Writes in third person and first person plural; currently commutes to Moorgate.

Kiva Loans

  • Issa Sarr

    Issa Sarr

    Personal Purchases

    Requested loan: $200

    Amount raised: $75

    Dakar, Senegal

    social needs

    Loan Now »

  • Edwin

    Edwin

    Movie Tapes & DVDs

    Requested loan: $800

    Amount raised: $125

    La Paz, Bolivia

    Buy a DVD burner tower

    Loan Now »

  • Soo

    Soo

    Laundry

    Requested loan: $5375

    Amount raised: $2175

    Queens, New York, United States

    To purchase a new dry cleaning machine

    Loan Now »

 To see more entrepreneurs »

Kiva Loans