There’s no such thing as a fixed price software development project

Fixed price software development contracts are a bad idea because they fail to limit the customer’s costs and they worsen the quality of what is delivered. In this article I will explain why.

What makes software development hard is that a group of people must clarify exactly what they want a computer to do in many thousands of situations. Between a senior executive having an idea in the shower one morning and a computer executing that idea, many people will be involved. Each person might have a different interpretation of that idea and every conversation about that idea might introduce mutations to that idea. Modern software development methodologies have evolved to help groups of people move towards agreement on how software should work by introducing techniques to specify how it will work. These include functional specification techniques (User Stories, UML), visual specification techniques (UI and UX design), iterative specification techniques (Agile processes), and formalised specification techniques (Test Driven Development).

Those techniques are all useful, but a piece of software is never fully specified until it’s been written. The developer takes these artefacts and makes the final set of decisions about what the computer should do. We can tell this is true because it is often necessary to ask a developer “What will the software do in this situation that we’ve just thought of?”. The developer will then check the code to find out.

Any software development project is therefore a collaborative exercise in deciding what should happen in each situation and writing that into code. This is what makes software development hard, not the ability to write code per say. Over years many attempts have been made to make coding easier by making the syntax more like English. But while languages like ADA, COBOL, Python and Ruby made coding more accessible, they didn’t make software development projects significantly easier, because they didn’t change the fundamental nature of the problem.

During a fixed price software development contract, any individuals on the supply side of the commercial relationship have a strong incentive to influence the decision making in favour of simplifying the software, to make it easier to write and test. The people on the customer side, on the other hand, have an incentive to remove manual tasks for their employees or make their customers’ lives better. They will try and influence the behaviour of the software in that direction. Sooner or later there will be a disagreement about what the software should do. The supplier will say that the behaviour being discussed was not made clear at the start of the project and they want to re-negotiate the price. The customer will respond that any reasonable reading of what was specified at the outset should include the behaviour being discussed and that the price should not change.

Although the supplier on a fixed price software project many not succeed in renegotiating the price every time they want to, they will succeed some of the time. And when they do they will take the opportunity to make up failed attempts to renegotiate the price earlier on. Ultimately, the customer engaged the supplier because they need them to build their software for them, and they will not be able to resist every attempt to renegotiate the price. So they will be forced to pay what the supplier says they need in order to complete the project. In very rare circumstances the customer may cancel the contract, but this is costly for the customer and just starts the whole project again, albeit with a new supplier.

Does that mean that customers should request fixed price contracts anyway? On the basis that it might work, but if it doesn’t they’re no worse off than if they had accepted a time and materials contract? The answer is no, because having two camps in an adversarial battle over how the software should work does not make for good quality software. Sometimes these battles are conducted in good natured discussions. Sometimes they are passive aggressive moves by the supplier to quietly implement their interpretation and see if a bug is raised, and sometimes there are full blown arguments with both sides threatening to call in lawyers. However it’s conducted, it’s not helpful. Good quality software comes from everyone involved being focused on the goal of making the user’s life easier.

In summary, no software development project is ever be fixed price because the scope is not defined until the code already exists, and having some of the team trying to avoid making the users’ lives easier does not make for good software.

IT Is Too Big To Fail

At some point in the last 30 years, software transitioned from being a powerful productivity boost for individuals, to an indispensable tool for teams working together. I’ll bet that having Windows 3.1 and Microsoft Office on your desktop computer in the early 90s helped you get your job done, but you could probably still (just about) have done your job without it if required.

Those days are gone now. It took a while for the organization you worked for to restructure itself around having always-on, networked systems for collaboration and data storage. But look around any medium to large organization today and it will quickly become apparent: IT is too big to fail.

So how are we doing?

What’s going wrong? Allow me to propose a theory.

One of the biggest problems I see in IT, is the culture that values novelty over robustness  and continually re-invents things that weren’t broken, while forcing established professionals into non-technical roles or out of the industry all together.  The IT industry needs get better at shipping reliable and secure systems, and I believe that retaining knowledge, ensuring professionals are properly trained, and reducing unnecessary churn is all a part of the solution to that.

IT is not the first industry to become critical to the ongoing function of society. We already rely on:

  • Law
  • Architecture
  • Medicine
  • Accountancy
  • Engineering

If something goes wrong in one of those industries, people either die or lose their livelihoods. So what do they do that we in IT don’t:

  • They all have structures that allow the most experienced professionals to continuing to practice in their field while managing and instructing more junior colleagues and advancing their own careers.
  • Practicing in these areas requires accreditation and ongoing accountability to an established industry body.

IT would do well to copy these industries. I propose we should:

  1. Establish a form of chartered status for IT engineers, and require engineers to be chartered before they can work on critical systems (and by that I mean anything a member of the public might use or depend on). That industry body will require it’s members to continually develop their skills and stay up to date on industry best practice.
  2. Develop a code of ethics and a mechanism for that industry body to ‘strike off’ its members for violating that code. It should have been impossible for managers at VW to pressure developers into cheating the Diesel emissions tests.
  3. Get more technical decision making into upper management discussions. It’s a dangerous and lazy stereotype to say that techies and management simply can’t understand each other. I’ve yet to come across a technical issue that can’t be explained to upper management in terms of a cost, time, benefits, risk trade-off, and I’ve yet to encounter a management issue that an intelligent and conscientious IT engineer can’t grasp the implications of if someone takes the time to properly explain it to them. Over time this will lead to technical experts having a clearer career path into upper management.

Doing these three things will ensure that companies as a whole deliver better, safer, more secure and more reliable IT systems. Ultimately, it will keep us all safer and healthier.

How Became the Internet’s #1 Database Import Tool

The number of files converted into SQL scripts each week, since the launch of

Back in August 2014, D4 launched an experimental product named The idea was simple: would be an easy to use tool that helps people to migrate data into databases, by converting data files into SQL scripts.

When we launched it, we had no idea whether it would be useful, but it’s been a huge success.

The traffic for has grown steadily since its launch, and today, it converts over 10,000 Excel, CSV, JSON and XML files into SQL every month.

Just over half (50.53%) of the files we convert are Excel spreadsheets. One third (33.63%) are CSV files, and around 11% are JSON files.

JSON is the new kid on the data file block, but its influence is growing. The number of JSON files we convert has been going up more rapidly than any other, since we added JSON support back in December 2014.

I think’s success comes down to the fact that it takes quite a complicated job, and makes it seem quite simple. We took that view early on that we mustn’t ask the user for any piece of information that we could work out from the file itself.

So, for example, works out what kind of database table it should create in the database, and what the column types should be, it doesn’t ask the user to provide that information. Instead, it analyses all the data in the file and chooses the right types of column.

Recognising all the different dates, integers, decimals and text entries can get surprisingly complicated, but that’s what it does.

We’ve never placed a paid advert for, all we did was post answers to a few questions about importing data on internet forums. But its success has taught me that, a product that solves a real problem for people, and solves it elegantly, will grow by word of mouth, no matter how specialised it may be.

Running for Portage

When you have children, you hope and pray that everything will be ok. That they will be born healthy and grow up just like everyone else.

When our second daughter, Imogen, was born that’s exactly how we felt. But when Immy was about 1 year old we realised that she was having significant difficulty sitting up, walking and talking. Nagging doubts turned into fully blown concerns and soon we were being referred to paediatricians and attending medical appointments to see what might be going on.

We were worried sick. Would she ever walk and talk? Would she always need extra support in life? Is this because of something we’ve failed to do as parents? We’re not the most laid back of parents anyway… this stressed us out more than we ever thought possible.

But nobody can give you an answer. There was no obvious cause. All we could do was wait and see.

Amidst this anxiety, we started to be visited by “Su” from a charity called Portage. Su was amazing. She came every week for an hour and worked with Imogen to develop her abilities. While the doctors, physios and speech therapists could only visit occasionally. Su was there every week, without fail, to improve Imogen’s skills, give us advice and help reassure us that it would be ok.

That helped a lot.

Imogen, big sister Hannah, and Su from Portage
Imogen, big sister Hannah, and Su from Portage

Imogen is 3 now, she’s walking a lot more steadily and she’s starting to say things that almost sound like words you would recognise. She’s still very behind but she’s doing ok. She’s also one of the happiest and most affectionate children I’ve ever known. We have a laugh together, me and Imms.

Imogen Imogen & Dan

But I’d like give something back to Portage for helping us out as a family over the past year or two. The’ve helped us so much it’s the least I could do.

So I’ve decided to run the Bupa Great Birmingham Run this October to raise money for the National Portage Association. I’ve never run a half marathon before. Actually, I’ve never run more than 6k before. It’s going to be a bloody hard work! (You can follow my training here). This is what I looked like after that last run:

Dan after running

As you can see, I’ve a long way to go. But here’s the thing: if you sponsor me, you’ll not only be helping more families can get the kind of support that we’ve been so fortunate to receive, you’ll also motivate me to keep going!

Please help me to support this charity by visiting my Just Giving page and making a donation.

JustGiving - Sponsor me now!

Thank you.

Where Customer Service Goes to Die

“Sorry to bother you…”, the email from my client began. It was Monday morning, and I knew the company this person worked for always had conference calls with their customers on a Monday. No doubt she was prepping for a difficult one.

The email went on, “but could you help us with the query below?”

The email chain that followed contained a discussion between my client and one of their customers about an obscure feature on one of their enterprise software products. The customer didn’t like the way it worked and was clearly trying to paint this as a bug rather than a feature. I had recently done some unrelated work on that product at the request of that same customer, and I could tell they were gearing up to say, in effect, “We’re not going to pay for those changes because the product still has this ‘bug’ and we want you to ‘fix’ that too.”

This is a classic manoeuvre in enterprise software support. The customer identifies a broad or complex problem. They propose a small change to solve that problem. Then, once there are people working on it, they conveniently forget about the specific change they requested and insist more work be done because their broad problem is not solved yet.

This sort of thing happens frequently enough to convince me that it’s a deliberate ploy by many corporate managers. The only way to prevent doing work for free as a software vendor is to employ people to keep a very close eye on the scope of every change and argue back with the customer if they try this kind of thing.

A customer service expert might suggest spending more time talking to the customer about their broader problem, and seeking to solve it all from the beginning. Sometimes that works, sometimes that just looks like up-selling and the customer rejects it.

The real issue is that enterprise software exists mostly to join all sorts of disparate parts of an organisation together. It’s plumbing, and plumbing is messy. Plumbing needs continual patching up and tweaking. It’s an ever moving target. It’s also very political. It’s hard enough getting several departments to agree on how they’re going to work together, by the time they’ve agreed a process, getting that implemented in software is the easy bit!

So the companies that make enterprise software spend their days locked in conflict with their own customers trying to keep a lid on all the complexity arguing about who’s going to pay for all the changes they keep having to make. Enterprise software is where customer service goes to die.

And any potential way to improve the level of service you can give effectively amounts to, not selling enterprise software any more. This includes approaches like:

  • Simplifying what you offer
  • Focusing on doing one thing really really well
  • Trying not to get drawn into customers’ complex inter departmental problems
  • Trying to have more customers so you’re not so beholden to the few that you do have
  • Saying no when customers ask for complex things
  • Charging for your time rather than for the software

If your business is enterprise software then I feel bad for you. Fortunately there is hope. The oncoming tidal wave of SaaS applications is making enterprise IT more a matter of composing disparate services rather than building or buying large behemoth systems. Over time, most business functions are becoming productised and then owned by 3 or 4 dominant SaaS apps.

Need content management? Try WordPress, Joomla, Durpal or Umbraco. Need CRM? Try Salesforce, Sage or Highrise. Need to send lots of email? Try Mailchimp, Constant Contact, Sendgrid or Campaign Monitor. All of these products offer some customisation, but they differ from old style enterprise products in that they can be setup and configured by users in a browser. Somehow the vendor has hit that sweet spot of product positioning, sensible defaults and customisation capability and as a result they’ve carved out a market for themselves without armies of implementation experts and customisations.

If you’re selling enterprise software, my advice is to stop. Your industry will eventually be eaten by a few major SaaS vendors. So either carve out a niche and become one of those vendors, or switch to making a living gluing things together for a daily rate. Your customers will be happier. You’ll be happier, and you can stop wasting your Mondays arguing over edge cases and who’s funding the ongoing cost of tweaking things.

Perfection Considered Harmful

People are always making mistakes. People send emails to the wrong person, or forget an attachment. They give you the wrong change at the checkout or bump into you in the street. Every day, people make little mistakes. For the most part we’re quite forgiving of each other. We understand that the other human being we’re looking at is doing their best and as long as they try to fix things quickly we cut them a bit of slack. But our forgiving nature is based on our ability to relate to the person who’s made the mistake as being like us, a fellow human being. We are notably less forgiving when someone is diving and needs to switch lanes before their junction. On the road, it is not a fellow human being who’s made a mistake, it’s “some idiot in a BMW”.

When it comes to software systems we are even less likely to relate to the person who’s made them. All software is designed, written and tested by people, but we don’t see it like that. We think of software as some alien entity that arrived on planet Earth fully formed. And like the evil BMW who wants to get in front of us, any mistakes that it makes are utterly unreasonable.

Within the software development world, we prevent mistakes by having people check things. Designs are reviewed, code is tested, the content of those tests are themselves reviewed, code is deployed to small groups of people first and monitored carefully, the list of things that went wrong is reviewed and fixes are made, tested, reviewed again, and then finally released to more people. If all goes well it will work perfectly and nobody who uses it will ever experience a problem.

All, never goes well.

Each layer of testing and checking is done by people, and people make mistakes. Each layer of checking just reduces the probability of a mistake getting through, but no amount of checking ever guarantees that the software is perfect. Playing the lottery every week will increase your chances of winning, but there’s no magic number of weeks you need to play before you’re guaranteed a jackpot. What is more, like playing the lottery, each time you get someone to check something it costs you money.

As a supplier of software to others, your job is to find the right balance when customers demand new features, developed quickly, with no bugs, on a budget. The cost of checking and re-checking each new bit of code that goes out the door must be traded off against the value of the new features you’re developing. If you had the new feature 3 weeks earlier, but had to refund 3 customers who found errors in it, would that be worth it? Or does the cost of having 3 angry customers outweigh the benefit of getting the feature sooner?

There is no easy answer here. Be assured; the site will not stay up 100% of the time, there will be bugs, occasionally a sale will be lost or a refund will have to be made. This is a natural consequence of having human beings do work for you. But the important thing to remember is; do not allow people to think its reasonable to expect perfection. Involve the customers/users in the trade off decision making process. Try to put a human face on the software you’ve created. Be present as much as possible. Smile. Talk in terms of probabilities.  Be humble. Say “sorry” early when things go wrong. And if you do that really well, you may actually go up in people’s minds when things go wrong, not down.

Behind the Scenes at QueryTree: The Anatomy of a Web Based Data Tool

QueryTree is a web based drag and drop tool for doing simple data analysis. Users drag and connect tools in a chain to first load in data, then manipulate that data (filter, group, sort etc.) and then visualise that data on a chart or map. I started building QueryTree by myself in between freelance projects about a year ago. More recently I’ve been able to devote more time to it and hire in another developer to help me. In this post I talk about some of the tools and technologies we used to build QueryTree and why.

The User Interface

QueryTree’s drag and drop interface was the first part of the tool to be developed, there was a time when I would do demos just off a single static html file with some JavaScript in – which involved a certain amount of smoke and mirrors! The tool is all about the interface and I wanted to see what people’s reaction to it was. Suffice to say that people like it and that encouraged me to continue working towards an MVP.

The UI is built using a HTML page and a few of JavaScript frameworks, these are:

  • RequireJS – this handles the modularisation of all the JavaScript and makes it easy to load new scripts asynchronously. Having modules for our stuff has been great, but getting other JavaScript frameworks integrated into Require has been a real pain. I’ve used a variety of tweaks to the require shim to set up dependencies, exported symbols and in some cases just wrapping third party libraries in require modules to keep them happy.
  • jQuery – naturally. I’m also using some jQuery UI elements such as dialog, draggable and button.
  • Knockout – this lets us template the HTML markup that we want and then bind those templates to JavaScript object models. There are “purer” frameworks such as Backbone or Angular but Knockout had a great tutorial which is what first got me into it and the framework has a minimal feel which suited my incremental development process. I started with some very simple HTML, I knockout-ified it, I added some more HTML, and somewhere along the way the UI QueryTree got built. I really like it and have never found it was holding me back.
  • Fabriq.js – this framework wraps the standard canvas API and gives you some extras like an object model for your shapes and SVG rendering. It does some extras like event handling but I’m not currently using those as the jQuery events were more familiar.
  • Flot – this is a JavaScript charting library. It uses the canvas element and I’m using it primarily because its fast. One of the things about QueryTree is that the user has complete flexibility to define whatever charts they like, which means they could do something stupid like try and plot 100,000 points. There are some charting libraries that I tried which have plenty of features, but which ground to a complete halt if the user did something bad.
  • Less – I’ve worked on so many projects were the CSS became a real mess and nobody dared refactor it for fear of never being able to get things looking right again. So for this project I took the decision to be a bit more organised from the start. Using less gave us a couple of weapons with which to beat back the CSS spaghetti: the nesting was most useful though, it enabled us to be specific about which elements we wanted to style, without adding lots of code.
  • Handsontable – The table tool uses this plugin to display an editable table to the user. Crucially, it has support for copying and pasting in from spreadsheets and HTML tables, which is fantastic.
  • Some free SVG icons
  • This reset.css script

The Middle Tier

Once the UI was mostly built and demoing well to interested parties I started on the middle tier. My language of choice these days is Python and as I’d been taking advantage of Google App Engine‘s free usage allowance for hosting my static HTML and JS it was totally natural (almost accidental) to start adding bits in python. As is standard on App Engine, we’re using Jinja for templating and WebApp2 for HTTP handlers.

First off, I started working on a JSON API that the UI could use to save the current state of the worksheet and to request the data for a particular tool. From the UI side I built objects (subsets of the Knockout model objects) then just stringified them before using jQuery to POST them to a URL. At the server end I started out using json.loads and json.dumps and have never looked back. It just works and it’s simple so I never felt the need to use a framework for my JSON API.

For user settings and worksheet setup I just used the Google App Engine Data Store. I felt nervous about locking us into App Engine but again, it’s so easy to get hooked. Although, if we had to, I’m sure we could replace it with something like MongoDB, it’s basically just a framework for persisting python objects so the GQL queries and the .put() methods are the only places where you really interact with it.

When it comes to data processing, well, we cheated. Each worksheet in QueryTree is allocated to a MySQL database and any data that you upload to a worksheet is stored in a special table for that tool. When you add chains of data manipulation tools to that worksheet and connect them to a data source the python code “compiles” that into SQL and runs it on the MySQL database. So, in effect, QueryTree is just an SQL development environment. When you click on a tool, the query is run and your data returned as JSON objects over HTTP, which the UI then renders in the results area at the bottom of the screen.

The Back End

As I said above, all the data ends up in a MySQL database somewhere. We have a system for allocating worksheets to a database and can grow the pool of databases as required. We use cloud hosted MySQL but could manage our own if we wanted, the web app just needs to know how to connect to the databases. We have workers which can clear down tables that are no longer being used to free up space too.

Keeping the data in this way does place an upper limit on the amount of data that QueryTree can handle on one worksheet to however much data a single MySQL database can hold. In practice though, that data has to come over HTTP to get into QueryTree, either as a file upload or web fetch, so the databases are not the limiting factor.

Other Bits

In no particular order, we’re using the following additional tools:

  • Joyride – Our product tour is built using Joyride, a JavaScript framework for scripting tour popups
  • Paymill – We take payments using Paymill.
  • Google Maps – The map tool renders a Google map in the results area.
  • Testing Framework – We do automated testing of the full stack using a testing framework that I wrote myself (along the lines of qUnit). So much of the functionality of QueryTree is spread out across the UI, middle tier and the SQL that is generated, or in how these layers interact, that simply testing one layer didn’t add enough value. I could have written 1000s of tests for the UI alone or for the python middle tier alone, but they would not have given me any comfort about deploying a new version to live. So I built a framework which drives the application from the UI and which exercises the whole stack. If those tests pass, I can have some confidence in the whole system being basically functional.
  • Bash – Whenever we want to deploy to live we type ./

The Future

Aside from adding more tools and  more options on the existing tools, there are three areas of the overall architecture that I’d like to improve:

  • Performance
    There’s a lot we can do to improve the performance of queries. We can automatically index tables (especially tables from uploaded files because the data isn’t changing) based on the columns that users’ queries are scanning a lot. We can also tune the queries that we generate, we’ve just made it work so far, we haven’t really thought about performance yet at all.
  • Interfaces
    We have the web data tool for pulling data in from URLs but for non-technical users I’d like to add tools that fetch data from specific services such as Twitter, ScraperWiki or These will be built on the same underlying tools as the web data tool but will be hard coded around the particular structure of the API in question. This will make it easier for users to pull meaningful data into their worksheets from third party sources.
  • Tool SDK
    Longer term I’d like to give people the ability to build their own tools and load them into QueryTree. The system is quite modular so coding up a new tool really just involves some JavaScript to define the tool at the client side, a HTML options dialog and a python object to define how the back end turns the settings into a SQL query. The only real challenge is productising it all, locking down the code that outsiders can upload and building the necessary development tools.

Looking back over this list, it struck me just how much I’m standing on the shoulders of giants here. QueryTree is a thin layer on top of the lot of third party components and my job has been simply to glue them all together in a way that is hopefully useful to others.


Exporting all Sheets on a Spreadsheet to a Single CSV

If you have a spreadsheet with multiple sheets/tabs containing similar tables of data; and you want to export the whole lot to a single CSV, then this VBA macro should help:

Sub ExportAllSheetsToSingleCSV()
    'The file to write to
    outputFile = "C:\Users\dan\output.csv"
    f = FreeFile()
    Dim headerLine As String
    Open outputFile For Output As f
    For Each Sheet In Worksheets
        For Each Row In Sheet.Rows
            Dim line As String
            line = ""
            Dim sep As String
            sep = ""
            Dim lineIsNonEmpty As Boolean
            lineIsEmpty = True
            'Work through all cells on this row
            For Each cell In Row.Cells
                If cell <> "" Then
                    line = line & sep & cell
                    sep = ","
                    lineIsEmpty = False
                End If
            'Did we find anything
            If lineIsEmpty = False Then
                'Dont write the header line out multiple times
                If headerLine <> line Then
                    Print #f, line
                End If
                'Set the header line to the first non empty line we find
                If headerLine = "" Then
                    headerLine = line
                End If
            End If
    Close #f
End Sub

Parsing Large CSV Blobs on Google App Engine

When parsing a blob on Google App Engine using the Python CSV library, the simplest approach is to pass the BlobReader straight into the CSV reader. However, unlike when with opening a normal file, there is no option to handle universal newline characters. In order to handle all the different kinds of newline characters, the string’s splitlines method can be used. However, doing that without loading the entire file into memory can be tricky. Google recommends blobs should be read 1MB at a time, so ideally, you could load 1MB into a buffer, split the lines and then feed the CSV reader one line at a time. That’s what this class does:

class BlobIterator:
    """Because the python csv module doesn't like strange newline chars and
    the google blob reader cannot be told to open in universal mode, then
    we need to read blocks of the blob and 'fix' the newlines as we go"""

    def __init__(self, blob_reader):
        self.blob_reader = blob_reader
        self.last_line = ""
        self.line_num = 0
        self.lines = []
        self.buffer = None

    def __iter__(self):
        return self

    def next(self):
        if not self.buffer or len(self.lines) == self.line_num + 1:
            self.buffer =  # 1MB buffer
            self.lines = self.buffer.splitlines()
            self.line_num = 0

            # Handle special case where our block just happens to end on a new line
            if self.buffer[-1:] == "\n" or self.buffer[-1:] == "\r":

        if not self.buffer:
            raise StopIteration

        if self.line_num == 0 and len(self.last_line) > 0:
            result = self.last_line + self.lines[self.line_num] + "\n"
            result = self.lines[self.line_num] + "\n"

        self.last_line = self.lines[self.line_num + 1]
        self.line_num += 1

        return result

Having defined this class, you can call it like this:

    blob_reader = blobstore.BlobReader(blob_key)
    blob_iterator = BlobIterator(blob_reader)
    reader = csv.reader(blob_iterator)

The BlobIterator supports the __iter__ convention but behind the scenes loads 1MB of the blob into memory, splits the lines and then keeps track of the last partial line so it can combine it with the first partial line of the next 1MB block.

UPDATE: 2014-08-14: Many thanks to Javier Carrascal for his help in spotting an issue with the first version of the BlobIterator. The code above has been updated with a fix. His post explains the process he went through.