Wednesday 10 March 2010

Physician heal thyself

Wow, it's been a busy few weeks, hence the lack of a second blog, but at last here it is!

Starting a new business gives one the opportunity to begin at the ground up with process and procedures. A chance to start with a clean sheet and bring prior experience to bear…..not always easy when time is short, as this cautionary tale will illustrate.

One of our current projects is to tidy up our customer data. How easy that sounded when I started. So I started a salesforce account, exported outlook contacts lists into a csv files, grabbed a handful of old contact lists (MS Excel) and set about using Excel to standardise them. Since they were already in Excel I thought it would be easier to use Excel than anything else. It should be just a matter of moving columns around. How hard can that be?

Wrong!!!!

In short I ended up with endless printouts, lot's of half finished files, multiple backups, more duplication and an even bigger headache through staring at lists and lists of contact data for hours on end.

I also became very nervous about quality. What if copy and paste to the wrong field? What if I delete a column accidentally and can’t get it back. This could prove very embarrassing should I mix up customer information.

It was then that the title of this post came to mind. Here I am, a long term data integration professional with access to a data integration tool (expressor), messing about with MS Excel. Why, why, why?

The answer is simple. Like the titular physician, I ignored my own advice. All because of the allure of the "quick fix". I know from experience that hand cranking using Excel doesn't work, but was seduced by the seeming “simplicity of approach” and more fool me.........

So, after a day or so of Excel torture, I bit the bullet. First, I defined the target fields and created a master customer format in expressor . I then did the same for the source fields and lo and behold I suddenly was in control again. All I then needed was a simple transformation drawing to copy the source data to the target format.

We now have a master format for all our customer data, a master customer file and repeatable process for rebuilding that file. It also means that there is no need to alter the source files, deliberately or accidentally, so no more worries about losing data.

What is also great is that this is not a one-shot deal. Should we feel the need to further enrich the master we can do so easily simply by extending the master format and copying in additional source data. Not an easy thing to do with Excel!

What's more, each list now takes only 15-20 minutes to standardise into the master format, which means that should I receive new lists, we can easily apply them to our master customer list.

Voila!

I suppose the moral here is to take your own advice and use the right tool for the job. When time is tight, it is even more important to choose a reliable and proven approach and to resist the temptation to use one-off hacks.

Oh and I also learned beyond a shadow of a doubt that MS Excel is NOT a reliable data integration tool........I will not weaken again!