Datamartist gives you data profiling and data transformation in one easy to use visual tool.

« | »

Adding self serve data transformation to reduce shadow systems

spreadsheet-data-is-official-its-just-seasonedDo you have lots of unoffical spreadsheets in your organization being used for data analysis? Is the data warehouse use low to non-existent, yet somehow lots of data is appearing in power point presentations and excel spreadsheets all over the company?

I believe a key to understanding how information moves around your organization is to think of it as a mini economy. (I know, the economy is not our favorite subject right now, but bear with me).

There are information suppliers, and information consumers. The consumers are willing to pay more or less for different types of information, and different methods of supplying information have different costs. In the end, the market decides what gets done and what does not get done.

And like many markets, there is also a underground economy- places consumers go if the official prices don’t make sense, or the products they want are not available on the open market.

In many companies, the IT department in theory has a monopoly on information supply, however the underground is active and constitutes a significant supply. The underground in this case is all the excel spreadsheets, the MS Access databases etc. used to make the shadow systems and spread marts. Spreadmarts seem to exist in the majority of enterprises- I’ve mentioned an interesting study regarding these shadow systems previously, and the attitudes people have.

To help illustrate this I am going to make up some data and put it in colorful graphs.

relative-cost-data-warehouse-data-mart-spreadmart2

Looking at the first graph, in broad terms a data warehouse based approach will have higher costs than one based on data marts (because data warehouse provide more cross enterprise integration, which requires more effort), and the spreadmarts will have the lowest perceived cost. It’s important to note that the actual cost of spreadmarts are higher, but percieved cost is what drives the consumers choice.

The trick is that because the percieved cost of spreadmarts is so low, and because there is no sanctioned enterprise solution to compete, a significant amount of effort is put in to these systems for any type of analysis that is percieved to be possible. Of course for certain data volumes or complexities there is no alternative to a full fledged data warehouse or data mart project, but for almost everything else, business users and analysts will often try to go it alone creating a chaos of spreadsheets and data bases.

The problem is, even “experts” can’t accurately estimate how much effort the data analysis is. So estimates for how long it will take to “whip it up in excel” by non-experts are almost always low by orders of magnitude.

Don’t dictate. Engage with sanctioned tools that work the way people want to work.

The key to adjusting this market imbalance is to introduce a new sanctioned product line, in effect undercutting the “black market”.
relative-cost-data-warehouse-data-mart-spreadmart-plus-self-serve

This is exactly what self serve data transformation is about. Rather than leaving users to do it themselves in Excel- IT can provide specific tools, and thereby reduce the amount of completely opaque data transformation going on, while still providing users with the ability to get what they need.

So why is that better?

  • It opens up the dialog – Talking is better than having a “Us” vs “Them” mentality. It lets you meet the people involved, lets you discuss their challenges with them, and provides an opening for discussion of important topics like data quality, master data management and data security.
  • You’ll know who the power users are – Right now, it is potentially anyone who has Excel- chances are that’s everyone in your organisation.
  • It gives you visibility on what matters to the business – If you know what the hot topics are, it can help you keep the official systems relevant and prioritize your efforts where they will do the most good.

What has to be different in this new relationship, however, is that IT has to understand about the “self” in self-serve. People will do things that no self-respecting ETL developer or data warehouse architect would ever sanction. If you clamp down and stop them, they will abandon the tools and return to the wild west. IT believes that it has the power in the relationship, but in fact the users are able to walk at any time. So add value, communicate, educate, but don’t dictate. If your relationship with the business users, and the “Kings of the spreadmart” is poor to start, you have to give it time to evolve.

“But we just can’t let them do that.”

Resist the urge to clamp down.

Keep your systems secure, guard your infrastructure, but don’t have any illusions that you can stop people from analyzing and transforming their data.

If they want to calculate net sales in a particular way then they’ll do it in excel, and it will be the number that the CEO sees. The business is made up of grownups, after all. IT has a responsibility to explain the issues and challenges that shadow systems and rampant spreadsheeting can cause, but I have yet to see or hear of a company where an authoritarian approach works. As Princess Leia said- “The more you tighten your grip, Tarkin, the more star systems will slip through your fingers.”

Arming the rebels

The business intelligence vendors are all realizing what the crowd pleasers are- really good integration into office applications, excel at the forefront. People want at their data.

Microsoft has of course long provided the main weapons for the shadow systems, MS Excel and MS Access- and they are going nuclear with the addition of “Power Pivot” to Excel 2010- although it is largely a presentation layer tool, and probably won’t be used widely for data transformation itself.

Trying to fight all this with the standard tools of closing down the ability to export data, hiring an army of report writers, and constantly raving about the dangers and pitfalls of run away spreadsheets is like pushing on a rope.

Provide a safe, legal alternative to the free for all.

Talk to your business users. Understand their needs. Provide them with tools. Work with them to both empower responsible analysts, and avoid the worst issues that existing shadow systems are creating.

Tagged as: ,

Twitter

« | »

1 Comment

  1. As a consultant for an organisation which specialises in Data Governance and the application of tools like MDM, I found this article fascinating. Keep up the good work!
    Graham (www.evaxyx.com)