<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Datamartist.com &#187; data culture</title>
	<atom:link href="http://www.datamartist.com/category/data-culture/feed" rel="self" type="application/rss+xml" />
	<link>http://www.datamartist.com</link>
	<description>Reduce cost with self serve data transformation</description>
	<lastBuildDate>Thu, 09 Feb 2012 20:00:31 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.3</generator>
		<item>
		<title>Data Quality Rules</title>
		<link>http://www.datamartist.com/data-quality-rules</link>
		<comments>http://www.datamartist.com/data-quality-rules#comments</comments>
		<pubDate>Thu, 16 Jun 2011 17:00:07 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[data culture]]></category>
		<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Reality Check]]></category>
		<category><![CDATA[Data Quality rules]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=5995</guid>
		<description><![CDATA[What's the difference between good data and bad data? It is much like the difference between good children and bad children- the bad data doesn't follow the rules. But what are the rules? Unlike the rules for kids, which have been fixed in stone for decades (or at least, parents wish it were so), the [...]]]></description>
			<content:encoded><![CDATA[<p>What's the difference between good data and bad data?  It is much like the difference between good children and bad children- the bad data doesn't follow the rules.<br />
<a href="http://www.datamartist.com/wp-content/uploads/2011/04/data-quality-rules-data-freedom-or-death.jpg"><img src="http://www.datamartist.com/wp-content/uploads/2011/04/data-quality-rules-data-freedom-or-death-300x269.jpg" alt="" title="data-quality-rules-data-freedom-or-death" width="300" height="269" class="alignright size-medium wp-image-6011" /></a><br />
But what are the rules?  Unlike the rules for kids, which have been fixed in stone for decades  (or at least, parents wish it were so), the rules for data are slippery things that depend very much on the context and the database.</p>
<p>While it's a complex subject, some basic rules of thumb can avoid the deeper rabbit holes.</p>
<p>The first thing to understand about Data Quality rules is they aren't as easy as they may look.  Data is in theory something in the ordered world of computers, but in reality is in the "flexible" world of humans.  A huge amount of data is entered by members of the group "Homo sapiens" (or mutilated by software written by members of that group) and as a result is not as ordered as we would all like.</p>
<p>The challenge for data quality practitioners is to remove the chaos injected by those highly involved primates (us) and make the data the sterile, ordered, never any question about anything type that we all imagine in our fantasies.</p>
<p>But how?</p>
<p>In the end, it is amazing how powerful and complex the various solutions to this problem are.</p>
<p>But I suggest that there are some basic principles that can help guide us.</p>
<h2>First- do no harm.</h2>
<p>One of the risks of any data quality initiative is that it actually screws up the data more.  Don't define rules that are so complex, and so sure of themselves that they actually make the data worse.  Be humble. Don't change data unless you are pretty sure it's a good idea.  Err on the side of not screwing up the original.  And keep a copy of the original- so if things do go off the rails you can undo- or at least try to understand what when wrong.</p>
<h2>Go out and talk to the people</h2>
<p>Don't sit in your ivory tower and speculate as to what the data means.  Go out there and watch people enter it in.  See what real world type things are happening that never make it into bits and bytes.</p>
<h2>Attack the basics first</h2>
<p>Focus your first efforts on dealing with the basics- they will resolve the vast majority of the issues- don't chase after the outliers until you have the "easy" cases taken care of- the tough stuff is a case of diminishing returns- look first at how to fix processes and train your people to make the majority of typical data entry cases more accurate before you start looking into artificial intelligence based hyper-multi-semantic-algorithmic-learning-matching-holistic-flux-capacitor data quality systems.</p>
<h2>Less is more- the fewer rules the better.</h2>
<p>So whats the rule about making rules?  Try to make less rules, and test them in a pragmatic way.  It is possible to have so many rules that the rules themselves have data quality issues- don't go there.</p>
<p>Sometimes the simplest things will bring the greatest benefit.</p>
<p>In the coming weeks, I'll be posting about how to design, implement and monitor Data quality rules using the <a href="/">Datamartist tool</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/data-quality-rules/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Which myths are holding you back?</title>
		<link>http://www.datamartist.com/which-myths-are-holding-you-back</link>
		<comments>http://www.datamartist.com/which-myths-are-holding-you-back#comments</comments>
		<pubDate>Thu, 10 Feb 2011 15:06:40 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[data culture]]></category>
		<category><![CDATA[Reality Check]]></category>
		<category><![CDATA[assumptions]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=5952</guid>
		<description><![CDATA[In your business you have "facts". Things that are considered to be true. Lots of folks have heard of them, or believe them, and propagate them. But are they true? You are making decisions every day based on these "facts". Obviously, we have to believe something. But today I'm asking you to be skeptical. Question [...]]]></description>
			<content:encoded><![CDATA[<p>In your business you have "facts". Things that are considered to be true. Lots of folks have heard of them, or believe them, and propagate them. But are they true?  You are making decisions every day based on these "facts".</p>
<p>Obviously, we have to believe something.  But today I'm asking you to be skeptical.  Question your facts.</p>
<p>Let me give you an example. I'm a Canadian, and looking out my window right now, I can see a pretty healthy snow fall accumulating. Lots of the white stuff.  Brings to mind the fact that some cultures in the far north have over 100 words for snow.<br />
<a href="http://www.datamartist.com/wp-content/uploads/2011/02/snowman-black-hat-and-scarf.jpg"><img src="http://www.datamartist.com/wp-content/uploads/2011/02/snowman-black-hat-and-scarf.jpg" alt="" title="snowman-black-hat-and-scarf" width="328" height="366" class="alignright size-full wp-image-5970" /></a><br />
Hang on. Is that a fact?  Tell me- have you heard a variation of that?</p>
<p>I'm sure I read that somewhere.  I've heard others mention it.  It makes sense- I mean, people living in the far north would see lots of snow, and would know all about it, and so their language would evolve to encompass lots of different qualities of snow. </p>
<p>Sounds good.</p>
<p>Only, is it?  It's an idea that "just makes sense".  People seem to just accept it as soon as you say it.  People are likely to pass the idea along to others- because it makes a compelling story.</p>
<p>But in fact, it's wrong.  I'll let you google to your hearts content if you like to find more evidence than my say so, but after reading a number of articles on the subject, (here is an <a href="http://www.princeton.edu/~browning/snow.html">example</a>, and of course <a href="http://en.wikipedia.org/wiki/Eskimo_words_for_snow">the Wikipedia entry</a>.) it seems clear that there are not 100 words for snow in any language.  In fact, English has about the same number of ways of talking about snow as languages from societies in the far north. </p>
<p>So the point of all this is-  what myths do you have in your organisation?  Things that "everyone" knows are true. Things that when they are explained to you make "perfect sense".  Things that you teach to every new hire so that they "know how things are".</p>
<p>The insidous thing about "facts" is that once they gain purchase, any contrary evidence tends to be called an "exception", or discounted.</p>
<p>Use data to find out what is true. Fight to improve the quality of your data to find more and more new truths. Question the status quo if the data contradicts it.  Don't assume that something is wrong with the data when "things don't make sense."  Maybe they don't make sense because your assumptions are just plain WRONG.</p>
<p>Be very aware that you might be making decisions based on myths that while sounding so plausible, so clear, so common sense, are pure fantasy.</p>
<p>The good news is that all your competitors might be doing the same thing.  If you look at your data, and see through it, you might show them all how wrong they are.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/which-myths-are-holding-you-back/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Good Data is a force for good.</title>
		<link>http://www.datamartist.com/good-data-is-a-force-for-good</link>
		<comments>http://www.datamartist.com/good-data-is-a-force-for-good#comments</comments>
		<pubDate>Wed, 20 Oct 2010 14:59:16 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[data culture]]></category>
		<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Public Data]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=5596</guid>
		<description><![CDATA[The United Nations has declared that today is the first world statistics day, "celebrating the many contributions and achievements of official statistics". It's the kind of holiday that those of us in the data wrangling profession can really get behind. Data about people in general, and their well being, their needs and challenges is a [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.datamartist.com/wp-content/uploads/2010/10/UN-World-Statisics-Day-Logo.jpg"><img src="http://www.datamartist.com/wp-content/uploads/2010/10/UN-World-Statisics-Day-Logo.jpg" alt="" title="UN-World-Statisics-Day-Logo" width="290" height="228" class="alignright size-full wp-image-5599" /></a>The United Nations has declared that today is the first world statistics day, "celebrating the many contributions and achievements of official statistics".</p>
<p>It's the kind of holiday that those of us in the data wrangling profession can really get behind.</p>
<p>Data about people in general, and their well being, their needs and challenges is a critical component of any plan for progress- and the UN focusing on "official statistics" highlights the huge good that this data does in our world.</p>
<p>Governments, educators, charities, and communities can use official statistics to best direct aid, tailor programs to be as efficient as possible, and dramatically improve the lives of billions of people.  </p>
<p>Citizens can use data to demand change from their governments, and businesses.  They can use data to make informed decisions about which products to buy, understanding their health, environmental and economic impact.</p>
<h2>Don't take all that data for granted.</h2>
<p>I am fortunate to be living in Canada, a wealthy country that provides a broad range of services to its citizens, and I know that my family and I benefit every day from decisions and policies that have been put in place thanks to decisions informed by a broad range of statistical information.  One of the key sources is the census.</p>
<p>Unfortunately, this summer, the Canadian government decided to eliminate the mandatory long form census in Canada (there is still a shorter one), and there has been a strong outcry of disagreement. The chief statistician of statistics Canada resigned in August, but the government seems determined to eliminate this important source of data.</p>
<p>Our little drama in Canada is of course a tiny issue compared to the tragic state of affairs in many countries. Obviously, in many countries the lack of data is a symptom for much more fundamental issues.  But collecting and acting on statistical data to help your populace is an indicator of good governance, and encouraging statistics collection is a positive way to support change.</p>
<p>So on this world statistics day, I encourage everyone that loves data, facts, and decisions made using them, to consider that the anti-data forces of evil are still alive and well.  Fight those who want to "go with their gut", or worse those who know that data will expose their actions as contrary to the common good.</p>
<p>Good decisions are made based on good data. Good data does good.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/good-data-is-a-force-for-good/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Data quality challenges: behavioral inertia and its evil opposite</title>
		<link>http://www.datamartist.com/data-quality-challenges-behavioral-inertia-and-its-evil-opposite</link>
		<comments>http://www.datamartist.com/data-quality-challenges-behavioral-inertia-and-its-evil-opposite#comments</comments>
		<pubDate>Tue, 05 Oct 2010 16:39:04 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[data culture]]></category>
		<category><![CDATA[Data Quality]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=5468</guid>
		<description><![CDATA[Often, I hear someone say something like "this would be much easier if users would just..." or "If only we could convince the sales people that...". Technology folks often are frustrated by the people component of the complex systems they are trying to install. People are not a problem solved by technology Some try to [...]]]></description>
			<content:encoded><![CDATA[<p>Often, I hear someone say something like "this would be much easier if users would just..." or "If only we could convince the sales people that...".   Technology folks often are frustrated by the people component of the complex systems they are trying to install.</p>
<h2>People are not a problem solved by technology</h2>
<p>Some try to ignore the issue, or solve it with technology alone-  "If we write complex enough validation into the data entry form people HAVE to enter good data" or "Our matching algorithms will resolve the issues in real time."</p>
<p><a href="http://www.datamartist.com/wp-content/uploads/2010/10/users-will-lose-chair-if-data-quality-suffers1.jpg"><img src="http://www.datamartist.com/wp-content/uploads/2010/10/users-will-lose-chair-if-data-quality-suffers1.jpg" alt="" title="users-will-lose-chair-if-data-quality-suffers" width="311" height="229" class="alignright size-full wp-image-5473" /></a>Others try to use sophisticated training, documentation, bonus plans or punishment plans to get the behavior they want.</p>
<p>Obviously, components of both approaches are going to be used to some extent- but don't lose sight of the fact that people ARE the process- and the heart of your business.  It's the sales guys that drive revenue, and its the sales order people, or help desk operators, or engineers in your manufacturing facilities that you are building the new system for that are creating all the value.   You are a person too- think about their motivations, and how to take advantage of their abilities and enthusiasm- not how to remove them from the equation.</p>
<p>I often think that there are two powerful forces at work in the minds of all of us- oddly, they are opposites, and yet can co-exist even in the same person at the same moment.  Some people are strongly to one side or the other.  </p>
<h2>Behavioral Inertia:  Change is bad</h2>
<p>We've all see this resistance to change, and in many cases people have this tendency for good reasons (that last disastrous ERP implementation where the new processes were not properly checked, and everyone worked 15 hour days for weeks while customers were screaming into their phones about how screwed up everything was, for example.)</p>
<p>Remember, resistance to bad change that is going to screw everything up is a good thing.</p>
<p>In other cases, however, it is unfounded, and it is a real problem- things have to change to move forward.  Sometimes risks have to be taken, and there will be bumpy periods before a much better steady state is achieved.</p>
<p>People have a natural resistance to this because change is the unknown.</p>
<h2>Hyper Active change syndrome: We can't wait to do it right- we have to act NOW</h2>
<p>This is the evil opposite twin of behavioral inertia. (It's like that episode when Captain Kirk got split in a transporter incident- you know.) </p>
<p>You can identify people with this force at work by phrases like "We're a dynamic organisation, we're being proactive not reactive, our processes are fluid- its the way it is with business in the fast lane" or my personal favorite- "We don't have time to get the data, we'll have to go with our gut."</p>
<p>Hyperactive changers will often try to get their way by always creating a sense of urgency: "The technology isn't moving fast enough for us, we can't wait for those changes to be approved, all the process is slowing us down, our customers are demanding speed"</p>
<p>Hyperactive changers are dangerous because they often ignore or circumvent processes in the name of expediency, generating risk and forcing others to waste effort compensating, and generally causing chaos.  They want to change things so often, that efficiencies of new processes are never realized- everyone is on a constant learning curve and never gets in the groove.</p>
<h2>Balance the forces, find your high-speed tortoise </h2>
<p>Think of the story of the Tortoise and the Hare.  The Hare, with all its speed, could not figure out that the process was start, run, finish, and completely wasted his speed advantage by having a nap.</p>
<p>On the other hand, while the Tortoise's complete dedication to his goal and process is admirable, you can't count on the incompetence of your competition.  (And now that all Hares are no doubt told this story throughout their childhood, its unlikely many tortoise get away with the same trick.)</p>
<p>They key lies in between- we need to work with our organization to foster an environment where we value process, and consistency, but understand that a steady, relentless change to optimize is needed, and valuable.  When one or the other of our behavioral urges overcomes us, we'll find that people are the problem in our initiatives.  If we balance them, and communicate with everybody, we can find ways to make things work, even without perfect cooperation at all times from everyone. </p>
<p>Not too slow, not too fast, always value process without letting it be your slave master.  And for goodness sake, forget about going with your gut-  go out and get some DATA!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/data-quality-challenges-behavioral-inertia-and-its-evil-opposite/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Too much data storage hurts data quality- the toothpaste effect</title>
		<link>http://www.datamartist.com/too-much-data-storage-hurts-data-quality-the-toothpaste-effect</link>
		<comments>http://www.datamartist.com/too-much-data-storage-hurts-data-quality-the-toothpaste-effect#comments</comments>
		<pubDate>Thu, 09 Sep 2010 15:36:34 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[data culture]]></category>
		<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Reality Check]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=4960</guid>
		<description><![CDATA[When I brush my teeth there is a wide range in terms of amount of toothpaste that is acceptable to me. This is not a profound statement- bear with me. Only as the tube of toothpaste starts getting near to its end do I start conserving toothpaste because I know I need to make it [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.datamartist.com/wp-content/uploads/2010/09/data-quality-and-toothpaste-labour-issues.jpg"><img src="http://www.datamartist.com/wp-content/uploads/2010/09/data-quality-and-toothpaste-labour-issues.jpg" alt="" title="data-quality-and-toothpaste-labour-issues" width="320" height="234" class="alignright size-full wp-image-4962" /></a><br />
When I brush my teeth there is a wide range in terms of amount of toothpaste that is acceptable to me.  This is not a profound statement- bear with me.</p>
<p>Only as the tube of toothpaste starts getting near to its end do I start conserving toothpaste because I know I need to make it last.</p>
<p>Another example is the all you can eat buffet- we eat because it's there and we can.  Unlike wasting toothpaste, this has  more immediate negative consequences.</p>
<p><strong>When there is lots of something, we tend to use more of it than we should.</strong></p>
<p>When the tube of enterprise storage capacity seems to be always full, and when massive databases make an all-you-can-store buffet the standard mode of operation, very often the tendency is to store everything.  </p>
<p>Rather than try to determine what information is of a useful level of quality, or focusing on the key information (and ensuring it IS of useful data quality), we stuff our systems full of every type of field and attribute, with massive bloated forms that are too long for anyone to really fill out properly.  </p>
<p>Sadly, this doesn't matter because there are too many fields to check anyways (who can define so many business and data quality rules?), so no one is checking.</p>
<p>If we were forced to make a choice between data A and data B, we might think a bit more about which is more useful for answering key business questions (and by connection, actually think about what the key business questions are).</p>
<p>Instead, how many times have I heard an overworked, rushed subject matter expert say - "Just collect it all, we might need it."</p>
<p>By collecting more, we end up with less.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/too-much-data-storage-hurts-data-quality-the-toothpaste-effect/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>How the general ledger can become a data warehouse</title>
		<link>http://www.datamartist.com/how-the-general-ledger-can-become-a-data-warehouse</link>
		<comments>http://www.datamartist.com/how-the-general-ledger-can-become-a-data-warehouse#comments</comments>
		<pubDate>Tue, 20 Jul 2010 14:54:15 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[data culture]]></category>
		<category><![CDATA[Management reporting]]></category>
		<category><![CDATA[General Ledger]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=4787</guid>
		<description><![CDATA[Many companies today rely on the general ledger as key part of their management reporting, well beyond the obvious financial information. This has often been shaped by how companies first adopted information technology. In some firms, their management reporting systems reflect the fact that as information technology began to be used extensively by business, often [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.datamartist.com/wp-content/uploads/2010/07/general-ledger-is-a-data-warehouse.jpg"><img src="http://www.datamartist.com/wp-content/uploads/2010/07/general-ledger-is-a-data-warehouse.jpg" alt="" title="general-ledger-is-a-data-warehouse" width="356" height="275" class="alignright size-full wp-image-4794" /></a>Many companies today rely on the general ledger as key part of their management reporting, well beyond the obvious financial information.  This has often been shaped by how companies first adopted information technology.</p>
<p>In some firms, their management reporting systems reflect the fact that as information technology began to be used extensively by business, often the very first functional area to be automated was accounting, and the first database within an enterprise was often the general ledger.</p>
<p>In many companies, the general ledger became the clearing house for all information- not just financial, and in effect became a data warehouse before the concept of data warehousing had even evolved. </p>
<p>The problem is, in some organisations, the data warehouse didn't come. The general ledger kept its place as the central repository for not just financial, but also management reporting.  Finance argued successfully that the cost of all the business intelligence architecture was unnecessary- adding accounts and bolt on tables was cheaper.  ERP vendors supported this by creating ever more flexible ledger structures, allowing additional ledgers for parallel accounting and management reporting.</p>
<p>Huge amounts of non-financial information is still stored in many general ledgers. There are so many reasons this is a bad idea.  Here are just three:</p>
<p><strong>1) It forces you to compromise on level of detail and drill down, and history</strong></p>
<p>No general ledger can hold the level of detail available in many source systems. As a result, any interface from the sales system, manufacturing system etc. feeding into the GL will have to create journal entries that summarize a great deal of information. </p>
<p>While the detail of course will still exist in the source system, if your management reporting is all from a general ledger based system, upper management will tend to use this single source- and as a result important granularity may be lost to the decision making process.</p>
<p>This summarization also makes it more difficult to have drill down into the details, giving up some of the greatest benefits of modern business intelligence systems.</p>
<p>Finally, general ledger based data storage does not usually allow for the tracking of reference data changes over time. As sales regions are modified, and territories shift, comparing one period to another becomes increasingly difficult. Data warehouses, designed from the beginning to store this type of slowly changing reference information, can provide a much more insight and historical analysis.</p>
<p>The bottom line is, the data model of the general ledger module is just not designed for analysis.</p>
<p><strong>2) It results in an overly complex chart of accounts and may even affect month end close</strong></p>
<p>As the source systems become more and more capable of collecting data, the tendency is to want to increase the amount of management reporting. If this is being done in the general ledger, it means that further charts of account must be added, and an increased number of journal entries need to be done. Depending how the overall process is setup, its even possible that the increased complexity might affect the speed at which month end closing can be completed, if for no other reason that the same finance resources must both tend to the financial and the management reporting needs.</p>
<p><strong>3) It discourages cross functional definitions and collaboration on analysis</strong></p>
<p>By making one of the functional areas (finance) the center and owner of management reporting, a general ledger based reporting architecture can actually increase the severity of the information silos it is most likely trying to eliminate.<br />
<a href="http://www.datamartist.com/wp-content/uploads/2010/07/dont-use-these-numbers-ourselves-for-the-finance-reports-only.jpg"><img src="http://www.datamartist.com/wp-content/uploads/2010/07/dont-use-these-numbers-ourselves-for-the-finance-reports-only.jpg" alt="" title="dont-use-these-numbers-ourselves-for-the-finance-reports-only" width="364" height="208" class="alignright size-full wp-image-4799" /></a><br />
Because the general ledger reporting does not require all the detail available, each department only needs to provide the summarized information required by finance. While every department has to coordinate with finance, there is no requirement for departments to work with each other to coordinate data and definitions.  While at a high level data is integrated, any benefit from more tightly integrating information across silos that a data warehouse can bring is lost.</p>
<p>In a very real way, a successful general ledger based management reporting system is in fact an impediment to progress for an enterprises business intelligence and data analysis evolution.</p>
<p>Because management reporting is available, the justification or need for a data warehouse is not felt as strongly. However, as needs continue to evolve, the effort expended in the constantly growing general ledger, and its impact on the financial processes, and the companies overall information management culture will become increasingly damaging.</p>
<p>Ironically, companies who failed to ever establish a general ledger based management reporting system could leapfrog their more financially focused competitors, as they embrace the modern data warehouse and the the tools available for data analysis. </p>
<p>A true data warehouse is not an easy road, and is only one component of a broader data analysis strategy. </p>
<p>Readers of this blog know that we advocate an approach that balances "Big Business Intelligence" with nimble, user focused data exploration and transformation.</p>
<p>In the short term, using the general ledger for management reporting can seem easier, but in the long term, it probably makes the task of creating an enterprise wide architecture harder- while your general ledger has been growing in the center, individual departments have probably been pursuing uncoordinated, fragmented business intelligence architectures of their own. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/how-the-general-ledger-can-become-a-data-warehouse/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Spreadsheet errors- Fear, uncertainty and doubt</title>
		<link>http://www.datamartist.com/spreadsheet-risk-and-errors-fear-uncertainty-and-doubt</link>
		<comments>http://www.datamartist.com/spreadsheet-risk-and-errors-fear-uncertainty-and-doubt#comments</comments>
		<pubDate>Mon, 11 Jan 2010 18:54:46 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[Business Intelligence Architecture]]></category>
		<category><![CDATA[data culture]]></category>
		<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[MS Excel]]></category>
		<category><![CDATA[Reality Check]]></category>
		<category><![CDATA[Business Intelligence trends]]></category>
		<category><![CDATA[Excel]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=3831</guid>
		<description><![CDATA[I love the acronym FUD which stands for "Fear, uncertainty and doubt". What I don't love is the underhanded use of FUD to manipulate peoples behavior. Spreading FUD is not about creating something new, but destroying- destroying someones confidence in something, clouding the real issue, stopping a new or creative direction from being taken. FUD [...]]]></description>
			<content:encoded><![CDATA[<p>I love the acronym FUD which stands for "Fear, uncertainty and doubt".  What I don't love is the underhanded use of FUD to manipulate peoples behavior.  Spreading FUD is not about creating something new, but destroying- destroying someones confidence in something, clouding the real issue, stopping a new or creative direction from being taken.  FUD is often used to block reform and change because FUD can cause people to do nothing- and doing nothing is good for the incumbent.</p>
<p>In the data analysis realm, spreadsheet errors are often used to try to dissuade companies from letting their people "work with the data directly".  Software vendors of all sizes, but particularly the really big ones (those incumbants) spread FUD because if they can stop people from getting at the data themselves, it increases the chance of companies buying some more business intelligence suites.</p>
<p>The argument goes something like this:</p>
<blockquote><p>Spreadsheets have been shown to be plagued with errors, many studies showing error rates above 90%.  You need to reduce the risk that spreadsheets are creating in your organization by establishing formal, documented processes that are created an managed by professionals using sophisticated tools.</p></blockquote>
<p>Then the usual nightmare scenarios are brought out, all involving rabid Auditors, Sarbane-Oxley, governance failures etc.</p>
<p><img src="http://www.datamartist.com/wp-content/uploads/2010/01/accidently-put-last-years-spreadsheet-number-into-annual-report1.jpg" alt="accidently-put-last-years-spreadsheet-number-into-annual-report" title="accidently-put-last-years-spreadsheet-number-into-annual-report" width="341" height="226" class="alignright size-full wp-image-3839" />Now, don't get me wrong, spreadsheet errors are a very real and serious problem, and there are all sorts of data applications that should never be done in Excel or other ad-hoc, user driven tools. Ever.  Formal documented processes are critically important, and there are lots of places where you better be using the right tools and professionals.  </p>
<p>I have seen the culture of the spreadsheet completely undermine initiatives that would have driven better data quality, data analysis and business processes.  The spreadsheet certainly has its dark side.</p>
<p>But the problem is that FUD paints with a broad brush.  People take it as "Spreadsheets with data in them? Bad news. Don't do it.  Individuals able to get at the data, and quickly transform it, analyze it?  Who knows what they'll do- shut them down!"</p>
<p>Sadly, from a data quality point of view, sometimes the spreadsheets have the BEST data quality- because people have fixed the issues they can't fix in the transactional system due to constraints or IT department delays.</p>
<h2>Encourage positive change with reasonable controls.</h2>
<p>Intelligent, responsible people should be encouraged to use "informal" methods and tools to do data analysis.  </p>
<p>These people will find things, learn things, and drive positive change (including change in those big formal professional systems).  </p>
<p>They should do it with a reasonable understanding that doing things in an informal way, with spreadsheets or other tools does introduce errors, and should consider this when they recommend taking action based on the results. </p>
<h2>Balance between two extremes </h2>
<p><strong>The totalitarian state:</strong> I don't think there is an  IT department in the world that is capable of stopping all unofficial data analysis.  In fact, I would suggest that the moment such an IT department comes into existence, it would kill the host company, a harsh sort of self-regulation.  People interested in data and thinking for themselves would just pack up and leave. So who would be left making the decisions and based on what?</p>
<p><strong>The twisted web of spreadsheets:</strong> Companies that allow an anything goes, visual basic code, macros and manual cut and paste direct to the annual report environment are not going to be long for the world either.  They populate the horror story pages on <a href="http://www.eusprig.org/horror-stories.htm" target="_blank">the spreadsheet risk websites.</a></p>
<h2>The zone of win.</h2>
<p>You want to be somewhere between insane spreadsheet addiction and strict formal big tool paralysis.  </p>
<p>I submit that companies that balance risk while still encouraging their smart people to "play" with the data and do analysis in new and interesting ways with new tools are going to win.</p>
<p>Again, don't let this process generate your profit and loss statement- understand where and what the informal discovery process is for- but do let it discover things.  If it discovers something interesting you'll have the chance to check for the errors.  Make sure its part of the process to do so.</p>
<p>By letting the FUD get you down, you'll never get that far and who knows what insights you might be giving up?</p>
<p>Of course,  we believe you should go even further and give those intelligent, responsible people new tools that are less error prone than spreadsheets but still provide as much or even greater flexibility.  That's why we're building Datamartist after all.</p>
<p>Openness, balance, and clear minded pragmatism will get you further than FUD every time.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/spreadsheet-risk-and-errors-fear-uncertainty-and-doubt/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The tragedy of anti-data leadership and dataphobia</title>
		<link>http://www.datamartist.com/anti-data-leadership-the-lies-of-non-fact-based-management</link>
		<comments>http://www.datamartist.com/anti-data-leadership-the-lies-of-non-fact-based-management#comments</comments>
		<pubDate>Thu, 07 Jan 2010 17:44:34 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[data culture]]></category>
		<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Reality Check]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=3769</guid>
		<description><![CDATA[There has been a lot of discussion in the last year or so about how important data analysis is becoming. IBM made a major move into data analytics by establishing a new organisation "Business Analytics &#038; Optimization Services" with 4000 people in it. There was the much quoted Hal Varian of Google who predicted that [...]]]></description>
			<content:encoded><![CDATA[<p>There has been a lot of discussion in the last year or so about how important data analysis is becoming.  </p>
<p>IBM made a major move into data analytics by establishing a new organisation <a href="http://www.businessweek.com/technology/content/apr2009/tc20090414_322525.htm?chan=top+news_top+news+index+-+temp_news+%2B+analysis" target="_blank">"Business Analytics &#038; Optimization Services"</a> with 4000 people in it.</p>
<p>There was the <a href="http://www.wired.com/culture/culturereviews/magazine/17-06/nep_googlenomics?currentPage=1" target="_blank">much quoted Hal Varian</a> of Google who predicted that the sexy new job this century will be some sort of data analyst/statistician.</p>
<p>But I believe there is a powerful force in many businesses that will slow down our headlong rush towards a fact based, analytical thinking, data quality focused future.</p>
<p>As a group they are generally referred to as "Upper management" or "Leadership".<img src="http://www.datamartist.com/wp-content/uploads/2010/01/the-data-days-no-the-ceo-says-yes-300x222.jpg" alt="the-data-days-no-the-ceo-says-yes" title="the-data-days-no-the-ceo-says-yes" width="300" height="222" class="alignright size-medium wp-image-3815" /></p>
<p>Now to be fair, there are obviously great leaders and executives that understand that data is important.  </p>
<p>But the fact that making decisions based on facts and data is actually defined as school of thought- "Fact based management" or "Evidence based management", or in the medical area its called "evidence based medicine" illustrates that too many alternatives still exist.</p>
<h2>The lies and dirty tricks of anti-data leadership</h2>
<p>They make comments that equate analysis with "delay".<br />
They confuse considering options with "indecisiveness".<br />
They don't invite people who actually have seen or understand the data to their meetings.</p>
<p>They come up with all sorts of alternate ways to make decisions- and defend their position even when the data clearly does not support them:</p>
<h3>Call it strategic</h3>
<blockquote><p>I know the numbers don't add up right now, but this is strategic.  </p></blockquote>
<p>What does that mean- our strategy is to do things without ROI?</p>
<h3>Go with consensus perception</h3>
<blockquote><p>We don't have time to get the actual data- we're going to have to make a decision based on what the people on the ground are seeing.</p></blockquote>
<p>If you ignore data, create a hypothesis and then go looking for supporting "evidence" in the form of people "on the ground" thinking it's a good idea, you'll find it.  </p>
<p><a href="http://agora.stanford.edu/sjls/Issue%20One/fisher&#038;tversky.htm" target="_blank">People take suggestions from your questions</a> and generate a matching memory/perception of what they think is happening in the real world.  This is something that is well understood and the accuracy of eye witness testimony is known to be poor.</p>
<h3>Blame the data quality</h3>
<blockquote><p>You know we have issues with that data.  I don't think we can risk relying on it.</p></blockquote>
<p>So what's the alternative? Tea leaves?  Might be some risk in that too.</p>
<p>And why is the data quality an issue? Probably because leadership didn't approve the budget and support the process changes that would have improved it.  If the top executives aren't responsible for data quality in their organisation and have decided not to use the data then a company is in a sad, dysfunctional state.</p>
<h1>Moving forward- fight the anti-data forces of evil</h1>
<p>Now, no-one can analyse forever- eventually a decision needs to be made.<br />
Often, not all the analysis we want to do can be done.  The number one reason anti-data leadership will likely reject doing detailed analysis is that it takes too long. They want to "pull the trigger" and get going, even if the decision is clueless (literally).</p>
<h2>Always be working on fixing the structural issues that slow analysis down</h2>
<p>These kinds of issues can slow you down:</p>
<ul>
<li>If you have bad quality data in your systems, any analysis must first fix it- causing delays.  </li>
<li>If you don't have the people on staff to do the analysis, you have to hire consultants, adding delay and cost.</li>
<li>If your data definitions are inconsistent across the company and with industry standards, mixing data from between operating units and other data sources takes forever.</li>
</ul>
<h2>Create a culture of data</h2>
<p>Some examples of beliefs that need to be openly stated and shared:</p>
<ul>
<li>the best way to make decisions, if possible, is by looking at actual data.</li>
<li>firing off decisions made on the basis of hunches isn't being "aggressive and decisive".  It's sloppy and incompetent.</li>
<li>data management and analysis is a key competency for ALL employees in ALL departments not just information technology.</li>
</ol>
<h2>Create data analysis SWAT teams</h2>
<p>On top of this, there are new techniques needed to enable data analysis to be fast enough to make decisions timely.  It is just not possible to launch a waterfall project, to try to find a date three weeks from now when everyone can get together for a functional requirements meeting.</p>
<p>Companies need to create teams (perhaps virtual, coming together when needed) that are able to use fast, flexible tools to do analysis quickly.  I am hoping that the <a href="/product/datamartist-for-developers">Datamartist tool</a> is one of the new tools that such SWAT teams would have in their toolkit.</p>
<p>The bottom line is that companies who have leaders that "get" data are going to be running circles around companies with executive dinosaurs who's eyes glaze over if anyone starts actually talking about facts and figures that can't fit on a single three dimensional pie chart in power point.</p>
<p>The future is data, but can we overcome the anti-data forces and their dataphobia?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/anti-data-leadership-the-lies-of-non-fact-based-management/feed</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

