<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Datamartist.com &#187; James Standen</title>
	<atom:link href="http://www.datamartist.com/author/james-standen/feed" rel="self" type="application/rss+xml" />
	<link>http://www.datamartist.com</link>
	<description>Reduce cost with self serve data transformation</description>
	<lastBuildDate>Thu, 09 Feb 2012 20:00:31 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.3</generator>
		<item>
		<title>A new years resolution to data profile</title>
		<link>http://www.datamartist.com/a-new-years-resolution-to-data-profile</link>
		<comments>http://www.datamartist.com/a-new-years-resolution-to-data-profile#comments</comments>
		<pubDate>Tue, 10 Jan 2012 15:54:05 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[Data profiling]]></category>
		<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Reality Check]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=6165</guid>
		<description><![CDATA[Well, it's the time of making and breaking resolutions, a time when setting realistic goals is sometimes hard to do with all the optimism of the new year. Sometimes, we make decisions NOT to set a goal, because we don't want to break it. You might be thinking you really should step up your data [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.datamartist.com/wp-content/uploads/2012/01/data-profiling-some-data.jpg"><img src="http://www.datamartist.com/wp-content/uploads/2012/01/data-profiling-some-data-300x225.jpg" alt="" title="data-profiling-some-data" width="300" height="225" class="alignright size-medium wp-image-6171" /></a>Well, it's the time of making and breaking resolutions, a time when setting realistic goals is sometimes hard to do with all the optimism of the new year.  </p>
<p>Sometimes, we make decisions NOT to set a goal, because we don't want to break it.  </p>
<p>You might be thinking you really should step up your data quality monitoring- get some data profiling underway to help identify the data domains and areas you most want to tackle in 2012.  But you might be also thinking that with all the pressures and cutbacks that many companies are facing, you don't have the resources to implement a full scale profiling and monitoring effort, and so might decide to delay. </p>
<p>Don't wait. Just do it.  The perfect is the enemy of the good.</p>
<p>Rather than worrying about how much of your data you are going to be able to cover, or that you can't devote enough resources to tackle all of your reference areas at once, work at the problem from another direction.  </p>
<h1>First, start with master data.</h1>
<p>Master data is the data that all your other data is made from.  It's the data everyone uses to view the massive piles of transactional data, so one bad row in a master data table, and the impact is felt across perhaps hundreds of reports, and multiple time periods.  If you have a product in the wrong category, then every transaction, across perhaps hundreds of customers, and all time, will be mis-catagorized, and every total, sub-total and calculated metric using it will suffer.</p>
<p>While bad transactions are bad, bad reference data is deadly.  Bad reference data takes a good transaction and messes it up.</p>
<h1>Worst first!</h1>
<p>Make a list of your reference tables/area.  Customer, Product, Chart of account, etc. etc.  What are the most important for your business?  This isn't something I can tell you- you have to think about what is most critical.</p>
<p>If you are a company that purchases large amounts of materials from many vendors, and purchasing decisions are fast paced and critical, then maybe it's your vendor master, and your accounts payable.</p>
<p>On the other hand, if you have lots of interaction with your customers, and errors in the customer master cost you business, then start with that.</p>
<p>The key is to first make the list, and then think to yourself "if I have bad quality data, where am I most afraid it will be?"  Start profiling there.  You want to find the worst first, and fixing that will have the greatest positive impact.</p>
<h1>Get to know your data</h1>
<p>Don't worry about setting complex or work intensive goals right away.  Data profiling is about data discovery sometimes.  You need to wade into your reference data, play with it, tease out patterns and relationships.  As you get to know your data, you will be able to better identify where there are issues to tackle, and where root causes might lie for data quality issues.</p>
<p>One approach might be to simply resolve to spend an hour a week, every week, profiling some data.  If you aren't do that now, you will find that even just a bit of time set aside will give huge insight- sometimes we get too busy to do the basics, and we miss opportunities to make significant improvements with relatively little effort in our data.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/a-new-years-resolution-to-data-profile/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Datamartist V1.5 Released</title>
		<link>http://www.datamartist.com/datamartist-v1-5-released</link>
		<comments>http://www.datamartist.com/datamartist-v1-5-released#comments</comments>
		<pubDate>Wed, 31 Aug 2011 18:47:53 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[Datamartist Tool]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=6114</guid>
		<description><![CDATA[We are pleased to announce that Datamartist V1.5 is now available. This version of Datamartist brings with it some useful new functionality, including new functions that can be used in expressions, new capabilities in terms of exporting to databases, and a new data block. In this post, we'll look at two new features, the Pivot [...]]]></description>
			<content:encoded><![CDATA[<p>We are pleased to announce that Datamartist V1.5 is now available.</p>
<p>This version of Datamartist brings with it some useful new functionality, including new functions that can be used in expressions, new capabilities in terms of exporting to databases, and a new data block.</p>
<p>In this post, we'll look at two new features, the Pivot block, and the enhanced database export capabilities.</p>
<h2>Pivot Block</h2>
<p>Our beta testers loved this new block.  The pivot block lets you do the equivalent of a cross-tab query, rolling up  a measure, and distributing the value in a new set of columns, where the column names are provided by the input data set.</p>
<p>Here is a simple example showing how it works:</p>
<p>Say we start with a set of data that has mutiple rows for each date, and different values in the color field, and a quantity measure:</p>
<p><a href="http://www.datamartist.com/wp-content/uploads/2011/08/pivot-block-input-data.png"><img src="http://www.datamartist.com/wp-content/uploads/2011/08/pivot-block-input-data.png" alt="" title="pivot-block-input-data" width="356" height="296" class="aligncenter size-full wp-image-6121" /></a></p>
<p>Then we can connect one of the new pivot blocks to this data set like so:</p>
<p><a href="http://www.datamartist.com/wp-content/uploads/2011/08/pivot-block-connected-to-internal-dataset.png"><img src="http://www.datamartist.com/wp-content/uploads/2011/08/pivot-block-connected-to-internal-dataset.png" alt="" title="pivot-block-connected-to-internal-dataset" width="666" height="338" class="aligncenter size-full wp-image-6122" /></a></p>
<p>The pivot block lets us select which columns to include (this defines the level of detail to roll up to), which string column to use to generate the new column names, and which measure to use as well as the rollup method (sum, average, min, max)</p>
<p><a href="http://www.datamartist.com/wp-content/uploads/2011/08/pivot-block-configuration.png"><img src="http://www.datamartist.com/wp-content/uploads/2011/08/pivot-block-configuration.png" alt="" title="pivot-block-configuration" width="578" height="238" class="aligncenter size-full wp-image-6125" /></a></p>
<p>The result?  The output of the pivot block looks like this: now we have a summary by color for each date, with a column for each color value.<br />
<a href="http://www.datamartist.com/wp-content/uploads/2011/08/pivot-block-resulting-dataset.png"><img src="http://www.datamartist.com/wp-content/uploads/2011/08/pivot-block-resulting-dataset.png" alt="" title="pivot-block-resulting-dataset" width="439" height="210" class="aligncenter size-full wp-image-6126" /></a></p>
<h2>Database export enhancements</h2>
<p>Now, when exporting to a database, there are a number of new options.</p>
<p>One of the most interesting is the capability to execute SQL commands in the database either before and/or after the data is exported into the table.</p>
<p>This provides the capability of running stored procedures, or launching follow on database side processing after Datamartist writes the data into the DB.</p>
<p>This is a powerful new capability, and makes it even easier to integrate datamartist into various systems, and get your data quality and profiling data where you need it.</p>
<p><a href="http://www.datamartist.com/wp-content/uploads/2011/08/sql-command-capability-example.png"><img src="http://www.datamartist.com/wp-content/uploads/2011/08/sql-command-capability-example.png" alt="" title="sql-command-capability-example" width="724" height="293" class="aligncenter size-full wp-image-6128" /></a></p>
<p>If you haven't checked out datamartist yet, we're not sure what you are waiting for-  <a href="/downloads">download the free trial,</a> and give it a go.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/datamartist-v1-5-released/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Datamartist data quality cartoons</title>
		<link>http://www.datamartist.com/datamartist-data-quality-cartoons</link>
		<comments>http://www.datamartist.com/datamartist-data-quality-cartoons#comments</comments>
		<pubDate>Tue, 21 Jun 2011 13:22:06 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[Datamartist Cartoons]]></category>
		<category><![CDATA[Just for fun]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=4463</guid>
		<description><![CDATA[I've had lots of fun over the years building the little cartoons that have become a regular feature. Here are a few, reposted together just for fun. Data quality super powers. Fighting the anti-data forces of evil. Data silo fun. The joys of a moving target Data migration tools.]]></description>
			<content:encoded><![CDATA[<p>I've had lots of fun over the years building the little cartoons that have become a regular feature.  Here are a few, reposted together just for fun.</p>
<h2>Data quality super powers.</h2>
<p><a href="http://www.datamartist.com/wp-content/uploads/2011/06/data-quality-sense-tingling-april-birthdays.jpg"><img src="http://www.datamartist.com/wp-content/uploads/2011/06/data-quality-sense-tingling-april-birthdays.jpg" alt="" title="data-quality-sense-tingling-april-birthdays" width="338" height="244" class="aligncenter size-full wp-image-6050" /></a></p>
<p></p>
<h2>Fighting the anti-data forces of evil.</h2>
<p><a href="http://www.datamartist.com/wp-content/uploads/2011/06/the-data-days-no-the-ceo-says-yes.jpg"><img src="http://www.datamartist.com/wp-content/uploads/2011/06/the-data-days-no-the-ceo-says-yes.jpg" alt="" title="the-data-days-no-the-ceo-says-yes" width="446" height="331" class="aligncenter size-full wp-image-6049" /></a></p>
<h2>Data silo fun.</h2>
<p><a href="http://www.datamartist.com/wp-content/uploads/2011/06/data-silos-what-do-you-mean-data-silos.jpg"><img src="http://www.datamartist.com/wp-content/uploads/2011/06/data-silos-what-do-you-mean-data-silos.jpg" alt="" title="data-silos-what-do-you-mean-data-silos" width="373" height="276" class="aligncenter size-full wp-image-6052" /></a></p>
<p></p>
<p><a href="http://www.datamartist.com/wp-content/uploads/2011/06/datamigration-as-long-as-the-new-system-is-the-same.jpg"><img src="http://www.datamartist.com/wp-content/uploads/2011/06/datamigration-as-long-as-the-new-system-is-the-same.jpg" alt="" title="datamigration-as-long-as-the-new-system-is-the-same" width="463" height="343" class="aligncenter size-full wp-image-6051" /></a></p>
<p></p>
<h2>The joys of a moving target</h2>
<p><a href="http://www.datamartist.com/wp-content/uploads/2011/06/we-are-changing-all-the-product-codes-again-problem.jpg"><img src="http://www.datamartist.com/wp-content/uploads/2011/06/we-are-changing-all-the-product-codes-again-problem.jpg" alt="" title="we-are-changing-all-the-product-codes-again-problem" width="373" height="212" class="aligncenter size-full wp-image-6057" /></a></p>
<h2>Data migration tools.</h2>
<p>
<a href="http://www.datamartist.com/wp-content/uploads/2011/06/data-migration-get-the-hammer.jpg"><img src="http://www.datamartist.com/wp-content/uploads/2011/06/data-migration-get-the-hammer.jpg" alt="" title="data-migration-get-the-hammer" width="374" height="225" class="aligncenter size-full wp-image-6060" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/datamartist-data-quality-cartoons/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Data Quality Rules</title>
		<link>http://www.datamartist.com/data-quality-rules</link>
		<comments>http://www.datamartist.com/data-quality-rules#comments</comments>
		<pubDate>Thu, 16 Jun 2011 17:00:07 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[data culture]]></category>
		<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Reality Check]]></category>
		<category><![CDATA[Data Quality rules]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=5995</guid>
		<description><![CDATA[What's the difference between good data and bad data? It is much like the difference between good children and bad children- the bad data doesn't follow the rules. But what are the rules? Unlike the rules for kids, which have been fixed in stone for decades (or at least, parents wish it were so), the [...]]]></description>
			<content:encoded><![CDATA[<p>What's the difference between good data and bad data?  It is much like the difference between good children and bad children- the bad data doesn't follow the rules.<br />
<a href="http://www.datamartist.com/wp-content/uploads/2011/04/data-quality-rules-data-freedom-or-death.jpg"><img src="http://www.datamartist.com/wp-content/uploads/2011/04/data-quality-rules-data-freedom-or-death-300x269.jpg" alt="" title="data-quality-rules-data-freedom-or-death" width="300" height="269" class="alignright size-medium wp-image-6011" /></a><br />
But what are the rules?  Unlike the rules for kids, which have been fixed in stone for decades  (or at least, parents wish it were so), the rules for data are slippery things that depend very much on the context and the database.</p>
<p>While it's a complex subject, some basic rules of thumb can avoid the deeper rabbit holes.</p>
<p>The first thing to understand about Data Quality rules is they aren't as easy as they may look.  Data is in theory something in the ordered world of computers, but in reality is in the "flexible" world of humans.  A huge amount of data is entered by members of the group "Homo sapiens" (or mutilated by software written by members of that group) and as a result is not as ordered as we would all like.</p>
<p>The challenge for data quality practitioners is to remove the chaos injected by those highly involved primates (us) and make the data the sterile, ordered, never any question about anything type that we all imagine in our fantasies.</p>
<p>But how?</p>
<p>In the end, it is amazing how powerful and complex the various solutions to this problem are.</p>
<p>But I suggest that there are some basic principles that can help guide us.</p>
<h2>First- do no harm.</h2>
<p>One of the risks of any data quality initiative is that it actually screws up the data more.  Don't define rules that are so complex, and so sure of themselves that they actually make the data worse.  Be humble. Don't change data unless you are pretty sure it's a good idea.  Err on the side of not screwing up the original.  And keep a copy of the original- so if things do go off the rails you can undo- or at least try to understand what when wrong.</p>
<h2>Go out and talk to the people</h2>
<p>Don't sit in your ivory tower and speculate as to what the data means.  Go out there and watch people enter it in.  See what real world type things are happening that never make it into bits and bytes.</p>
<h2>Attack the basics first</h2>
<p>Focus your first efforts on dealing with the basics- they will resolve the vast majority of the issues- don't chase after the outliers until you have the "easy" cases taken care of- the tough stuff is a case of diminishing returns- look first at how to fix processes and train your people to make the majority of typical data entry cases more accurate before you start looking into artificial intelligence based hyper-multi-semantic-algorithmic-learning-matching-holistic-flux-capacitor data quality systems.</p>
<h2>Less is more- the fewer rules the better.</h2>
<p>So whats the rule about making rules?  Try to make less rules, and test them in a pragmatic way.  It is possible to have so many rules that the rules themselves have data quality issues- don't go there.</p>
<p>Sometimes the simplest things will bring the greatest benefit.</p>
<p>In the coming weeks, I'll be posting about how to design, implement and monitor Data quality rules using the <a href="/">Datamartist tool</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/data-quality-rules/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Data Profiler tool Datamartist V1.4 Released</title>
		<link>http://www.datamartist.com/data-profiler-tool-datamartist-v1-4-released</link>
		<comments>http://www.datamartist.com/data-profiler-tool-datamartist-v1-4-released#comments</comments>
		<pubDate>Mon, 16 May 2011 16:05:31 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[Datamartist Tool]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=6017</guid>
		<description><![CDATA[We are pleased to announce the release of Datamartist V1.4 We've had lots of great feedback from our customers and are thrilled with how people are using Datamartist, not just for powerful and flexible data profiling, but for data migration, data quality work and ad-hoc datamart creation. As always we're committed to continually improve our [...]]]></description>
			<content:encoded><![CDATA[<p>We are pleased to announce the release of Datamartist V1.4</p>
<p>We've had lots of great feedback from our customers and are thrilled with how people are using Datamartist, not just for powerful and flexible data profiling, but for data migration, data quality work and ad-hoc datamart creation.  </p>
<p>As always we're committed to continually improve our products- Here are just a few of the features added in this latest version;</p>
<h2>Block definition import and export</h2>
<p><a href="http://www.datamartist.com/wp-content/uploads/2011/05/block-export-datamartist-data-profiling-tool2.jpg"><img src="http://www.datamartist.com/wp-content/uploads/2011/05/block-export-datamartist-data-profiling-tool2-300x195.jpg" alt="" title="block-export-datamartist-data-profiling-tool" width="300" height="195" class="alignright size-medium wp-image-6023" /></a>Datamartist adds a new level of reuse-ability and collaboration capabilities with the addition of block export/import. </p>
<p>This lets you export a block configuration (or a number of blocks, with all their connectors) to a file that can then be imported into any other Canvas, either by yourself or by your colleagues that are also using Datamartist.  Just select the blocks you want, and right click to export- just right click anywhere on any canvas to import an existing block file.</p>
<p>We've found this particularly useful in saving filters, segmentations, and even full data profiling blocks to give us a library of useful blocks and block groups that we use again and again.</p>
<h2>Improved database connectivity and connection management</h2>
<p><a href="http://www.datamartist.com/wp-content/uploads/2011/05/Database-connectivity-with-value-distribution-profiling-datamartist.jpg"><img src="http://www.datamartist.com/wp-content/uploads/2011/05/Database-connectivity-with-value-distribution-profiling-datamartist-300x222.jpg" alt="" title="Database-connectivity-with-value-distribution-profiling-datamartist" width="300" height="222" class="alignright size-medium wp-image-6018" /></a>We've also added some features and improved how we connect to databases in Datamartist</h2>
<p>Datamartist can connect to SQL Server, Oracle, MySQL, MS Access, Text files and Excel files, as well as having and ODBC driver that lets you connect to many other databases.  Now its even easier to manage a large number of database connections for multiple database types- keep all those servers and connections at your fingertips when you are combining all that data!</p>
<h2>Data profiling in an affordable, graphical, visual environment</h2>
<p>Find out why people are loving the combination of an ETL and a Data profiling tool in one- using Datamartist not just for data profiling, but for data migration, data quality audits, and ad hoc datamart creation.  Try the <a href="http://www.datamartist.com/downloads">free trial</a> today.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/data-profiler-tool-datamartist-v1-4-released/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Data quality sizzle</title>
		<link>http://www.datamartist.com/data-quality-sizzle</link>
		<comments>http://www.datamartist.com/data-quality-sizzle#comments</comments>
		<pubDate>Tue, 22 Mar 2011 18:08:56 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Project Management]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=5985</guid>
		<description><![CDATA[I'm an engineer. Being an engineer, I'm pretty product focused, pretty technology focused, and pretty "does it work or not" focused. Having technical things like tools work is useful, and good. But just because you build it, does not mean they will come. The challenge often in Data Quality is that often what has to [...]]]></description>
			<content:encoded><![CDATA[<p>I'm an engineer. Being an engineer, I'm pretty product focused, pretty technology focused, and pretty "does it work or not" focused.  </p>
<p>Having technical things like tools work is useful, and good.  But just because you build it, does not mean they will come.</p>
<p>The challenge often in Data Quality is that often what has to change even more than the technology or tools is the behaviours and perspectives of the people in the organisation with data quality issues.  At the very least, the users have to use the tools.  Very few data quality solutions are of the "full autopilot" bad-data-goes-in-here-good-comes-out-here type.</p>
<p>As much as we engineers would like to solve everything with software, people are involved in Data Quality.  </p>
<p>While a fantastic bit of data profiling analysis or an elegant and powerful data transform would seem to be enough, the truth is sometimes how and when you present these things is key to getting the non-engineer people to buy in.  </p>
<p>Sometimes preparing people over time, and introducing things in a step by step way helps them understand, and makes the technology and the change required less daunting.</p>
<p><a href="http://www.datamartist.com/wp-content/uploads/2011/03/red-bbq.jpg"><img src="http://www.datamartist.com/wp-content/uploads/2011/03/red-bbq-300x199.jpg" alt="" title="red-bbq" width="300" height="199" class="alignright size-medium wp-image-5987" /></a>Because I'm looking out my window at a tentative (very tentative it's only March after all) spring day here in Toronto, I'm going to use a summer barbecue analogy.</p>
<p>The tools and technology are the steak.  The steak is key to the party.   In the end (at least for me in this analogy) the steak delivers most of the value in your summer BBQ party value proposition, but you'll have more guests and be more successful over all if you package the whole. </p>
<p>Sometimes, part of selling the steak is the sizzle, the preparation, the things around the steak.</p>
<p>It's the smell of the BBQ getting ready, it's the sound of the steak hitting the grill- its the cold drink, the conversation, the games on the lawn for the kids.</p>
<p>In the end, even if you know that 90% of the deal was that steak, if you just put a steak on a plate and give it to each guest the moment they arrive, its just not going to get the same response.</p>
<p>In my usual round about way the point I'm trying to get to is that you can't solve technical problems, then drop them on people desks and say "do it".  You need to invite them to the party.  Prepare them for the menu, ask preferences, give them some time to hear the sizzle, smell the charcoal, enjoy the sunshine in expectation of that steak.</p>
<p>Steak is good.  Remember to plan some sizzle too.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/data-quality-sizzle/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Which myths are holding you back?</title>
		<link>http://www.datamartist.com/which-myths-are-holding-you-back</link>
		<comments>http://www.datamartist.com/which-myths-are-holding-you-back#comments</comments>
		<pubDate>Thu, 10 Feb 2011 15:06:40 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[data culture]]></category>
		<category><![CDATA[Reality Check]]></category>
		<category><![CDATA[assumptions]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=5952</guid>
		<description><![CDATA[In your business you have "facts". Things that are considered to be true. Lots of folks have heard of them, or believe them, and propagate them. But are they true? You are making decisions every day based on these "facts". Obviously, we have to believe something. But today I'm asking you to be skeptical. Question [...]]]></description>
			<content:encoded><![CDATA[<p>In your business you have "facts". Things that are considered to be true. Lots of folks have heard of them, or believe them, and propagate them. But are they true?  You are making decisions every day based on these "facts".</p>
<p>Obviously, we have to believe something.  But today I'm asking you to be skeptical.  Question your facts.</p>
<p>Let me give you an example. I'm a Canadian, and looking out my window right now, I can see a pretty healthy snow fall accumulating. Lots of the white stuff.  Brings to mind the fact that some cultures in the far north have over 100 words for snow.<br />
<a href="http://www.datamartist.com/wp-content/uploads/2011/02/snowman-black-hat-and-scarf.jpg"><img src="http://www.datamartist.com/wp-content/uploads/2011/02/snowman-black-hat-and-scarf.jpg" alt="" title="snowman-black-hat-and-scarf" width="328" height="366" class="alignright size-full wp-image-5970" /></a><br />
Hang on. Is that a fact?  Tell me- have you heard a variation of that?</p>
<p>I'm sure I read that somewhere.  I've heard others mention it.  It makes sense- I mean, people living in the far north would see lots of snow, and would know all about it, and so their language would evolve to encompass lots of different qualities of snow. </p>
<p>Sounds good.</p>
<p>Only, is it?  It's an idea that "just makes sense".  People seem to just accept it as soon as you say it.  People are likely to pass the idea along to others- because it makes a compelling story.</p>
<p>But in fact, it's wrong.  I'll let you google to your hearts content if you like to find more evidence than my say so, but after reading a number of articles on the subject, (here is an <a href="http://www.princeton.edu/~browning/snow.html">example</a>, and of course <a href="http://en.wikipedia.org/wiki/Eskimo_words_for_snow">the Wikipedia entry</a>.) it seems clear that there are not 100 words for snow in any language.  In fact, English has about the same number of ways of talking about snow as languages from societies in the far north. </p>
<p>So the point of all this is-  what myths do you have in your organisation?  Things that "everyone" knows are true. Things that when they are explained to you make "perfect sense".  Things that you teach to every new hire so that they "know how things are".</p>
<p>The insidous thing about "facts" is that once they gain purchase, any contrary evidence tends to be called an "exception", or discounted.</p>
<p>Use data to find out what is true. Fight to improve the quality of your data to find more and more new truths. Question the status quo if the data contradicts it.  Don't assume that something is wrong with the data when "things don't make sense."  Maybe they don't make sense because your assumptions are just plain WRONG.</p>
<p>Be very aware that you might be making decisions based on myths that while sounding so plausible, so clear, so common sense, are pure fantasy.</p>
<p>The good news is that all your competitors might be doing the same thing.  If you look at your data, and see through it, you might show them all how wrong they are.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/which-myths-are-holding-you-back/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Data profiling- a search or a code to crack?</title>
		<link>http://www.datamartist.com/data-profiling-a-search-or-a-code-to-crac</link>
		<comments>http://www.datamartist.com/data-profiling-a-search-or-a-code-to-crac#comments</comments>
		<pubDate>Wed, 03 Nov 2010 17:50:08 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[Data profiling]]></category>
		<category><![CDATA[Data Quality]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=5848</guid>
		<description><![CDATA[Often, tracking down data quality issues is presented as a search for bad data- but sometimes the data isn't so much bad, as not understood. In legacy systems, you might be more trying to first find the meaning of data- in effect, decoding it as if it had been encrypted (which in a way, time [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.datamartist.com/wp-content/uploads/2010/11/300px-Enigma-rotor-stack.jpg"><img src="http://www.datamartist.com/wp-content/uploads/2010/11/300px-Enigma-rotor-stack.jpg" alt="" title="Photo by Bob Lord" width="300" height="225" class="alignright size-full wp-image-5850" /></a>Often, tracking down data quality issues is presented as a search for bad data- but sometimes the data isn't so much bad, as not understood.  In legacy systems, you might be more trying to first find the meaning of data- in effect, decoding it as if it had been encrypted (which in a way, time and lack of documentation might very well have done).</p>
<p>You know that all that data means something- but what?</p>
<p>One of my favorite code-busting stories is the epic victory over the Enigma code during the second world war.  One of the reasons its of interest is that it was one of the early applications of computing- but the key lesson I think is from not the brute force computation done, but the strategies used to crack the code.</p>
<p>When you are trying to crack a code, one of the key things you need are "Cribs"- some way have samples of coded message and clear text.  These cribs can radically reduce the number of possible ways a code can be decoded.</p>
<p>In the case of enigma, the allies would listen for German U-boat radio transmissions, while also using direction finding equipment to estimate their location.  Standard procedure was for a U-Boat to first radio a weather report.</p>
<p>By painstakingly back tracking known weather conditions and locations of U-Boats when they transmitted it was possible to take advantage of that first weather report- there were only so many ways to say "Sunny and calm".  Having this crib gave them a way to break into the code.</p>
<p>What is the point in terms of Data profiling?  While it's critical to have the right tools to analyse the data (a data profiler like <a href="/">Datamartist</a>, for example), its also important to get out there and talk to people, understand whats going on- collect some Cribs that will help it all make sense. </p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/data-profiling-a-search-or-a-code-to-crac/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Good Data is a force for good.</title>
		<link>http://www.datamartist.com/good-data-is-a-force-for-good</link>
		<comments>http://www.datamartist.com/good-data-is-a-force-for-good#comments</comments>
		<pubDate>Wed, 20 Oct 2010 14:59:16 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[data culture]]></category>
		<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Public Data]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=5596</guid>
		<description><![CDATA[The United Nations has declared that today is the first world statistics day, "celebrating the many contributions and achievements of official statistics". It's the kind of holiday that those of us in the data wrangling profession can really get behind. Data about people in general, and their well being, their needs and challenges is a [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.datamartist.com/wp-content/uploads/2010/10/UN-World-Statisics-Day-Logo.jpg"><img src="http://www.datamartist.com/wp-content/uploads/2010/10/UN-World-Statisics-Day-Logo.jpg" alt="" title="UN-World-Statisics-Day-Logo" width="290" height="228" class="alignright size-full wp-image-5599" /></a>The United Nations has declared that today is the first world statistics day, "celebrating the many contributions and achievements of official statistics".</p>
<p>It's the kind of holiday that those of us in the data wrangling profession can really get behind.</p>
<p>Data about people in general, and their well being, their needs and challenges is a critical component of any plan for progress- and the UN focusing on "official statistics" highlights the huge good that this data does in our world.</p>
<p>Governments, educators, charities, and communities can use official statistics to best direct aid, tailor programs to be as efficient as possible, and dramatically improve the lives of billions of people.  </p>
<p>Citizens can use data to demand change from their governments, and businesses.  They can use data to make informed decisions about which products to buy, understanding their health, environmental and economic impact.</p>
<h2>Don't take all that data for granted.</h2>
<p>I am fortunate to be living in Canada, a wealthy country that provides a broad range of services to its citizens, and I know that my family and I benefit every day from decisions and policies that have been put in place thanks to decisions informed by a broad range of statistical information.  One of the key sources is the census.</p>
<p>Unfortunately, this summer, the Canadian government decided to eliminate the mandatory long form census in Canada (there is still a shorter one), and there has been a strong outcry of disagreement. The chief statistician of statistics Canada resigned in August, but the government seems determined to eliminate this important source of data.</p>
<p>Our little drama in Canada is of course a tiny issue compared to the tragic state of affairs in many countries. Obviously, in many countries the lack of data is a symptom for much more fundamental issues.  But collecting and acting on statistical data to help your populace is an indicator of good governance, and encouraging statistics collection is a positive way to support change.</p>
<p>So on this world statistics day, I encourage everyone that loves data, facts, and decisions made using them, to consider that the anti-data forces of evil are still alive and well.  Fight those who want to "go with their gut", or worse those who know that data will expose their actions as contrary to the common good.</p>
<p>Good decisions are made based on good data. Good data does good.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/good-data-is-a-force-for-good/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Data quality challenges: behavioral inertia and its evil opposite</title>
		<link>http://www.datamartist.com/data-quality-challenges-behavioral-inertia-and-its-evil-opposite</link>
		<comments>http://www.datamartist.com/data-quality-challenges-behavioral-inertia-and-its-evil-opposite#comments</comments>
		<pubDate>Tue, 05 Oct 2010 16:39:04 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[data culture]]></category>
		<category><![CDATA[Data Quality]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=5468</guid>
		<description><![CDATA[Often, I hear someone say something like "this would be much easier if users would just..." or "If only we could convince the sales people that...". Technology folks often are frustrated by the people component of the complex systems they are trying to install. People are not a problem solved by technology Some try to [...]]]></description>
			<content:encoded><![CDATA[<p>Often, I hear someone say something like "this would be much easier if users would just..." or "If only we could convince the sales people that...".   Technology folks often are frustrated by the people component of the complex systems they are trying to install.</p>
<h2>People are not a problem solved by technology</h2>
<p>Some try to ignore the issue, or solve it with technology alone-  "If we write complex enough validation into the data entry form people HAVE to enter good data" or "Our matching algorithms will resolve the issues in real time."</p>
<p><a href="http://www.datamartist.com/wp-content/uploads/2010/10/users-will-lose-chair-if-data-quality-suffers1.jpg"><img src="http://www.datamartist.com/wp-content/uploads/2010/10/users-will-lose-chair-if-data-quality-suffers1.jpg" alt="" title="users-will-lose-chair-if-data-quality-suffers" width="311" height="229" class="alignright size-full wp-image-5473" /></a>Others try to use sophisticated training, documentation, bonus plans or punishment plans to get the behavior they want.</p>
<p>Obviously, components of both approaches are going to be used to some extent- but don't lose sight of the fact that people ARE the process- and the heart of your business.  It's the sales guys that drive revenue, and its the sales order people, or help desk operators, or engineers in your manufacturing facilities that you are building the new system for that are creating all the value.   You are a person too- think about their motivations, and how to take advantage of their abilities and enthusiasm- not how to remove them from the equation.</p>
<p>I often think that there are two powerful forces at work in the minds of all of us- oddly, they are opposites, and yet can co-exist even in the same person at the same moment.  Some people are strongly to one side or the other.  </p>
<h2>Behavioral Inertia:  Change is bad</h2>
<p>We've all see this resistance to change, and in many cases people have this tendency for good reasons (that last disastrous ERP implementation where the new processes were not properly checked, and everyone worked 15 hour days for weeks while customers were screaming into their phones about how screwed up everything was, for example.)</p>
<p>Remember, resistance to bad change that is going to screw everything up is a good thing.</p>
<p>In other cases, however, it is unfounded, and it is a real problem- things have to change to move forward.  Sometimes risks have to be taken, and there will be bumpy periods before a much better steady state is achieved.</p>
<p>People have a natural resistance to this because change is the unknown.</p>
<h2>Hyper Active change syndrome: We can't wait to do it right- we have to act NOW</h2>
<p>This is the evil opposite twin of behavioral inertia. (It's like that episode when Captain Kirk got split in a transporter incident- you know.) </p>
<p>You can identify people with this force at work by phrases like "We're a dynamic organisation, we're being proactive not reactive, our processes are fluid- its the way it is with business in the fast lane" or my personal favorite- "We don't have time to get the data, we'll have to go with our gut."</p>
<p>Hyperactive changers will often try to get their way by always creating a sense of urgency: "The technology isn't moving fast enough for us, we can't wait for those changes to be approved, all the process is slowing us down, our customers are demanding speed"</p>
<p>Hyperactive changers are dangerous because they often ignore or circumvent processes in the name of expediency, generating risk and forcing others to waste effort compensating, and generally causing chaos.  They want to change things so often, that efficiencies of new processes are never realized- everyone is on a constant learning curve and never gets in the groove.</p>
<h2>Balance the forces, find your high-speed tortoise </h2>
<p>Think of the story of the Tortoise and the Hare.  The Hare, with all its speed, could not figure out that the process was start, run, finish, and completely wasted his speed advantage by having a nap.</p>
<p>On the other hand, while the Tortoise's complete dedication to his goal and process is admirable, you can't count on the incompetence of your competition.  (And now that all Hares are no doubt told this story throughout their childhood, its unlikely many tortoise get away with the same trick.)</p>
<p>They key lies in between- we need to work with our organization to foster an environment where we value process, and consistency, but understand that a steady, relentless change to optimize is needed, and valuable.  When one or the other of our behavioral urges overcomes us, we'll find that people are the problem in our initiatives.  If we balance them, and communicate with everybody, we can find ways to make things work, even without perfect cooperation at all times from everyone. </p>
<p>Not too slow, not too fast, always value process without letting it be your slave master.  And for goodness sake, forget about going with your gut-  go out and get some DATA!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/data-quality-challenges-behavioral-inertia-and-its-evil-opposite/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

