<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Datamartist.com &#187; Hierarchies and Tree Structures</title>
	<atom:link href="http://www.datamartist.com/tag/hierarchies-and-tree-structures/feed" rel="self" type="application/rss+xml" />
	<link>http://www.datamartist.com</link>
	<description>Reduce cost with self serve data transformation</description>
	<lastBuildDate>Thu, 09 Feb 2012 20:00:31 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.3</generator>
		<item>
		<title>Hierarchies and Tree Structures in Dimensions- an Example Item Dimension (Part 4)</title>
		<link>http://www.datamartist.com/hierarchies-and-tree-structures-in-dimensions-an-example-item-dimension-part-4</link>
		<comments>http://www.datamartist.com/hierarchies-and-tree-structures-in-dimensions-an-example-item-dimension-part-4#comments</comments>
		<pubDate>Wed, 11 Feb 2009 16:09:09 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[Data Modelling]]></category>
		<category><![CDATA[Datamartist Tool]]></category>
		<category><![CDATA[Personal Data Marts]]></category>
		<category><![CDATA[Purchasing Analysis]]></category>
		<category><![CDATA[Data Mart Example]]></category>
		<category><![CDATA[Hierarchies and Tree Structures]]></category>
		<category><![CDATA[Purchasing Data Warehouse]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=903</guid>
		<description><![CDATA[Having a way to create and manage tree structures (Hierarchies) with your dimension and fact tables is a key part of making a dimensional model in any data warehouse or data mart. Hierarchical structures lend themselves to managing a very large number of categories and we use them to create drill down paths. Check out [...]]]></description>
			<content:encoded><![CDATA[<p><object width="450" height="412"><param name="movie" value="/resources/video/DemoClips/beta2_tree_edit_clip_un_prod.swf"><embed src="/resources/video/DemoClips/beta2_tree_edit_clip_un_prod.swf" width="450" height="412"></embed></object></p>
<p>Having a way to create and manage tree structures (Hierarchies) with your dimension and fact tables is a key part of making a dimensional model in any data warehouse or data mart. Hierarchical structures lend themselves to managing a very large number of categories and we use them to create drill down paths.</p>
<p>Check out the first three parts of this series (<a href="/purchasing-data-mart-cutting-costs-with-analysis-part-1">1</a>,<a href="/creating-a-fact-table-with-the-vendor-dimension-purchasing-dm-part-2">2</a> and <a href="/connecting-the-dimension-table-to-the-fact-table-vendor-example-part-3">3</a>) to see what we've done so far.</p>
<p>In this installment, we will make a another dimension, the Item dimension.  This will illustrate how the Datamartist tool allows you to quickly and easily generate hierarchies, and even edit and manage them in a graphical user interface.</p>
<p>The head of purchasing for Acme has asked us to analyze the company's spend on computer equipment- "I have a feeling some offices are spending more than others- but I don't have the numbers to back it up.  But I don't want you to use the categories in the source system- I just want it broken down by Desktops, Laptops, Printers, PDAs and other.  Can you do that with the data mart?"</p>
<p> In their source system, Acme is using the <a href="http://unstats.un.org/unsd/cr/registry/cpc-2.asp">United Nations Central Product Classification</a>,  (UNCPC) and so we know that all the computer spending we're interested is in division "C45 Office  accounting and computing machinery".   The way the codes are structured is they have a code like "C45222", so we want to take all codes with the left three characters being "C45".  We can do this easily with a filter block. After the filter block we connect a define reference block (to make a dimension), just as we did before-and finally, since we're looking at hierarchies, we'll add a recategorise block too- that last block in the chain is what we use to change the drill down structure;</p>
<p><img src="/wp-content/uploads/2009/02/items-modify-computer-categories.jpg" alt="items-modify-computer-categories" title="items-modify-computer-categories" width="500" height="141" class="alignnone size-full wp-image-932" /></p>
<h2> Tree structures simplify alternate categorisation</h2>
<p>The advantage of using a tree structure is we only have to rearrange the level of the hierarchy that encompasses the level of detail we need: we don't have to map each individual product, just the higher levels.  So it's much less work to start, and when new products are added in the source system, they will automatically map up into the new categorization.  Recategorising in excel often means search and replace at the bottom level which can cause errors, and has to be redone manually every time the data is updated.</p>
<p>When we open the recategorise block, we simply pick the levels we want to see, and then are presented with a tree view that shows us the hierarchy, automatically generated from the underlying data.<br />
<img src="/wp-content/uploads/2009/02/acme-computer-categories-edit.jpg" alt="acme-computer-categories-edit" title="acme-computer-categories-edit" width="500" height="245" class="alignnone size-full wp-image-936" /></p>
<p>Now, directly within the hierarchy we can edit categories, add new categories, and drag and drop categories around to build the new drill down that we want.  <img src="/wp-content/uploads/2009/02/acme-computer-updated-categories1.jpg" alt="acme-computer-updated-categories1" title="acme-computer-updated-categories1" width="250" height="331" class="alignleft size-full wp-image-945" /> The interface is a lot like the windows file explorer, just like renaming and moving folders, except that you are building dimensional data. Of course, the underlying input data is not changed, so there is no need to modify the source system in any way, but the datamartist tool records all the mapping and is able to reproduce it when new data arrives. </p>
<p>You only have to edit the Hiearchy once, and from that point on your analysis can use both the existing, and the edited tree structure.  It's possible to create as many different hiearchies as required- it's a fast way to do "what if" analysis, trying out different drill down paths and categorisations.</p>
<p>This is part of a 5 part series- here are the links to the various parts: <a href="/purchasing-data-mart-cutting-costs-with-analysis-part-1">1</a>,<a href="/creating-a-fact-table-with-the-vendor-dimension-purchasing-dm-part-2">2</a> , <a href="/connecting-the-dimension-table-to-the-fact-table-vendor-example-part-3">3</a> , <a href="/hierarchies-and-tree-structures-in-dimensions-an-example-item-dimension-part-4">4</a> and <a href="/joining-the-dimension-table-to-the-fact-table-purchasing-data-mart-part-5">5</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/hierarchies-and-tree-structures-in-dimensions-an-example-item-dimension-part-4/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Data modelling Hierarchies- how to make a dimension</title>
		<link>http://www.datamartist.com/data-modelling-how-to-make-a-dimension</link>
		<comments>http://www.datamartist.com/data-modelling-how-to-make-a-dimension#comments</comments>
		<pubDate>Thu, 23 Oct 2008 02:45:52 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Datamartist Tool]]></category>
		<category><![CDATA[Personal Data Marts]]></category>
		<category><![CDATA[Data Modelling]]></category>
		<category><![CDATA[Dimension Tables]]></category>
		<category><![CDATA[Hierarchies and Tree Structures]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=204</guid>
		<description><![CDATA[One of the most useful data model structures in a data mart is a Hierarchy (also called a Tree structure).  Tree structures let us take a large number of things and organise them in a way that makes sense.  More importantly, a tree structure lets us “drill down” into information.   Hierarchy Rules In a simple tree [...]]]></description>
			<content:encoded><![CDATA[<p><a href="/wp-content/uploads/2008/10/treegraphic.jpg"><img class="alignleft size-medium wp-image-226" style="border: 0px;" title="treegraphic" src="/wp-content/uploads/2008/10/treegraphic.jpg" alt="" width="195" height="262" /></a>One of the most useful data model structures in a data mart is a Hierarchy (also called a Tree structure).  Tree structures let us take a large number of things and organise them in a way that makes sense.  More importantly, a tree structure lets us “drill down” into information.  </p>
<h2>Hierarchy Rules</h2>
<p>	In a simple tree structure, every object has one and only one parent, or it is at the top level of the tree.<br />
	For each level of the tree, all the objects are the same type.</p>
<p>All fine in theory, but what do the actual table structures look like? </p>
<h2>Parent Child Relationships</h2>
<p>The most efficient way to store a tree structure of objects is in a parent child type structure. </p>
<div id="attachment_206" class="wp-caption alignright" style="width: 206px"><a href="/wp-content/uploads/2008/10/parentchild1.jpg"><img class="size-medium wp-image-206 " title="Parent Child Structure" src="/wp-content/uploads/2008/10/parentchild1.jpg" alt="Parent Child Structure" width="196" height="237" /></a><p class="wp-caption-text">Parent Child Structure</p></div>
<p>For every object you store one row recording the parent of the object.  This means that every relationship in the tree is stored only once. </p>
<p>This is the best form to store the “master copy” of the tree- because there is no ambiguity- one row, one object, one parent.  Rule number one is enforced strictly by the physical model in this case- and that’s a good thing. </p>
<p>The downside of this structure is that it requires looking at multiple rows to summarise data.  And its just not easy to read.<br />
To find out which country a city is in, we have to first look up the parent (the state province), then we have to look up the parent of that to find the country.  If a hierarchy has 10 levels, we have to look at ten rows for every row that we want to summarise to the top level.  Not so good.</p>
<h2>Dimensional Tables</h2>
<div class="mceTemp mceIEcenter" style="text-align: left;">In a dimensional table, we store one row for each object at the bottom of the hierarchy.  In that row, we store its parent, its grand parent, its great grand parent, its great great grand parent etc. etc.   Here’s what that table looks like for our example:</div>
<p style="text-align: left;"><a href="/wp-content/uploads/2008/10/dimensiontable1.jpg"><img class="aligncenter size-full wp-image-208" title="dimensiontable1" src="/wp-content/uploads/2008/10/dimensiontable1.jpg" alt="" width="338" height="138" /></a></p>
<p style="text-align: left;">This way, we have everything right there and it makes it easy to summarise.  To find the totals for a country just add up every row with a given value in the country field.  The advantages of this form are clear when it comes time to do the analysis- but what are the disadvantages?  Well, if you want to change a parent child relationship between level 1 and 2, then you have to change lots of rows- the relationship between a country and a state/province is repeated many times .</p>
<p style="text-align: left;">Depending on where the data is, and what applications have access to read and write it's also possible to have inconsistencies-  you could have some rows that say Michigan is in the USA, and others that put it in Canada.</p>
<p style="text-align: left;"><a href="/wp-content/uploads/2008/10/inconsistantdimension.jpg"><img class="aligncenter size-full wp-image-209" title="inconsistantdimension" src="/wp-content/uploads/2008/10/inconsistantdimension.jpg" alt="" width="345" height="142" /></a></p>
<p>The ideal solution is to store the master copy of the tree as a Parent Child relationship, and generate the Dimensional table automatically so that when the analysis is run, it’s fast and easy, and users can view it in spreadsheet tools in an easy to read format, knowing  that it is guaranteed to be consistent.<br />
This is what is done by the <a href="/product">Datamartist tool</a>- but rather than worrying about data models and table structures,  managing tree structures is done with drag and drop.</p>
<p><a href="/wp-content/uploads/2008/10/datamartistrecategorise2.jpg"><img class="aligncenter size-full wp-image-236" title="datamartistrecategorise2" src="/wp-content/uploads/2008/10/datamartistrecategorise2.jpg" alt="" width="499" height="336" /></a></p>
<p>Then dimensional tables are generated that are in the "everything in one row" format that is so easy to use in excel, either through an auto-filter, or with pivot tables. </p>
<p><a href="/product">Find out more about Datamartist</a>-  and <a href="/download">download</a> a free trial version.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/data-modelling-how-to-make-a-dimension/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

