<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Datamartist.com &#187; Data Mart Example</title>
	<atom:link href="http://www.datamartist.com/tag/data-mart-example/feed" rel="self" type="application/rss+xml" />
	<link>http://www.datamartist.com</link>
	<description>Reduce cost with self serve data transformation</description>
	<lastBuildDate>Mon, 26 Jul 2010 18:33:50 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>MS Access query example and comparision to Datamartist</title>
		<link>http://www.datamartist.com/microsoft-access-query-example-and-comparision-to-datamartist</link>
		<comments>http://www.datamartist.com/microsoft-access-query-example-and-comparision-to-datamartist#comments</comments>
		<pubDate>Tue, 31 Mar 2009 22:59:55 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[Business Intelligence Architecture]]></category>
		<category><![CDATA[Data Modelling]]></category>
		<category><![CDATA[Data Transformation]]></category>
		<category><![CDATA[MS Access]]></category>
		<category><![CDATA[Microsoft Excel]]></category>
		<category><![CDATA[Access]]></category>
		<category><![CDATA[Data Mart Example]]></category>
		<category><![CDATA[Excel]]></category>
		<category><![CDATA[Personal data mart]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=1321</guid>
		<description><![CDATA[Microsoft Access allows users to create complex queries and analyze large data sets. However, it can be complicated to use compared to Excel. In this post, I'll talk about ms access queries and the equivalent way to perform the same data transformation in the Datamartist tool- visually and simply. Microsoft Access has a clear role [...]]]></description>
			<content:encoded><![CDATA[<p>Microsoft Access allows users to create complex queries and analyze large data sets.  However, it can be complicated to use compared to Excel.  In this post, I'll talk about <a href="/help-support/tutorials/microsoft-access-examples-and-tutorials">ms access queries</a> and the equivalent way to perform the same data transformation in the <a href="/product">Datamartist tool</a>- visually and simply.</p>
<p>Microsoft Access has a clear role to play when a small, light database application is required.  However, it has a learning curve, and is not necessarily the best tool for data analysis.</p>
<h2>Product Segmentation Query Example</h2>
<p>Lets look at an example ms access query or two and see how we can do the same thing Datamartist, only without the queries and without any SQL. For this example, lets say that we have two sets of sales data from different time periods, and a product list, and we want to define some product segments based on color and price.  We want to get a summary of the sales Qty and average price sold by month, broken out by the new categories which are as follows:</p>
<ul>
<li> "Red and High Priced" If the product is Red and its minimum price is more than $1000</li>
<li> "Red Low Price wide price range" If the product is Red, has a minimum price less than $1000 but has a min to max price of more than $200</li>
<li> "Red Low Price small price range" If its Red and not in the first two segments</li>
<li> "Yellow" if the product is yellow. </li>
<li> "Other" for all the rest</li>
</ul>
<p>The three data tables we have are as follows:</p>
<ol>
<li> Sales 03-06 with about 120 000 rows, which contains sales data from 2003 - 2006</li>
<li> Sales 2007  with about 30 000 rows, which contains sales data for 2007</li>
<li> Products  which contains the colors for all the products and their minimum and maximum prices</li>
</ol>
<p>So- first step is to combine the two data tables, in Access, this is done with a UNION query with the following SQL code:</p>
<blockquote><p>select * from [Sales Data 03-06] UNION select * from [Sales Data 2007];</p></blockquote>
<p>In Datamartist, we simply connect the two tables up to a combine block.<br />
<img src="/wp-content/uploads/2009/03/segmentation-example-datamartist-combine1.jpg" alt="segmentation-example-datamartist-combine1" title="segmentation-example-datamartist-combine1" width="264" height="234" class="alignnone size-full wp-image-1394" /></p>
<p>Next, we need to define the segmentation-  again in Access this is done with a Query, this time by nesting IIF statements to add a new column called "Product_Segment" to the resulting query.</p>
<blockquote><p>SELECT Products.Product_ID, Products.Product_Name, Products.Product_Group, Products.Product_Category, Products.Product_SubCategory, Products.Shipping_Weight, Products.Color, Products.Price_Min, Products.Price_Max, IIf([Color]="Red" And [Price_Min]>1000,"Red and High Priced",IIf([Color]="Red" And ([Price_max]-[Price_min])>200,"Red Low Price wide price range",IIf([Color]="Red","Red Low Price small price range",IIf([Color]="Yellow","Yellow","Other")))) AS Product_Segment<br />
FROM Products;</p></blockquote>
<p>In Datamartist, we use a segmentation block to do the same thing.  The interface is graphical, and the syntax is the same as you would use in Excel.  There is no need to nest any IF statements, because the overall block is designed to do that.  Heres what the blocks look like-  the MS Access import block on the left, and the segmentation rule block on the right.<br />
<img src="/wp-content/uploads/2009/03/segmentation-example-datamartist-segment-block.jpg" alt="segmentation-example-datamartist-segment-block" title="segmentation-example-datamartist-segment-block" width="418" height="211" class="alignnone size-full wp-image-1428" /><br />
Each segment has the statement that defines if a row is in the segment or not.   The block tests each segment rule in order, starting at the top- the first statement that solves as "TRUE" defines the value for the Product_Segment column for that row. Dragging the segments up and down changes what order the rules are checked.</p>
<p><a href="/resources/images/Segmentation-Example-Product.jpg" target="_blank" onClick="javascript: pageTracker._trackPageview('/screenshots/Segmentation-Example-Product'); "><img src="/resources/images/Segmentation-Example-Product-Thumb.jpg">
<p style="padding:8px;">(Click to Enlarge)</p>
<p></a></p>
<p>Then we have to Join this new product dimension (with the segmentation column) to the sales data, and summarize.</p>
<p>In MS Access, this is done with more queries-  Heres what Access looks like when we're done.<br />
<img src="/wp-content/uploads/2009/03/segmentation-example-access-gui1.jpg" alt="segmentation-example-access-gui1" title="segmentation-example-access-gui1" width="450" height="485" class="alignnone size-full wp-image-1405" /><br />
Compare that list of Tables and Queries to the visual, left to right layout of the Datamartist data canvas that does the same thing.  Without ever having to write any SQL code:</p>
<h2>The VISUAL way to do it</h2>
<p><img src="/wp-content/uploads/2009/03/segmentation-example-solved-canvas.jpg" alt="segmentation-example-solved-canvas" title="segmentation-example-solved-canvas" width="406" height="314" class="alignnone size-full wp-image-1403" /></p>
<p><a href="/resources/images/Segmentation-Example-Datamartist-full-app-shot.jpg" target="_blank" onClick="javascript: pageTracker._trackPageview('/screenshots/Segmentation-Example-Datamartist-full-app-shot'); "><img src="/resources/images/Segmentation-Example-Datamartist-full-app-shot-Thumb.jpg" class="alignright size-full wp-image-1430" ></a><br />
In Datamartist you can see the flow of the data, the row counts are clearly displayed, and clicking on the connectors will bring up the underlying data set in the data viewer.  Its clear which block feeds which, and by adding more blocks and connecting them at the desired point in the data flow, new analysis can be created.</p>
<p>Take Datamartist for a trial run-  <a href="/downloads">download it now</a> because maybe you don't have to learn microsoft access queries after all.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/microsoft-access-query-example-and-comparision-to-datamartist/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Joining the Dimension Table to the Fact Table- Purchasing Data mart (Part 5)</title>
		<link>http://www.datamartist.com/joining-the-dimension-table-to-the-fact-table-purchasing-data-mart-part-5</link>
		<comments>http://www.datamartist.com/joining-the-dimension-table-to-the-fact-table-purchasing-data-mart-part-5#comments</comments>
		<pubDate>Tue, 17 Feb 2009 16:31:48 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[Cost Reduction]]></category>
		<category><![CDATA[Data Modelling]]></category>
		<category><![CDATA[Datamartist Tool]]></category>
		<category><![CDATA[ETL]]></category>
		<category><![CDATA[Personal Data Marts]]></category>
		<category><![CDATA[Purchasing Analysis]]></category>
		<category><![CDATA[Data Mart Example]]></category>
		<category><![CDATA[Dimension Tables]]></category>
		<category><![CDATA[Purchasing Data Warehouse]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=991</guid>
		<description><![CDATA[After we have created the dimension tables and the fact table and populated them with data the final step to getting a star schema is of course to actually join the dimension tables to the fact table. In the datamartist tool we do this with a Join block. Check out the first four parts of [...]]]></description>
			<content:encoded><![CDATA[<p><img src="/wp-content/uploads/2009/02/join1.jpg" alt="join1" title="join1" width="200" height="200" class="alignright size-full wp-image-995" />After we have created the dimension tables and the fact table and populated them with data the final step to getting a star schema is of course to actually join the dimension tables to the fact table.  In the datamartist tool we do this with a Join block.</p>
<p>Check out the first four parts of this series (<a href="/purchasing-data-mart-cutting-costs-with-analysis-part-1">1</a>,<a href="http://www.datamartist.com/creating-a-fact-table-with-the-vendor-dimension-purchasing-dm-part-2">2</a> , <a href="/connecting-the-dimension-table-to-the-fact-table-vendor-example-part-3">3</a> and <a href="/hierarchies-and-tree-structures-in-dimensions-an-example-item-dimension-part-4">4</a>) where we created an example data mart, with some fictitious purchasing data.</p>
<p>The final step is to join the dimensions we have created to the fact table. To do this, we connect up the two dimensions (Vendor and Item) to the Join block and connect an export block to the output.  What has in effect been created is a complete Extract, Transform Load (ETL) and the final star schema join.<br />
<a href="/wp-content/uploads/2009/02/po-data-mart-screen-shot2.png"><img src="/wp-content/uploads/2009/02/po-datamart-blocks1.jpg" alt="po-datamart-blocks1" title="po-datamart-blocks1" width="400" height="208" class="alignnone size-full wp-image-1002" /></a></p>
<p>(If thats a bit hard to read- click on the image to see the full size screen shot.)</p>
<p>With the generated data set I used for this example, summarizing the data to yearly totals but keeping all the detail on Vendor and Item causes the roughly 4 million row raw data file to be reduced to around 800 thousand rows.  (This summarizing was done on another canvas- although it could have been done on this canvas just as easily).</p>
<p><img src="/wp-content/uploads/2009/02/join-column-selection.jpg" alt="join-column-selection" title="join-column-selection" width="249" height="361" class="alignleft size-full wp-image-1007" />This data mart, with 800 k rows and two dimensions of about three thousand members each took my laptop about a minute and 45 seconds to solve, and save to a 360 Mb text file out.</p>
<p>Of course, by summarizing or filtering (just add blocks) analysis subsets could easily be exported directly to Excel, managing the data volumes involved, and letting you create the graphs, dashboards and reports that you need.</p>
<p>This is part of a 5 part series- here are the links to the various parts: <a href="/purchasing-data-mart-cutting-costs-with-analysis-part-1">1</a>,<a href="/creating-a-fact-table-with-the-vendor-dimension-purchasing-dm-part-2">2</a> , <a href="/connecting-the-dimension-table-to-the-fact-table-vendor-example-part-3">3</a> , <a href="/hierarchies-and-tree-structures-in-dimensions-an-example-item-dimension-part-4">4</a> and <a href="/joining-the-dimension-table-to-the-fact-table-purchasing-data-mart-part-5">5</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/joining-the-dimension-table-to-the-fact-table-purchasing-data-mart-part-5/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Hierarchies and Tree Structures in Dimensions- an Example Item Dimension (Part 4)</title>
		<link>http://www.datamartist.com/hierarchies-and-tree-structures-in-dimensions-an-example-item-dimension-part-4</link>
		<comments>http://www.datamartist.com/hierarchies-and-tree-structures-in-dimensions-an-example-item-dimension-part-4#comments</comments>
		<pubDate>Wed, 11 Feb 2009 16:09:09 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[Data Modelling]]></category>
		<category><![CDATA[Datamartist Tool]]></category>
		<category><![CDATA[Personal Data Marts]]></category>
		<category><![CDATA[Purchasing Analysis]]></category>
		<category><![CDATA[Data Mart Example]]></category>
		<category><![CDATA[Hierarchies and Tree Structures]]></category>
		<category><![CDATA[Purchasing Data Warehouse]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=903</guid>
		<description><![CDATA[Having a way to create and manage tree structures (Hierarchies) with your dimension and fact tables is a key part of making a dimensional model in any data warehouse or data mart. Hierarchical structures lend themselves to managing a very large number of categories and we use them to create drill down paths. Check out [...]]]></description>
			<content:encoded><![CDATA[<p><object width="450" height="412"><param name="movie" value="/resources/video/DemoClips/beta2_tree_edit_clip_un_prod.swf"><embed src="/resources/video/DemoClips/beta2_tree_edit_clip_un_prod.swf" width="450" height="412"></embed></object></p>
<p>Having a way to create and manage tree structures (Hierarchies) with your dimension and fact tables is a key part of making a dimensional model in any data warehouse or data mart. Hierarchical structures lend themselves to managing a very large number of categories and we use them to create drill down paths.</p>
<p>Check out the first three parts of this series (<a href="/purchasing-data-mart-cutting-costs-with-analysis-part-1">1</a>,<a href="/creating-a-fact-table-with-the-vendor-dimension-purchasing-dm-part-2">2</a> and <a href="/connecting-the-dimension-table-to-the-fact-table-vendor-example-part-3">3</a>) to see what we've done so far.</p>
<p>In this installment, we will make a another dimension, the Item dimension.  This will illustrate how the Datamartist tool allows you to quickly and easily generate hierarchies, and even edit and manage them in a graphical user interface.</p>
<p>The head of purchasing for Acme has asked us to analyze the company's spend on computer equipment- "I have a feeling some offices are spending more than others- but I don't have the numbers to back it up.  But I don't want you to use the categories in the source system- I just want it broken down by Desktops, Laptops, Printers, PDAs and other.  Can you do that with the data mart?"</p>
<p> In their source system, Acme is using the <a href="http://unstats.un.org/unsd/cr/registry/cpc-2.asp">United Nations Central Product Classification</a>,  (UNCPC) and so we know that all the computer spending we're interested is in division "C45 Office  accounting and computing machinery".   The way the codes are structured is they have a code like "C45222", so we want to take all codes with the left three characters being "C45".  We can do this easily with a filter block. After the filter block we connect a define reference block (to make a dimension), just as we did before-and finally, since we're looking at hierarchies, we'll add a recategorise block too- that last block in the chain is what we use to change the drill down structure;</p>
<p><img src="/wp-content/uploads/2009/02/items-modify-computer-categories.jpg" alt="items-modify-computer-categories" title="items-modify-computer-categories" width="500" height="141" class="alignnone size-full wp-image-932" /></p>
<h2> Tree structures simplify alternate categorisation</h2>
<p>The advantage of using a tree structure is we only have to rearrange the level of the hierarchy that encompasses the level of detail we need: we don't have to map each individual product, just the higher levels.  So it's much less work to start, and when new products are added in the source system, they will automatically map up into the new categorization.  Recategorising in excel often means search and replace at the bottom level which can cause errors, and has to be redone manually every time the data is updated.</p>
<p>When we open the recategorise block, we simply pick the levels we want to see, and then are presented with a tree view that shows us the hierarchy, automatically generated from the underlying data.<br />
<img src="/wp-content/uploads/2009/02/acme-computer-categories-edit.jpg" alt="acme-computer-categories-edit" title="acme-computer-categories-edit" width="500" height="245" class="alignnone size-full wp-image-936" /></p>
<p>Now, directly within the hierarchy we can edit categories, add new categories, and drag and drop categories around to build the new drill down that we want.  <img src="/wp-content/uploads/2009/02/acme-computer-updated-categories1.jpg" alt="acme-computer-updated-categories1" title="acme-computer-updated-categories1" width="250" height="331" class="alignleft size-full wp-image-945" /> The interface is a lot like the windows file explorer, just like renaming and moving folders, except that you are building dimensional data. Of course, the underlying input data is not changed, so there is no need to modify the source system in any way, but the datamartist tool records all the mapping and is able to reproduce it when new data arrives. </p>
<p>You only have to edit the Hiearchy once, and from that point on your analysis can use both the existing, and the edited tree structure.  It's possible to create as many different hiearchies as required- it's a fast way to do "what if" analysis, trying out different drill down paths and categorisations.</p>
<p>This is part of a 5 part series- here are the links to the various parts: <a href="/purchasing-data-mart-cutting-costs-with-analysis-part-1">1</a>,<a href="/creating-a-fact-table-with-the-vendor-dimension-purchasing-dm-part-2">2</a> , <a href="/connecting-the-dimension-table-to-the-fact-table-vendor-example-part-3">3</a> , <a href="/hierarchies-and-tree-structures-in-dimensions-an-example-item-dimension-part-4">4</a> and <a href="/joining-the-dimension-table-to-the-fact-table-purchasing-data-mart-part-5">5</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/hierarchies-and-tree-structures-in-dimensions-an-example-item-dimension-part-4/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Connecting the dimension table to the fact table- Vendor Example (Part 3)</title>
		<link>http://www.datamartist.com/connecting-the-dimension-table-to-the-fact-table-vendor-example-part-3</link>
		<comments>http://www.datamartist.com/connecting-the-dimension-table-to-the-fact-table-vendor-example-part-3#comments</comments>
		<pubDate>Mon, 09 Feb 2009 20:47:55 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[Cost Reduction]]></category>
		<category><![CDATA[Data Modelling]]></category>
		<category><![CDATA[Data Quality]]></category>
		<category><![CDATA[Datamartist Tool]]></category>
		<category><![CDATA[ETL]]></category>
		<category><![CDATA[Personal Data Marts]]></category>
		<category><![CDATA[Data Mart Example]]></category>
		<category><![CDATA[Dimension Tables]]></category>
		<category><![CDATA[Duplicate Data]]></category>
		<category><![CDATA[Purchasing Data Warehouse]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=858</guid>
		<description><![CDATA[In parts one and two of this series we introduced our challenge (to make a data mart to analyze the Acme Company's spending) and showed how the Datamartist tool could import millions of rows of data and then turn it into a fact table we can use in Excel. Now we need to create a [...]]]></description>
			<content:encoded><![CDATA[<p><img src="/wp-content/uploads/2009/02/makingdimseasyway.jpg" alt="makingdimseasyway" title="makingdimseasyway" width="250" height="97" class="alignright size-full wp-image-883" />In parts <a href="/purchasing-data-mart-cutting-costs-with-analysis-part-1">one</a> and <a href="/creating-a-fact-table-with-the-vendor-dimension-purchasing-dm-part-2">two</a> of this series we introduced our challenge (to make a data mart to analyze the Acme Company's spending) and showed how the <a href="/product">Datamartist tool</a> could import millions of rows of data and then turn it into a fact table we can use in Excel.</p>
<p>Now we need to create a Vendor dimension table and join it to this fact table to determine who our big vendors are.</p>
<p>In Datamartist it is a simple task to create this vendor dimension. As always we use blocks and connect them together.  We define a dimension by using a reference definition block. All we have to do to configure the reference block is to specify which columns uniquely define the dimension (or almost uniquely, Datamartist will resolve duplicate keys using a majority/first rule set for you if you have some data glitches).</p>
<p>We start with an import block that brings in the Vendor master text file, then we define the reference by specifying "Vendor_ID" as the key.  These first two blocks look like this:<br />
<img src="/wp-content/uploads/2009/02/vendor-master-in-and-reference-block.jpg" alt="vendor-master-in-and-reference-block" title="vendor-master-in-and-reference-block" width="302" height="148" class="alignnone size-full wp-image-878" /></p>
<p>Then we join it to the fact table we created in part two of this series with a join block.  This means that now instead of just the vendor ID number that was in the fact table, we have the name, and address for the vendor in our mini star schema.</p>
<p><img src="/wp-content/uploads/2009/02/vendor-dimension-and-join.jpg" alt="vendor-dimension-and-join" title="vendor-dimension-and-join" width="436" height="283" class="alignnone size-full wp-image-879" /></p>
<p>And finally we put a summarize block after that to total up all the monthly values for each vendor, and we export to excel. This is what the canvas looks like:<br />
<img src="/wp-content/uploads/2009/02/vendor-dimension-without-dedup1.jpg" alt="vendor-dimension-without-dedup1" title="vendor-dimension-without-dedup1" width="501" height="198" class="alignnone size-full wp-image-865" /><br />
After we do this, we grab the excel file Datamartist just created for us, do a quick sort, and come up with a list of Acme's top ten suppliers.  Feeling pretty good about ourselves, we do a review with the head of purchasing.</p>
<p>"Where's Mega brothers?" she says with a frown "I think your data is screwy- no way that Mega brothers didn't make the top ten- we spend a fortune on railways, and a lot of our freight goes with the Mega Brothers Rail company. Of course it is probably entered under different vendors, each location works with the office local to them... But we've got to view them as a single vendor in the data mart- you <em><strong>can</strong></em> do that right?"</p>
<p><img src="/wp-content/uploads/2009/02/vendor-dimension-with-dedupe1.jpg" alt="vendor-dimension-with-dedupe1" title="vendor-dimension-with-dedupe1" width="300" height="205" class="alignright size-full wp-image-870" /></p>
<h2>Fixing Duplicate Rows</h2>
<p>  Having to deal with duplicate data is a very common issue in any type of data analysis.  So, back to the canvas.  By simply adding a de-duplicate block to our Vendor dimension table (after the Reference block, and before the join) we can find and resolve the Mega Brothers duplicates.<br />
We just use the filter to find the records- (Easy to do, looking for "Mega" "rail" "brothers" etc. and we map them to a single instance.)  This is the filter control that lets us find and tag the duplicates:<br />
<img src="/wp-content/uploads/2009/02/mega-bros-duplicates-in-picker1.jpg" alt="mega-bros-duplicates-in-picker1" title="mega-bros-duplicates-in-picker1" width="400" height="280" class="alignnone size-full wp-image-871" /></p>
<p><img src="/wp-content/uploads/2009/02/mega-bros-duplicates-in-mapper.jpg" alt="mega-bros-duplicates-in-mapper" title="mega-bros-duplicates-in-mapper" width="312" height="247" class="alignright size-full wp-image-872" />As we tag them, they show up in the mapper, which lets us see which duplicate records we have eliminated for the dimension. We run the canvas again, and this time, sure enough, Mega Brothers Rail is in our top ten.  But even though the head of purchasing knew it was a lot, this is actually the first time she's seen the number.  "Wow. I've got to give them a call- can you give me that in an Excel spreadsheet?"</p>
<p>Stay tuned, more to come as we go further into Datamartist's ability to segment, filter and organize large data sets.</p>
<p>If you want to see the interface in action watch our first <a href="/product/video-and-screenshots/introductory-tutorial-video">Tutorial Video</a>.  Or just get right to it with your own data- <a href="/downloads">download the free 30 day trial now</a>- there is no registration required, and it installs in minutes.</p>
<p>This is part of a 5 part series- here are the links to the various parts: <a href="/purchasing-data-mart-cutting-costs-with-analysis-part-1">1</a>,<a href="/creating-a-fact-table-with-the-vendor-dimension-purchasing-dm-part-2">2</a> , <a href="/connecting-the-dimension-table-to-the-fact-table-vendor-example-part-3">3</a> , <a href="/hierarchies-and-tree-structures-in-dimensions-an-example-item-dimension-part-4">4</a> and <a href="/joining-the-dimension-table-to-the-fact-table-purchasing-data-mart-part-5">5</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/connecting-the-dimension-table-to-the-fact-table-vendor-example-part-3/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Creating a Fact Table with the Vendor dimension Purchasing DM (Part 2)</title>
		<link>http://www.datamartist.com/creating-a-fact-table-with-the-vendor-dimension-purchasing-dm-part-2</link>
		<comments>http://www.datamartist.com/creating-a-fact-table-with-the-vendor-dimension-purchasing-dm-part-2#comments</comments>
		<pubDate>Fri, 06 Feb 2009 00:23:50 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[Data Modelling]]></category>
		<category><![CDATA[Data Transformation]]></category>
		<category><![CDATA[Datamartist Tool]]></category>
		<category><![CDATA[ETL]]></category>
		<category><![CDATA[Personal Data Marts]]></category>
		<category><![CDATA[Data Mart Example]]></category>
		<category><![CDATA[Excel Data Import]]></category>
		<category><![CDATA[Excel Performance]]></category>
		<category><![CDATA[Purchasing Data Warehouse]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=781</guid>
		<description><![CDATA[In creating a data warehouse or data mart data model there are two key types of tables- fact tables and dimension tables. Fact tables hold the data to be analyzed, dimensional tables provide categories and analysis values that organize the data. So we have our mission from Part 1: to analyze the "Acme does everything" [...]]]></description>
			<content:encoded><![CDATA[<p><img src="/wp-content/uploads/2009/02/four_million_rows_no_worries1.jpg" alt="four_million_rows_no_worries1" title="four_million_rows_no_worries1" width="300" height="136" class="alignright size-full wp-image-812" />In creating a data warehouse or data mart data model there are two key types of tables- fact tables and dimension tables.  Fact tables hold the data to be analyzed, dimensional tables provide categories and analysis values that organize the data.<br />
So we have our <a href="/purchasing-data-mart-cutting-costs-with-analysis-part-1">mission from Part 1</a>: to analyze the "Acme does everything" company's purchasing data and find ways to save money.  The first step, however is getting a handle on the data.  The IT department has given us the files, and with a smug smile told us to "have fun".  We've been given three files that are a snapshot of the purchasing data:</p>
<ul>
<li><strong>Item_Master.txt</strong>  - this holds all the items that Acme buys</li>
<li><strong>Vendor_Master.txt</strong> - this holds a list of all the vendors, with information such as their address</li>
<li><strong>PO_Detail.tx</strong>t - this is the huge data set, all the purchase order data for the last four years</li>
</ul>
<p>The Item and Vendor files aren't very big, but the PO_Detail is over 340 Mb, and it holds almost four million purchase order lines.  Don't try to import it into Excel. Of course you need Excel 2007 to even try to import 4 million rows. In Excel 2003 it would take over sixty sheets and probably some VBA code to try it.  I tried the import in Excel 2007- it takes 20 seconds just to tell me I'll have to go back to the text file import multiple times to do multiple imports onto separate sheets. It took almost two minutes to do the first million rows.  Even once we have the data spread across four sheets it's not clear how to summarize millions of rows in excel easily.<img src="/wp-content/uploads/2009/02/po_detail_columns.jpg" alt="po_detail_columns" title="po_detail_columns" width="247" height="398" class="alignright size-full wp-image-785" /></p>
<p>Instead, let's use the <a href="/product">Datamartist tool</a> to manage this data set and generate one thats more useful.</p>
<p>The first analysis we will do will be on the Vendor dimension, to determine who Acme's big vendors are, and if we can negotiate some price reductions where we have leverage.</p>
<p>In Datamartist, very large files are not an issue because the tool can load in only preview data- this means that it's possible to look at a sampling of a few hundred thousand rows, and design the transformation before running it on the whole data set.</p>
<p>The PO Detail file has the columns shown- let's answer the question - "Who are our biggest suppliers"?<br />
 So which columns do we need?  We probably want to have some sense of trends over time so we'll keep the <strong>order date</strong>, but summarize to <strong>Month</strong>,  we'll keep the <strong>Vendor ID</strong> of course, and then we need to use the <strong>Quantity and Price</strong> fields to calculate the total amount spent.  Then we want to write this summarized data into Excel to check it out.</p>
<p>To do this in Datamartist all it takes is four simple blocks;  A Text import block to load in the PO_Detail.txt file, a calculate block to multiply QTY by PRICE, a Summarize block to do all the summarizing, and an Excel export block to generate the excel file;</p>
<p><img src="/wp-content/uploads/2009/02/po_detail_summarize_blocks.jpg" alt="po_detail_summarize_blocks" title="po_detail_summarize_blocks" width="463" height="92" class="alignnone size-full wp-image-806" /></p>
<p>Each block passes its result to the next block via the connectors, and the last block saves it to an excel file we've specified.</p>
<p>Defining the calculation uses standard spreadsheet functions- here's what the config area looks like;<br />
<img src="/wp-content/uploads/2009/02/calculate_total_closeup.jpg" alt="calculate_total_closeup" title="calculate_total_closeup" width="400" height="91" class="alignnone size-full wp-image-801" /></p>
<p>And defining the summary is as simple as it looks- pick the columns you want, and select what kind of summary you want done.<br />
<img src="/wp-content/uploads/2009/02/summary_block_closeup1.jpg" alt="summary_block_closeup1" title="summary_block_closeup1" width="417" height="111" class="alignnone size-full wp-image-797" /></p>
<p>We run it on a preview set of 100 thousand rows (takes about twelve seconds to run), and check the output.</p>
<p>It looks good, so we run on the whole 4 million rows;</p>
<p><img src="/wp-content/uploads/2009/02/summarize_progress_po_detail.jpg" alt="summarize_progress_po_detail" title="summarize_progress_po_detail" width="466" height="128" class="alignnone size-full wp-image-804" /></p>
<p>About seven minutes later we have our result- an excel sheet with a manageable 130 thousand rows, total spend, by vendor, by month for four years;<br />
<img src="/wp-content/uploads/2009/02/completed_po_detail_summary.jpg" alt="completed_po_detail_summary" title="completed_po_detail_summary" width="461" height="95" class="alignnone size-full wp-image-807" /></p>
<p>Next up we need to create our vendor dimension, and join it to this mini fact table we have created.  Stay tuned.</p>
<p>This is part of a 5 part series- here are the links to the various parts: <a href="/purchasing-data-mart-cutting-costs-with-analysis-part-1">1</a>,<a href="/creating-a-fact-table-with-the-vendor-dimension-purchasing-dm-part-2">2</a> , <a href="/connecting-the-dimension-table-to-the-fact-table-vendor-example-part-3">3</a> , <a href="/hierarchies-and-tree-structures-in-dimensions-an-example-item-dimension-part-4">4</a> and <a href="/joining-the-dimension-table-to-the-fact-table-purchasing-data-mart-part-5">5</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/creating-a-fact-table-with-the-vendor-dimension-purchasing-dm-part-2/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Purchasing Data Mart &#8211; cutting costs with analysis (Part 1)</title>
		<link>http://www.datamartist.com/purchasing-data-mart-cutting-costs-with-analysis-part-1</link>
		<comments>http://www.datamartist.com/purchasing-data-mart-cutting-costs-with-analysis-part-1#comments</comments>
		<pubDate>Tue, 27 Jan 2009 20:08:35 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Personal Data Marts]]></category>
		<category><![CDATA[Purchasing Analysis]]></category>
		<category><![CDATA[Spreadsheet Tips]]></category>
		<category><![CDATA[Accounts payable]]></category>
		<category><![CDATA[analysis]]></category>
		<category><![CDATA[Data Mart Example]]></category>
		<category><![CDATA[Data Warehouse Example]]></category>
		<category><![CDATA[Example Data mart]]></category>
		<category><![CDATA[Purchasing]]></category>
		<category><![CDATA[Purchasing Data Warehouse]]></category>

		<guid isPermaLink="false">http://www.datamartist.com.php5-2.dfw1-1.websitetestlink.com/?p=774</guid>
		<description><![CDATA[In these difficult economic times, cutting costs isn't just optimization, it's survival. You can't reduce what you can't quantify so it's critical to analyze the accounts payable (AP), or purchasing data to identify the areas where cost savings are possible. This is one of the most useful financial data marts because spending is often something [...]]]></description>
			<content:encoded><![CDATA[<p><img src="/wp-content/uploads/2009/02/purchasingdatamartgraphic-300x224.jpg" alt="purchasingdatamartgraphic" title="purchasingdatamartgraphic" width="300" height="224" class="alignright size-medium wp-image-775" />In these difficult economic times, cutting costs isn't just optimization, it's survival. You can't reduce what you can't quantify so it's critical to analyze the accounts payable (AP), or purchasing data to identify the areas where cost savings are possible.  This is one of the most useful financial data marts because spending is often something that can be controlled quickly once understood.</p>
<p>In the next series of posts I am going walk through the design and implementation of a purchasing data mart, including its fact tables and dimensions to allow us to analyze some typical purchasing data.  I’ll build this data mart model using the <a href="/product">Datamartist tool</a>.  </p>
<p>This will create a “snapshot” analysis of purchasing data with a desktop data analysis tool that can be built quickly yet will access millions of rows of data, and deal with data quality issues such as duplicate rows.</p>
<p>For the purchasing data mart model that we’ll be defining, I’ll use a fictitious company that manufactures and sells a broad range of things- the "Acme does everything company".  </p>
<p>Acme is a long standing enterprise, with a number of offices and factories in the US. But they’ve never done an in-depth analysis of their costs because they didn’t have to until now- profits were good, and the business was growing well.  But then the economy took a turn for the worst, and Acme’s customers are cutting back on pretty much everything.  Acme’s CFO has announced that if costs aren't reduced quickly, Acme is going to simply run out of cash.  He wants you to head up the analysis on the company’s purchases- where can Acme save?</p>
<p>I look forward to showcasing the functionality in the Datamartist tool that makes it possible to do this without programming, and without requiring database software, developers or servers.  This kind of snapshot, immediate data transformation is what we think will make Datamartist such a cost effective and efficient addition to any serious analyst's toolkit.</p>
<p>This is part of a 5 part series- here are the links to the various parts: <a href="/purchasing-data-mart-cutting-costs-with-analysis-part-1">1</a>,<a href="/creating-a-fact-table-with-the-vendor-dimension-purchasing-dm-part-2">2</a> , <a href="/connecting-the-dimension-table-to-the-fact-table-vendor-example-part-3">3</a> , <a href="/hierarchies-and-tree-structures-in-dimensions-an-example-item-dimension-part-4">4</a> and <a href="/joining-the-dimension-table-to-the-fact-table-purchasing-data-mart-part-5">5</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/purchasing-data-mart-cutting-costs-with-analysis-part-1/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
