<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Datamartist.com &#187; Access</title>
	<atom:link href="http://www.datamartist.com/tag/access/feed" rel="self" type="application/rss+xml" />
	<link>http://www.datamartist.com</link>
	<description>Reduce cost with self serve data transformation</description>
	<lastBuildDate>Wed, 25 Jan 2012 15:47:34 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.1.3</generator>
		<item>
		<title>MS Access query example and comparision to Datamartist</title>
		<link>http://www.datamartist.com/microsoft-access-query-example-and-comparision-to-datamartist</link>
		<comments>http://www.datamartist.com/microsoft-access-query-example-and-comparision-to-datamartist#comments</comments>
		<pubDate>Tue, 31 Mar 2009 22:59:55 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[Business Intelligence Architecture]]></category>
		<category><![CDATA[Data Modelling]]></category>
		<category><![CDATA[Data Transformation]]></category>
		<category><![CDATA[Microsoft Excel]]></category>
		<category><![CDATA[MS Access]]></category>
		<category><![CDATA[Access]]></category>
		<category><![CDATA[Data Mart Example]]></category>
		<category><![CDATA[Excel]]></category>
		<category><![CDATA[Personal data mart]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=1321</guid>
		<description><![CDATA[Microsoft Access allows users to create complex queries and analyze large data sets. However, it can be complicated to use compared to Excel. In this post, I'll talk about ms access queries and the equivalent way to perform the same data transformation in the Datamartist tool- visually and simply. Microsoft Access has a clear role [...]]]></description>
			<content:encoded><![CDATA[<p>Microsoft Access allows users to create complex queries and analyze large data sets.  However, it can be complicated to use compared to Excel.  In this post, I'll talk about <a href="/help-support/tutorials/microsoft-access-examples-and-tutorials">ms access queries</a> and the equivalent way to perform the same data transformation in the <a href="/product">Datamartist tool</a>- visually and simply.</p>
<p>Microsoft Access has a clear role to play when a small, light database application is required.  However, it has a learning curve, and is not necessarily the best tool for data analysis.</p>
<h2>Product Segmentation Query Example</h2>
<p>Lets look at an example ms access query or two and see how we can do the same thing Datamartist, only without the queries and without any SQL. For this example, lets say that we have two sets of sales data from different time periods, and a product list, and we want to define some product segments based on color and price.  We want to get a summary of the sales Qty and average price sold by month, broken out by the new categories which are as follows:</p>
<ul>
<li> "Red and High Priced" If the product is Red and its minimum price is more than $1000</li>
<li> "Red Low Price wide price range" If the product is Red, has a minimum price less than $1000 but has a min to max price of more than $200</li>
<li> "Red Low Price small price range" If its Red and not in the first two segments</li>
<li> "Yellow" if the product is yellow. </li>
<li> "Other" for all the rest</li>
</ul>
<p>The three data tables we have are as follows:</p>
<ol>
<li> Sales 03-06 with about 120 000 rows, which contains sales data from 2003 - 2006</li>
<li> Sales 2007  with about 30 000 rows, which contains sales data for 2007</li>
<li> Products  which contains the colors for all the products and their minimum and maximum prices</li>
</ol>
<p>So- first step is to combine the two data tables, in Access, this is done with a UNION query with the following SQL code:</p>
<blockquote><p>select * from [Sales Data 03-06] UNION select * from [Sales Data 2007];</p></blockquote>
<p>In Datamartist, we simply connect the two tables up to a combine block.<br />
<img src="/wp-content/uploads/2009/03/segmentation-example-datamartist-combine1.jpg" alt="segmentation-example-datamartist-combine1" title="segmentation-example-datamartist-combine1" width="264" height="234" class="alignnone size-full wp-image-1394" /></p>
<p>Next, we need to define the segmentation-  again in Access this is done with a Query, this time by nesting IIF statements to add a new column called "Product_Segment" to the resulting query.</p>
<blockquote><p>SELECT Products.Product_ID, Products.Product_Name, Products.Product_Group, Products.Product_Category, Products.Product_SubCategory, Products.Shipping_Weight, Products.Color, Products.Price_Min, Products.Price_Max, IIf([Color]="Red" And [Price_Min]>1000,"Red and High Priced",IIf([Color]="Red" And ([Price_max]-[Price_min])>200,"Red Low Price wide price range",IIf([Color]="Red","Red Low Price small price range",IIf([Color]="Yellow","Yellow","Other")))) AS Product_Segment<br />
FROM Products;</p></blockquote>
<p>In Datamartist, we use a segmentation block to do the same thing.  The interface is graphical, and the syntax is the same as you would use in Excel.  There is no need to nest any IF statements, because the overall block is designed to do that.  Heres what the blocks look like-  the MS Access import block on the left, and the segmentation rule block on the right.<br />
<img src="/wp-content/uploads/2009/03/segmentation-example-datamartist-segment-block.jpg" alt="segmentation-example-datamartist-segment-block" title="segmentation-example-datamartist-segment-block" width="418" height="211" class="alignnone size-full wp-image-1428" /><br />
Each segment has the statement that defines if a row is in the segment or not.   The block tests each segment rule in order, starting at the top- the first statement that solves as "TRUE" defines the value for the Product_Segment column for that row. Dragging the segments up and down changes what order the rules are checked.</p>
<p><a href="/resources/images/Segmentation-Example-Product.jpg" target="_blank" onClick="javascript: pageTracker._trackPageview('/screenshots/Segmentation-Example-Product'); "><img src="/resources/images/Segmentation-Example-Product-Thumb.jpg">
<p style="padding:8px;">(Click to Enlarge)</p>
<p></a></p>
<p>Then we have to Join this new product dimension (with the segmentation column) to the sales data, and summarize.</p>
<p>In MS Access, this is done with more queries-  Heres what Access looks like when we're done.<br />
<img src="/wp-content/uploads/2009/03/segmentation-example-access-gui1.jpg" alt="segmentation-example-access-gui1" title="segmentation-example-access-gui1" width="450" height="485" class="alignnone size-full wp-image-1405" /><br />
Compare that list of Tables and Queries to the visual, left to right layout of the Datamartist data canvas that does the same thing.  Without ever having to write any SQL code:</p>
<h2>The VISUAL way to do it</h2>
<p><img src="/wp-content/uploads/2009/03/segmentation-example-solved-canvas.jpg" alt="segmentation-example-solved-canvas" title="segmentation-example-solved-canvas" width="406" height="314" class="alignnone size-full wp-image-1403" /></p>
<p><a href="/resources/images/Segmentation-Example-Datamartist-full-app-shot.jpg" target="_blank" onClick="javascript: pageTracker._trackPageview('/screenshots/Segmentation-Example-Datamartist-full-app-shot'); "><img src="/resources/images/Segmentation-Example-Datamartist-full-app-shot-Thumb.jpg" class="alignright size-full wp-image-1430" ></a><br />
In Datamartist you can see the flow of the data, the row counts are clearly displayed, and clicking on the connectors will bring up the underlying data set in the data viewer.  Its clear which block feeds which, and by adding more blocks and connecting them at the desired point in the data flow, new analysis can be created.</p>
<p>Take Datamartist for a trial run-  <a href="/downloads">download it now</a> because maybe you don't have to learn microsoft access queries after all.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/microsoft-access-query-example-and-comparision-to-datamartist/feed</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Spreadmarts and Data Shadow Systems- The Debate</title>
		<link>http://www.datamartist.com/spreadmarts-and-data-shadow-systems-the-debate</link>
		<comments>http://www.datamartist.com/spreadmarts-and-data-shadow-systems-the-debate#comments</comments>
		<pubDate>Wed, 18 Feb 2009 01:13:28 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[Business Intelligence Architecture]]></category>
		<category><![CDATA[Cost Reduction]]></category>
		<category><![CDATA[MS Access]]></category>
		<category><![CDATA[Spreadmarts]]></category>
		<category><![CDATA[Access]]></category>
		<category><![CDATA[Excel]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=1017</guid>
		<description><![CDATA[When business users are not getting what they want out of the enterprise business intelligence system they very rarely just give up. Successful business people didn't get where they are by giving up when someone doesn't deliver something, they take things into their own hands and get it done. Knowing this, it's not surprising that [...]]]></description>
			<content:encoded><![CDATA[<p><img src="/wp-content/uploads/2009/02/spreadmarts-another-100-spreadsheets1.jpg" alt="spreadmarts-another-100-spreadsheets1" title="spreadmarts-another-100-spreadsheets1" width="300" height="316" class="alignright size-full wp-image-1043" />When business users are not getting what they want out of the enterprise business intelligence system they very rarely just give up.  Successful business people didn't get where they are by giving up when someone doesn't deliver something, they take things into their own hands and get it done.</p>
<p>Knowing this, it's not surprising that a huge amount of data collection, extraction, and transformation happens in Excel spreadsheets, or Access databases that are made without the involvement (and often under the direct scorn of) the IT department in large companies.  In my previous life I was in the IT department, and I saw some amazing systems generated with hundreds of spreadsheets and databases.  This mix of spreadsheets and databases, created without the involvement of the IT department by power users or external consultants (financed out of departmental budgets) are often referred to as <a href="http://www.doubletongued.org/index.php/citations/spreadmart_1/" target="_blank">Spreadmarts</a> or <a href="http://en.wikipedia.org/wiki/Shadow_system" target="_blank">Shadow Systems</a>.</p>
<p>For an interesting survey on the subject, take a look at <a href="https://www.tdwi.org/research/display.aspx?ID=8874" target="_blank">TDWI's report "Strategies for Managing Spreadmarts: Migrating to a Managed BI Environment".</a>  This report is now a year old, but I'm certain as valid as ever.</p>
<p>The title suggests that the solution is managed BI-  I won't get into that right now, but you'll notice the study was sponsored by the likes of Microsoft, Cognos, Microstrategy and SAP- so of course the solution is Big Business Intelligence solutions.</p>
<p>But what's really interesting from the survey, is how the different groups within the respondent companies feel about spreadmarts and shadow data systems.  The analysts love them,  the executives are unsure, and IT hates with a passion.  This makes for an interesting mix.<br />
<img src="/wp-content/uploads/2009/02/position-on-spreadsheets.jpg" alt="position-on-spreadsheets" title="position-on-spreadsheets" width="450" height="301" class="alignnone size-full wp-image-1029" /></p>
<p>This is very much what I've seen in my experience.  IT and the Business are at odds with each other, and senior management is either disinterested or forced to take sides.</p>
<p>Where do I stand?  I'm in the "avoid them if you can" camp when we're talking about a tangle of spreadsheets and undocumented MS Access databases that can be error prone and time consuming.  I understand why it's often unavoidable, but I've seen first hand how painful these systems are to maintain.  </p>
<p>On the other hand, I don't subscribe to the school of thought that says "Excel needs to be eliminated- analysts should use the Business Intelligence systems only, otherwise there will be chaos."  Let's not go overboard.  Excel and spreadsheets are useful tools, and have their place.  Additionally, I really feel for business users who simply can't get what they want from the IT departments.  I used to be the IT department, and it was frustrating to not have the resources available to build what people needed.</p>
<p>As one of the authors of the above report, <a href="http://www.athena-solutions.com/index.shtml" target="_blank">Rick Sherman</a>, said in <a href="http://searchcio.techtarget.com/generic/0,295582,sid182_gci1344289,00.html?asrc=SS_CLA_308990&#038;psrc=CLT_182" target="_blank">a recent podcast</a>:</p>
<blockquote><p>"reality is no matter how many IT folks that you have in your company you're not likely to have enough resources or time to meet every business users reporting or analytical requirements..."</p></blockquote>
<p>He presents what is a refreshingly balanced approach to Excel.  In his <a href="http://datadoghouse.typepad.com/data_doghouse/2009/02/business-intelligencedata-warehousing-emerging-trends-but-not-breakouts-9-for-09.html" target="_blank">predictions for trends in 2009</a>, number 5 is "Excel becomes an accepted tool in a BI portfolio". He points out that this may not be mainstream in 2009, but I hope he's right about the trend.  A pragmatic, inclusive strategy with more power to the people while avoiding the chaotic side of spreadmarts is where the solution is.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/spreadmarts-and-data-shadow-systems-the-debate/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Data mart Data Modelling 101</title>
		<link>http://www.datamartist.com/data-mart-data-modelling-101</link>
		<comments>http://www.datamartist.com/data-mart-data-modelling-101#comments</comments>
		<pubDate>Wed, 24 Sep 2008 03:30:51 +0000</pubDate>
		<dc:creator>James Standen</dc:creator>
				<category><![CDATA[Business Intelligence Architecture]]></category>
		<category><![CDATA[Data Modelling]]></category>
		<category><![CDATA[MS Access]]></category>
		<category><![CDATA[Personal Data Marts]]></category>
		<category><![CDATA[Access]]></category>
		<category><![CDATA[Linking to Excel]]></category>

		<guid isPermaLink="false">http://www.datamartist.com/?p=83</guid>
		<description><![CDATA[Last time we talked about how much data can comfortably be put into and Excel spreadsheet and I've found that more than a few hundred thousand rows can get awkward.  Plus, certain types of operations are more difficult to automate in excel (often requiring programming skills with macros or visual basic for applications (VBA)).  So- this means [...]]]></description>
			<content:encoded><![CDATA[<p><img class="size-full wp-image-120 alignright" title="datamartjoin" src="/wp-content/uploads/2008/09/datamartjoin.jpg" alt="" width="345" height="343" /><a href="/importing-data-into-excel">Last time </a>we talked about how much data can comfortably be put into and Excel spreadsheet and I've found that more than a few hundred thousand rows can get awkward.  Plus, certain types of operations are more difficult to automate in excel (often requiring programming skills with macros or visual basic for applications (VBA)).  So- this means we need an alternative to Excel, something to let us clean up and summarise or split our data so it can be exported into Excel to generate reports and analysis.  For this example, I'll use one of the better alternatives currently available- Microsoft Access. Very soon, I'll show you with an upcoming alternative- nModal's Datamartist.</p>
<p>What is our goal? To build a data mart.  What is a data mart?  A data mart is data that has been formatted for ease of analysis, and that contains the information that an analyst needs, even if that information was not in the original system or at least not in a format that is easy to use.  The best way to model a data mart is to build it using two types of tables.</p>
<ol>
<li>Data-  the FACTS - which define the who, what where, when of the data.</li>
<li>Definitions - the DIMENSIONS - which describe the various things that are found in the Facts.</li>
</ol>
<p>An example would be in order.  Lets say we have some sales data that we are analysing.  Ideally, we want to get this information in a format where the FACTS are that on date A, we sold product Y to customer Z with a quantity and a price.  So in MS Access the columns might be:<a href="/wp-content/uploads/2008/09/salesfactrows.jpg"><img class="alignright size-full wp-image-96" title="salesfactrows" src="/wp-content/uploads/2008/09/salesfactrows.jpg" alt="" width="528" height="176" /></a></p>
<p>Note that if we can we use Unique things to define the who what where when- for example, we don't use the Customers name (how many John Smiths are there?) but we use a unique customer number or ID.  Then we need three definition sets- Date, product and customer that match these IDs.  By the way- creating these IDs when they don't exist, and cleaning the duplicates up is one of the challenges- if you already have a clean set of customer numbers and data then you are way ahead.</p>
<p>This is the key in terms of the data model- Sales_FACT holds JUST THE FACTS, and a table for each definition or DIMENSION, that holds just once the definition for each unique Product or customer.  This way we have consistent information throughout the analysis- if something about a product changes we don't have to change it in every row of the FACTS, we only have to change the single entry in, say, the product DIMENSION. The Product and Customer dimension tables might look like this (again in Access):</p>
<p style="text-align: center;"><a href="/wp-content/uploads/2008/09/productdimensionrows2.jpg"></a></p>
<p style="text-align: center;"><a href="/wp-content/uploads/2008/09/productdimensionrows3.jpg"><img class="aligncenter size-full wp-image-118" title="Product Dimension Rows" src="/wp-content/uploads/2008/09/productdimensionrows3.jpg" alt="" width="387" height="129" /></a> </p>
<p style="text-align: center;"> <a href="/wp-content/uploads/2008/09/customerdimensionrows1.jpg"><img class="aligncenter size-full wp-image-119" title="Customer Dimension Rows" src="/wp-content/uploads/2008/09/customerdimensionrows1.jpg" alt="" width="401" height="129" /></a></p>
<p>The final step in analysis is to combine the FACT and the DIMENSION to form a table or view that has all the information you need to slice and dice- often using a Pivot table in Excel is a good way to do this, or simply autofilters etc.  But first in access it is necessary to create a query.  This can be done by selecting "create" on the ribbon control, then design view, and showing the three tables we've created- the Sales_Fact, Product_DIMENSION and Customer_DIMENSION.  Now its necessary to JOIN these tables together using the ID fields- this is done by dragging the Product_ID field from the Product_DIMENSION table and dropping it on the Product_ID field on the Sales_FACT table, and then dragging the Customer_ID field from the Customer_DIMENSION table and dropping it on the Customer_ID field on the Sales_FACT table.</p>
<p> This results in a query that returns all the rows in the table Sales_FACT with the product and customer information looked up from the two dimension tables and added in.</p>
<p>If anyone cares- the Structured Query Language (SQL) that is generated by Access was as follows:</p>
<blockquote><p>SELECT Sales_FACT.[Sales Date], Sales_FACT.Quantity, Sales_FACT.Price, Product_DIMENSION.Product_Name, Product_DIMENSION.Product_Category, Customer_DIMENSION.Cust_Name, Customer_DIMENSION.Cust_Email, Sales_FACT.Product_ID, Sales_FACT.Customer_ID<br />
FROM Customer_DIMENSION INNER JOIN (Product_DIMENSION INNER JOIN Sales_FACT ON Product_DIMENSION.Product_ID = Sales_FACT.Product_ID) ON Customer_DIMENSION.Customer_ID = Sales_FACT.Customer_ID;</p></blockquote>
<p>This table structure is a very simple example of a <a href="http://en.wikipedia.org/wiki/Star_schema" target="_blank">STAR SCHEMA</a>- the basic data model used by respectable data marts and data warehouses everywhere.</p>
<p>This is all good and well, if you happen to have the data in the right format- but usually the data is stored (or created) in the source system using very different data models than this simple star schema.  This is done to allow the source system do transactions (both reading and writing data) quickly.  The star schema is excellent for analysing data, but not effective for modifying it. </p>
<p>In the end this is the key as to why we need data transformation tools.  Data in the source systems is often in the wrong format for analysis, or has quality issues that while perhaps not ideal, do not actually cause the transactional system to fail, so may not get addressed, but will radically affect your analysis.  The transformation tool resolves these issues, and separates the data into the Fact, Dimension structure.  This can be done with access, although if there are multiple transformations fairly complex queries or scripts must be created.  Stay tuned to this channel- Datamartist will show you some <a href="/connecting-the-dimension-table-to-the-fact-table-vendor-example-part-3">tricks</a> without having to figure through all the INNER JOIN SQL statements that tools like Microsoft Access needs to use...</p>
]]></content:encoded>
			<wfw:commentRss>http://www.datamartist.com/data-mart-data-modelling-101/feed</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

