Datamartist gives you data profiling and data transformation in one easy to use visual tool.

« | »

Data, Data Everywhere- but not a report to read

It is truly astounding how much data storage there is in the world. The IDC claims in one of their reports almost 300 exabytes of storage capacity exists, I don’t know if they included my USB key in there or not- but that’s still a lot.

Now of course, the majority of that is photos and video but there are a few bytes of good old data in there too.

 In my opinion, data is like commuter traffic.  If you make a bigger road, the space is available so more subdivisions are built- result, the road is full and traffic is worse than ever.

 Cheap, plentiful storage is the equivalent of a 300 lane highway- we’ve built it, and our applications have consumed it just as fast.  BECAUSE the storage is available (or less expensive) the designers of the systems choose to store more (at higher resolutions, with more detail.)  “You never know when we might need it.” 

But just as the road ends up being full- has anyone out there been told that even though storage is “cheap”- its going to cost another million dollars to avoid archiving your history?

 But analysis is ABOUT history, isn’t it?  Its about finding patterns, comparing different aspects of our companies performance and finding those magic relationships that let us dump the unprofitable, focus on the money makers, marshal our resources in the most effective way.

 So whats between us and these oceans of data?  A bewildering number of things.

  • System structure and incompatibility of technologies/applications
  • Data structure and definition differences
  • Data Quality issues
  • Lack of Resources to do whats needed to make it useful (technological, technical and analytical)
  • etc. etc.

In my experience its often the “etc. etc.” that is the tricky part. 

There are, however, ways to deal with all of these obstacles and I’m going to go through some examples and techniques that can make it possible to turn raw data, with all its faults, into useful information ready for analysis, and preferably without needing to have a degree in computer science to do it. 

I would also be pleased to take real life examples, (names and data changed to protect the innocent) to illustrate the points, should any of the readership care to submit them – (free analysis assistance on offer here people)- just send me a note at datasos@datamartist.com and if your problem is something I can help with and write about, I’ll take a crack at it.

Meanwhile I’m enjoying my new RAM- since I’m running a 32 bit windows operating system I can only use about 3.2 Gb of it- but its Great.  In memory analysis is the future if you ask me.

Tagged as: ,

Twitter

« | »

Leave a Response