Tag archive for ‘Data Quality’

  • Small data in the age of Big Data

    Just a quick thought on the subject of small data. Don't hear much about that in the press these days- everything is BIG BIG BIG. But small- (and remember, small is still hundreds of thousands or millions of rows) is still critically important. I hope I'm stating the obvious here- but with some of the […]

  • Exact isn’t everything- Surf your data!

    Sometimes an analyst needs to take off the accountants hat, forget the urge to chase down every last penny, and instead put on their surfing gear, grab the data surf board (i.e. their set of prefered data tools), and just surf some data. There are some cases were "Exact" is the only acceptable level of […]

  • Data quality monitoring and reporting

    In the vast majority of cases, useful data sets are not static, but are being updated, added to and purged constantly. Data quality monitoring aims to provide data quality information that is also being constantly updated, and can be used to detect issues quickly, before the bad data piles up. Don't let those bad records […]

  • A new years resolution to data profile

    Well, it's the time of making and breaking resolutions, a time when setting realistic goals is sometimes hard to do with all the optimism of the new year. Sometimes, we make decisions NOT to set a goal, because we don't want to break it. You might be thinking you really should step up your data […]

  • Data Quality Rules

    What's the difference between good data and bad data? It is much like the difference between good children and bad children- the bad data doesn't follow the rules. But what are the rules? Unlike the rules for kids, which have been fixed in stone for decades (or at least, parents wish it were so), the […]

  • Data quality sizzle

    I'm an engineer. Being an engineer, I'm pretty product focused, pretty technology focused, and pretty "does it work or not" focused. Having technical things like tools work is useful, and good. But just because you build it, does not mean they will come. The challenge often in Data Quality is that often what has to […]

  • Data quality challenges: behavioral inertia and its evil opposite

    Often, I hear someone say something like "this would be much easier if users would just..." or "If only we could convince the sales people that...". Technology folks often are frustrated by the people component of the complex systems they are trying to install. People are not a problem solved by technology Some try to […]

  • An introduction to using regular expressions for data quality validation

    Regular expressions (sometimes referred to as regex or regexp) are a powerful formal language that can be used to match text strings to patterns. They way regular expressions work is like this: A pattern is defined. This is a string of symbols that act as a set of rules. A text string to test, and […]

  • When should you data profile? Morning, Noon and Night!

    Data profiling is an important part of any data related project. The question often arises when the best time to data profile is. As you would expect from a software company that sells a really cool visual data profiling tool, our view is "all the time". Using data profiling tools before the project Data profiling […]

  • Too much data storage hurts data quality- the toothpaste effect

    When I brush my teeth there is a wide range in terms of amount of toothpaste that is acceptable to me. This is not a profound statement- bear with me. Only as the tube of toothpaste starts getting near to its end do I start conserving toothpaste because I know I need to make it […]