Category archive for ‘Data profiling’ rss

  • Datamartist V1.3.0 Value Distribution data profiling

    This video gives a quick (under two minute) look at the Datamartist data profiler's ability to explore the distribution of numeric values in a data set by counting the number of values that fall into a series of equal size buckets. It highlights the datamartists calculation, visualization, selection and drill down features using a simple [...]

  • Why you should data profile.

    Imagine that you have bought a new home, and you've decided to do some landscaping. So you pick three landscapers, draw a rough sketch of what you want, and ask them to bid on the job. But you don`t allow them to come see your property, and your sketch doesn't specify anything about the existing [...]

  • Automated data profiling and reporting- Data quality behavioral modification?

    Recently, Jim Harris of Obsessive compulsive data quality speculated as to if the concept of the "Swear jar" could be used to improve data quality. It was an interesting post, and the discussion in the comments underlined the reality of data quality- much of the time, the problem is not about changing bits in a [...]

  • Data migration- Part 2 – Determining data quality is the first key step

    Any discussion on data migration needs to include data quality as a core topic. Migrating data from one set of applications to another, particularly when the applications were never designed to interact, and share little or no common structure or definitions is a complex task. This task is made even more complex by the data [...]

  • Data quality at the burger joint

    I have noticed that when I go to a fast food outlet no matter what I get to drink with my meal it is almost always listed as "Cola" on the receipt. But I didn't order Cola. Ever. Usually I get juice, or milk. So every time I order a burger, I'm clearly a source [...]

  • Data Profiling and Data Completeness

    There are various steps in data analysis- for me the very first one is always "what have we got?".  You have a data set, and some broad requests or ideas about what you want to get out of it, but the first question is how good is the data?  In the end, the first thing [...]