August 04, 2014

The Marketing of Data Blending: Is It as Simple as All That?

The automatic data blending features in a number of new or updated BI and visualization tools offer to bring complex analytics to the ubiquitous “Non-technical Business User” without the need for data scientists. Is it as simple as all that?

One thing that is telling is that the examples and demos by the vendors employ fairly simple data--clearly already cleaned up to Join easily. With the examples, one often could have made the conclusions simply looking at the data in a table.

Still, some people simply work better with visuals than with tables of numbers. So there is potential for big value in end user tools.

But there is also a big “Only” in vendor statements like “only requires user intervention to resolve conflicts.” Pulling together data from disparate systems, even when they are modeling the same entities, is rife with conflict.

Having “manually blended” data myself many times over the years, I know the drill. Export from a variety of sources; upload into Access or SQL Server; change up data types to get them consistent; use SQL queries to join, sort, filter and explore; finally generate charts. I can tell you it is easy to get the wrong results when you are not careful with your joins. I believe that if the source data is clean and obvious enough, then automated blending tools can do a good job with this. But there’s the rub, the data isn't always clean and obvious. And the non-technical business user isn’t in a position to know it.

To be clear, I’m not saying there isn't a place for these tools. I just believe that they still take a more “data aware” user to do the blending. In other words, they are great for analysts to do ad-hoc research projects and to figure out ETL jobs.

In fact that’s the way we've ended up deploying. Analysts use Tableau and Datameer to build dashboards for business users without having to make a data mart first. This is a great benefit because users get the visualizations in hand and can make change requests. Over time, this iterative period winds down and we have a very good model encoded in the blending setups should we choose to implement a structured data warehouse or mart.

I really like these tools in the right hands. I simply believe the technology hasn’t caught up with the marketing yet.

No comments:

Post a Comment