Meta-development and [automatic] analytics !

Sunday, October 11, 2015

Meta-development and [automatic] analytics !

It can sound crazy to some of you but I think it is time for our IT platforms to become smarter and to start to do some data management/data preparation/analytics by themselves or at least do more suggestion.

For most of our BI-like/analytics project we spend around 70% of the time to do data preparation. We, human, need time to understand the data, the way it is produced, how to ingest/transform it to get a structure, how to enrich it to get the semantic/standards for our business users/reporting.

What if I dedicate 20% space/CPU of my [Hadoop] platform, what if my platform knows some heuristics and makes some assumption ? What if I have an embedded-portal [ambari view] which shows metrics about data/asking question ?

Those files seem to be received as daily full dump, do you agree ?
This dataset can be map to this schema <CREATE TABLE .... schema>
This column contain 20% of NULL especially when this <columnname> = "October", do you want to create a rule ?
This file contains on average 45000 lines +/- 5% except for 3 days
<column name> can be used to join these two tables, the matching will be 74%
This column can be predicted using <analytics cube name> with a 70% accuracy, the best model is <model name>, top variable are <list of variable name>, do you think of a new variable ?

Lanciaux Maxime | BI | DWH | Hadoop | DevOps | Google Cloud | DataOps | PostgreSQL

Labels

Sunday, October 11, 2015

Meta-development and [automatic] analytics !

No comments: