Showing posts with label Datawarehouse. Show all posts
Showing posts with label Datawarehouse. Show all posts

Monday, November 16, 2020

Data Mesh on Google Cloud Platform (and this is excellent !)

Hello from Home 🏡,

Quick update as I am leading the creation of a new Deck explaining Data Mesh Architecture and why this is an amazing opportunity to adopt Google Cloud Platform for this approach :
  • GCP Project per Data Domain / Team (<-> direct Data Mesh mapping 👌)
  • Serverless
  • Pay as you go
  • Google APIs
  • Looker (esp. Looker API and the semantic layer)
  • Scalability
  • BigQuery / Cloud Storage / Cloud Pub/Sub
  • Ephemeral Hadoop cluster (Dataproc)
  • IAM
  • Cloud Source Repositories / Cloud Build
This [new] architecture is not a huge revolution (and this is great), it comes from 40+ years of data platform innovation and it follows the same approach as Microservice / Kubernetes.



Stay Data Mesh tuned !

Friday, October 17, 2014

Hive development !

Hive 0.14 is now supporting ACID (atomicity, consistency, isolation and durability) transaction which lead to :
  • UPDATE, DELETE
  • BEGIN, COMMIT, ROLLBACK
  • INSERT ... VALUES
Stinger.next will bring more SQL compliance (non-equi joins, more sub-queries, materialized views and others) and Apache Optiq is bringing cost-based optimization to improve performance.

This is really impressive !

Monday, March 31, 2014

Teradata & Hadoop !

Teradata and Hadoop interacts well together especially inside UDA with InfiniBand interconnect. To know which platform to use when you should look at your needs, where is the largest volume and platform's capabilities.

If you want to transfert data, you can consider :