Lanciaux Maxime | BI | DWH | Hadoop | DevOps | Google Cloud | DataOps | PostgreSQL

PostgreSQL, BI, DWH, Hadoop, DevOps, DataOps, Machine Learning, Cloud and others topics !

Labels

Administration Analytics Architecture Aster Automation Best practice BI Bitcoin Bug Business Intelligence CDO Data visualization Databases DataFlow DataLake DataMesh DataOps Datawarehouse Detente development DevOps ElasticSearch enterpr1se 3.0 ETL Flume Fun Games Git Google Cloud Platform Graph Database Hadoop Hadoop 2.0 Hbase Hive Impala Informatica IoT Java Javascript Jazz Jenkins Kafka linux Machine Learning Mahout MapReduce Meta Development Monitoring Mood Music Oozie Optimisation performance Pig Python Quality R Real Time Scala scam Shark SolR Spark SQL Standards Statistics Stinger Storm SVN Talend Task TED Teradata Thinking Ubuntu Useful Web development WTF Yarn Zeppelin Zookeeper

Friday, June 10, 2016

My Hadoop is not efficient enough, what can I do ?

1. Review your memory configuration to maximize CPU utilisation
2. Review your YARN settings especially the Capacity Scheduler
3. Review your application design, parameter used, join strategy, file format

Of course with checking your ganglia / Ambari Metrics, voilĂ  !

PS : For those who don't trust Multi-tenant Hadoop cluster, please call me ;-)
Labels: Administration, Architecture, Best practice, Hadoop, Hadoop 2.0, Hive, MapReduce, Optimisation, performance, Pig, Quality, Spark, SQL, Standards, Useful
Location: Manor Park, Banbury OX16 3TB, Royaume-Uni

No comments:

Post a Comment

Newer Post Older Post Home
Subscribe to: Post Comments (Atom)
Simple theme. Powered by Blogger.