Hadoop is evolving very fast and sometimes you can find bugs. Be sure to check for your version / component what are the bugs :
PostgreSQL, BI, DWH, Hadoop, DevOps, DataOps, Machine Learning, Cloud and others topics !
Labels
Administration
Analytics
Architecture
Aster
Automation
Best practice
BI
Bitcoin
Bug
Business Intelligence
CDO
Data visualization
Databases
DataFlow
DataLake
DataMesh
DataOps
Datawarehouse
Detente
development
DevOps
ElasticSearch
enterpr1se 3.0
ETL
Flume
Fun
Games
Git
Google Cloud Platform
Graph Database
Hadoop
Hadoop 2.0
Hbase
Hive
Impala
Informatica
IoT
Java
Javascript
Jazz
Jenkins
Kafka
linux
Machine Learning
Mahout
MapReduce
Meta Development
Monitoring
Mood
Music
Oozie
Optimisation
performance
Pig
Python
Quality
R
Real Time
Scala
scam
Shark
SolR
Spark
SQL
Standards
Statistics
Stinger
Storm
SVN
Talend
Task
TED
Teradata
Thinking
Ubuntu
Useful
Web development
WTF
Yarn
Zeppelin
Zookeeper
Showing posts with label SolR. Show all posts
Showing posts with label SolR. Show all posts
Monday, March 16, 2015
There are bugs but it is normal life !
Labels:
Architecture,
Bug,
Flume,
Hadoop,
Hadoop 2.0,
Hive,
Kafka,
Mahout,
Pig,
SolR,
Spark,
Task,
Useful
Location:
Antony, France
Tuesday, December 30, 2014
Collaborative datalake !
It is holidays now so let's relax a little and imagine some funny things !
Why not a collaborative datalake based on Hadoop and web technology which allows users to share both their dataset and the code story to create it ? I would add a vote system too !
Let's see ;-)
Labels:
Architecture,
BI,
Business Intelligence,
DataFlow,
development,
ETL,
Hadoop,
Hadoop 2.0,
Hbase,
Hive,
IoT,
Java,
Pig,
SolR,
Spark,
Standards
Location:
Tainan, Taïwan
Wednesday, July 23, 2014
My Hadoop is not working, what can I do ?
Keep calm and ;-)
- First check your logs
- Is the service is running ? (netstat -nat | grep ...)
- Is it possible to access it ? (telnet ip port)
- Is there a problem linked with path, java libraries, environment variable or exec ?
- Am I using the correct user ?
- What is the security system in place ?
- Are nodes well synchronized ?
- What about memory issue ? (swap should be desactivated also)
Monday, February 10, 2014
HDP [>] 2.1 natively available applications !
Stack components :
- MapReduce (API v1 & v2) : software framework for processing vast amounts of data
- Tez : more powerful framework for executing DAG (directed acyclic graph) of tasks
- HOYA, HBase on YARN : distributed, column oriented database
- Accumulo : (Linux only) sorted, distributed key / value store
- Hue : web application interface for Hadoop ecosystem (Hive, Pig, HDFS, ...)
- HDFS : hadoop distributed file system
- WebHDFS : interact to HDFS using HTTP (no need for library)
- WebHCat : interact to HCatalog using HTTP (no need for library)
- YARN : Yet Another Resource Negotiator, allows more applications to run on Hadoop
- Oozie : workflow / coordination system
- Mahout : Machine-Learning libraries which use MapReduce for computing
- Zookeeper : centralized service for maintaining configuration information, naming, providing distributed synchronization, and providing group services
- Flume : data ingestion and streaming tool
- Sqoop : extract and push down data to databases
- Pig : scripting platform for analyzing large data sets
- Hive : tool to query the data using a SQL-like language
- SolR : plateform for indexing and search
- HCatalog : meta-data management service
- Ambari : set up, monitor and configure your Hadoop cluster
- Phoenix : sql layer over HBase
Components being developed / integrated :
- Spark : in memory engine for large-scale data processing
- Falcon : data management framework
- Knox : single point of secure access for Apache Hadoop clusters (use WebHDFS)
- Storm : distributed realtime computation system
- Kafka : publish-subscribe messaging system
- Giraph : iterative graph processing system
- OpenMPI : high performance message passing library
- S4 : stream computing platform
- Samza : distributed stream processing framework
- R : software programming language for statistical computing and graphics
Labels:
Architecture,
BI,
Business Intelligence,
development,
Hadoop,
Hadoop 2.0,
Hbase,
Hive,
Java,
Kafka,
linux,
Mahout,
Pig,
Real Time,
SolR,
Spark,
Stinger,
Useful
Location:
Istambul, Turquie
Friday, October 18, 2013
Hadoop 2.0 !
Apache Hadoop 2.0 has just been released some days ago ! Hadoop is no longer only a MapReduce container but a multi data-framework container and provides High Availability, HDFS Federation, NFS and snapshot !
Wednesday, May 8, 2013
SolR & ElasticSearch !
SolR and ElasticSearch are both great way to add search capability (and more) to your projects. And behind that, there is Lucene !
Subscribe to:
Posts (Atom)