Before I was not convinced that Open Data brings more value to my project. Lately, just using Open Data, I am able to build an efficient model to predict dengue rate in Brezil with Least Angle Regression algorithm. To do so, we used meteo (wind, temperature, precipitation, thunder / rain rates, ...), altitude, localisation, urbanization, twitter / wikipedia frequency and custom variables (mostly lag).
PostgreSQL, BI, DWH, Hadoop, DevOps, DataOps, Machine Learning, Cloud and others topics !
Labels
Administration
Analytics
Architecture
Aster
Automation
Best practice
BI
Bitcoin
Bug
Business Intelligence
CDO
Data visualization
Databases
DataFlow
DataLake
DataMesh
DataOps
Datawarehouse
Detente
development
DevOps
ElasticSearch
enterpr1se 3.0
ETL
Flume
Fun
Games
Git
Google Cloud Platform
Graph Database
Hadoop
Hadoop 2.0
Hbase
Hive
Impala
Informatica
IoT
Java
Javascript
Jazz
Jenkins
Kafka
linux
Machine Learning
Mahout
MapReduce
Meta Development
Monitoring
Mood
Music
Oozie
Optimisation
performance
Pig
Python
Quality
R
Real Time
Scala
scam
Shark
SolR
Spark
SQL
Standards
Statistics
Stinger
Storm
SVN
Talend
Task
TED
Teradata
Thinking
Ubuntu
Useful
Web development
WTF
Yarn
Zeppelin
Zookeeper
Showing posts with label Aster. Show all posts
Showing posts with label Aster. Show all posts
Saturday, December 6, 2014
Using Open Data & Machine Learning !
Labels:
Analytics,
Aster,
BI,
Business Intelligence,
Hadoop,
IoT,
Machine Learning,
Useful
Location:
Singapour
Tuesday, August 5, 2014
Scale Open Source R with AsterR or Teradata 15 !
I recently contribute to a great project which deals with using R in a distributed way within Aster and Teradata. I rediscover that R is really permissive, flexible, powerful.
Labels:
Analytics,
Architecture,
Aster,
BI,
Business Intelligence,
Data visualization,
development,
ETL,
Machine Learning,
Python,
R,
SQL,
Standards,
Statistics,
Teradata,
Useful
Thursday, April 24, 2014
Python !
Python is already almost everywhere and used in production in Google. It is a very powerful programming langage to map your wish (from Web to GUI) in a script !
Labels:
Analytics,
Aster,
development,
ETL,
Hadoop,
Machine Learning,
Python,
Useful
Location:
Londres, Royaume-Uni
Friday, March 28, 2014
Machine Learning with Aster !
I am now working with Aster to do Machine Learning and statistics. Here are the functions you can use :
- Approximate Distinct Count : to quickly estimates the number of distinct values
- Approximate Percentile : to computes approximate percentiles
- Correlation : to determine if one variable is useful for predicting an other
- Generalized Linear Regression & Prediction : to perform linear regression analysis
- Principal Component Analysis : for dimensionality reduction
- Simple | Weighted | Exponential Moving Average : compute average with special algortihm
- K-Nearest Neighbor : classification algorithm based on proximity
- Support Vector Machines : build a SVM model and do prediction
- Confusion Matrix [Plot] : visualize ML algorithm performance
- Kmeans : famous clustering algorithm
- Minhash : Another clustering technic which depends on the set of products bought by users
- Naïve Bayes : useful classification method especially for documents
- Random Forest Functions : predictive modelling approaches broadly used for supervised classification learning
Labels:
Analytics,
Aster,
Business Intelligence,
Machine Learning,
Statistics
Location:
Antony, France
Tuesday, March 11, 2014
Teradata’s SNAP Framework !
Teradata’s Seamless Network Analytic Processing Framework is one of the great ideas inside Aster 6 database. It allows user to query different analytical engines and multiple type of storage using a SQL-like programming interface. It is composed by a query optimizer, a layer that integrates and manages resources, an execution engine and the unified SQL interface. These are the main components and their goals :
- SQL-GR & Graph Engine : provide functions to work with edge, vertex, [un|bi|]directed or cyclic graph
- SQL-MR : library (Machine Learning, Statistics, Search behaviour, Pattern matching, Time series, Text analysis, Geo-spatial, Parsing) to process data using MapReduce framework
- SQL-H : easy to use connection to HDFS for loading data from Hadoop
- SQL : join, filter, aggregation, OLAP, insert, update, delete, CASE WHEN, table
- AFS connector : SQL-MR function to map AFS file to table
- Teradata connector : SQL-MR function to load data from / to Teradata RDBMS
- Stream API : plug your Python, Ruby, Perl, C[|++|#] scripts and use Aster CPU workers node to process it
Labels:
Analytics,
Architecture,
Aster,
BI,
Data visualization,
Databases,
development,
Graph Database,
SQL,
Standards,
Teradata,
Useful
Location:
Antony, France
Subscribe to:
Posts (Atom)