Andres Chaves blog: Pentaho for Big Data Analytics

Despite I'm currently not an active community user of Pentaho, mainly because right now I'm focused on network and it managament, I still follow the evolution of this great platform. In the last years there have been great new improvements:

- The acquisition of Webdetails with all the useful Ctools
- The integration of connectors to several NoSQL technologies allowing the use of big data in all the components of the platform (BI Server, Kettle, Mondrian, etc).

Recently I got the opportunity to review the Pentaho for Big Data Analytics book published by Packt. My expectations on the book were quite high. I hoped the book would help me to stay updated with the latest improvements of Pentaho and clarify a lot of the marketing buzzword around Big Data.

What I found was the following:

- The book intention was to give a broad overview of Pentaho components and spend a lot of chapters setting up Pentaho platform: One would expect that if someone buy this book is because he already have a background of Pentaho and want to detail the relationship of it with Big Data.
- The Bigdata theory and examples were oriented to using Apache Hive: It is explained how to handle files in HDFS and how to handle Big Data analysis through Apache Hive (which at the end is a SQL layer over Hadoop). While the examples and theory are a good introduction to the topic, there a lot of issues not handled: How to do a map/reduce job directly to Hadoop? What about other NoSQL technologies like MongoDB, Reddis, etc?
- There are very good examples about how to use Ctools: Handling the CDE, CDF and CDA tools is not easy at the beginning, so the "Visualization of Big Data" chapter is very helpful for this.

I think the book is worth of read if you are a new user of Pentaho and Hadoop, want a introduction about how to install, run them, etc, and need the first steps to handle data through Hadoop.

2 comentarios:

Unknown dijo...: Buenas noches Andres. Me gusta conocer que hay latinos comentando acerca de Pentaho en nuestra región. Sigue así. Saludos desde Venezuela; 21 de marzo de 2014 a las 20:51
Anugraha Jain dijo...: Hello Andres,

We have developed a new gen developer friendly BI framework with some extremely unique features. Would like to give you early access & love to hear your opinion. Please do let me know of how to reach out to you. Would be launching product in 1 week from now.

Also could you please share your email details for further communication.

Regards,
Anugraha; 6 de julio de 2015 a las 2:58

Publicar un comentario

Andres Chaves blog

miércoles, 19 de marzo de 2014

Pentaho for Big Data Analytics

2 comentarios:

Datos personales

Linked In Profile

Pentaho

Archivo del blog

Recommended Blogs