miércoles, 19 de marzo de 2014

Pentaho for Big Data Analytics

Despite I'm currently not an active community user of Pentaho, mainly because right now I'm focused on network and it managament, I still follow the evolution of this great platform. In the last years there have been great new improvements:

 - The acquisition of Webdetails with all the useful Ctools
- The integration of connectors to several NoSQL technologies allowing the use of big data in all the components of the platform (BI Server, Kettle, Mondrian, etc).

 Recently I got the opportunity to review the Pentaho for Big Data Analytics book published by Packt. My expectations on the book were quite high. I hoped the book would help me to stay updated with the latest improvements of Pentaho and clarify a lot of the marketing buzzword around Big Data.

What I found was the following:

 - The book intention was to give a broad overview of Pentaho components and spend a lot of chapters setting up Pentaho platform: One would expect that if someone buy this book is because he already have a background of Pentaho and want to detail the relationship of it with Big Data.
 - The Bigdata theory and examples were oriented to using Apache Hive: It is explained how to handle files in HDFS and how to handle Big Data analysis through Apache Hive (which at the end is a SQL layer over Hadoop). While the examples and theory are a good introduction to the topic, there a lot of issues not handled: How to do a map/reduce job directly to Hadoop? What about other NoSQL technologies like MongoDB, Reddis, etc?
 - There are very good examples about how to use Ctools: Handling the CDE, CDF and CDA tools is not easy at the beginning, so the "Visualization of Big Data" chapter is very helpful for this.

I think the book is worth of read if you are a new user of Pentaho and Hadoop, want a introduction about how to install, run them, etc, and need the first steps to handle data through Hadoop.

2 comentarios:

Unknown dijo...

Buenas noches Andres. Me gusta conocer que hay latinos comentando acerca de Pentaho en nuestra región. Sigue así. Saludos desde Venezuela

Anugraha Jain dijo...

Hello Andres,

We have developed a new gen developer friendly BI framework with some extremely unique features. Would like to give you early access & love to hear your opinion. Please do let me know of how to reach out to you. Would be launching product in 1 week from now.

Also could you please share your email details for further communication.

Regards,
Anugraha