viernes, 28 de noviembre de 2014

Pentaho Business Analytics Cookbook

Hi,
I had the opportunity to review the book Pentaho Business Analytics Cookbook published by Packtpub. I have been delayed with this review because most of the time was consumed by the master I have been studying.

But now that I read it, these are my comments:

Pentaho Business Analytics Cookbook is a book that gives you a good overview of the Pentaho Platform, for both the enterprise and community version. For each major part of the platform, except Pentaho Data Integration and Weka, it give you a set of recipes to accomplish the most common tasks. These are the topics covered in each chapter:

  • Chapter 1: Getting Familiar with Pentaho User Console
  • Chapter 2: Configuring Your BA Server Instance
  • Chapter 3: Defining BA Server Data Sources
  • Chapter 4: Defining Business Models with the Pentaho Metadata Editor
  • Chapter 5: Creating Reports Using Pentaho Interactive Reporting
  • Chapter 6: Creating Analysis Reports
  • Chapter 7: Creating Reports Using Pentaho Report Designer
  • Chapter 8: Creating Dashboards
  • Chapter 9: Scheduling Content
  • Chapter 10: Working with Pentaho Mobile BI
  • Chapter 11: Customizing Pentaho BA to Meet Your Business Needs
For example, in chapter 2: Configuring your BA Server Instance, the author gives you an overview of the BA Platform and recipes for managing Pentaho solution files, users and roles. These are things that every Pentaho operator should accomplish day by day.

All the recipes given are well detailed and written with proper images and steps. This is a great book if you plan to be a Pentaho devop.

From my perspective, the book allowed me to:

  1. Understand what's new in Pentaho 5, how is the UI organized, how the Pentaho solution is managed on it.
  2. Grasp a very good understanding of how to use Pentaho Metadata. Pentaho Metadata is a powerfull business abstraction of the physical db model, sadly I never saw a detailed guide about how to use it. The recipes given in the book are quite accurate explaining how to use it.
  3. Know what is the Pentaho Interactive Reporting. As this feature is only available in the Enterprise version I have never used it.  It also give you a good understanding of how WAQR can be used.
  4. See how is the Ipad mobile version of Pentaho.
It also cover some few recipes about using Saiku in Analytical Reports and CDE in Dashboard edition.

I hope this review would be useful for Pentaho newbies,

All the best,

Andres

domingo, 10 de agosto de 2014

Quick update

Hey, a quick update,

I decided to resign my job in Colombia and go to study overseas. I am currently studying a Master of Information Technology - Distributed Systems in The University of Melbourne, Australia.

I just been in Melbourne 3 weeks and everything so far is very interesting and challenging.

Regards,


Andres

miércoles, 19 de marzo de 2014

Pentaho for Big Data Analytics

Despite I'm currently not an active community user of Pentaho, mainly because right now I'm focused on network and it managament, I still follow the evolution of this great platform. In the last years there have been great new improvements:

 - The acquisition of Webdetails with all the useful Ctools
- The integration of connectors to several NoSQL technologies allowing the use of big data in all the components of the platform (BI Server, Kettle, Mondrian, etc).

 Recently I got the opportunity to review the Pentaho for Big Data Analytics book published by Packt. My expectations on the book were quite high. I hoped the book would help me to stay updated with the latest improvements of Pentaho and clarify a lot of the marketing buzzword around Big Data.

What I found was the following:

 - The book intention was to give a broad overview of Pentaho components and spend a lot of chapters setting up Pentaho platform: One would expect that if someone buy this book is because he already have a background of Pentaho and want to detail the relationship of it with Big Data.
 - The Bigdata theory and examples were oriented to using Apache Hive: It is explained how to handle files in HDFS and how to handle Big Data analysis through Apache Hive (which at the end is a SQL layer over Hadoop). While the examples and theory are a good introduction to the topic, there a lot of issues not handled: How to do a map/reduce job directly to Hadoop? What about other NoSQL technologies like MongoDB, Reddis, etc?
 - There are very good examples about how to use Ctools: Handling the CDE, CDF and CDA tools is not easy at the beginning, so the "Visualization of Big Data" chapter is very helpful for this.

I think the book is worth of read if you are a new user of Pentaho and Hadoop, want a introduction about how to install, run them, etc, and need the first steps to handle data through Hadoop.

miércoles, 11 de abril de 2012

Software Engineering for Software as a Service - Statement

If you want to have an idea of how is the statement here is a picture of it:

Software Engineering for Software as a Service

Hi,
I had the experience to take the Software Engineering for Software as a Service course offered by professor Armando Fox and David Patterson from University of Berkeley through the coursera startup.

The course used the same material, quizzes and videos from the official Berkeley class. And at then end deliver an statement of accomplishment. I'll try to upload it later.

In general terms I think the experience was great and I am very grateful with Professors Fox and Patterson. There will be a new offering of the course if you are interested. The url of the course is http://saas-class.org

The topics that the course covers are Architectural patterns, software design, code coverage, unit and integration tests, agile development and Ruby. If you like those topics you'll have a great time.

I have to say I liked the Ruby language and the rails framework. In a non expert and scientific perception I think it is more suitable for most of the projects than Java in terms of speed and velocity of development and value added to enterprise software.

jueves, 21 de julio de 2011

Tutorial CDE

Inteligencia de Negocio y Pentaho: Cómo hacer cuadros de mando: V: Un tutorial recomendado para manejar las excelentes herramientas que Pedro Alves y Webdetails han desarrollado, en especial CDE.

lunes, 18 de julio de 2011

Your faithful employee

Hello,

In the area I work for, we have one duty among several others more: Maximize the availability of the network management systems. The NMSs must run almost every time because the NOCs (network operation centers) monitor them 7x24.

But thats not an easy task, sometimes the planning area delivers the NMS implementation with many flaws, other times the machine you get is not what you expected or the application servers freezes continously and research the root cause can take weeks.

As we're not big fan of attending service disruption calls at 3am, we deployed a nice and useful service in all our Linux machines: Monit. This nice program monitor the service existence,availability and performance and take automated actions when the rules/thresholds are exceeded.

For example, we had a Tomcat container that was getting frozen several times at the week...the rule for Monit was something like this:

check process tomcat5 with pidfile /var/run/tomcat5.pid
group tomcat5
start program = "/etc/init.d/tomcat5 start" with timeout 120 seconds
stop  program = "/etc/init.d/tomcat5 stop" with timeout 120 seconds
if failed host 127.0.0.1 port 8080 
protocol HTTP request /archivos/gestion.jpg 
TIMEOUT 3 SECONDS then restart
if cpu usage > 95% for 10 cycles then restart
if 5 restarts within 5 cycles then timeout


So, this loyalty automated employee check the port 8080, check the HTTP request of gestion.jpg to be less than 3 seconds and check the cpu usage of the process to be under 95%. If Monit sees any of these rules broken then it begins to restart the service. As the good employee he is, Monit sends email notification of every step he takes.

Hope this application be useful for you,


Quick update: Even thought Monit is doing a great job is important to find the root cause. Regarding the tomcat issue i found this useful site to tune the JVM parameters: http://wiki.alfresco.com/wiki/JVM_Tuning