Rama's Free Thoughts: Pentaho

Showing posts with label Pentaho. Show all posts

Sunday, November 21, 2010

Adjust detail rows height in WAQR templates dynamically

Today I'm back writing something about WAQR. Yesterday one of my customers wrote me about a problem they were having while executing WAQR reports that has a long text in their columns. If you had columns with a very long text in it and you're exporting the WAQR report in PDF the columns text gets truncated. Stupid problem but apparently not so trivial to be solved within WAQR templates.

About WAQR

WAQR (Web AdHoc Query Reporting) is an interesting module in the Pentaho suite. It sits on top of the report engine and the metadata layer and lets the users easily build tabular reports to be used for their daily activities or to just to export some complex data in an easy way from the Pentaho system. The report definition is based on a wizard that takes the users along these easy steps:

select a template, from the set of available templates, and a business model.
decide were to put which fields in the report layout
adjust some visualization attributes or/and can apply filter conditions and define sort fields and orders
manage some page layout attributes
.... and here we go! We get the report.

WAQR templates are basically the old JFreereport report designer's template files so they are simple xml files we can view and modify. THEY ARE NOT compatible with the newer report designer template files. The wizard uses text manipulation routines to create a report-definition out of the template. We all know that sooner or later WAQR will be replaced by something more interactive and more attractive that will use the latest report engine's version. But for the moment we have this and with this we have to battle.

How to expand the height of the detail rows dynamically when export type is PDF

Try to build a new report with WAQR putting in the report definition a field with a very log text in it. If you try to export the report using the PDF format you will get the text truncated. To fix this it's only a matter of minutes and you need to modify the report template

adding a new configuration attribute for the report engine and
add a new attribute to the details band to set the row height dynamic.

Below I'm going to summarize all these steps modifying the Basic template given with the Pentaho demo solution.

Go to <biserver_home>/pentaho-solutions/system/waqr/templates/Basic and open jfreereport-template.xml

Locate the configuration xml element near the end of the file. Add the following line as a child of the configuration element

 <property name="org.pentaho.reporting.engine.classic.core.modules.output.pageable.pdf.AssumeOverflowY">true</property>

Add the attribute dynamic = "true" to the items element
Save the template and if the BI Server is running refresh the cache using the Tools -> Refresh -> Repository Cache menu entry

Saturday, August 14, 2010

Blogging from my summer holidays

Today I was reading through my twitter client the latest news and my attention was catched by all the tweets around the latest Doug post on the Pentaho forum. So I immediately connected through my phone browser to read that post and think about it.

Doug's question was asking how the Pentaho community is going. Following what my Pentaho friends said in their answers to this thread I also agree that comparing downloads isn't a good metric to judge how a community is going. The spirit of a community is made by all the people that every day gives a precious help in coding, helps people understanding the product or coming out from problems (that they think they are hard but they aren't) and talks about the product.

I personally come from another experience with another important opensource community (not a BI product). From the outside, I remember, everything seemed wonderful. The product stayed for a very long time on top as the best project in one of the biggest open source forges. But from the inside everything was totally different and the approach was really cold.

Here may experience is totally different, I'm breathing a completely different air. I found real people that cooperates everyday helping each other to solve their everyday problems with the product or other related technologies. They support people through irc, the forums and the wiki. I'm a software architect so I decided to help mainly with code contributions, but also in the forums and in writing articles about some pentaho topics in my blog .I studied and I'm continuously studying the code and I'm working with wonderful guys: Pedro, Tom, Paul and many other.

In our Pentaho community, the irc channel is the something I really enjoy but unfortunately I can participate only a few times because very often the various proxies broke my ability to connect. It's a sort of noisy room where anyone expose problems and quickly gets solutions. You always find someone available to support you. But it is also a place to talk with friends about everything. A sort of meeting place. That is wonderful.

I said friends not colleagues and this is a very important distinction. These are the things that makes me thinking that IT IS really a community that works and not the number of downloads or any other stupid indicator.

Thursday, February 18, 2010

BI Server & MS Active Directory in 10 minutes

Recently I had the need to connect Pentaho to MS Active Directory for user authentication/authorization. Immediately I asked myself how to connect Pentaho BI Server to Microsoft Active directory and I answered "Oh don't worry... it will take no more than 10 minutes!". Then the "look for a how-to document..." discovery process started.

I found a lot of documentation about this issue (wiki articles, forums thread) but there isn't a well done survival guide on this problem (that's my opinion).

So I'll try to summarize in few lines the steps followed and problems encountered to try to build a sort of survival guide for anyone with the same issue to solve

Have a look at spring configuration files

BI Server security architecture is based on Spring Security so the first guide to be read is the Spring documentation where they talk about LDAP configuration. Better, in case you don't know anything about that, if you came a step backward and have a read at the general architecture of Spring security.

Spring security beans wires together through spring application context and in Pentaho all the needed spring application context files are located in <biserver_home>/pentaho-solutions/system. You'll find a lot of them there but the important things to know are:

pentaho-spring-beans.xml contains the list of imported spring bean files that will be loaded when BI Server will start.
We've two important file groups there named applicationContext-spring* and applicationContext-pentaho*. In each group, you have one file for every available authentication method defined in Pentaho. Usually the beans located in files belonging to applicationContext-spring group contains definitions for spring related beans needed to configure the specified authentication/authorization method. The beans located in files belonging to applicationContext-pentaho group contains definitions of Pentaho's beans involved in the authorization/authentication methods for the specific authentication method (LDAP, Hibernate, jdbc).

So how to configure Pentaho to work with MS Active Directory?

The setup to have Pentaho working with MS Active Directory is really simple if you know exactly what to do and to how. Il try to summarize you everything in the following paragraphs. As detailed above all the files we will mention are located in <biserver_home>/pentaho-solutions/system.

1. Setup an MS Active Directory user to let BI Server connect to. You need to define a user in MS Active Directory so that BI Server can connect to and check if the Pentaho's user is going authenticate exists and is valid. The user you're going to define in MS Active Directory doesn't need to have any special right so a normal user is the more appropriate. Remember to check that the "password never expire" flag is not set for this user.

2. Setup Spring Security files needed to enable LDAP Server authentication/authorization. This is a good point, I think the major one. First of all read the guidelines provided here about some rules to be followed when editing Spring configuration files particularly regarding white spaces and special characters. Then follow the points detailed here.

2.a) Open the applicationContext-security-ldap.properties and change the properties accordingly to your needs. The useful thing about this file is that it contains all the properties needed to configure spring beans so that we doesn't need to look for them in eac xml file. They're all in one single place. Following you'll find an example:

 contextSource.providerUrl=ldap\://ldaphost\:389  
 contextSource.userDn=cn=ldapuser,OU=my_org_unit,dc=my_dc  
 contextSource.password=password  
 userSearch.searchBase=OU=my_org_unit,dc=my_dc  
 userSearch.searchFilter=(sAMAccountName=\{0\})  
 populator.convertToUpperCase=false  
 populator.groupRoleAttribute=cn  
 populator.groupSearchBase=OU=my_org_unit,dc=my_dc  
 populator.groupSearchFilter=(member=\{0\})  
 populator.rolePrefix=  
 populator.searchSubtree=true  
 allAuthoritiesSearch.roleAttribute=cn  
 allAuthoritiesSearch.searchBase=OU=my_org_unit,dc=my_dc  
 allAuthoritiesSearch.searchFilter=(objectClass=group)

Important things to note here are

contextSource.providerUrl - LDAP server url
contextSource.userDn - LDAP username. This is the user we've talked about in 1) above
contextSource.password - LDAP user password
the populator properties are needed by Spring DefaultLdapAuthoritiesPopulator. That object is needed to load the set of authorities the user was granted.
the userSearch properties configures the attributes needed to fills up the "users" box when assigning permissions to reports etc.
the allAuthorities properties configures the attributes needed to fills up the "roles" portion of the permissions box when setting them for a report etc.

The excerpt I gave above is fully working so you can copy and paste it in your properties file changing only the definitions specific to your installation. At this point you completed almost the 70% of the required configuration to have everything working on your system. Be careful to declare the full DN (domain name) when you work with MS Active Directory because, if not, it's almost sure you'll have an error like this (The first time I tried I had such an error for this reason)

<pre style="font-family:arial;font-size:12px;border:1px dashed #CCCCCC;width:99%;height:auto;overflow:auto;background:#f0f0f0;padding:0px;color:#000000;text-align:left;line-height:20px;"><code style="color:#000000;word-wrap:normal;"> Microsoft Active Directory Error:
   javax.naming.AuthenticationException:
   [LDAP: error code 49 - 80090308: LdapErr: DSID-0C09030B, comment:
                        AcceptSecurityContext error, data 525, v893 ]
     at com.sun.jndi.ldap.LdapCtx.mapErrorCode(Unknown Source)
     at com.sun.jndi.ldap.LdapCtx.processReturnCode(Unknown Source)
     at com.sun.jndi.ldap.LdapCtx.processReturnCode(Unknown Source)
     at com.sun.jndi.ldap.LdapCtx.connect(Unknown Source)
     at com.sun.jndi.ldap.LdapCtx.(Unknown Source)
     at com.sun.jndi.ldap.LdapCtxFactory.getUsingURL(Unknown Source)
     at com.sun.jndi.ldap.LdapCtxFactory.getUsingURLs(Unknown Source)
     at com.sun.jndi.ldap.LdapCtxFactory.getLdapCtxInstance(Unknown Source)
     at com.sun.jndi.ldap.LdapCtxFactory.getInitialContext(Unknown Source)
     at javax.naming.spi.NamingManager.getInitialContext(Unknown Source)
     at javax.naming.InitialContext.getDefaultInitCtx(Unknown Source)
     at javax.naming.InitialContext.init(Unknown Source)
     at javax.naming.InitialContext.(Unknown Source)
     at javax.naming.directory.InitialDirContext.(Unknown Source)
</code></pre>

You can find short but useful informations about this issue here

2.b) Two words about the default role. Anytime a user logs into Pentaho, the system assigns the user a default role of "Authenticated". This is a really important point to have all the things working. If that role isn't assigned you're able to connect but mantle witll never show anything to you. My problem was that in my spring LDAP context file the definitions to have that role assigned by default were missed. So I had to add them manually. So be absolutely sure to check that in applicationContext-spring-security-ldap.xml you have the defaultRole property defined with a value of Authenticated in the popolator bean and if missed add it. Below an excerpt of that bean definition with the defaultRole property added so that you can copy and paste if missed in your file.

 <bean id="populator" class="org.springframework.security.ldap.populator.DefaultLdapAuthoritiesPopulator">  
  <!-- omitted -->  
  <property name="defaultRole" value="Authenticated" />  
  <!-- omitted -->  
 </bean>

3) Rework the imports in pentaho-spring-beans.xml to enable the load of LDAP security beans at startup. Below (at lines 8-9) we disabled the DAO/Hibernate security and (at lines 10-11) we imported the new definitions enabling LDAP security

1:  <beans>  
2:   <import resource="pentahoSystemConfig.xml" />  
3:   <import resource="adminPlugins.xml" />  
4:   <import resource="systemListeners.xml" />  
5:   <import resource="sessionStartupActions.xml" />  
6:   <import resource="applicationContext-spring-security.xml" />  
7:   <import resource="applicationContext-common-authorization.xml" />  
8:   <!-- import resource="applicationContext-spring-security-hibernate.xml" />  
9:   <import resource="applicationContext-pentaho-security-hibernate.xml" / -->  
10:   <import resource="applicationContext-spring-security-ldap.xml" />  
11:   <import resource="applicationContext-pentaho-security-ldap.xml" />  
12:   <import resource="pentahoObjects.spring.xml" />  
13:  </beans>

4) Now its the time to define a set of groups in your MS Active Directory for your pentaho users (if you don't already have such a groups) depending on your authorization needs. For sure you need to have at least one group to contain pentaho admins. In my system I called that group PentahoAdmin.

5) Declare the new admin group in Pentaho configuration to assign that group the admin grant. To do that rework the acl-voter element in pentaho.xml as shown below.

      <acl-voter>  
           <!-- What role must someone be in to be an ADMIN of Pentaho -->  
           <admin-role>PentahoAdmin</admin-role>  
      </acl-voter>

6) Rework acl-publisher definitions in pentaho.xml for all the Pentaho's groups defined in the LDAP server. In my system I defined two roles PentahoAdmin and PentahoUser so my configuration looks like this

  <acl-publisher>  
           <!--   
                These acls are used when publishing from the file system. Every folder  
                gets these ACLS. Authenticated is a "default" role that everyone  
                gets when they're authenticated (be sure to setup your bean xml properly  
                for this to work).  
           -->  
           <default-acls>  
                <acl-entry role="PentahoAdmin" acl="FULL_CONTROL" />                    <!-- Admin users get all authorities -->  
                <!-- acl-entry role="cto" acl="FULL_CONTROL" / -->                    <!-- CTO gets everything -->  
                <acl-entry role="PentahoUser" acl="EXECUTE_SUBSCRIBE" />          <!-- PentahoUser gets execute/subscribe -->  
                <acl-entry role="Authenticated" acl="EXECUTE" />          <!-- Authenticated users get execute only -->  
           </default-acls>  
           <!--  
                These acls are overrides to specific file/folders. The above default-acls will  
                be applied and then these overrides. This allows for specific access controls to  
                be loaded when the repository if first populated. Futher changes to acls can be  
                made in the platform GUI tool. Uncomment these and change add or delete to your hearts desire -->                           
           <overrides>  
                <file path="/pentaho-solutions/admin">  
                     <acl-entry role="PentahoAdmin" acl="FULL_CONTROL" />  
                </file>  
           </overrides>  
      </acl-publisher>

7) Stop and restart you Pentaho server and everything is ready for a try.

I hope I haven't missed anything and that everything is clear enough for everyone who reads these few instructions. Let me know if you have any problems so that I can keep updated this very brief guide.

Tuesday, February 2, 2010

Pentaho BI Server language files (1/2)

It was a long time of hard work the last month and a half but now I'm able to see a little light at the end of the tunnel and I had the time to be back. This time I have an interesting thing for all the italian friends. I completed:

the italian language files for Mantle GWT Client. The files were already submitted to Pentaho JIRA hoping they will be incorporated in the next release of the product.
the italian language file for JPivot toolbar. This was a minor effort because it is a very little file so it takes really a few minutes to be completed. This language file has been already committed by myself on JPivot's cvs trunk.

I detailed below the steps to be followed to install the new language on existing BI Server installations. The files are compatible with Pentaho 3.x. A special thanks to my colleague and dear friend Andrea Pasotti who helped (and is helping) me in this work.

How to activate italian language support for Mantle GWT client

Download the files messages_it.properties and MantleLoginMessages_it.properties from the following link to JIRA
Stop BI Server
Copy the file messages_it.properties to <biserver_home>/webapps/pentaho/mantle/messages
Open the file <biserver_home>/webapps/pentaho/mantle/messages/supported_languages.properties
Add the following line

it=Italiano

Save and close the file
Copy the file MantleLoginMessages_it.properties to <biserver_home>/webapps/pentaho/mantleLogin/messages
Open the file <biserver_home>/webapps/pentaho/mantleLogin/messages/supported_languages.properties
Add the following line

it=Italiano

Save and close the file
Start BI Server

How to activate Italian language support for JPivot toolbar

Dowload the resources_it.properties language file from the following link
Go to <biserver_home>/webapps/pentaho/WEB-INF/classes and create the following directory path com/tonbeller/jpivot/toolbar
Copy resources_it.properties to <biserver_home>/webapps/pentaho/WEB-INF/classes/com/tonbeller/jpivot/toolbar
Start BI Server

I'm going to finalize to complete translation of the GUI so in the next few weeks will follow the italian language files PAC (Pentaho Administration Console) and Ad-Hoc Query Reporting. So what else.... stay tuned!

Thursday, December 10, 2009

Patching JPivot to filter by multiple members belonging to the same dimension

A very annoying problem with the release of JPivot shipped with the actual release of Pentaho is that you can only select one member from the same cube dimension at a time. But frequently we would like to select more than one member at a time. So what a good opportunity to build a patched release of JPivot and make that happens!

The first thing I made was try to test that by writing a sample MDX query on the SteelWheels cube that had, as a filter, a tuple made up by a set of members belonging to the same dimension. It worked. Infact, starting from Mondrian 3.1.2.13008, the OLAP engine supported compound slicers that makes this possible. After this very brief verification I was convinced to continue in my adventure and the next step was to find out where was the latest JPivot sources. As stated by Will Gorman in the post #7 of this thread the latest and good sources are in the obvious place... the JPivot CVS Repository of the JPivot project on SourceForge. So I get it and started my work.

The patch wasn't so difficult to be done and I reached easily my goal. It was a good opportunity to go into the internals of JPivot and learn about it. The new PAT is not round the corner so my opinion is that JPivot will be the production OLAP viewer of choice for Pentaho Community Edition again for a few months. So, I think, this exercise is not a waste of time.

If you want to try it, you can find the patched JPivot jar file here. The setup takes just one minute. Move your original JPivot library from <biserver_home>/tomcat/webapps/WEB-INF/lib in a different directory (just to be safe). Unzip the archive you just downloaded and copy the extracted jar file to the location specified before. Restart Pentaho BI Server and then... here we go!!

Friday, November 27, 2009

Accessing Mondrian cubes through Pentaho Report Designer

Today I had the opportunity to design some complex reports using the cubes published in my customer's Pentaho BI Server as datasources for my report. Using a cube as a datasource to produce reports is good, in my opinion, because it gives you a perfect way to make easy reporting whenever you've, for example, to produce reports that compares data on different periods.

A good sample for what we're going to discuss here is the Top N Analysis report you can find in the Pentaho's Steel Wheels samples. For abbreviation I'll refer to this report simply as "the sample". If you open it and have a look at the defined datasources you can clearly see that it takes the data it needs from steelwheels.mondrian.xml cube schema. So that is good for us.

Publish the report to Pentaho BI Server

Before thinking about the publishing of your report to your bi-server running instance, you've to think about the way Pentaho will use to access the schema it needs for your report. The strategy used by the reporting plugin goes through two possible path:

Firstly it tries to access the Analysis Schema file using the path you've specified in the datasource definition of the Pentaho Report Designer. Whenever you're in the Pentaho BI Server execution environment every path is calculated respect to <BISERVER_HOME>/tomcat/bin. That means that if you set a reference to your Analysis Schema cube as a relative path in your report datasource definition (as is for the sample I mentioned in my opening) Pentaho will look for you schema file calculating the absolute file path respect to <BISERVER_HOME>/tomcat/bin. So you need to be sure that your file is in the right place before the system will try to access it. I think that this way is not as good because is dependent on your BI Server filesystem layout.
Secondly it tries to access the schema as an XMLA datasource. That is, in my opinion, the more elegant way to make the schema available to the reporting engine.

How to add a new XMLA datasource to Pentaho BI Server

To define a new XMLA datasource in our Pentaho BI Server environment we've to update the datasources.xml file in <BISERVER_HOME>/pentaho-solution/system/olap.

This file contains the definitions of all the XMLA datasources available in the system. We can add a new datasource definition using one of these two ways:

Manually add a new Catalog element to configure a new Mondrian catalog
While publishing the Analysis Schema from the Schema Workbench flag the Enable JNDI datasource and set the JNDI Data Source field appropriately. You can find the procedure to publish the schema clearly explained in Pentaho's wiki.

Thursday, October 29, 2009

Mondrian cubes debugging: how to display SQL queries

These days I've got the interesting need to look at the queries that Mondrian generates while the user is navigating the OLAP cube. This idea came to me when I decided to the check if the indexes applied to my tables gives me the best performances possible. To decide which indexes are eligible to be applied to my tables, my strategy is

collect some queries and then
get the query plan of each query and check if the indexes are properly applied.

It's really useful to look at Mondrian log files because they gives us a lot of useful informations about how our system is behaving. We can

look at sql statements and MDX queries,
have some profiling informations on queries that are executed,
get other useful debugging informations.

The following paragraphs illustrates how to enable Mondrian debugging logs, adding some properties to the Mondrian configuration file.
After that, we'll configure two new log4j appenders to have the desired log files properly written on our filesystem.

Enable Mondrian debug log
Mondrian has a big set of configuration settings that can be modified. In our case, to enable Mondrian debug informations follow the steps detailed below:

Open the mondrian.properties file located in <bi-server_home>/pentaho-solution/system/mondrian and add the following line.

mondrian.rolap.generate.formatted.sql=true

You can find the complete set of configuration settings here

Update log4j configuration
At this point we're going to modify the log4j configuration file adding the required appenders abd categories to have our logging informations displayed properly

Open the log4j.xml file located in <bi-server_home>/tomcat/webapps/pentaho/WEB-INF/classes

Based on what you want to log, add the one or each of the following lines to the file. They will create two new RollingFileAppenders. You're free to use
the kind of appender you prefer. In case you need further informations about log4j and its configuration parameters you can have a look at here
IMPORTANT: The location of the produced files is relative to the <bi-server_home>/tomcat/bin directory. You can put the generated log files wherever you
want in the filesystem but always remember this important consideration.

<appender name="SQLLOG" class="org.apache.log4j.RollingFileAppender">

     <param name="File" value="sql.log"/>

     <param name="Append" value="false"/>

     <param name="MaxFileSize" value="500KB"/>

     <param name="MaxBackupIndex" value="1"/>

     <layout class="org.apache.log4j.PatternLayout">

       <param name="ConversionPattern" value="%d %-5p [%c] %m%n"/>

     </layout>

   </appender>

   

   <appender name="MONDRIAN" class="org.apache.log4j.RollingFileAppender">

     <param name="File" value="mondrian.log"/>

     <param name="Append" value="false"/>

     <param name="MaxFileSize" value="500KB"/>

     <param name="MaxBackupIndex" value="1"/>

     <layout class="org.apache.log4j.PatternLayout">

       <param name="ConversionPattern" value="%d %-5p [%c] %m%n"/>

     </layout>

   </appender>

Add the following new categories to the log4j.xml file according to your logging needs.

<category name="mondrian.sql">

      <priority value="DEBUG"/>

      <appender-ref ref="SQLLOG"/>

   </category>

   

   <category name="mondrian">

      <priority value="DEBUG"/>

      <appender-ref ref="MONDRIAN"/>

   </category>

Enable the new log settings
To have the new log settings enabled restart the Pentaho bi-server instance. Remember, as soon as you satisfied your debugging needs, to disable the tracing logs because they have a severe impact on system performances.

Rama's Free Thoughts