The data analytics, insight and reporting system is mostly for business intelligence purpose, support business decision making and experimentation. The general requirements in my mind: slice and dice, cross-dimensional analysis, correlation, trending, ad-hoc queries/reports, data visualization, nice UI/dashboard, large data size, low latency. I usually work at the data tier. Mondrian fits in the middle-tier and presentation layer nicely for me.
Mondrian is an in-memory OLAP cube (cache) built on top of relational database. Cube is a multi-dimensional data structure. A cell holds numeric measurements of a business; aggregation can be executed efficiently along dimensions. MDX is the query language of Mondrian and a query can be broken down to lookup at the cube (cache hit) and direct SQL query against the underneath relational database.
I play with the Mondrian and SQLite.
#1: Create a sqlite database (I borrow the data model from the Mondrian tutorial by Slawomir, architect of Pentaho).
> sqlite3 issue.db sqlite> create table fact_issue(id smallint, issue_type varchar(255), assignee varchar(255), priority varchar(255); sqlite> insert into fact_issue values(1, 'bug', 'Grover', 'blocker'); sqlite> insert into fact_issue values(2, 'bug', 'Oscar', 'open');
#2: Design the Mondrian schema (IssueMart.xml). Define 3 dimensions (type, assignee, priority) and 1 measures (basic count).
<?xml version="1.0"?> <Schema name="IssueMart"> <Cube name="Issue"> <Table name="fact_issue"/> <Dimension name="Type"> <Hierarchy hasAll="true" allMemberName="All Types"> <Level name="Type" column="issue_type" uniqueMembers="true"/> </Hierarchy> </Dimension> <Dimension name="Assignee"> <Hierarchy hasAll="true" allMemberName="All Assignees"> <Level name="Assignee" column="assignee" uniqueMembers="true"/> </Hierarchy> </Dimension> <Dimension name="Priority"> <Hierarchy hasAll="true" allMemberName="All Priorities"> <Level name="Priority" column="priority" uniqueMembers="true"/> </Hierarchy> </Dimension> <Measure name="Issue Count" column="id" aggregator="count" formatString="Standard"/> </Cube> </Schema>
#3: Install the Tomcat; verify the http://localhost:8080 up and running.
#4: Download Mondrian; unzip mondrian-version.war to tomcat_home/webapps/ folder.
#5: Get the sqlite jdbc driver: sqlite-jdbc-version.jar; deploy to tomcat_home/webapps/mondrian/WEB-INF/lib
#6: Change the connection string in Mondrian web.xml (tomcat_home/webapps/mondrian/WEB-INF) to point to the sqlite database.
Provider=mondrian;Jdbc=jdbc:sqlite:/C:/sqlite/issue.db;JdbcDrivers=org.sqlite.JDBC;Catalog=/WEB-INF/queries/IssueMart.xml;
#7: Try out the ad hoc query UI: http://localhost:8080/mondrian/adhoc.jsp. It displays the results in tabular form.
SELECT {[Measures].[Issue Count]} ON COLUMNS, CROSSJOIN ({[Assignee].members}, {[Type].members}) ON ROWS FROM [Issue]
One Comment
It is a nice and helpful little bit of facts. Now i am happy you distributed this convenient info around. Be sure to keep us educated this way. Many thanks for giving.