DW, BI and OBIEE: 2012

12 Jul 2012

Quick Introduction to Exalytics

Well, this being dubbed as next generation product, lets quickly see what it is...

What is Exalytics?

The Oracle Exalytics In-Memory Machine is a “engineered in-memory analytics machine” that delivers no-limit, extreme performance for Business Intelligence and Enterprise Performance Management applications.

What is Engineered Systems?

Hardware and software designed to work together.

Best-in-class enterprise BI platform, in-memory analytics software, and hardware optimized to work together

Advanced data visualization and exploration to quickly provide actionable insight from large amounts of data

Fastest solution for business intelligence, modeling, forecasting, and planning applications

Access to all Oracle and non-Oracle enterprise data sources

No limits on users, data, or applications

What is In-memory Database?

Is a memory-optimized relational database that empowers applications with the responsiveness and high throughput (speed, performance).

Oracle “TimesTen” In-Memory Database and “IMDB Cache” deliver real-time performance by changing the assumptions about where data resides at run time. By managing data in memory and optimizing data structures and access algorithms accordingly, database operations execute with maximum efficiency, achieving dramatic gains in responsiveness and throughput, even compared with a fully cached, disk-based relational database management system (RDBMS). Following the standard relational data model, you can use SQL, JDBC, ODBC, PL/SQL and Oracle Call Interface (OCI) to access TimesTen and IMDB Cache databases.

In a conventional disk-based RDBMS, client applications communicate with a database server process over some type of IPC connection, which adds performance overhead to all SQL operations. An application can link TimesTen directly into its address space to eliminate the IPC overhead and streamline query processing. This is accomplished through a direct connection to TimesTen. Traditional client/server access is also supported for functions such as reporting, or when a large number of application-tier platforms must share access to a common in-memory database.

Exalytics Hardware Architecture

Exalytics hardware is delivered in units of a single 3RU rack-mountable server that is optimally configured for in-memory analytics for business intelligence workloads. Multiple Oracle Exalytics machines can be clustered together to expand available memory capacity and to provide high availability. Oracle Exalytics includes powerful compute capacity, abundant memory, and fast networking options and is capable of direct attached storage options.

All network interfaces support failover and can be used to setup a cluster without a single point of failure. Oracle Exalytics also includes redundant hot-swappable power supplies and fans.

It includes a high-performance direct attached storage system including a high-performance RAID HBA and 3.6TBs of raw disk capacity. Optionally, clusters of these machines can leverage network attached storage for storing shared metadata and configuration data.

It includes a complete server management infrastructure with Oracle Integrated Lights Out Management (ILOM). Oracle ILOM provides complete management and administration of the server hardware via remote (SSL, HTTPS) and serial connectivity.

Exalytics Software Overview

Exalytics runs the Oracle Business Intelligence Foundation along with Oracle TimesTen In-Memory Database. Both have been specifically enhanced to work together and have been optimized to provide exclusive features on Exalytics hardware.

a. Oracle Business Intelligence Foundation

The Oracle BI Foundation provides comprehensive and complete capabilities for business intelligence, including enterprise reporting, dashboards, ad hoc analysis, multi-dimensional OLAP, scorecards, and predictive analytics on an integrated platform.

b. Oracle TimesTen In-Memory Database for Exalytics

TimesTen stores all its data in memory optimized data structures and supports query algorithms specifically designed for in-memory processing. Using the familiar SQL programming interfaces, TimesTen provides real-time data management that delivers blazing-fast response times, and very high throughput for a variety of workloads.

c. Columnar Compression:

TimesTen In-Memory Database for Exalytics supports columnar compression that reduces the memory footprint for in-memory data. Compression ratios of 5X are practical and help expand in-memory capacity. Analytic algorithms are designed to operate directly on compressed data, thus further speeding up the in-memory analytics queries.

d. Oracle Essbase

Essbase on Exalytics has a number of optimizations for in-memory operation including improvements to overall storage layer performance, enhancements to parallel operations, enhanced MDX syntax and a high performance MDX query engine. Essbase on Exalytics provides up to 16X faster query execution as well as up to 6X reduction in write-back and calculation operations, including batch processes.

e. Clustering

Exalytics In-Memory Machine also supports clustering to provide scalability and high availability. It supports both active-active and active-passive configurations. A cluster configuration also can be configured to pool the available memory resources to accommodate larger data sets in-memory.

f. In-Memory Analytics

Exalytics includes two in-memory analytics engines that provide the analytics capability - Oracle TimesTen In-Memory Database for Exalytics and Oracle Essbase with in-memory optimizations for Exalytics.

29 Jun 2012

OBIEE Architecture

OBIEE Architecture

4 Apr 2012

ETL Vs ELT

ETL and ELT are two different approaches in Data warehouse design and implementations. There are three stages in the process of building a data warehouse commonly accepted by the industry. However, the order in which these stages are performed can have a fundamental effect on the meaning of these processes and also the overall outcome of the BI solution. Extract – The process by which data is extracted from the data source Transform – The transformation of the source data into a format relevant to the solution Load – The loading of data into the warehouse.

what is the difference between ETL and ELT ?

In ETL the transformations are processed by the ETL tools while in ELT the transformations are processed by the target data sources (RDBMS). So the important thing to remember is the side where the transformations are processed.

ETL – Extract, Transform and Load

This is the traditional approach. Data is “Extracted” from the data sources using a data extraction tool. It is then “Transformed” using a series of transformation routines. This transformation process is largely dictated by the data format of the output. Data quality and integrity checking is performed as part of the transformation process, and corrective actions are built into the process. Transformations and integrity checking are performed in the data staging area. Finally, once the data is in the target format, it is then “Loaded” into the data warehouse ready for presentation.

It is made up of software that transforms (with its own inbuilt routines and functions) and migrates data on most platforms with or without source and target databases

ELT – Extract, Load and Transform

Data is “extracted” from the data sources into the “Staging Database” using a data extraction tool. Whilst in the staging area, integrity and business rules checks can be applied, and relevant corrections can be made. The source data is then loaded into the warehouse. In effect this provides a validated and cleaned offline copy of the source data in the data warehouse. Once in the warehouse, transformations are performed to re-shape the data into its target output format. It consists of software that transforms and migrates data in a database engine, often by generating SQL statements and procedures and moving data between tables.

A key appeal of ELT is loading performance, an advantage it holds because it operates at the infrastructure level, using SQL and related procedural features of the relational database). In contrast, ETL works at the integration server level, using dedicated graphical application development facilities including transformation rules.

Some Big names in the domain:

Oracle has been firmly on the side of ELT and is upping the ante with appliances like Oracle/Sun Exadata. Oracle Data Integrator Enterprise Edition (ODI-EE) is a unified solution comprising Oracle Warehouse Builder and Oracle Data Integrator (ODI).

Informatica, the pure-play data integrator and a traditional ETL player, now offers the option of deploying the transformation logic to either the Informatica engine (ETL) or the DBMS (ELT), or a hybrid (ETLT).

IBM’s InfoSphere DataStage is an ETL tool that can also be deployed in an ETLT style

3 Apr 2012

Multiple RPD on single BI Server

Now let’s assume that the two RPD i.e deba.rpd & test.rpd and their respective two catalogs both working properly while loaded individually and working independently .

Configuring the NQSCONFIG:

Lets configure the NQSCONFIG.ini file as below :

[ REPOSITORY ]
RPD1 = deba.rpd , DEFAULT;
RPD2 = test.rpd ;

Configuring the DSN:

Create 2 Oracle BI Server System DSN i.e

DSN1 to point to default repository deba.rpd, and

DSN2 to point to test.rpd.

Since we are configuring two RPD in parallel to single BI server instance hence for both DSN we point to server Local with default Port 9703 . The 9703 port is ODBC communication port between BI Server and Presentation Services .

N.B : whatever repository will be put under change Default repository section will be override by the DEFAULT settings in NQSCONFIG.ini . You can tick “Connect to Oracle BI Server to obtain default settings” ,so that RPD connection will be established as per the definition in .ini file after BI server restart .

Configuring the Catalog and instanceconfig.xml:

For both deba.rpd and test.rpd we have catalog file under ../$OracleBIDataHome/web/catalog/” as “deba” and “test” having different sets of reports in each of them to distinguish properly. So let’s make catalog “deba” online after making the below changes in instanceconfig.xml .

<WebConfig>
<ServerInstance>
<DSN>DSN1</DSN>
<CatalogPath>D:/OracleBIData/web/catalog/deba</CatalogPath>

Restart the services and log into catalog for “deba” and you can verify that the query running on deba.rpd and your dashboard display the reports relevant for catalog “deba” .

Edit the xml and make the similar type of change to point to DSN2 and catalog “test” .It should work as well pointing to repo “test” .

22 Feb 2012

OBIEE - How to Set a Temp Query Logging Level?

You might want to diagnose performance or data issues by setting a temporary log level for a query. You can enable query logging for a select statement by adding a prefix clause in the Advanced SQL Clauses section of the Advanced tab in Oracle BI Presentation Services.
For example, for the select statement:
SELECT a, b, c from dim1, fact 1;
You can specify the logging level of 5 in the Prefix field as follows:

Set Variable LOGLEVEL=5;

For this query, the logging level of 5 is used regardless of the value of the underlying LOGLEVEL variable.

OBIEE - What is Log File Rotation?

Log file rotation is the creation of new log files, when the file exceeds a specified threshold or date.

Take the MaximumFileSizeKb setting for the component log configuration file for the Oracle BI Scheduler as an example. Whenever a log file exceeds the size that is specified by this setting, then the existing Scheduler log file is renamed, and a new log file is created.

· nqscheduler.log — The latest log file.

· nqscheduler-<n>.log — The renamed previous log file.
where <n> = date and timestamp,
for example nqscheduler-20100909-2135.log

Additionally, a log file date that is older than the MaximumLogAgeDay setting is deleted.

13 Feb 2012

OBIEE - Function Shipping

OBIEE Function-Shipping:

Each database has its own set of built in features and limits. The BI server always makes use of these built-in database functions as it will be more efficient to process data on database. This process is aptly called Function Shipping. In case if the required function is not supported by the data source, then the BI server performs the operations by itself.

In the BI AdminTool double-click on the physical data source to take a look at the properties associated, in the Features tab you can see the inbuilt functions available in the data source. BI Server will decide what functions to ship and what not during run time based on the settings here.

The Default settings (Check Boxes) are getting loaded from a INI file bases on the data source and connection pool settings.

The INI file, contains the default settings for database features, is in <Oracle_Instance>\config\OracleBIServerComponent\coreapplication\DBFeatures.INI

If you must modify the DBFeatures.ini file, use extreme caution because setting features that the underlying databases do not support can cause errors and unexpected results to be returned.
Regards,

- Akash

9 Feb 2012

Outer Joins - Left, Right and Full - Practical Example

Outer Join:

An outer join does not require each record in the two joined tables to have a matching record. The joined table retains each record—even if no other matching record exists. Outer joins subdivide further into left outer joins, right outer joins, and full outer joins, depending on which table(s) one retains the rows from (left, right, or both). In this case left and right refer to the two sides of the JOIN keyword.

SQL> Select * from A;

emp_id	emp_name	dpt_id
1001	Saran	20
1002	Jason	20
1003	Brian	20
1004	Paul	10
1012	Mary	40
1007	Steve	30
1008	Jim	20
1009	Mark	20
1010	Yihong	20
1013	Melanie	30
1011	Greathouse	50
1014	Jie	50
1015	Raj	50

SQL> Select * from B;

dpt_id	dpt_name
10	Management
20	Development
30	Usability
40	HR
60	Sales

Left Outer Join:
This means that a left outer join returns all the values from the left table, plus matched values from the right table or NULL in case of no matching join predicate.
SQL>
select a.EMP_ID, a.EMP_NAME, a.DPT_ID, b.Dpt_name from a
LEFT OUTER JOIN b on a.DPT_ID = B.DPT_ID
order by a.EMP_ID;

emp_id	emp_name	dpt_id	dpt_name
1001	Saran	20	Development
1002	Jason	20	Development
1003	Brian	20	Development
1004	Paul	10	Management
1007	Steve	30	Usability
1008	Jim	20	Development
1009	Mark	20	Development
1010	Yihong	20	Development
1011	Greathouse	50	Null
1012	Mary	40	HR
1013	Melanie	30	Usability
1014	Jie	50	Null
1015	Raj	50	Null

Right Outer Join:

This returns all the values from the right table and matched values from the left table or NULL in case of no matching join predicate.
SQL>
select a.EMP_ID, a.EMP_NAME, a.DPT_ID, b.Dpt_name from a
RIGHT OUTER JOIN b on a.DPT_ID = B.DPT_ID
order by a.EMP_ID;

emp_id	emp_name	dpt_id	dpt_name
1001	Saran	20	Development
1002	Jason	20	Development
1003	Brian	20	Development
1004	Paul	10	Management
1007	Steve	30	Usability
1008	Jim	20	Development
1009	Mark	20	Development
1010	Yihong	20	Development
1012	Mary	40	HR
1013	Melanie	30	Usability
Null	Null	Null	Sales

Full Outer Join
a full outer join combines the effect of applying both left and right outer joins. Where records in the FULL OUTER JOINed tables do not match, the result set will have NULL values for every column of the table that lacks a matching row. For those records that do match, a single row will be produced in the result set.
SQL>
select a.EMP_ID, a.EMP_NAME, a.DPT_ID, b.Dpt_name from a
FULL OUTER JOIN b on a.DPT_ID = B.DPT_ID
order by a.EMP_ID;

emp_id	emp_name	dpt_id	dpt_name
1001	Saran	20	Development
1002	Jason	20	Development
1003	Brian	20	Development
1004	Paul	10	Management
1007	Steve	30	Usability
1008	Jim	20	Development
1009	Mark	20	Development
1010	Yihong	20	Development
1011	Greathouse	50	Null
1012	Mary	40	HR
1013	Melanie	30	Usability
1014	Jie	50	Null
1015	Raj	50	Null
Null	Null	Null	Sales

Regards,
- Akash

DW, BI and OBIEE