ETL and ELT are two different approaches in Data warehouse design and implementations. There are three stages in the process of building a data warehouse commonly accepted by the industry. However, the order in which these stages are performed can have a fundamental effect on the meaning of these processes and also the overall outcome of the BI solution. Extract – The process by which data is extracted from the data source Transform – The transformation of the source data into a format relevant to the solution Load – The loading of data into the warehouse.
what is the difference between ETL and ELT ?
In ETL the transformations are processed by the ETL tools while in ELT the transformations are processed by the target data sources (RDBMS). So the important thing to remember is the side where the transformations are processed.
ETL – Extract, Transform and Load
This is the traditional approach. Data is “Extracted” from the data sources using a data extraction tool. It is then “Transformed” using a series of transformation routines. This transformation process is largely dictated by the data format of the output. Data quality and integrity checking is performed as part of the transformation process, and corrective actions are built into the process. Transformations and integrity checking are performed in the data staging area. Finally, once the data is in the target format, it is then “Loaded” into the data warehouse ready for presentation.
It is made up of software that transforms (with its own inbuilt routines and functions) and migrates data on most platforms with or without source and target databases
ELT – Extract, Load and Transform
Data is “extracted” from the data sources into the “Staging Database” using a data extraction tool. Whilst in the staging area, integrity and business rules checks can be applied, and relevant corrections can be made. The source data is then loaded into the warehouse. In effect this provides a validated and cleaned offline copy of the source data in the data warehouse. Once in the warehouse, transformations are performed to re-shape the data into its target output format. It consists of software that transforms and migrates data in a database engine, often by generating SQL statements and procedures and moving data between tables.
A key appeal of ELT is loading performance, an advantage it holds because it operates at the infrastructure level, using SQL and related procedural features of the relational database). In contrast, ETL works at the integration server level, using dedicated graphical application development facilities including transformation rules.
Some Big names in the domain:
Oracle has been firmly on the side of ELT and is upping the ante with appliances like Oracle/Sun Exadata. Oracle Data Integrator Enterprise Edition (ODI-EE) is a unified solution comprising Oracle Warehouse Builder and Oracle Data Integrator (ODI).
Informatica, the pure-play data integrator and a traditional ETL player, now offers the option of deploying the transformation logic to either the Informatica engine (ETL) or the DBMS (ELT), or a hybrid (ETLT).
IBM’s InfoSphere DataStage is an ETL tool that can also be deployed in an ETLT style