Extract transformation and load testing (ETL)
HAPPY TESTING...!
What is ETL Testing?
ETL testing is done to ensure that the data that has been loaded from a source to the destination after business transformation is accurate. It also involves the verification of data at various middle stages that are being used between source and destination. ETL stands for Extract-Transform-Load.ETL Testing Process
Similar to other Testing Process, ETL also go through different phases. The different phases of ETL testing process is as followsETL testing is performed in five stages
- Identifying data sources and requirements
- Data acquisition
- Implement business logics and dimensional Modelling
- Build and populate data
- Build Reports
Types of ETL Testing
How to create ETL Test Case
ETL testing is a concept which can be applied to different tools and databases in information management industry. The objective of ETL testing is to assure that the data that has been loaded from a source to destination after business transformation is accurate. It also involves the verification of data at various middle stages that are being used between source and destination.While performing ETL testing, two documents that will always be used by an ETL tester are
- ETL mapping sheets :An ETL mapping sheets contain all the information of source and destination tables including each and every column and their look-up in reference tables. An ETL testers need to be comfortable with queries as ETL testing may involve writing big queries with multiple joins to validate data at any stage of ETL. ETL mapping sheets provide a significant help while writing queries for data verification.
- DB Schema of Source, Target: It should be kept handy to verify any detail in mapping sheets.
ETL Test Scenarios and Test Cases
| Test Scenario | Test Cases |
|---|---|
| Mapping doc validation | Verify mapping doc whether corresponding ETL information is provided or not. Change log should maintain in every mapping doc. |
| Validation |
|
| Constraint Validation | Ensure the constraints are defined for specific table as expected |
| Data consistency issues |
|
| Completeness Issues |
|
| Correctness Issues |
|
| Transformation | Transformation |
| Data Quality |
|
| Null Validate | Verify the null values, where “Not Null” specified for a specific column. |
| Duplicate Check |
|
| Date Validation | Date values are using many areas in ETL development for
|
| Complete Data Validation |
|
| Data Cleanness | Unnecessary columns should be deleted before loading into the staging area. |
Types of ETL Bugs
| Type of Bugs | Description |
|---|---|
| User interface bugs/cosmetic bugs |
|
| Boundary Value Analysis (BVA) related bug |
|
| Equivalence Class Partitioning (ECP) related bug |
|
| Input/Output bugs |
|
| Calculation bugs |
|
| Load Condition bugs |
|
| Race Condition bugs |
|
| Version control bugs |
|
| H/W bugs |
|
| Help Source bugs |
|
Difference between Database testing and ETL testing
| ETL Testing | Data Base Testing |
|---|---|
| Verifies whether data is moved as expected | The primary goal is to check if the data is following the rules/ standards defined in the Data Model |
| Verifies whether counts in the source and target are matching Verifies whether the data transformed is as per expectation | Verify that there are no orphan records and foreign-primary key relations are maintained |
| Verifies that the foreign primary key relations are preserved during the ETL | Verifies that there are no redundant tables and database is optimally normalized |
| Verifies for duplication in loaded data | Verify if data is missing in columns where required |
Responsibilities of an ETL tester
Key responsibilities of an ETL tester are segregated into three categories- Stage table/ SFS or MFS
- Business transformation logic applied
- Target table loading from stage file or table after applying a transformation.
- Test ETL software
- Test components of ETL datawarehouse
- Execute backend data-driven test
- Create, design and execute test cases, test plans and test harness
- Identify the problem and provide solutions for potential issues
- Approve requirements and design specifications
- Data transfers and Test flat file
- Writing queries3 for various scenarios like count test
ETL Performance Testing and Tuning
ETL performance testing is a confirmation test to ensure that an ETL system can handle the load of multiple users and transactions. The goal of performance tuning is to optimize session performance by eliminating performance bottlenecks. To tune or improve the performance of the session, you have to identify performance bottlenecks and eliminate it. Performance bottlenecks can be found in source and target databases, the mapping, the session and the system. One of the best tools used for performance testing is Informatica.Automation of ETL Testing
The general methodology of ETL testing is to use scripting or do “eyeballing” of data.. These approaches to ETL testing are time-consuming, error-prone and seldom provide complete test coverage. To accelerate, improve coverage, reduce costs, improve defect detection ration of ETL testing in production and development environments, automation is the need of the hour. One such tool is Informatica.Best Practices for ETL Testing
- Make sure data is transformed correctly
- Without any data loss and truncation projected data should be loaded into the data warehouse
- Ensure that ETL application appropriately rejects and replaces with default values and reports invalid data
- Need to ensure that the data loaded in data warehouse within prescribed and expected time frames to confirm scalability and performance
- All methods should have appropriate unit tests regardless of visibility
- To measure their effectiveness all unit tests should use appropriate coverage techniques
- Strive for one assertion per test case
- Create unit tests that target exceptions
HAPPY TESTING ...!
Comments
Post a Comment