Showing posts with label data driven testing. Show all posts
Showing posts with label data driven testing. Show all posts

Monday, May 19, 2008

Meta-Data data driven testing!

If you are involved in test automation, you'll know the importance and challenges of managing data. In my experience, developers are the worst testers mainly because they fail to understand the concept of data driven testing. Even good developers, who care to test, hard code test data in their unit test cases. To me, coming up with a good Data Strategy is more than half of test automation challenge.

As part of my profession, I teach and train students in SOA lifecycle quality. Over period of time, I have realized that my basic theory of iteration (refer to my Data Driven Testing (DDT) blog) doesn't really explain all different types of data that one needs to take into consideration in order to define a good test strategy. It only explains, at a high level, how one can extract randomness and variability from a test and move it into external data asset.

Modular testing is the key to re-usability and lower maintenance. Apart from creating test libraries and using building blocks, it is also important to understand how to organize different types of data outside of a test script.

Following diagram illustrates how different types of data can be organized around a test procedure/logic. Different types of data must not be cobbled together, because it almost always leads to increased maintenance costs and chaos.


  • Removing Meta-data from the test logic not only makes the logic more reusable, but also lowers its maintenance. Change in meta-data can be easily handled by updating appropriate properties files. Example: SQL scripts, tag names, component names. Meta-Data has 1-1 relationship with the test case.
  • Input Test Data is core to the business functionality/scenario. Example: username and password to login into a website. A single test case can have multiple copies of the same data, generally referred to as rows inside a dataset.
  • Output Test Data is used for validation purposes. Validation of responses can be done using regular expressions or against the actual data. In case, generic validations (like regular expression checking size of the confirmation id) are not used, output data is mapped 1-1 with the input data.
  • Server Environment Data refers to the endpoint data which varies between environments. Ability to change this data (at the execution time) allows the same set of test cases to be run against multiple environments.
  • Client Environment Data refers to local environment specific to the test framework and project.
  • Staging data allows the same test asset to be staged in different ways.

Sunday, August 27, 2006

Data Strategy (DS)

A lot of test automation engineers get confused when it comes to Data Driven Testing (DDT)

  • How tests should be structured?
  • What seed data to use?
  • When and How to load Data?
  • When to flush data?
  • How to bind test scripts with test data?
These are some of the very basic questions that must be answered in conjuction with the tooling and test automation strategies. Note: DDT is more than just passing arguments to a test library function.


When and How to load Data?
Well, it is imporant to understand data scope in conjunction with the test case and the automation strategy to address the "WHEN" question. There are several ways to address this.
  • DS1: Consider most of your data as "seed data" and load it upfront before actually kicking off your test automation. In this case, data cleanup is done after test scripts exit. This is the most preferrred strategy for Load & Performance testing. Data can be loaded directly into the database using SQL scripts or by running data suite. Data suite executes a set of test cases that takes care of loading and polulating the seed data.
  • DS2: Load data for every test suite and cleanup at the end of the suite. Make sure environment is clean for the next suite to run.
  • DS3: Load data for every test case. Every test is independent and takes care of its own data. This strategy is very cumbersome as the data is not leveraged among test cases.
  • DS4: Combo Strategy. Load high level data (which is going to be used most often) upfront and leave rest of the data to every test case.
In all of the above data strategies, test cases can use factory classes to get pre-populated data objects. These factory classes can encapsulate the whole data strategy, including creation and deletion of data.

Note: Remember, in DDT, you only need to worry about the end-user-data. Auto generated information, like session-ids, does not need to be part of the DS, unless you plan to skip some steps. Your automation framework must provide mechanisms to pass auto-generated data across test cases and test suites.

Final Thought:
Data Strategies can get really complex, and sometimes even more complex than your actual test cases. Try not to drift away too deep into fancy data strategies and loose sight of the real thing!

Data Driven Testing (DDT)

We all have heard about data driven testing and how it can improve our ROI exponentially and at the same time reduce maintenance costs. We all believe in it, but we still don't practice it to the extent we should. This blog describes the concepts using a pictorial notation in an effort to bringforth the structural thinking that one needs to implement DDT.

A picture is worth thousand words:-



Step 1:

You start your automation activity with some test procedures in hand. Refer to column 1. A test step is reflected by a donut, hexagon, triangle, etc. These different building blocks combine togather to form a structure, which depicts a test case. Different shapes and their combinations reflect various test cases in the test suite. As you can see, a test case can have many different steps as part of its procedure.

Step 2:

In next step, you start automating one test case at a time. When you encounter a second test case which contains the similar step that you automated before, you move that particular step into what we call "Test Library". You must filter out all common functionality into test libraries over time. Refer to column 2. It shows how you can take common shapes and make library out of them.

Step 3:

As part of Step 3, you separate variability from static functionality. This is the most important step in DDT. Most of the times the variability is the data input; but you may have more variability in your system that you may also want to identify and manage it as data. More the static functionality you can identify, more you'll be able to leverage your automated scripts across datasets. Refer to column 3. All colors depict data. In this step you also make your test libraries data driven.

Step 4:

Once you separate the variability from the core (static) functionality, you'll be able to identify opportunities for more test libraries or leverage existing libraries. Go through this iteration to make sure that you are leveraging your test case libraries to the fullest. Convert more common functionalities into libraries, if required. Refer to column 4.

The more you will think in terms of DDT and leveraging libraries, the easier it will get to automate more in less time. Apart from achieving higher productivity, you will realize that your automated test cases are a lot easier to maintain. Whenever some common functionality changes, you just need to update the corresponding test library. Just imagine updating hundreds of test cases, in case you are not using libraries. Similarly, when some data requirements changes, you will not have to change your automated scripts - just update your datasets and you will be done.

Final Thought:
Tools only provide mechanisms to accomplish DDT or create test libraries, but if we choose not to practice these concepts, we end up creating large autmation code which is both hard to maintain and sustain.