Thursday, October 20, 2016

Syncing QA data with Production

Applies to: xSQL Data Compare v7.0.0 and higher

WARNING: this article focuses on the data transfer / synchronization task and does not address operation security issues and sensitive data scrubbing / obfuscation which are critical whenever dealing with live, production data.

Nowadays, Quality Assurance (QA) is a very important part of the development process for any company that strives to offer a reliable product, to satisfy its clients, and be competitive in the market. And let’s not forget the “side benefit” of reduction of life cycle costs.

One of the main aspects of QA are the data and databases it uses. These data are usually kept in a separate environment from the production and, to have the best possible quality assurance process, data from the production environment are copied to the QA environment. For this task there is a choice between two main options:
  • Backing up and restoring the live database. Although this might work with small databases, if we are dealing with large databases with many tables and millions of rows, it becomes a very expensive operation because the database will have to be recreated each time the synchronization is performed. Also, if the databases are in a cloud environment like Azure, which does not support restoring a backup, this option is automatically eliminated. As a final point, in those cases where backup – restore is a viable option, there is always the issue of automating the synchronization process, which in the case of backup - restore operations is problematic at best. 
  • Using comparison tools like our xSQL Data Compare. For any large databases, whose data change rapidly, this is probably the best option. And that’s because the comparison and synchronization process is highly customizable and easy to automate.
     There are two cases that show in detail how xSQL Data Compare’s features can be used in the QA synchronization process.

  1. Suppose you have a production database with a table which currently has 5 million rows and needs to be synchronized with the QA Database. One way to go about this is to back up the live database and restore it in QA. The problem here is very easy to identify. Unless this is the first synchronization, it’s very improbable that all 5 million rows of the table in the Production database will have differences from the table in QA.
    For argument’s sake, let’s say that there are 100,000 records out of sync. If one was to use the backup – restore option, 5 million new rows would be inserted in the table in the QA database. So there’s 4,900,000 unnecessary INSERT operations and the server will be doing 50 times the amount of work it actually needs to do. By any standards, this is unacceptable.
    The process can be made much more efficient by using xSQL Data Compare, because after the comparison, xSQL Data Compare generates a synchronization script
    only for the rows that are out of sync. This is a big improvement already, but it can be made even better. Since the synchronization process is, in most cases, performed periodically, every week for example, than you already know that the only rows that are out of sync are the ones added or modified in the week prior to the synchronization. So there is no need to compare all 5 million rows. Just the ones that are out of sync.
    You can do this by using the
    where clause of xSQL Data Compare in which you can enter conditions in the same way you would enter them in a SQL Query. Below is an example in which the where condition is specified as ModifiedDate >= DATEADD(DAY, -7,GETDATE()). This will compare only the records which were modified in the last week. To open the dialog shown in the picture click the button selected in red.

  2. As I said in the first case, synchronization of the QA environment is usually a periodic and very repetitive process, so automating it would save a lot of time for DBAs or people responsible for this task. Data Compare addresses this issue with its command line version. The comparison from the first case can also be done from xSQL Data Compare command line and scheduled to be run periodically with Windows Task Scheduler. The same options that were specified in the UI can be specified in an XML file that will be given as a parameter to xSQLDataCmd.exe. Below is the example XML configuration to perform the same comparison as in the first case.
<?xml version="1.0" encoding="utf-8"?>
<SqlDataCompareCmdConfig xmlns="">
  <!-- **** auto-generated data compare cmd line config file, created on 10/20/2016 4:14:43 PM **** -->
  <TablesMappingRules />
    <!-- ******* custom pairs ******* -->
    <!-- Sales.SalesOrderHeader -->
    <Pair Action="Include">
      <LeftObject Name="Sales.SalesOrderHeader" />
      <RightObject Name="Sales.SalesOrderHeader" />
      <WhereClause>ModifiedDate >= DATEADD(DAY, -7,GETDATE())</WhereClause>
    <!-- **************************** -->

Now, all that needs to be done is for the task to be scheduled in windows task manager and Data Compare will synchronize all the changes made in the last week in the production database. And just so you don’t have to manually write the XML file, xSQL Data Compare can generate it for you from the UI by clicking this button.
If, for any reason, the schema in your production database has changed and is out of sync with the QA database, use Schema Compare for SQL Server to sync the schemas first and then sync the data.

In conclusion, apart from the scenario described in this article there are a ton of other synchronization scenarios which, by using xSQL Data Compare, can be customized to be very efficient, and have the QA environment at your disposal in a very short time. For a full reference of the available customizations check out xSQL Data Compare’s onlinedocumentation.


Post a Comment