This is a rethink of my previous data retention strategies with some application deployment considerations thrown in.
This approach should be ok with most DBAs as using DMXes to retain data may be too foreign a concept for most managed operations. The deployment will not be automated using tools and procedures that are not developed by the deployment team so they can reuse what they have been doing in other applications and not have something different for your application.
The approach should work on any application, but Curam applications would benefit from the approach anyway.
Objectives
The following are the objectives I am trying to achieve in my approach.
- Data is retained from one build to the next.
- Support changes in "Reference Data" loads.
- Application Deployment should be performed using standard methods and not some specialized batch file file not coming from the deployment team.
- Infinitely repeatable in terms of number of environment supported.
Requirements
- You require at least three deployment environments to test on.
- You have a tool that would compare two different databases for both structure and data.
Environments
Environment 0 - This environment is where your developers perform their Integration Test. This is the first shared environment that the application would get deployed to. Data does not get retained in this environment. Automatic application deployment and database resets are performed only to this level. In Curam speak this is the only place where DMX files are used.
Environment 1 - This environment is where builds that are defined as stable and you have your test team performing their tests on. Data is retained in this environment.
Environment 2 - This can be your final environment or your user acceptance environment. Obviously data better be retained in this environment. Applications and database changes fully tested on Environment 1 are the only ones moved over to this environment.
Environment n - As this approach is infinitely repeatable, we can take as many environments each taking application and database changes from Environment n - 1.
Approach
The first thing you need to perform is set a base line. This tasks can be performed when at least two environments are put in place. Before deploying the code base from environment 0 to environment 1 for the first time, the environment 0 data is reset to only contain the reference data. The DBA then makes a backup or snapshot of the database state at that point - we call the backup Database Snapshot Build 0. The deployer also takes the EAR of the current application to prepare it for Environment 1 - we call the EAR file EAR Build 0.
Environment 1 then gets it's database restored from Database Snapshot Build 0. It then has EAR Build 0 deployed to it.
Now we have our base line ready for the next build.
So let's say the next build, Build z (where z > 0), is ready to be tested and deployed by the testers in Environment 1. The following things happen:
- Developers smoke test Build z on environment 0.
- The database is reset to only contain reference data in environment 0.
- DBA takes a Database Snapshot Build z.
- DBA uses his database difference tool to determine what has changed between Database Snapshot Build z and Database Snapshot Build z-1. The tool would generate the proper SQL change scripts.
- DBA restores the database in environment 0 to Database Snapshot Build z-1.
- DBA applies the SQL change scripts against the database in environment 0.
- The DBA verifies that the table changes relatively match Database Snapshot z. I say relatively as there may be slight differences especially since the SQL change scripts may only alter the table and not be able to drop columns.
- The DBA then makes a backup of the database of environment 1.
- The DBA applies their SQL change script against the database. If it works out, great, if not the SQL change script may need to be hand extended to support further data changes.
- EAR Build z is deployed on Environment 1.
- A smoke test on environment 1 is performed. If there is data corruption, the database is reverted to its backup and the SQL change script may need to be hand extended to support further data changes.
- At this point Environment 1 has been updated and its data has been retained.
Now let's say we want to deploy Build z (where z > 0)the to the final environment, let's call it environment n (where n > 1) as it can be repeated as many times as possible.
- In Environment n we perform a database backup.
- In Environment n we apply the database change script from Environment n-1. If it works out, great, if not the SQL change script may need to be hand extended to support further data changes. Most likely reason would be Primary Key IDs for reference data may be clashing in which case we need to remap the clashing IDs to other IDs while maintaining referential integrity.
- EAR Build z is deployed on Environment n.
- A smoke test on environment n is performed. If there is data corruption, the database is reverted to its backup and the SQL change script may need to be hand extended to support further data changes. Most likely the scripts need to be modified to perform data cleanup tasks.
- At this point Environment n has been updated to build z and its data has been retained.
Conclusion
I think I have covered the objectives I have set out earlier. With the exception of Environment 0 which is the environment were we bootstrap all the changes.
- Any data entered is retained from one build to the next due to the fact we only apply patch changes to the database rather than resetting the whole thing.
- The "Reference Data" is loaded up properly as the patch changes contain the reference data changes as well.
- Application Deployment is performed using standard deployment procedures, there are no specialized scripts required.
- There is no technical limit as to the number of environment levels that you can deploy as the deployment logic is recursive.
The toughest part may be locating a tool that would perform the database diffs that your DBA team would be comfortable with. It is best to let the DBA choose the tool and they can get pretty pricey, but it will all be worth it come your second release to production.