Wednesday, September 05, 2007

What do you mean "Duplicate record on insert"?

One of the things you may eventually end up doing is doing a build dataextract to capture some of your test or reference data for locations and organization structures. So what happens when you run your tests again?

infrastructure:RUN_ID_DUPLICATE_RECORD: Duplicate record on insert.

Odd problem eh? I think the reason for this is the algorithm Curam uses when generating Unique IDs. Not sure exactly what they use, but it still has a chance of getting duplicates.

Currently the way I fix this is to do a search and replace on the IDs in the DMX files. The approach I take is

  1. load up the DMX file whose ID I want to change in Excel
  2. copy the ID column to another worksheet
  3. Add another column with the formula =1000 + ROW()
  4. Fill the column down.
  5. Add another column with the formula ="perl -pi.bak -e ""s/"&A1&"/"&B1&"/g"" filename1.dmx filename2.dmx"
  6. Fill the column down.
  7. Then I copy and paste the new column to a Command Prompt window and let it do the search and replace.
I guess Curam should improve their algorithm for the unique ID generator. I do hope they are using SecureRandom at least.

4 comments:

Peter said...

You could also change the keyserver dmx file to start from a different block.

Craig said...

Just for reference, unlike codetables you cannot merge dmx files.

I know Archie recommends copying all the datamanger files to folder in custom (or your project folder), calling it Reference_Data and update the datamanger_config.xml appropriately.

This means all the system data will reside in your folder, so make sure you change the datamanger_config.xml to ignore the original location.

It's a pain if you just want to change the starting values for one of the KeyServer entries, but it keeps everything nice and clean.

Archimedes Trajano said...

@Peter

Yup, you can do it if you are generating the DMX files yourself. It is more difficult if you are trying to merge existing data extracts from multiple sources.

Archimedes Trajano said...

@Craig
Some other folders you may write are.

env_Data (for your environment specific data) such as USERS, ORGANIZATIONUSERLINK, etc.

Ideally you should transition to the point where you do not use any of the "Initial Data" from core to only use your own.

However, still have DEMO_Data containing information for some person and case data so you don't have to recreate them manually every time.