Make sure your data conversion SQL is easily identifiable/reversible

Here’s an example of what I mean:

If this was your data conversion script (to flag users as deleted):

update users
set deleted = 1 where id = (
select id from users
where to_be_deleted = 1
and deleted = 0);

Assuming that all deleted users have the flag to_be_deleted set to 1, this script is not reversible. For example, once executed, how would you distinguish the data set you changed (deleted = 1) from all the other users that already have a flag of deleted = 1?

A few options to consider here:

  • If there is a date/time stamp field use that to identify your change date/time (this is probably the most common way).
  • If there is an “updated by” field use that to identify your change by the user who made it.
  • If there is any other column you can use to tag the changes without adversely impacting the data use that (e.g. a comments or notes column).
  • Generate individual update statements for each user id you are updating (use SQL to do this). This is more time consuming to reverse of course but it is still possible to do so.

The key is to think about the scenario, “after running this script, can I identify the data set that was changed easily?”. If not see if you can come up with a reasonable way to do this, it will save you later!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store