Weekly Tech Lesson: Using SELECT * INTO With SQL

in Home Edders3 years ago

Recently, a development team reached out because they faced issues when testing their data. The team had created a quick copy of the data with the SQL Server code of SELECT * INTO, but their testing failed when they used the newly created table in A-B testing. As they described their issue, I highlighted to the team that they didn't only need a copy of the data, but they needed a copy of the functionality as well. In other words, the data were only a part of their needs in their testing. While the execution of the code created a copy of the data, it did not create the same functionality. This makes a useful lesson to cover for this week involving creating copies of data in SQL (any method) where only the data are copied, but the structure underlying the data may not be copied.

Some points worth considering before we proceed to using the SELECT * INTO syntax:

  • Do we only need to test the validity of specific data?
  • Are we creating a quick backup copy of the data without needing the defaults?
  • Are we A-B testing specific data only or are we A-B testing data with functionality?
  • What CRUD operations, such as INSERT, UPDATE, DELETE and SELECT will we be performing on the copied table?
There are other considerations as well, but these make good starting points before we run out with a design that may not be appropriate in our situation. Like the development team I assisted, you may discover after the fact that the testing was insufficient because the wrong solution was used. There is a big difference between testing for data purposes only and testing for data and functionality - that functionality can be impacted by the schema underlying the data.


In any SQL language, before using any quick copy tool, we want to consider the reason for the copy. If we're only copying data because we need a quick backup, with SQL Server (in this case), the SELECT * INTO syntax will function as intended. But this means that we get an exact copy of data with nothing else from the table as far as defaults or possible schema needed in testing. For instance, if we need a copy of configuration data and we ONLY need the configuration data for possible reversion, then this technique will suffice as a backup. However, if we wanted to copy data for an A-B test, but our A-B test included data creation that had (1) primary keys, (2) default constraints, and (3) other restrictions, these would not be copied in the new table outside of the specific data values. The data values would be copied, but the specific schema would not (without the defaults and constraints applied, we only get a partial schema, not the full schema). Notice that our need for the copied data (and possibly structure) will determine whether this method of copying data functions correctly.

In the case of A-B testing where we want the functionality AND the data, a full script and copy will be required. This means that the full defaults and constraints are copied, as well as any dependencies, such as foreign keys. We cannot thoroughly test our code if we do not have everything created that is required in our testing. Keep in mind that testing data and functionality significantly differ from testing data only, or using data for a backup. Some of the testing discussion assumes we're testing in the same environment. In some contexts, we may have an A and B testing environment (ie: Sandbox A and Sandbox B). The advantage of these types of environments is that we can easily do A-B testing and avoid situations like the above discussion where we're trying to test side-by-side in the same environment on the same server within the same database. There are other techniques that are similar, which may save us time if testing is involved.


Source: SQL Basics: SELECT * INTO (Table Copy and Testing). Many SQL languages features ways to copying data that may differ some, but these techniques may also only copy the data without the matching schema.