Introduction
Data type consistency across databases is crucial, especially for tasks like data migrations. Based on my experience in managing data migrations, I have developed a few SQL scripts that streamline the process of comparing and converting data types between database tables. These scripts are particularly valuable when working with large tables, often with more than 100 columns, where manual data type adjustments are not practical and are prone to errors.
Code
GitHub Repository: | https://github.com/masaki9/compare_table_column_data_types |
Script Overview
This repository includes two main scripts to ensure data type consistency and facilitate data migrations.
Compare Table Column Data Types (compare_table_column_data_types.sql)
This script compares the data types of columns in two tables within a database. By fetching column metadata from the INFORMATION_SCHEMA.COLUMNS table, it checks if the data types in the source table match those in the target table. It is invaluable for identifying discrepancies in the data types to ensure consistency before initiating data migrations.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
select
src.column_name as source_column,
src.data_type as source_type,
dst.column_name as target_column,
dst.data_type as target_type,
case
when dst.data_type = src.data_type then true
else false
end as is_data_type_same
from
(
select
ordinal_position,
column_name,
data_type
from
information_schema.columns
where
table_schema = 'test'
and table_name = 'source_tbl'
) src
join (
select
ordinal_position,
column_name,
data_type
from
information_schema.columns
where
table_schema = 'test'
and table_name = 'target_tbl'
) dst on src.column_name = dst.column_name
order by
src.ordinal_position asc;
Compare Table Column Data Types Example Output
Generate Data Type Casting SQL (generate_data_type_casting.sql)
When discrepancies are found, casting data correctly is essential. This script generates the necessary SQL expressions to cast data types from one table to match those in another. This functionality ensures that data fits into the target table and facilitates migrations.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
select
src.column_name as src_column,
src.data_type as src_type,
dst.column_name as dst_column,
dst.data_type as dst_type,
case
when dst.data_type = 'int' and src.data_type <> 'int' then concat('cast(nullif(', src.column_name, ',\'\') as unsigned) as ', src.column_name, ',')
when dst.data_type = 'decimal' and src.data_type <> 'decimal' then concat('cast(nullif(', src.column_name, ',\'\') as decimal) as ', src.column_name, ',')
when dst.data_type = 'varchar' and src.data_type <> 'varchar' then concat('cast(', src.column_name, ' as char) as ', src.column_name, ',')
when dst.data_type = 'date' and src.data_type <> 'date' then concat('str_to_date(nullif(', src.column_name, ',\'\'), \'%Y-%m-%d\') as ', src.column_name, ',')
else concat(src.column_name, ',')
end as `sql`
from
(select column_name, ordinal_position, data_type
from INFORMATION_SCHEMA.COLUMNS
where table_schema = 'test'and
table_name = 'source_tbl'
) src
join
(select column_name, ordinal_position, data_type
from INFORMATION_SCHEMA.COLUMNS
where table_schema = 'test' and
table_name = 'target_tbl'
) dst
on src.column_name = dst.column_name
order by dst.ordinal_position asc;
Generate Data Type Casting SQL Example Output
You can take the generated expressions and create a SQL statement to copy data from the source table into the target table, as shown in an example below:
1
2
3
4
5
6
7
8
9
10
11
12
13
insert into test.target_tbl
select
id,
name,
age,
email,
cast(nullif(salary,'') as decimal) as salary,
cast(nullif(quantity,'') as unsigned) as quantity,
cast(is_active as char) as is_active,
cast(nullif(total_hours,'') as unsigned) as total_hours,
str_to_date(nullif(last_login,''), '%Y-%m-%d') as last_login,
created_at
from test.source_tbl;
Discussion
I needed these scripts to handle data migration issues that I came across at work. Errors often occurred when data types between the source and target tables were not aligned. By automating the data comparison and type conversion processes, these scripts reduce manual efforts, which is especially beneficial for tables with extensive columns.
While the scripts shown here are used with small test tables, their strength is more apparent in larger applications. Anyone facing similar challenges can adapt the scripts to their environments to ensure data type consistency and support data migrations.