一个字符校对规则引发的血案

举报
DRS技术快客 发表于 2021/05/31 15:38:05 2021/05/31
【摘要】 问题现场我们先看一个建表语句CREATE TABLE collate_test (val1 char(32) COLLATE utf8mb4_general_ci,val2 char(32)) CHARACTER SET utf8mb4;当我们在MySQL5.7和MySQL8.0上建表,都能建成功,但是当建成功之后,我们都执行SQL:SELECT * FROM collate_test W...

问题现场

我们先看一个建表语句

CREATE TABLE collate_test (
val1 char(32) COLLATE utf8mb4_general_ci,
val2 char(32)
) CHARACTER SET utf8mb4;

当我们在MySQL5.7和MySQL8.0上建表,都能建成功,但是当建成功之后,我们都执行SQL:SELECT * FROM collate_test WHERE val1=val2的时候:
在5.7上执行

mysql> SELECT * FROM collate_test WHERE val1=val2;
Empty set (0.00 sec)

在8.0上执行

mysql> SELECT * FROM collate_test WHERE val1=val2;
ERROR 1267 (HY000): Illegal mix of collations (utf8mb4_general_ci,IMPLICIT) and (utf8mb4_0900_ai_ci,IMPLICIT) for operation '='

很奇怪,为什么会出现utf8mb4_0900_ai_ci呢
我们查看MySQL的资料https://dev.mysql.com/doc/refman/8.0/en/charset-mysql.html 发现,原来MySQL8.0在UTF8mb4字符集下面的默认排序规则为utf8mb4_0900_ai_ci

现场分析

然后我们再分别来看一下建表语句:SHOW CREATE TABLE collate_test
在5.7上执行

 show create table collate_test;
+--------------+--------------------------------------------------------------------------------------------------------------------------------------+
| Table        | Create Table                                                                                                                         |
+--------------+--------------------------------------------------------------------------------------------------------------------------------------+
| collate_test | CREATE TABLE `collate_test` (
  `val1` char(32) DEFAULT NULL,
  `val2` char(32) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 |
+--------------+--------------------------------------------------------------------------------------------------------------------------------------+

在8.0上执行

show create table `collate_test`;
+--------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table        | Create Table                                                                                                                                                                                                     |
+--------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| collate_test | CREATE TABLE `collate_test` (
  `val1` char(32) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci DEFAULT NULL,
  `val2` char(32) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci |
+--------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+

看出来差别了,8.0上建表的时候,被加上了collate属性
在8.0执行

mysql> SHOW CHARACTER SET WHERE Charset="utf8mb4";
+---------+---------------+--------------------+--------+
| Charset | Description   | Default collation  | Maxlen |
+---------+---------------+--------------------+--------+
| utf8mb4 | UTF-8 Unicode | utf8mb4_0900_ai_ci |      4 |
+---------+---------------+--------------------+--------+
1 row in set (0.01 sec)

原来8.0中建表的时候,当指定字符集为utf8mb4的时候,它的默认collation就是utf8mb4_0900_ai_ci,而mysql不允许两个互斥的校验规则的数据做对比,而utf8mb4_0900_ai_ci与utf8mb4_general_ci是互斥的

扩展问题

这里面问题比较简单,因为一般我们不会对同一个表的不同字段设置相同字符集不同校对规则,但是在不同的表结构之前,我们有可能不经意之间就犯了这个错误,例如,联表,触发器等

联表查询

比如下面两个表

CREATE TABLE collate_general(
val1 char(32)
) COLLATE utf8mb4_general_ci;

CREATE TABLE collate_0900 (
val2 char(32)
) COLLATE utf8mb4_0900_ai_ci;

当我们联表查询的时候

mysql> select * from collate_general,collate_0900 where val1=val2;
ERROR 1267 (HY000): Illegal mix of collations (utf8mb4_general_ci,IMPLICIT) and (utf8mb4_0900_ai_ci,IMPLICIT) for operation '='
触发器

比如我们先建一个表和触发器(为举例需要,触发器并无实际意义)

CREATE TABLE collate_trigger(
val1 char(32)
) COLLATE utf8mb4_general_ci;

DELIMITER ||
CREATE TRIGGER trigger_0900 AFTER INSERT ON collate_trigger FOR EACH ROW
BEGIN
    DECLARE val2 VARCHAR(32);
    SET val2=new.val1;
    SELECT val1 into val2 from collate_trigger WHERE val1=val2;
END||
DELIMITER ;

当我们向表中插入数据的时候

mysql> insert into collate_trigger values ('abc');
ERROR 1267 (HY000): Illegal mix of collations (utf8mb4_general_ci,IMPLICIT) and (utf8mb4_0900_ai_ci,IMPLICIT) for operation '='

然后我们看一下建表语句

mysql> show create table collate_trigger;
+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table           | Create Table                                                                                                                                                  |
+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+
| collate_trigger | CREATE TABLE `collate_trigger` (
  `val1` char(32) COLLATE utf8mb4_general_ci DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci |
+-----------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------+

我们再看一下触发器

mysql> show create trigger trigger_0900\G
*************************** 1. row ***************************
               Trigger: trigger_0900
              sql_mode: ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION
SQL Original Statement: CREATE DEFINER=`root`@`localhost` TRIGGER `trigger_0900` AFTER INSERT ON `collate_trigger` FOR EACH ROW BEGIN
    DECLARE val2 VARCHAR(32);
    SET val2=new.val1;
    SELECT val1 into val2 from collate_trigger WHERE val1=val2;
END
  character_set_client: gbk
  collation_connection: gbk_chinese_ci
    Database Collation: utf8mb4_0900_ai_ci
               Created: 2021-05-31 15:24:44.40

发现没有,触发器的Database collation为utf8mb4_0900_ai_ci,
在触发器的比较语句中
val1为collate_trigger的字段,collation为utf8mb4_general_ci,
val2为触发器trigger_0900的自有字段,collation为utf8mb4_0900_ai_ci

本文中举例都比较简单直接,客户真实业务场景可能都比较复杂,但是所遇问题的原因都是一样的
由此可见,在处理MySQL之前的版本升级到8.0版本的时候,字符集校验规则一定要注意了

【版权声明】本文为华为云社区用户原创内容,转载时必须标注文章的来源(华为云社区)、文章链接、文章作者等基本信息, 否则作者和本社区有权追究责任。如果您发现本社区中有涉嫌抄袭的内容,欢迎发送邮件进行举报,并提供相关证据,一经查实,本社区将立刻删除涉嫌侵权内容,举报邮箱: cloudbbs@huaweicloud.com
  • 点赞
  • 收藏
  • 关注作者

评论(0

0/1000
抱歉,系统识别当前为高风险访问,暂不支持该操作

全部回复

上滑加载中

设置昵称

在此一键设置昵称,即可参与社区互动!

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。

*长度不超过10个汉字或20个英文字符,设置后3个月内不可修改。