Loading... # MSSQL Server ## char(n) vs varchar(n) * `n`为字节单位而非字符单位,这是有区别的,在单字节编码字符集存储时,`n`的大小即为字符长度,当所存储的字符无法使用单字节编码字符集表示时则会采用双字节字符集存储,`n`的大小则是字符的2倍 * `char`会存储固定字节长度的字符,如果定义`char(4)`则插入`'ab'`会被存储为`'ab '`,超出字节长度的字符串会被截断。实际存储大小总是`4 bytes`,另外插入`'ab '`后查询出的结果是会去除末尾空格为`'ab'` * `varchar`存储可变字节长度的字符,如果定义`varchar(4)`则插入`'ab'`即存储为`'ab'`而不会向`char`一样使用空格填充,在检索时也不会去除末尾空格。然而使用`varchar(n)`存储的字符总是会比真实大小大`2 bytes`,因为`varchar`类型总是会使用`2 bytes`来存储字节长度,`2 bytes`最大可表示`65535`个字节长度,而`varchar(n)`最大为`1~8000 bytes`,所以`2 bytes`足够存储最大字节长度 | 插入值 | char(4) | 实际存储大小 | varchar(4) | 实际存储大小 | | :---: | :-----: | :---------: | :--------: | :---------: | | `''` | `' '` | 4 bytes | `''` | 0+2 bytes | | `'ab'` | `'ab '` | 4 bytes | `'ab'` | 2+2 bytes | | `'abcd'` | `'abcd'` | 4 bytes | `'abcd'` | 4+2 bytes | | `'abcdefg'` | `'abcd'` | 4 bytes | `'abcd'` | 4+2 bytes | ## [var]char(n) vs n[var]char(n) * 很多人或许会对`char/nchar`和`varchar/nvarchar`之间的区别感到困惑,这不过是不同的编码方式带来的差异 * 带有`n`前缀的表示使用`Unicode`标准以及[UTF-16](https://en.wikipedia.org/wiki/UTF-16)编码方式(适用于开启了补充字符(SC)的排序规的情况下,如果未开启,则使用[UCS-2](https://en.wikipedia.org/wiki/Universal_Coded_Character_Set)编码方式),前者规定了每个字符对应的代码点`(Code Point)`是多少,后者表示如何对`Unicode`定义的代码点进行编码,由于`UTF-16`在[BMP](https://en.wikipedia.org/wiki/Plane_(Unicode)#Basic_Multilingual_Plane)区域使用`2 bytes`进行编码,超出`BMP`区域的则使用`4 bytes`编码,所以`nvarchar`实际存储大小需要根据具体的值来确定,例如存储`'a'`则占用大小为`2 bytes`,存储一个`emoji`表情则可能占用`4 bytes` * 对于`[var]char`,则使用对应排序规则的代码页进行编码,这意味着不同区域无法通用存储的编码字符。不过从SQL Server 2019(15.x)开始,如果开启了`UTF-8`排序规则,则使用`UTF-8`进行编码,与`UTF-16`一样能够表示`Unicode`标准所定义的字符集。`UTF-8`对于字符的编码长度也是可变的,例如`ASCII`码表所定义的字符则只需要`1 byte`,对于一个非`BMP`区域的字符则可能需要最大`4 bytes`进行编码 #### 下表所示的`varchar`使用`UTF-8`排序规则 | 插入值 | varchar(50) | 实际存储大小 | nvarchar(50) | 实际存储大小 | | :---: | :-----: | :---------: | :--------: | :---------: | | `''` | `''` | 0+2 bytes | `''` | 0+2 bytes | | `'a'` | `'a'` | 1+2 bytes | `'a'` | 2+2 bytes | | `'abcd'` | `'abcd'` | 4+2 bytes | `'abcd'` | 8+2 bytes | | `'😀'` | `'😀'` | 4+2 bytes | `'😀'` | 4+2 bytes | # MySQL ## char(n) vs varchar(n) 相比较`MSSQL Server`,`MySQL`中的`n`则很好理解:无论采用哪种编码方式,它始终表示字符串的长度。在`MySQL`中,`varchar`类型也有一个长度值,然而与`MSSQL Server`所不同的是,它的存储长度值并不总是`2 bytes`,而是存储字节超过`255 bytes`时才使用两个字节存储 | 插入值 | char(4) | 实际存储大小 | varchar(4) | 实际存储大小 | | :---: | :-----: | :---------: | :--------: | :---------: | | `''` | `' '` | 4 bytes | `''` | 0+1 bytes | | `'ab'` | `'ab '` | 4 bytes | `'ab'` | 2+1 bytes | | `'abcd'` | `'abcd'` | 4 bytes | `'abcd'` | 4+1 bytes | | `'abcdefg'` | `'abcd'` | 4 bytes | `'abcd'` | 4+1 bytes | # Reference * [nchar and nvarchar (Transact-SQL)](https://docs.microsoft.com/en-us/sql/t-sql/data-types/nchar-and-nvarchar-transact-sql?view=sql-server-ver15) * [char and varchar (Transact-SQL)](https://docs.microsoft.com/en-us/sql/t-sql/data-types/char-and-varchar-transact-sql?view=sql-server-ver15) * [Single-Byte and Multibyte Character Sets](https://docs.microsoft.com/en-us/cpp/c-runtime-library/single-byte-and-multibyte-character-sets?view=vs-2019) * [What is the difference between varchar and nvarchar?](https://stackoverflow.com/questions/144283/what-is-the-difference-between-varchar-and-nvarchar) * [The CHAR and VARCHAR Types](https://dev.mysql.com/doc/refman/5.7/en/char.html) * [Do you say “2 Byte” or “2 Bytes”?](https://ell.stackexchange.com/questions/93379/do-you-say-2-byte-or-2-bytes) * [How does the varchar(limit) work?](https://stackoverflow.com/questions/35566648/how-does-the-varcharlimit-work) * [UTF-16](https://en.wikipedia.org/wiki/UTF-16) * [Collation and Unicode support](https://docs.microsoft.com/en-us/sql/relational-databases/collations/collation-and-unicode-support?view=sql-server-ver15#utf8) * [Universal Coded Character Set](https://en.wikipedia.org/wiki/Universal_Coded_Character_Set#Encoding_forms) * [What is the difference between char, nchar, varchar, and nvarchar in SQL Server?](https://stackoverflow.com/questions/176514/what-is-the-difference-between-char-nchar-varchar-and-nvarchar-in-sql-server) * [International Features in Microsoft SQL Server 2005](https://docs.microsoft.com/en-us/previous-versions/sql/sql-server-2005/administrator/bb330962(v=sql.90)?redirectedfrom=MSDN#data-types-in-sql-server-2005) * [Unicode(UTF-8, UTF-16)令人混淆的概念](https://www.cnblogs.com/kingcat/archive/2012/10/16/2726334.html) * [Unicode 编码及 UTF-32, UTF-16 和 UTF-8](https://zhuanlan.zhihu.com/p/51202412) * [The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)](https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/) * [Unicode](https://en.wikipedia.org/wiki/Unicode) * [Code page](https://en.wikipedia.org/wiki/Code_page#Relationship_to_Unicode) * [Plane (Unicode)][1] * [UNICODE (Transact-SQL)](https://docs.microsoft.com/en-us/sql/t-sql/functions/unicode-transact-sql?view=sql-server-ver15) * [UTF-8](https://en.wikipedia.org/wiki/UTF-8) * [Universal Coded Character Set](https://en.wikipedia.org/wiki/Universal_Coded_Character_Set) 最后修改:2020 年 09 月 22 日 © 允许规范转载 赞 0 如果觉得我的文章对你有用,请随意赞赏