Varchar Vs Nvarchar Và Varchar Dan Nvarchar, Sự Khác Biệt Giữa Varchar Và Nvarchar 2021

as a general rule, should you use varchar instead of nvarchar, if you won”t ever be storing foreign language characters?

Attachments: Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Cathyji-msft · Jan 25 at 02:08 AM
SQLGuy10-3060,

We have not received a response from you. Did the replies could help you? If the response helped, do “Accept Answer”. By doing so, it will benefit all community members who are having this similar issue. Your contribution is highly appreciated.

Đang xem: Nvarchar và varchar

Best regards,Cathy

*

If you never need to make use of the extended characters available in UNICODE then yes, it does make sense to only use VARCHAR.

However, if you are only using characters in the UNICODE range 0-65535 then they use the same storage space as VARCHAR.

See the discussion about NCHAR and NVARCHAR

Remember also that if you use UNICODE then you MUST always prefix strings with “N” otherwise you will introduce an implicit conversion and also affect the SARGability of your predicates.

Attachments: Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

*

TiborKaraszi · Jan 19 at 08:05 AM

Martin,

I”m confused by your below statement:

“However, if you are only using characters in the UNICODE range 0-65535 then they use the same storage space as VARCHAR.”

Above is, AFAIK, true if you use row compression and the data isn”t off-row. But without row compression, nvarchar uses double the length compared to varchar:

USE tempdb /* DROP TABLE IF EXISTS t4 DROP TABLE IF EXISTS t5 */ CREATE TABLE t4(c1 int IDENTITY, c2 varchar(40)) CREATE TABLE t5(c1 int IDENTITY, c2 nvarchar(40)) INSERT INTO t4 SELECT TOP(50000) REPLICATE(“a”, 40) FROM sys.columns AS a CROSS JOIN sys.columns AS b INSERT INTO t5 SELECT TOP(50000) REPLICATE(“a”, 40) FROM sys.columns AS a CROSS JOIN sys.columns AS b EXEC sp_spaceused “t4” EXEC sp_spaceused “t5” Did I misunderstand you?

0 Votes 0 ·

*

MartinCairney-6481 TiborKaraszi · Jan 19 at 09:55 PM

Hi
TiborKaraszi – see the link to the sentory.vn page in my post in the Remarks section

“when using characters defined in the Unicode range 0-65,535, one character can be stored per each byte-pair. However, in higher Unicode ranges (65,536-1,114,111) one character may use two byte-pairs.”

There isn”t mention of compression there, but after reading it again, it may also depend on the collation that you have set. I haven”t experimented to see what differences there are.

On the test however, I would use a variables defined as VARCHAR(x) and NVARCHAR(x) and compare the length of that for the same string. I thik for the page storage you would need to dig into DBCC PAGE and see exactly what has been written to validate that test.

0 Votes 0 ·
ErlandSommarskog MartinCairney-6481 · Jan 19 at 11:15 PM

“when using characters defined in the Unicode range 0-65,535, one character can be stored per each byte-pair.

A byte pair is two bytes per character. In varchar, each character in the ASCII range takes up one byte each. For characters beyond the ASCII range, it depends on the code page for the collation. But for collations for languages using Latin, Cyrillic or Greek script but based on UTF-8, that is one byte for every character.

It is different if you use UTF-8, or a collation for languages like Thai or Chinese.

0 Votes 0 ·

*

TiborKaraszi MartinCairney-6481 · Jan 20 at 10:35 AM

Sorry if I state the obvious (after Erland”s post), but the key word you probably overlooked is pair. I.e. a byte pair is indeed two bytes. 🙂

As for a T-SQL example not using a table, here goes:

SELECT DATALENGTH(“sentory.vn SQL Server “), DATALENGTH(N”sentory.vn SQL Server “); Result

———– ———– 25 50
0 Votes 0 ·
SolomonRutzky · Jan 20 at 06:21 AM
“characters in the UNICODE range 0-65535…use the same storage space as VARCHAR” is incorrect. NVARCHAR uses 2 bytes per BMP character (0-65535) and 4 bytes per Supplementary Character (65536-1114111). For details please see my post: How Many Bytes Per Character in SQL Server.

Using the N prefix is great advice, but it”s mainly to avoid data loss (when Unicode chars not available in the code page associated with the current database”s default collation get converted to either ? or a “best fit” mapping of something similar). The performance issue is specific to a VARCHAR column using a SQL Server collation compared to an NVARCHAR value. For details please see my post: Impact on Indexes When Mixing VARCHAR and NVARCHAR Types.

Xem thêm: Ngân Hàng Đại Lí (Correspondent Banks) Là Gì? Tài Khoản Nostro Là Gì ?

0 Votes 0 ·

*

TiborKaraszi SolomonRutzky · Jan 20 at 10:43 AM

Hi again Solomon. Did you also miss the word “pair” in that MS statement? I.e., “Unicode range 0-65,535, one character can be stored per each byte-pair. So unless I am out-of-the-ordinary confused, you seem to say the same thing as the documentation: that range uses two bytes.

The fact that both you and Martin (seem to have) missed that “pair” does indeed reflect that there”s room for improvement in how MS phrases things, though!

0 Votes 0 ·
SolomonRutzky TiborKaraszi · Jan 20 at 11:39 PM

Hello. No, I definitely did not miss that word ;-). I was actually responding to Martin, quoting part of this answer. However, my comment was long and I had to cut parts out to fit within the limit (which is not the 1000 that it claims to be, but closer to 985). I updated my comment earlier today to include more of the quote so it would be clearer what I was responding to. Yes, I”m saying the same thing as the documentation, except I prefer to use number of bytes for everything as it”s not only clearer, but is consistent when speaking of the other options.

And yes, there are several areas of the documentation that need improvement, but only so much time in a day, and even less these days 🙁

0 Votes 0 ·

Click to vote 0 Votes” 0 Click to down vote
Cathyji-msft answered · Jan 19 at 03:02 AM

Hi
SQLGuy10-3060,

If we know that data to be stored in the column or variable doesn’t have any Unicode characters, we can use varchar. But some experts recommends nvarchar always because: since all modern operating systems and development platforms use Unicode internally, using nvarchar rather than varchar, will avoid encoding conversions every time you read from or write to the database.

The major difference between varchar vs nvarchar

Nvarchar stores UNICODE data. If you have requirements to store UNICODE or multilingual data, nvarchar is the choice. Varchar stores ASCII data and should be your data type of choice for normal use.

For nvarchar, when using characters defined in the Unicode range 0-65,535, one character can be stored per each byte-pair, however, in higher Unicode ranges (65,536-1,114,111) one character may use two byte-pairs. Whereas varchar only uses 1 byte.

Advantages and Disadvantages of varchar and nvarchar in SQL Server.

By the way, starting with SQL Server 2019 (15.x), consider using a UTF-8 enabled collation to support Unicode and minimize character conversion issues.

Please refer to below links to get more information.

Xem thêm: Nghĩa Của Từ Insomnia Là Gì, Insomnia Trong Tiếng Tiếng Việt

Difference Between Sql Server VARCHAR and NVARCHAR Data TypeWhat is the difference between varchar and nvarchar?SQL Server differences of char, nchar, varchar and nvarchar data types

Best regards,Cathy

If the response is helpful, please click “Accept Answer” and upvote it, thank you.Note: Please follow the steps in our documentation to enable e-mail notifications if you want to receive the related email notification for this thread.

Xem thêm bài viết thuộc chuyên mục: Hỏi Đáp

Related Articles

Trả lời

Check Also
Close
Back to top button