Why and when to use the SQL ‘N’ prefix in SQL Server (and in general)?

Hi, I haven’t blogged in a while, so this one is going to be a short one.

So, what is the N prefix and why and when should you use it? The prefix basically tells SQL Server to treat the characters after the N prefix as Unicode.

For example, let’s say we have this super basic database table, called Users:

All of the columns are self-explanatory. Note that all columns, except for Username, are of the nvarchar datatype. Simply put, nvarchar datatype supports pretty much all of the languages out there.

For example, if John Doe from USA wants to create an account, he could use his first and last names, written in his own alphabet (latin), and in the database the FirstName will be John and the LastName will be Doe.

This is also true if there is a japanese guy who also wants to register using the japanese alphabet without any problems.

So, in short, if you want your app to support multiple languages make sure to use the NVARCHAR and NCHAR datatypes for the appropriate columns.

Note that we used VARCHAR for the Username column, because I assume that the usernames would always be in latin and since VARCHAR also happens to takes less space than NVARCHAR (1 byte versus between 2 and 4 bytes per character), it might be better.

So, let’s get back to the N prefix and see why we need it and what kind of problems it solves with a few short examples.

Let’s say that we are writing a simplified parameterized INSERT query that adds a new user to the Users table.

  INSERT INTO [dbo].[Users]
  VALUES ('@Username', '@FirstName', '@LastName', '@About')

For the sake of the example we have two people – John Doe from USA and Aiko from Japan, who are going to create accounts. Here is what the queries for both of them would look like and the results in the table.

  1. John Doe, USA
 INSERT INTO [dbo].[Users]
 VALUES ('johndoe', 'John', 'Doe', 'Hi, I am John!');

2. Aiko (愛子), Japan

 INSERT INTO [FacebookDB].[dbo].[Users]
 VALUES ('aiko', '愛子', '佐藤', '井の中の蛙大海を知らず');

So, for John Doe everything worked as expected, but Aiko would be pretty mad because his first and last name as well as about section were replaced with ‘?’. What caused this buggy behavior?

Let’s first see, in a very simple terms, what SQL Server does when it starts processing the query with the japanese characters in it.

When SQL Server sees the query it uses its default code page, which in our case is Cyrillic_General_CI_AS, to convert the values for the given columns.

Simply put, Cyrillic_General_CI_AS contains all latin and cyrillic characters. So for the Username, since it is written in latin, there is no problem and SQL Server knows how to treat it. For the last three columns, however (FirstName, LastName and About), there is a problem. Since the default code page SQL Server uses doesn’t know/support these characters SQL Server simply converts every single character in there to a ‘?and inserts them in the database. That’s why you have two question marks for FirstName and LastName and 11 question marks for About – so it is not random.

So how could we fix? Let’s bring the N prefix to the rescue!

In order for our application / database to have a working multi language support we must rewrite our query a little and add N as a prefix before every NVARCHAR (and NCHAR, if we had one) column.

INSERT INTO [dbo].[Users]
VALUES ('aiko', N'愛子', N'佐藤', N'井の中の蛙大海を知らず');

And here is the result of the query with the N prefix above:

So, what exactly this prefix does?

When SQL Server sees the N prefix it knows that the characters inside the single quotes are Unicode characters (i.e. there could be latin, chinese, japanese or a combination of these or any other characters out there). Now the SQL Server won’t use its default code page (Cyrillic_General_CI_AS in our case) but instead it would convert the characters to Unicode and then INSERT them into the database resulting in a valid data inside our database.

TL;DR

If you want to support multiple languages, make sure your columns are either NVARCHAR or NCHAR, depending on the case, and always prefix your queries with N before those columns, as in the example above (and below).

INSERT INTO [dbo].[Users]
VALUES ('aiko', N'愛子', N'佐藤', N'井の中の蛙大海を知らず');

Thank you and happy coding! 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *