Debug School

Cover image for NLS Sorting
Suyash Sambhare
Suyash Sambhare

Posted on

NLS Sorting

Information

The National Language Support (NLS) features assist applications in meeting the diverse language and locale-specific needs of users around the world. New Windows versions virtually often involve NLS updates. This modification affects collation and sorting, as well as applications that use persistent indexes.

A collation table has two numbers that indicate its version: the specified version and the NLS version. Both versions are DWORD values, consisting of a major and minor version. The first byte of a value is reserved, the following two bytes represent the major version, and the final byte represents the minor version. In hexadecimal, the pattern is 0xRRMMMMmm, where R stands for Reserved, M for Major, and m for Minor. Such as, a major version of 3 with a minor version of 4 is represented as 0x304.

For a major version, one or more code points change, necessitating that the program re-index all data to make meaningful comparisons. For a minor version, nothing changes, yet code points are added. For this type of version, the program merely has to re-index strings with previously unsorted values. To summarise, here is what the version numbers indicate about the data changes in the locale-specific exception and default tables:

  • NLSVersion Major – Changed code points in the 'exception,' or locale-specific tables.
  • NLSVersion Minor – Added new code points in the 'exception,' or locale-specific tables.
  • DefinedVersion Major – Changed code points in the default table.
  • DefinedVersion Minor – Added new code points in the default table.
Operating System Release Version (0xRRMMMMmm)
Windows XP RTM/SP1/SP2/SP3/… N/A - no GetNLSVersion() API
Windows Server 2003 RTM/SP1 0x00 0000 01
Windows Vista RTM/SP1 0x00 0405 00
Windows Server 2008 RTM 0x00 0501 00 / 0x00 5001 00
Windows 7 RTM 0x00060100

Manifestation

Applications such as databases, with persistent indexes that do not check the NLS version and re-index when it changes will fail to sort properly or return the necessary results. In user interfaces, lists such as, alphabetical, numeric, alphanumeric, symbols, and so on, may be sorted wrongly.

NLS

Solution

To receive both the specified version and the NLS version for a collation table, your program can use either GetNLSVersionEx (Windows 6.0 or later) or GetNLSVersion (pre-Windows 6.0).

GetNLSVersionEx

Returns information about the current version of a particular NLS capability for a locale specified by name. This function enables an application, such as Active Directory, to determine whether a NLS change impacts the locale used by a certain index table. If not, there is no reason to re-index the table. For further information, see Managing Locale and Language Data.
This function supports custom locales. If lpLocaleName specifies a supplemental locale, the data returned is appropriate for the collation order associated with that supplemental locale. Versions of Windows prior to Windows Vista do not support GetNLSVersionEx.

GetNLSVersion

This is to be used for applications running on versions of Windows prior to Windows 6.0. It retrieves details for a location given by identifier regarding the most recent version of a given NLS capability. With the use of this function, an application like Active Directory can ascertain whether a change in the NLS has an impact on the locale identifier assigned to a certain index table. There is no need to re-index the table if it does not.
Only data pertaining to the locale that the identifier specifies is retrieved by this function. This function, GetNLSVersionEx, supports RFC 4646 names in addition to extra locales and features. Nevertheless, GetNLSVersionEx is not supported by Windows versions earlier than Windows 6.0.

Programs designed just for Windows 6.0 and above versions ought to utilise GetNLSVersionEx instead of this procedure. There is decent support for additional locales with GetNLSVersionEx.

Test

How to determine whether a collation version has changed and a reindex is necessary:

When indexing your data initially, use GetNLSVersionEx() to retrieve a NLSVERSIONINFOEX structure.

To determine the version, keep the following characteristics stored with your index: The sorting table version you are using is specified by the combination of these two properties: NLSVERSIONINFOEX.dwDefinedVersion and NLSVERSIONINFOEX.dwNLSVersion.
NLSVERSIONINFOEX.dwEffectiveId - This indicates your sort's effective locale. An in-box locale's sort will be referenced by a custom locale.
Use GetNlsVersionEx() to find the version of your data when using the index.

Any indexing you do may not locate records if any of the three attributes has changed, and the sorting data you are using may yield different results.

You can consider your data to be the same if just the low byte of dwNLSVersion and dwDefinedVersion changed as the minor versions mentioned above, provided that you KNOW that your data does not contain incorrect Unicode code points. That is, all of your strings returned TRUE from a call to IsNLSDefinedString().

https://learn.microsoft.com/en-us/windows/win32/win7appqual/nls-sorting-changes

Top comments (0)