« All Software Development Articles

SQL Server Trailing Spaces: A Cautionary Tale

7/25/2018

Introduction

Traveling through the wilds of stored procedures an adventurous developer discovered a gem. The author of this particular script had placed a clever line, something like the following.


SELECT descriptive_attribute
FROM source_of_knowledge
WHERE ISNULL(descriptive_attribute, '') NOT LIKE '';

The inquisitive fellow thought to himself, "What is the purpose of this?" As he poked and prodded he found that using NOT LIKE was akin to trimming the descriptive_attribute. What a fascinating discovery!

But as he showed me this, I wondered why? Is this trick safe to use? Let's explore.

The following tests were ran with the SQL_Latin1_General_CP1_CI_AS collation. A good article describing SQL Server collation can be found at https://www.red-gate.com/simple-talk/sql/sql-development/questions-sql-server-collations-shy-ask/

Experimentation Begins

Let's start by pulling together our test subjects.


DECLARE @char_lower_a CHAR(10) = 'a';
DECLARE @char_upper_a CHAR(10) = 'A';
DECLARE @varchar_lower_a VARCHAR(10) = 'a';
DECLARE @varchar_upper_a VARCHAR(10) = 'A';
DECLARE @varchar_lower_a_padded VARCHAR(10) = 'a         ';
DECLARE @varchar_upper_a_padded VARCHAR(10) = 'A         ';
DECLARE @char_space CHAR(1) = ' ';
DECLARE @varchar_space VARCHAR(1) = ' ';
DECLARE @char_empty CHAR(1) = '';
DECLARE @varchar_empty VARCHAR(1) = '';

PRINT 'Server Properties:';
PRINT 'Collation: ' + CONVERT(VARCHAR(64), SERVERPROPERTY('collation'), 1);
PRINT '';
PRINT 'Declarations:';
PRINT '@char_lower_a: "' + @char_lower_a + '"';
PRINT '@varchar_lower_a: "' + @varchar_lower_a + '"';
PRINT '@varchar_lower_a_padded: "' + @varchar_lower_a_padded + '"';
PRINT '@char_space: "' + @char_space + '"';
PRINT '@varchar_space: "' + @varchar_space + '"';
PRINT '@char_empty: "' + @char_empty + '"';
PRINT '@varchar_empty: "' + @varchar_empty + '"';

Here is the PRINT output. We can see that CHARs are padded and VARCHARs respect what we've defined.


Server Properties:
Collation: SQL_Latin1_General_CP1_CI_AS
 
Declarations:
@char_lower_a: "a         "
@varchar_lower_a: "a"
@varchar_lower_a_padded: "a         "
@char_space: " "
@varchar_space: " "
@char_empty: " "
@varchar_empty: ""

The first tests will be for equality. Given the collation we've chosen we know what to expect. Case insensitivity, trailing whitespace is not considered. Here's the setup.


PRINT 'Tests:'
IF (@char_lower_a = @varchar_lower_a) PRINT '@char_lower_a = @varchar_lower_a'
ELSE PRINT '@char_lower_a <> @varchar_lower_a';

IF (@char_lower_a = @varchar_lower_a_padded) PRINT '@char_lower_a = @varchar_lower_a_padded'
ELSE PRINT '@char_lower_a <> @varchar_lower_a_padded';

IF (@varchar_lower_a = @varchar_lower_a_padded) PRINT '@varchar_lower_a = @varchar_lower_a_padded'
ELSE PRINT '@varchar_lower_a <> @varchar_lower_a_padded';

IF (@char_space = @varchar_space) PRINT '@char_space = @varchar_space'
ELSE PRINT '@char_space <> @varchar_space';

IF (@char_space = @char_empty) PRINT '@char_space = @char_empty'
ELSE PRINT '@char_space <> @char_empty';

IF (@char_space = @varchar_empty) PRINT '@char_space = @varchar_empty'
ELSE PRINT '@char_space <> @varchar_empty';

IF (@varchar_space = @char_empty) PRINT '@varchar_space = @char_empty'
ELSE PRINT '@varchar_space <> @char_empty';

IF (@varchar_space = @varchar_empty) PRINT '@varchar_space = @varchar_empty'
ELSE PRINT '@varchar_space <> @varchar_empty';

IF (@char_empty = @varchar_empty) PRINT '@char_empty = @varchar_empty'
ELSE PRINT '@char_empty <> @varchar_empty';

SQL Server faithfully follows the collation rules we expect, everything is equal.


@char_lower_a = @varchar_lower_a
@char_lower_a = @varchar_lower_a_padded
@varchar_lower_a = @varchar_lower_a_padded

And comparing the space to the empty string is consistent. Again, equality for all.


@char_space = @varchar_space
@char_space = @char_empty
@char_space = @varchar_empty
@varchar_space = @char_empty
@varchar_space = @varchar_empty
@char_empty = @varchar_empty

The Interesting Bits

But with LIKE things turn out a little differently. Setup the LIKE comparisons.


IF (@char_lower_a LIKE @varchar_lower_a) PRINT '@char_lower_a LIKE @varchar_lower_a'
ELSE PRINT '@char_lower_a NOT LIKE @varchar_lower_a';

IF (@varchar_lower_a LIKE @char_lower_a) PRINT '@varchar_lower_a LIKE @char_lower_a'
ELSE PRINT '@varchar_lower_a NOT LIKE @char_lower_a';

IF (@char_lower_a LIKE @varchar_lower_a_padded) PRINT '@char_lower_a LIKE @varchar_lower_a_padded'
ELSE PRINT '@char_lower_a NOT LIKE @varchar_lower_a_padded';

IF (@varchar_lower_a_padded LIKE @char_lower_a) PRINT '@varchar_lower_a_padded LIKE @char_lower_a'
ELSE PRINT '@varchar_lower_a_padded NOT LIKE @char_lower_a';

IF (@varchar_lower_a LIKE @varchar_lower_a_padded) PRINT '@varchar_lower_a LIKE @varchar_lower_a_padded'
ELSE PRINT '@varchar_lower_a NOT LIKE @varchar_lower_a_padded';

IF (@varchar_lower_a_padded LIKE @varchar_lower_a) PRINT '@varchar_lower_a_padded LIKE @varchar_lower_a'
ELSE PRINT '@varchar_lower_a_padded NOT LIKE @varchar_lower_a';

Here is the results. Some of these things are NOT LIKE the others.


@char_lower_a LIKE @varchar_lower_a
@varchar_lower_a NOT LIKE @char_lower_a
@char_lower_a LIKE @varchar_lower_a_padded
@varchar_lower_a_padded LIKE @char_lower_a
@varchar_lower_a NOT LIKE @varchar_lower_a_padded
@varchar_lower_a_padded LIKE @varchar_lower_a

When the arguments are in a different order we get different results. The right hand operand is treated differently. If the right hand operator has trailing spaces, the left hand operator will not be padded prior to comparison. Otherwise, both arguments will be padded to be the same length before comparison. Below we see that comparing just the space is the same.


IF (@char_space LIKE @varchar_space) PRINT '@char_space LIKE @varchar_space'
ELSE PRINT '@char_space NOT LIKE @varchar_space';

IF (@varchar_space LIKE @char_space) PRINT '@varchar_space LIKE @char_space'
ELSE PRINT '@varchar_space NOT LIKE @char_space';

IF (@char_space LIKE @char_empty) PRINT '@char_space LIKE @char_empty'
ELSE PRINT '@char_space NOT LIKE @char_empty';

IF (@char_empty LIKE @char_space) PRINT '@char_empty LIKE @char_space'
ELSE PRINT '@char_empty NOT LIKE @char_space';

IF (@char_space LIKE @varchar_empty) PRINT '@char_space LIKE @varchar_empty'
ELSE PRINT '@char_space NOT LIKE @varchar_empty';

IF (@varchar_empty LIKE @char_space) PRINT '@varchar_empty LIKE @char_space'
ELSE PRINT '@varchar_empty NOT LIKE @char_space';

IF (@char_empty LIKE @varchar_space) PRINT '@char_empty LIKE @varchar_space'
ELSE PRINT '@char_empty NOT LIKE @varchar_space';

IF (@varchar_space LIKE @char_empty) PRINT '@varchar_space LIKE @char_empty'
ELSE PRINT '@varchar_space NOT LIKE @char_empty';

IF (@varchar_space LIKE @varchar_empty) PRINT '@varchar_space LIKE @varchar_empty'
ELSE PRINT '@varchar_space NOT LIKE @varchar_empty';

IF (@varchar_empty LIKE @varchar_space) PRINT '@varchar_empty LIKE @varchar_space'
ELSE PRINT '@varchar_empty NOT LIKE @varchar_space';

IF (@char_empty LIKE @varchar_empty) PRINT '@char_empty LIKE @varchar_empty'
ELSE PRINT '@char_empty NOT LIKE @varchar_empty';

IF (@varchar_empty LIKE @char_empty) PRINT '@varchar_empty LIKE @char_empty'
ELSE PRINT '@varchar_empty NOT LIKE @char_empty';

The above test yields these results.


@char_space LIKE @varchar_space
@varchar_space LIKE @char_space
@char_space LIKE @char_empty -- note CHAR datatype is automatically padded
@char_empty LIKE @char_space
@char_space LIKE @varchar_empty
@varchar_empty NOT LIKE @char_space
@char_empty LIKE @varchar_space
@varchar_space LIKE @char_empty
@varchar_space LIKE @varchar_empty
@varchar_empty NOT LIKE @varchar_space
@char_empty LIKE @varchar_empty
@varchar_empty NOT LIKE @char_empty

Funny that things that are equal may not be like each other. A cautionary tale indeed. Be wary of using LIKE without pattern matching in mind.

Read the Friendly Manual

The reasoning is provided by Microsoft at this KB article https://support.microsoft.com/en-us/help/316626/inf-how-sql-server-compares-strings-with-trailing-spaces

"The ANSI standard requires padding for the character strings used in comparisons so that their lengths match before comparing them."

"When the right side of a LIKE predicate expression features a value with a trailing space, SQL Server does not pad the two values to the same length before the comparison occurs."

Done

Adventure forth! Friendly travelers feel free to contact me with thoughts or corrections.

Have a nice day.