CASE WHENs Are Not All Alike

In SQL, CASE WHEN statements can be written two ways

Method 1

CASE 
  WHEN field_name = value THEN result
  WHEN field_name = value THEN result
  ELSE result 
END

Or

Method 2

CASE field_name 
  WHEN value THEN result
  WHEN value THEN result 
  ELSE result 
END

I usually use the first method. It's what I'm used to and it's more flexible to more situations - I can use it with multiple conditions, multiple fields, etc. But I've always felt that the second method looks more efficient and so when I find myself with simpler CASE WHENs, lately I've been trying to use the second method. I've always assumed that they operate similarly, which is why I was surprised when I ran into something like the following. 

The below is test SQL replicating the type of issue I ran into. 

WITH student_grades (school_id, student_id, grade, date) AS 
(
    VALUES
    (6, 1, 94, '2023-01-31'),
    (6, 1, 82, '2023-02-10'),
    (6, 1, 88, '2023-02-25'),	
    (6, 1, 80, '2023-02-25'),	
    (12, 2, 98, '2023-01-31'),
    (12, 2, 97, '2023-02-10'),
    (12, 2, 95, '2023-02-25'),	
    (12, 2, 98, '2023-02-25'),	
    (12, 3, 78, '2023-01-31'),
    (12, 3, 82, '2023-02-10'),
    (12, 3, 85, '2023-02-25'),	
    (12, 3, 80, '2023-02-25'),
    (12, 4, NULL, '2023-02-10'),
    (12, 4, NULL, '2023-02-25'),	
    (12, 4, NULL, '2023-02-25'),
    (12, 4, NULL, '2023-02-25')
),

public_school_data (school_id, school_name, public_school) AS 
(
    VALUES 
    (6, 'Campbell High School', TRUE),
    (7, 'Spencer Middle School', TRUE)
)

-- Method 1
SELECT 
    CASE 
        WHEN public_school IS NOT NULL THEN 'Public'
        ELSE 'Private'
    END AS school_type, 
    student_grades.*
FROM student_grades
LEFT JOIN public_school_data 	
    ON public_school_data.school_id = student_grades.school_id;


-- Method 2
SELECT 
    CASE public_school_data.public_school
        WHEN NOT NULL THEN 'Public'
        ELSE 'Private'
    END AS school_type, 
    student_grades.*
FROM student_grades
LEFT JOIN public_school_data 	
          ON public_school_data.school_id = student_grades.school_id;

I wrote SQL similar to Method 2 and expected output similar to Method 1.

Here’s my output from Method 1, pretty much as expected. Records with school_id = 6 are ‘Public’ schools and records with school_id = 12 are ‘Private.’

But now, here’s my output from Method 2. 

Now, all of a sudden, all the schools are ‘Private’! I reread my CASE statement multiple times. Did I have my NULLs and NOT NULLs reversed? No. It looks correct, and it wasn’t apparent what was happening.

Which led me to investigate. A trip to postgresl docs (https://www.postgresql.org/docs/current/functions-conditional.html)

reveals that the two methods operate differently.

In the first method, ' Each condition is an expression that returns a boolean result.’

In the second method, ‘The first expression is computed, then compared to each of the value expressions in the WHEN clauses until one is found that is equal to it. If no match is found, the result of the ELSE clause (or a null value) is returned.’

In the second method, it’s executing an exact value comparison, and ’NOT NULL’ is not a valid value comparison. Therefore, the CASE statement will always return NULL.

Personally, learning about these differences prompts me to use method 1 even more. It’s just more explicit and therefore easier to see what is happening, easier to debug, etc.

Previous
Previous

Running or rolling back a specific migration in Rails

Next
Next

Generating Base Tables in dbt