For a project I’m working on, I ‘inherited’ an existing database. Most of the columns where nullable so almost every field in our Entity Framework model was of type Nullable<T>. As the data itself is used to do calculations, all these extra null checks and ‘.Value’ everywhere made our code a lot harder to read and understand.
However while looking through the data in the database, I had the feeling that most columns should actually be not nullable. But I had to be 100% sure before I could make the switch.
Based on the following blog post(http://blogs.msdn.com/b/samlester/archive/2012/09/22/tsql-solve-it-your-way-finding-the-percentage-of-null-values-for-each-column-in-a-table.aspx), I created a small SQL script that returns the percentage of NULL values for all columns in a table:
DECLARE @TotalCount DECIMAL(10, 2) | |
,@SQL NVARCHAR(MAX) | |
SELECT @TotalCount = COUNT(*) | |
FROM dbo.Orders | |
SELECT @SQL = STUFF(( | |
SELECT ', CAST(SUM(CASE WHEN ' + Quotename(C.COLUMN_NAME) + ' IS NULL THEN 1 ELSE 0 END) * 100.00 | |
/@TotalCount AS decimal(10,2)) AS [' + C.COLUMN_NAME + ' NULL %] | |
' | |
FROM INFORMATION_SCHEMA.COLUMNS C | |
WHERE TABLE_NAME = 'Orders' | |
AND TABLE_SCHEMA = 'dbo' | |
ORDER BY C.ORDINAL_POSITION | |
FOR XML PATH('') | |
,type | |
).value('.', 'nvarchar(max)'), 1, 2, '') | |
SET @SQL = 'SET @TotalCount = NULLIF(@TotalCount,0) | |
SELECT ' + @SQL + ' | |
FROM dbo.Orders' | |
PRINT @SQL | |
EXECUTE SP_EXECUTESQL @SQL | |
,N'@TotalCount decimal(10,2)' | |
,@TotalCount |
This returns the following result:
By doing this, I could easily identify columns that possibly should be not nullable.