Link Search Menu Expand Document

Checks

There are 17 checks currently supported in 3 categories.

Data Types

File Growth

Architecture


Time based formats

Checks that commonly named time columns are using one of the recommended date/time data types. Storing date/time data in other data types may take up more storage and cause performance issues.

Arbitrary VARCHAR length

A variable length column should be based off of business requirements and only be as large as the maximum amount of data needed to be stored. Using classic numbers like 256/255 to estimate the length of a column usually indicate that the exact length required has not been properly assessed.

Unspecified VARCHAR length

If a VARCHAR column is created without specifying a length, it defaults to one. If this is done by mistake, it may cause truncation of data as it is inserted into the table. If only one character is needed, CHAR(1) is preferable as it uses 2 less bytes than VARCHAR(1).

Mad VARCHAR Max

While using VARCHAR(MAX)/NVARCHAR(MAX) can be tempting as a one size fits all solution, it is generally bad design practice unless absolutely required. It limits the column’s ability to be used as an index key, makes online index rebuilding impossible, can lead to storing data pages out of row, and causes row size estimates to balloon. Generally, performance will be degraded.

NVARCHAR data type

With the database size limit of 10GB for user databases in SQL Server Express, choosing the smallest data types is integral. Avoid NVARCHAR unless the column requires Unicode data. Instead, VARCHAR will only use approximately half of the space to store similar data.

FLOAT and REAL data types

While FLOAT and REAL can store a precision up to 53, they inherently store approxmate data. If stored values are required to be exact or are queried as WHERE =, then inexact results may be returned. DECIMAL or NUMERIC should be chosen if exact values are required and can fit in a precision of up to 38. For more information, see Microsoft’s Using decimal, float, and real Data.

Deprecated data types

The use of NTEXT, TEXT, and IMAGE data types should be avoided. These are deprecated data types and will be removed in future versions of SQL Server. They can be replaced by NVARCHAR(MAX), VARCHAR(MAX), and VARBINARY data types.

BIGINT as IDENTITY

Each BIGINT value requires 8 bytes of space and supports numbers up to 2^63-1, but INT only requires 4 bytes with numbers up to 2^31-1. Since each user database in SQL Server Express has a limit of 10GB, it is unlikely that enough rows can exist to necessitate using BIGINT as an IDENTITY data type. Defaulting to INT can save space and reflects a more reasonable upper limit given the restrictions.

NUMERIC or DECIMAL with 0 scale

DECIMAL or NUMERIC data types with a scale of 0 and precision of 18 or less are more effectively stored as an INT or BIGINT due to their smaller sizes on disk and ability to hold equivalent values.

Enum column not implemented as foreign key

Enumeration based columns should be implemented in a separate table to adhere to third normal form, which brings benefits of data integrity, space savings, and performance increases. If these values are instead stored in character data types directly in a table instead of as a foreign key to said enumeration table, table size increases, data integrity becomes much harder to maintain, and performance can suffer.

Database growth past 10GB

In most versions SQL Server Express, user databases and the model database are subject to a limit of 10GB. Setting the file growth limit to stay under that amount helps to prevent the database from auto growing too large and being unable to function properly.

Database growth type

Setting data file growth to be a fixed value, versus a percentage, helps to avoid exponential growth of the files. A percentage will result in the file growing significantly larger over time during each auto growth event.

Default fill factor

SQL Server defaults all table fill factors to 100%, but modifying it to leave room for future data can greatly reduce the rate of fragmentation for a table in certain situations. Due to size limitations in SQL Server Express, however, changing this from the default value can artificially increase file size and cause the limit to be reached faster than desired. Fragmentation on small data sets rarely results in performance issues, so keeping the default fill factor is preferred for Express instances.

Number of indexes

While indexes should be added to complement key queries, too many on a single table can hamper performance. There is no magic number of how many indexes are too many, so this value is provided as a parameter. The default value is 7.

Inefficient indexes

Indexes that are exact duplicates or overlap with others may unnecessarily increase the storage footprint of a database. These can reduce performance with little or no benefit to query performance. Each index should be assessed based on the required workload and only removed once it is determined it is not necessary.

Sparse columns

If sparse columns is an available feature and a column reaches the threshold percentage of NULL values, significant space can be optimized by converting the column to a sparse column. Statistics are used to estimate the amount of NULL values so any suggestion should be validated before implementing this advice.

Heap Tables

Heap tables cause an excessive amount of fragmentation when rows are inserted, updated, or deleted from the table. This results in under utilized data pages, adding additional I/O and causing table size to balloon. Most tables should have a clustered index, unless they are used for staging data or temporary in nature.

See also: