First I will check the structure of this table. This table is a heap with no indexes at all and the table only contains 3
columns, each of which is a varchar, and they are 1, 15 and 150 characters in
length. This means the maximum possible size of a row is 166 bytes (plus a couple of bytes for overhead) which is not very
big at all.
CREATE TABLE Codes(Code VARCHAR(15) NOT NULL,
Category VARCHAR(1) NOT NULL,
Condition VARCHAR(150) NOT NULL);
OK so how much data is stored in this table? A simple SELECT
COUNT(*) reveals 42,997 rows in this table. Whoa, hang on a second. If we had
42,997 rows all using the maximum possible row size we would have the
following:
42997 * 166 = 7137502 bytes 7137502 / 1024 = 6970 kilobytes
6970 / 1024 = 6.8 megabytes
6.8 MB is somewhat different to the 430 MB the table is
currently using. What is going on?
Next I want to have a look at sp_spaceused to see what the
system will report:
rows – 42997
reserved – 517 MBdata – 430 MB
index_size – 32 KB
unused – 86 MB
Obviously something is terribly wrong. I want to have a look
at index physical stats.
SELECT index_id, index_type_desc,
avg_fragmentation_in_percent,
avg_page_space_used_in_percent, page_count, record_countFROM sys.dm_db_index_physical_stats(DB_ID(), OBJECT_ID('Codes'), NULL, NULL, 'DETAILED')
avg_fragmentation_in_percent – 30.9%
avg_page_space_used_in_percent – 4.02%page_count – 55,143
Well, well, well. The table storage is not normal at all.
Fragmentation, at 31%, is not too bad but this table has 55,143 pages to store
43,647 rows and on average only 4% of each page has data. In other words we are
using 430MB worth of data pages to store about 6MB worth of data. Not only do
we waste 424MB of disk space, we also waste 424MB of memory and we also have to
read that 430MB from disk to get to our 6MB of data. Any reduction in IO will usually benefit the entire database server.
The cause of this problem is documented in Books Online:
When
rows are deleted from a heap the Database Engine may use row or page locking
for the operation. As a result, the pages made empty by the delete operation
remain allocated to the heap. When empty pages are not deallocated, the
associated space cannot be reused by other objects in the database. To delete
rows in a heap and deallocate pages, use one of the following methods.· Specify the TABLOCK hint in the DELETE statement. Using the TABLOCK hint causes the delete operation to take a shared lock on the table instead of a row or page lock. This allows the pages to be deallocated.
· Use TRUNCATE TABLE if all rows are to be deleted from the table.
· Create a clustered index on the heap before deleting the rows. You can drop the clustered index after the rows are deleted. This method is more time consuming than the previous methods and uses more temporary resources.
Now I know this table gets loaded from another system on a
daily basis. I had a look at the procedure that loads the table and saw the
following:
DELETE FROM Codes;
INSERT INTO Codes
SELECT ......
This table has had pages being allocated to it on a daily
basis and the delete statement ensured that pages would not be de-allocated. In my particular case I am going to create a clustered index
on this table to remedy the situation and to prevent a reccurance. Let’s see what that has done for our storage.
rows – 42,997reserved – 3,624 KB
data – 3,408 KB
index_size – 56 KB
unused – 160 KB
avg_fragmentation_in_percent – 0.2%
avg_page_space_used_in_percent – 99%page_count – 426
Bang, I’ve just scored 430MB of RAM for free.
Sean,
ReplyDeleteIn your above row size math you use X * 166 to bytes. (And I understand this may be a bonehead question...) How did you derive that number?
Thanks,
-Jeff
That is the maximum possible row size for the data.
ReplyDeletevarchar(150)
varchar(15)
varchar(1)
So 166 single byte characters if each column was full, i.e. a worst case scenario.
Thanks! Here I was thinking it was some clever way of establishing row size. I should have looked at the obvious. :)
Delete