[Story] Terrible performance problem

Hello again and welcome to new series categorized informally as “DBA stories”. This is what it means – stories from real life depicting strange, bizarre, sometimes funny (sometimes not) but somehow interesting to a DBA. The choice of them is arbitrary – as long as I have my stories, I will try not to include stories posted by someone else. And remember, these things happened, so they might as well happen to you!

OK, so for the beginning a situation that made me stay up whole night once. I get a phone call from help desk agent around 11pm saying that one of mission-critical systems suffers from a poor performance. OK, I say, let me see – I log on and there it is, CPU load 100% with SQL Server barely responding due to load. The system uses transactional replication to maintain a warm-standby and I started thinking of invoking business continuity procedures. Then I saw that a 10 GB database has 40 GB of log which is growing by 1 GB every half an hour. So, it’s a database problem and I start digging.

Service restart for SQL Server didn’t help, of course – it didn’t remove the source of the problem. So I started looking for the source with Profiler and then I found it.

The clue was – one of the components received a patch with a new functionality, which – from my (DBA) view – took all records in one of the tables and in a loop updated all records in each iteration (for each record there was UPDATE issued without WHERE). There were like 5000 records in this table, so each execution of this procedure updated ~25 million records, and that’s not the end of it. This procedure was run once a minute! That’s why the log was growing so fast and replication could not simply keep it up as well.

Once the problem was found, one quick call to application support team with request to disable this functionality solved it immediately. How did it pass QA? you might ask. Well, there are two morals:

  1. Keep your QA environment as similar to production as possible.
  2. Don’t trust your developers – have it checked two or three times.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s