Do You Know What Your Database is Doing?

July 25, 2007

There are so many different web frameworks these days that it is hard to keep count. Do you ever wonder exactly how they are accessing your data? Even if you write your own SQL, do you know how often it executes? In order to investigate performance issues at Chesspark, I wrote a query analyzer to answer these questions. The analyzer looks at all queries and their execution times. It generates a report of which queries are executed most often, which queries have the longest average duration, and which queries consume the most database time. The results were quite surprising for us. For example, we found that one of our simplest apps was generating too many queries by several orders of magnitude (a good candidate for caching!) and that Django and Twisted, two frameworks whose database layers we make heavy use of, had so much overhead that it was frightening.

:EXTENDED:

The analyzer is quite simple. It groups queries by their generic type (by removing the literals) and keeps track of how many times a given query type was executed and how long it took. It then uses this data to generate reports. You can find the analyzer source code below along with some brief documentation. It is written in Python for PostgreSQL log files, but the ideas are universally applicable to any SQL database which can generate the log data.

Here is an exerpt from a report:

TOTAL QUERIES: 348 generics from 4719056 total
TOTAL DB PIDS: 32018
TOTAL DB TIME: 747.62m
TOP 25 QUERIES BY FREQUENCY:
...
80281: 0.45ms: 1.70%: 0.08%: SET TIME ZONE ?
80307: 0.24ms: 1.70%: 0.04%: SHOW client_encoding
80307: 0.07ms: 1.70%: 0.01%: SHOW default_transaction_isolation
80307: 0.25ms: 1.70%: 0.04%: SET DATESTYLE TO ?
204854: 0.36ms: 4.34%: 0.17%: SELECT member_title.title_id AS title_id ...
...

The first line says that we had 348 types of queries out of over 4 million actual queries. These came from 32,018 process ids and the database used 747 minutes of cputime to execute these (unfortunately the script doesn’t help you out by telling you the wall time elapsed from the log file yet). The report lines give data in this format: “number of queries: avg. duration: % by frequency: % by total duration: generic query”.

So we can see from the above data that approximately 6% overhead is coming from somewhere. We shouldn't really need 4 times 80,000 database configuration statements, even if it isn't using much of database time. We can also see that we're doing a simple SELECT over 200,000 times, which probably means we should be caching something in the application.

The next report section is generic queries by average query duration. I find that this section isn’t very useful as there are many singleton queries (usually ones that have been run manually) that consume a lot of database time. I need to do some work to strip out those kinds of things before this section is worthwhile, but your experience may be better.

The last section is generic queries sorted by total database time used. This is a list of the slowest queries. Before we started analyzing queries and optimizing them, this section was filled with queries that were just missing a simple index. Now that we’ve got the indexes all set up, this section is mostly full of INSERT and DELETE statements, which are expected to be slow.

We have been using this analyzer for months now and have made many discoveries about how our application is using the database. We’ve easily made a few orders of magnitude improvement. Generally improvements have fallen into a couple categories:

All of these things are common sense, of course. The issue is knowing exactly which queries are causing the problem. Once we find a problem query or group of queries, it is usually pretty easy to fix, and then it’s wash, rinse, and repeat.

The code is below for anyone who is interested. I’d love to hear feedback on it as well as what you discover about other web and database frameworks with it. Enjoy!

Download analyzer.zip here.

Update 05-15-2008: Fixed broken link to analyzer.zip.

Update 03-29-2013: Fixed broken link to analyzer.zip again.

Do You Know What Your Database is Doing? - July 25, 2007 - Jack Moffitt