Efficiently Loading Paginated Results From MySQL with Just One Query Per Page

There are many sit­u­a­tions in which pag­i­na­tion of query results is very use­ful, espe­cial­ly for per­for­mance opti­miza­tion.  In most of the­se kinds of sit­u­a­tions, the pag­i­nat­ing of results requires you to deter­mine the total num­ber of results, so the appli­ca­tion knows the num­ber of pages avail­able.

The most com­mon way to do this, is to use two queries; one which obtains the count of results and one which obtains a par­tic­u­lar page of results.  This method has the poten­tial to be even less opti­mal than load­ing the entire set of results, how­ev­er; due to the fact that two queries are now nec­es­sary, where­as before there was only one query.

In most data­base sys­tems, it is pos­si­ble to over­come this lim­i­ta­tion; though the tech­nique is speci­fic to the par­tic­u­lar data­base you are using.  This exam­ple explains how to do this in MySQL.

Here is the sub-optimal exam­ple, using two queries; it is load­ing the first page of results. The LIMIT 0, 40 means it will start at posi­tion 0 (the begin­ning) and obtain a set of 40 results.

SELECT count(*) FROM my_table WHERE timestamp >= '2010-03-15' and timestamp <= '2010-08-01';
 
SELECT id, timestamp, field1, field2, field3 FROM my_table WHERE timestamp >= '2010-03-15' and timestamp <= '2010-08-01' ORDER BY id LIMIT 0, 40;

Here is a more opti­mal exam­ple, which uses two state­ments, only one of which is a real query.  Every­thing is done dur­ing the first state­ment, the sec­ond state­ment mere­ly loads the count, which was cal­cu­lat­ed dur­ing the first state­ment.

SELECT SQL_CALC_FOUND_ROWS id, timestamp, field1, field2, field3 FROM my_table WHERE timestamp >= '2010-03-15' and timestamp <= '2010-08-01' ORDER BY id LIMIT 0, 40;
 
SELECT found_rows() AS cnt;

One of the draw­backs of SQL_CALC_FOUND_ROWS, or count(*) in gen­er­al; is the fact that, by run­ning the­se cal­cu­la­tions, you lose some of the ben­e­fit of pag­i­na­tion. This is because your data­base is required to exam­ine all of the effect­ed data in order to gen­er­ate an accu­rate count.

Depend­ing on the specifics of your my.cnf con­fig­u­ra­tion file, the first state­ment will cache part of the infor­ma­tion, caus­ing it to exe­cute faster when sub­se­quent pages are load­ed. In some of my own test­ing I have seen a sig­nif­i­cant speedup after the first page is load­ed.

If you want to get the most out of your appli­ca­tion, you will like­ly need to do a com­bi­na­tion of appli­ca­tion tun­ing, query tun­ing and data­base tun­ing. Gen­er­al­ly you will want to start by tun­ing the appli­ca­tion itself; this way you’re elim­i­nat­ing any bot­tle­necks inher­ent in the appli­ca­tion. Then you’ll need to do some query opti­miza­tions if the appli­ca­tion still isn’t fast enough. Final­ly, you’ll want to look into what con­fig­u­ra­tion changes you can make on the data­base in order to speed things up from the source.