Optimizing my PHP

Introduction

This is a narrative of the process I'm going through beginning on 29. January, 2003, looking at the performance of my various PHP code, determining what was causing slowdowns, and then trying to speed up the page loads. The reason for this is that according to apachebench, I can do about 35 page-loads on the front page of Dave's Picks per second. That's it. That's not good enough, especially considering that I can serve up about 2500 static pages on the same server per second.

At the moment, this writeup just contains the analysis of what needs optimizing. Before you can make things faster, you have to figure out what's slow. I've got that part of the process done, and now have to go do some research to figure out how to speed those things up. Then I'll do the analysis again, see what's slow, speed that up, lather, rinse, repeat until I'm happy. I found an article: A HOWTO on Optimizing PHP with tips and methodologies at phpLens that offered some suggestions. Not all of them apply to my case, but it got me started.

How do I time things?

So the first step is to find a way to get fairly accurate timestamps from PHP. Here's the function I use, taken right from the microtime documentation in PHP:

function getmicrotime()
{
    list($usec, $sec) = explode(" ",microtime());
    return ((float)$usec + (float)$sec);
}

I can just put one of these before the code I want to time and one after, and then have an idea of how long the code took:

$start = getmicrotime();
function_call_i_want_to_time();
$elapsed = getmicrotime() - $start;

Once I've got the raw numbers, I spit them into the page output in a meta tag (it's near the top of the source, and easy for me to find). It looks like <meta name="benchinfo" content="Total - template = 0.028558969497681 (sec), requires = 0.010807037353516, include_days = 0.012656927108765" /> if you "view source" on one of the pages here. And yeah, they'll all have that extra bit of stuff for a while, so you're welcome to follow along as I tweak things and watch the times change.

Is it the template code?

One of the first things I wondered about was how long my PHP code was spending vs. the PHPLib template code I use. This was easy to do, since apachebench gave me a pretty good idea of how much time was being used total (1sec / 35 = 0.028 seconds, or 28 milliseconds). So I dropped a getmicrotime before the first line of code, and one just before I pass off all the variables to the template object. The timing is the first number in the benchinfo, and at 28.496 msec, matched pretty well with what I was seeing from apachebench. It looks like the template code isn't really an issue, and sure enough, my code is.

Since I'm doing templates, output buffering won't help. The output is pretty much spit out all in one call, so the only reason to do output buffering would be to offer compression, and I'll worry about that when bandwidth starts being an issue (it's not yet).

How about the last 10 referrers?

So how about the last ten referrers code I implemented a while back? Well, I bracketed that code with getmicrotimes, and saw that it's about 4 msec. Noticeable, but nothing to panic over just yet. Writing the references out took only 0.2 msec, so that's definitely not the problem.

Hmm. Included files? Yes!

Okay, let's look at a few other things. I include a lot of PHP code, which is broken up into nice, neat modules (files). It ends up being 9 require_onces (plus I suspect the template includes do more including). 10.8 msec elapsed. Wow! That's definitely something I need to look at.

Processing days isn't speedy, either

The other big time-sink I suspect is the process of pulling in the individual days (the include_days bit in the benchinfo). Good thing it's near the end of a month, so I can look at that easily. Just including three days, plus the titles from the rest of the month makes for about 12msec. Just out of curiosity, let's see what a full-month from the archives costs. It's not something that gets hit a lot (I know that from having looked at the logs), but when it does, it might be expensive. Yep. 200 msec for a full-month. Ouch. I'll have to deal with that, too.

The one bright note here is that all the code that handles the day-inclusion doesn't take time if I'm not actually including any days. Looking at this page, the include_days bit is 0.3 msec, so I don't need to worry (much) about the case where the code's doing nothing.

Conclusions

Loading any page on Dave's Picks is going to cost at least 11msec or so, just requiring all the PHP code I need to process the pages. Including the text for each day comes out to something like 1-10 msec (depending on how much is there). Including just the title for a day looks like about 1msec per.

I'm starting to think that opening and reading in a file costs me a millisecond. It's about the number I get for the requires, and about the number I get for the short days in a month, or for including the title. If it's a big day, and I include the whole day, it'll get closer to 10msec total. There's a couple things there I can look at that won't require too much coding, so those will be the first things I try. More details to follow once I've had a chance to hook things up, test them out, and figure out whether or not each individual optimization helped.

But there's definitely some room for improvement. The two slow things I've identified so far take up 20msec out of the 28 msec needed to assemble the front page here. If I can manage to cut the time they take in half, I'll go from 35 pages/second to over 50 pages/second. It still won't be anything like the speed of spitting out static pages, but it'll be better. And maybe I can get smart with caching pages and knock those times down more. I guess we'll find out.

Copyright 2009, Dave Polaschek. Last updated on Mon, 15 Feb 2010 14:09:32.