URL rewriting with Apache and PHP – a simple example

This article is intended to provide a run down on URL rewriting including what it’s used for, why it’s good and/or bad and how to do it with Apache and PHP. As this article is only trying to give an introduction to rewriting URL’s the examples are very generic and may not actually represent the best or most efficient way of achieving your goal. Particularly on the PHP side where you will have specific requirements of how you application handles the data.

So with that said, it’s on with the article.

What is it and why would you use it?
URL rewriting is actually a very very broad term. There are so many things you can do with it and so many reasons why you would use it that I could simply go on for hours. For the purposes of this article however, I will try to keep it simple. URL rewriting is a way of changing (or rewriting) the URLs of your website or web application so they appear to the user in a much simpler format than they would without it.

There are several reasons why you would want to do this but the two primary reasons behind most URL rewriting these days are:

  1. Some search engines index pages using the URL. More readable URLs may allow for better indexing.
  2. Simpler URLs are much easier for users to use/remember and they look nicer.

This technique is frequently used in websites and applications where content is stored in a database where URLs will typically be more complex than static sites.

How do you do it?
In Apache there’s actually a couple of ways (that I know of) to implement URL rewriting. One way, using a 404 error handler, is only useful if you don’t need access to your request data. The reason for this is that when the error handler is applied by Apache the request data is discarded. The second way uses mod_rewrite, an Apache module created for just this purpose. The example given below will use the mod_rewrite method.

So, you will need to create a .htaccess file in your document root. This file will include something along the lines of:

RewriteEngine On

RewriteCond %{REQUEST_URI} !\.(php|css|js|gif|png|jpe?g)$
RewriteRule (.*)$ /index.php [L]

What this does is tell Apache to redirect all requests for files that have an extension other than php, css, js, gif, png, jpg or jpeg to the /index.php file. So for example going to a URL of /blog/2009/01/my-test-article, the request will be rewritten to /index.php. This means that you can load the appropriate content or return an error message as required.

So, lets break the .htaccess file down line by line.

The first line “RewriteEngine On” simply tells Apache to enable the runtime rewriting engine to allow for the proceeding lines to be parsed.

The second line “RewriteCond %{REQUEST_URI} !\.(php|css|js|gif|png|jpe?g)$” tells Apache when the rule should be applied to the request. In this case a regular expression is applied to the REQUEST_URI to match anything except the specified file extensions. The exclamation mark (!) at the beginning of the regular expression says run the rule when the following “DOES NOT” match. It is then simply followed by a list of file extensions.

The third line “RewriteRule (.*)$ /index.php [L]” describes the rewrite rule to be applied to the request. In this case there are three arguments. The first is the pattern to match. (.*)$ says to match any path. The second is what to rewrite the matched path to (in this case /index.php). The third argument is a list of “flags”. In this case the only flag that is used is L, which tells Apache that this is the last RewriteRule and no more rules should be applied.

Now, in PHP you will notice that your request data (if there is any) will remain in tact. You may also notice that the PHP_SELF server variable (e.g. $_SERVER['PHP_SELF']) will be /index.php. So, how do you know what the “virtual URL” was? Well, you use the REQUEST_URI server variable (e.g. $_SERVER['REQUEST_URI']) as it will have the value of the virtual URL (In the case of this example it will be /blog/2009/01/my-test-article). Depending on your application you may want to run a regular expression of some kind on the variable, pass it through a switch statement or use any other means to determine what content to show based on the URL.

For example:

<?php

switch ($_SERVER['REQUEST_URI']) {
    case '/blog/2009/01/my-test-article':
        // load blog article and output here
        break;
}

OR

<?php

if (preg_match('/\/blog\/[0-9]{4}\/[0-9]{2}\/.+/', $_SERVER['REQUEST_URI'])) {
    // load blog article here
}

Conclusions
There are numerous different ways URL rewriting could be implemented depending on what you are trying to achieve. This article shows one very simple case. If you are trying to implement URL rewriting and get stuck just drop me a line and I’ll do what I can to help you.

Additionally, if you have any other comments please feel free to share them.

  • Hitendra,

    Most of the work required for what you want to do is going to be in PHP detecting that the username exists and displaying the right page. As I'm not familiar with your site code I can't really offer you much in that regard.

    One suggestion I will make though is that you use some kind of prefix to the username so you can detect it is a username in your rewrite rules. Otherwise you will end up rewriting every URL just to check if it is a username.

    For example I would use something like:

    http://www.mysite.com/profile/{username}

    instead of:

    http://www.mysite.com/{username}
  • Hitendra
    Hi Michael,
    i m trying to mod rewrite when user enter his username after my site name
    then this request rewrite on user profile page .
    if u have any idea then plz reply on my id.
  • Hi Michael,

    thanks. mod-rewrite has been on my list of to-do's for my home-grown CMS for a while now. most tut's i've read complicate the issue more than need be. your's is clean and straightforward. now to automate url creation from the db when the nav data is loaded... guess what i'm doing this weekend while everybody else is sleeping.
  • Bruno
    Hi Michael,

    Excellent example, thanks a lot, it really helps me out!

    Have a nice day!
  • Hi Cal,

    That is a very good point and whenever possible the rules should be moved into the httpd.conf. This is unfortunately not always possible however. For example in certain hosted environments...
  • Michael,

    Good primer, thank you. one thing that needs to be noted though is that .htaccess is read by Apache on EVERY hit. For development servers it's fine to leave your rewrite rules in .htaccess because they are easy to change and you don't have to bounce Apache for the changes to show. however, in a production environment, you would want to move your rewrite rules into your httpd.conf or site specific conf file. Apache only reads this when it's started up so it will offer better performance.

    http://httpd.apache.org/docs/2.0/howto/htaccess...

    Thanks for the article!

    =C=
blog comments powered by Disqus