Making sure you only have one instance of a script running

One of my tasks in my current job is to build large indexes for our Sphinx powered search engine. This sounds pretty simple, but when you have 2,000,000 records that have, in turn, one-to-many relationships with other records, you can be processing a huge amount of data at any given time.

The script that I wrote to gather the data in preparation for indexing was a little heavy on the memory side of things and it was a huge problem when that script had not completed in time for the next cron job, which would evidently start a giant snowball as for every instance of the script, it ran slower, so there would be no end in sight and we’d have to manually kill all the processes. Thats not a great solution for many reasons, but when data integrity is of the utmost importance, you can’t just go killing your scripts willy nilly.

I found a great little class written by Chris Hope at electrictoolbox.com that touches a file and writes the scripts Process ID (PID) to the file so any other instances can check that file to see if their own PID is in there, if its not their PID, they die. Otherwise, they can continue. Also, if the file exists, but the PID inside the file is not valid anymore (ie the script was killed and had no opportunity to remove the file) the new instance can still run. Its a two-stage check.

I have made a few small adaptations to the original class and my version is here;

<?php
    class pid
    {
        protected $filename;
        public $already_running = false;

        function __construct($directory)
        {
            $this->filename = $directory . '/' . basename($_SERVER['PHP_SELF']) . '.pid';
            if(is_writable($this->filename) || is_writable($directory)) {
                if(file_exists($this->filename)) {
                    $pid = (int)trim(file_get_contents($this->filename));
                    if(file_exists('/proc/' . $pid)) {
                        $this->already_running = true;
                    }
                }
            } else {
                die("Cannot write to pid file '$this->filename'. Program execution halted.\n");
            }

            if(!$this->already_running) {
                $pid = getmypid();
                file_put_contents($this->filename, $pid);
            }
        }

        public function kill()
        {
            // Make sure this script owns the file before we delete it...
            $pid = (int)trim(file_get_contents($this->filename));
            if(file_exists('/proc/' . $pid) && $pid == getmypid()) {
                unlink($this->filename);
            }
        }
    }

Can can be used like so;

<?php
    class newclass
    {
        public function __construct()
        {
            if ($this->checkPid()) {
                // Continue...
                $this->killPid();
            }
        }

        private function killPid()
        {
            $this->pid->kill();
        }

        private function checkPid()
        {
            $this->pid = new pid('/tmp');
            if ($this->pid->already_running) {
                print 'Already running. Exiting.' . PHP_EOL;
                exit;
            } else {
                return true;
            }
        }
    }

The main difference is that I don’t use __destruct() as I wanted to be able to call a kill() method. The other notable difference is that my version does not use posix_kill() and instead checks that the process file in /proc exists. Keep in mind that this will not work on Windows or Mac operating systems as the /proc directory is unique to Linux. The reason I made this change is that posix_kill requires a PHP extension.

Thanks to Chris Hope for the original class.

blog comments powered by Disqus