CakePHP HTML Cache Helper

Posted by Matt on Wed, Sep 03 2008

The latest version of this code can be found here: http://github.com/mcurry/html_cache

Cake's core cache helper is great, but the files it outputs are PHP files, so it will never be as fast as straight HTML files. This HTML Cache Helper writes out pure HTML, meaning the web server doesn't have to touch PHP when a request is made. Yea, I know there are some huge limitations with this. First of all you can't have any user/session specific code on the page. Also there is no way to automatically check if the cache is expired and needs to be rebuilt.

Uses

I use this helper on RSStalker.com. It handles the custom RSS feeds (currently around 13k), which is perfect since there is nothing user specific in the XML. Each feed gets hit multiple times a day, by multiple aggregators. This really adds up to a ton of requests.

The Code

You can download it here. Or just copy and paste this into /app/views/helpers/html_cache.php:

httpc://github.com/mcurry/cakephp/raw/f76839a885da27a7c95efe77bc4ad42197bd128f/helpers/html_cache/html_cache.php

There really isn't much to it. Just add it to any controller that you want to cache the output of.

In addition you need to add two line to your webroot/.htaccess, so that the rewrite section looks like this:

httpc://github.com/mcurry/cakephp/raw/f76839a885da27a7c95efe77bc4ad42197bd128f/helpers/html_cache/webroot.htaccess

Issues

To expire the cache I use a cron job which deletes old files from the directory.

find /full/path/to/app/webroot/cache -mmin +360 | xargs rm -f

The cached files are getting written right to your webroot. The default Cake .htaccess checks to see if a file actually exists, this is what allows images, js, css, and other files to be handled directly by the web server.

This won't work with the root file of your controller. So for example www.rsstalker.com/feeds won't work, but www.rsstalker.com/feeds/amazon does.

Posted in CakePHP, Code

7 Comments

Davy Van Den Bremt said on Mar 19, 2009
Cool! I'm using this on my site now.

i've added the following thought...

RewriteCond %{REQUEST_METHOD} ^GET$

So I get

RewriteEngine On

RewriteCond %{REQUEST_METHOD} ^GET$
RewriteCond %{DOCUMENT_ROOT}/cache/$1/index.html -f
RewriteRule ^(.*)$ /cache/$1/index.html [L]

this way form posts (etc) won't be cached.

I'm also doing

data)) {
$this->helpers[] = 'HtmlCache';
}
}
}
?>
Ollie Treend said on Feb 26, 2010
This is excellent!

If only I could figure out how to expire/delete the cache files automatically with a beforeSave() function in my Model...

I'm sure there's a way to do it, it shouldn't be too hard. But I'll have to think about it.

Cheers though!!
Matt said on Feb 27, 2010
Although it's not ideal you could be this in an afterSave and clear all the cache
App::import('core', 'Folder');
$Folder = new Folder();
$Folder->delete(WWW_ROOT . 'cache');
Ollie Treend said on Mar 01, 2010
Hey Matt

Thanks for your suggestion about how to remove the cache upon updating the database.

I've been trying really hard to implement this cache helper, but unfortunately I'm stumbling on the last hurdle.

The helper is successfully creating the cache files and everything.

But I am getting 500 Internal Server Errors from my .htaccess configuration.

Because of the Apache2 setup I have on my development server, I have to add a line to Cake's .htaccess files. Following "RewriteEngine On" I have to use "RewriteBase /", because otherwise I get 500 Internal Server errors.

I'm not exactly sure why I have to add the RewriteBase line, but I know it works.

I don't understand why I'm getting 500 Internal Server Errors :-( I've tried fiddling & tweaking it to no avail.

Do you have any suggestions?

For the record, I'm using a vanilla setup of Ubuntu's Apache2, with dynamic folder-based virtual hosting enabled.

I'd really appreciate any help... Cheers
Matt said on Mar 03, 2010
Not sure how much help I can be, but I'll try. I actually don't use Apache much anymore.

From the docs:
http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html#RewriteBase

It looks like the RewriteBase could be messing up the DOCUMENT_ROOT. Maybe try replacing that w/ the hardcoded path just to see if that works?

Anything of interest in the Apache error logs?
Ollie Treend said on Mar 03, 2010
Thanks for the pointers Matt. I'm so chuffed... I've got it working in the end!! :-D

I was putting the mod_rewrite lines in the /app/webroot/.htaccess file. But my final solution came from putting them in the /.htaccess file... i.e. in the parent folder of app/

Here's the contents of my root .htaccess file now:

<IfModule mod_rewrite.c>
RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_METHOD} ^GET$
RewriteCond /var/www/hosts/cms.dev/public_html/app/webroot/cache/$1/index.html -f
RewriteRule ^(.*)$ app/webroot/cache/$1/index.html [L]
RewriteRule ^$ app/webroot/ [L]
RewriteRule (.*) app/webroot/$1 [L]
</IfModule>

Unfortunately I'm having to use an absolute path, which is a tad annoying. I'm not really sure how to make this relative - I never have been too competent with mod_rewrite rules. So it does mean that I will have to change it per website/server setup, however I can live with that since your helper will REALLY help speed things up.

Again, it's such an ingenious idea to keep a plain HTML cache and not even touch PHP, let alone CakePHP, for the majority of requests!

Thanks Matt!
Spout said on Apr 14, 2010
Hello,

I tweaked the html_cache helper from the github (cant make the current one with HtmlCacheBaseHelper works):
[code]
<?php
/*
* HtmlCache Plugin
* Copyright (c) 2009 Matt Curry
* http://pseudocoder.com
* http://github.com/mcurry/html_cache
*
* @author mattc <matt@pseudocoder.com>
* @license MIT
*
*/

class HtmlCacheHelper extends Helper {
var $options = array('test_mode' => false, 'www_root' => WWW_ROOT);
var $helpers = array('Session', 'Auth');
var $isFlash = false;

function beforeRender() {
if($this->Session->read('Message')) {
$this->isFlash = true;
}
}

function afterLayout() {
if(!$this->__isCachable()) {
return;
}

$view =& ClassRegistry::getObject('view');

//handle 404s
if ($view->name == 'CakeError') {
$path = $this->params['url']['url'];
} else {
$path = $this->here;
}

$path = implode(DS, array_filter(explode('/', $path)));
if($path !== '') {
$path = DS . ltrim($path, DS);
}

$path = $this->options['www_root'] . 'cache' . $path . DS . 'index.html';
$file = new File($path, true);
$file->write($view->output);
}

function __isCachable() {
if (/*!$this->options['test_mode'] && */ Configure::read('debug') > 0) {
return false;
}

if($this->Auth->sessionValid()){
return false;
}

if($this->isFlash) {
return false;
}

if(!empty($this->data)) {
return false;
}

return true;
}
}
?>
[/code]
I added check with Auth Helper to check if user is logged in.
http://bakery.cakephp.org/articles/view/authhelper

This was working fine like that, not generating cache for logged in user.

But if the cache is created before, apache/htaccess will fetch the cached version even if user is logged in.

So I make a little workaround:

I created app/webroot/html_cache.php:
[code]
<?php
session_name('CAKEPHP');
session_start();

$cacheFile = './cache/'.$_GET['url'].'index.html';

if(isset($_SESSION['Auth']['User']['id']) && !empty($_SESSION['Auth']['User']['id'])){
//user logged in
include('index.php');
}
elseif(is_readable($cacheFile)){
//not logged, cache exists
include($cacheFile);
}
else{
//not logged, not cache
include('index.php');
}
?>
[/code]

in the app/webroot/.htaccess, replaced 1 line:
[code]
RewriteCond %{REQUEST_METHOD} ^GET$
RewriteCond %{DOCUMENT_ROOT}/cache/$1/index.html -f
#RewriteRule ^(.*)$ /cache/$1/index.html [L]
RewriteRule ^(.*)$ /html_cache.php?url=$1 [L]
[/code]

And this works fine!
This is passing through some PHP code to include cached version or CakePHP, but performances are still good.

Have you comments about this workaround ?