CakePHP HTML Cache Helper
The latest version of this code can be found here: http://github.com/mcurry/html_cache
Cake's core cache helper is great, but the files it outputs are PHP files, so it will never be as fast as straight HTML files. This HTML Cache Helper writes out pure HTML, meaning the web server doesn't have to touch PHP when a request is made. Yea, I know there are some huge limitations with this. First of all you can't have any user/session specific code on the page. Also there is no way to automatically check if the cache is expired and needs to be rebuilt.
Uses
I use this helper on RSStalker.com. It handles the custom RSS feeds (currently around 13k), which is perfect since there is nothing user specific in the XML. Each feed gets hit multiple times a day, by multiple aggregators. This really adds up to a ton of requests.
The Code
You can download it here. Or just copy and paste this into /app/views/helpers/html_cache.php:
httpc://github.com/mcurry/cakephp/raw/f76839a885da27a7c95efe77bc4ad42197bd128f/helpers/html_cache/html_cache.php
There really isn't much to it. Just add it to any controller that you want to cache the output of.
In addition you need to add two line to your webroot/.htaccess, so that the rewrite section looks like this:
httpc://github.com/mcurry/cakephp/raw/f76839a885da27a7c95efe77bc4ad42197bd128f/helpers/html_cache/webroot.htaccess
Issues
To expire the cache I use a cron job which deletes old files from the directory.
find /full/path/to/app/webroot/cache -mmin +360 | xargs rm -f
The cached files are getting written right to your webroot. The default Cake .htaccess checks to see if a file actually exists, this is what allows images, js, css, and other files to be handled directly by the web server.
This won't work with the root file of your controller. So for example www.rsstalker.com/feeds won't work, but www.rsstalker.com/feeds/amazon does.

7 Comments
i've added the following thought...
RewriteCond %{REQUEST_METHOD} ^GET$
So I get
RewriteEngine On
RewriteCond %{REQUEST_METHOD} ^GET$
RewriteCond %{DOCUMENT_ROOT}/cache/$1/index.html -f
RewriteRule ^(.*)$ /cache/$1/index.html [L]
this way form posts (etc) won't be cached.
I'm also doing
data)) {
$this->helpers[] = 'HtmlCache';
}
}
}
?>
If only I could figure out how to expire/delete the cache files automatically with a beforeSave() function in my Model...
I'm sure there's a way to do it, it shouldn't be too hard. But I'll have to think about it.
Cheers though!!
App::import('core', 'Folder');$Folder = new Folder();
$Folder->delete(WWW_ROOT . 'cache');
Thanks for your suggestion about how to remove the cache upon updating the database.
I've been trying really hard to implement this cache helper, but unfortunately I'm stumbling on the last hurdle.
The helper is successfully creating the cache files and everything.
But I am getting 500 Internal Server Errors from my .htaccess configuration.
Because of the Apache2 setup I have on my development server, I have to add a line to Cake's .htaccess files. Following "RewriteEngine On" I have to use "RewriteBase /", because otherwise I get 500 Internal Server errors.
I'm not exactly sure why I have to add the RewriteBase line, but I know it works.
I don't understand why I'm getting 500 Internal Server Errors :-( I've tried fiddling & tweaking it to no avail.
Do you have any suggestions?
For the record, I'm using a vanilla setup of Ubuntu's Apache2, with dynamic folder-based virtual hosting enabled.
I'd really appreciate any help... Cheers
From the docs:
http://httpd.apache.org/docs/2.2/mod/mod_rewrite.html#RewriteBase
It looks like the RewriteBase could be messing up the DOCUMENT_ROOT. Maybe try replacing that w/ the hardcoded path just to see if that works?
Anything of interest in the Apache error logs?
I was putting the mod_rewrite lines in the /app/webroot/.htaccess file. But my final solution came from putting them in the /.htaccess file... i.e. in the parent folder of app/
Here's the contents of my root .htaccess file now:
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteBase /
RewriteCond %{REQUEST_METHOD} ^GET$
RewriteCond /var/www/hosts/cms.dev/public_html/app/webroot/cache/$1/index.html -f
RewriteRule ^(.*)$ app/webroot/cache/$1/index.html [L]
RewriteRule ^$ app/webroot/ [L]
RewriteRule (.*) app/webroot/$1 [L]
</IfModule>
Unfortunately I'm having to use an absolute path, which is a tad annoying. I'm not really sure how to make this relative - I never have been too competent with mod_rewrite rules. So it does mean that I will have to change it per website/server setup, however I can live with that since your helper will REALLY help speed things up.
Again, it's such an ingenious idea to keep a plain HTML cache and not even touch PHP, let alone CakePHP, for the majority of requests!
Thanks Matt!
I tweaked the html_cache helper from the github (cant make the current one with HtmlCacheBaseHelper works):
[code]
<?php
/*
* HtmlCache Plugin
* Copyright (c) 2009 Matt Curry
* http://pseudocoder.com
* http://github.com/mcurry/html_cache
*
* @author mattc <matt@pseudocoder.com>
* @license MIT
*
*/
class HtmlCacheHelper extends Helper {
var $options = array('test_mode' => false, 'www_root' => WWW_ROOT);
var $helpers = array('Session', 'Auth');
var $isFlash = false;
function beforeRender() {
if($this->Session->read('Message')) {
$this->isFlash = true;
}
}
function afterLayout() {
if(!$this->__isCachable()) {
return;
}
$view =& ClassRegistry::getObject('view');
//handle 404s
if ($view->name == 'CakeError') {
$path = $this->params['url']['url'];
} else {
$path = $this->here;
}
$path = implode(DS, array_filter(explode('/', $path)));
if($path !== '') {
$path = DS . ltrim($path, DS);
}
$path = $this->options['www_root'] . 'cache' . $path . DS . 'index.html';
$file = new File($path, true);
$file->write($view->output);
}
function __isCachable() {
if (/*!$this->options['test_mode'] && */ Configure::read('debug') > 0) {
return false;
}
if($this->Auth->sessionValid()){
return false;
}
if($this->isFlash) {
return false;
}
if(!empty($this->data)) {
return false;
}
return true;
}
}
?>
[/code]
I added check with Auth Helper to check if user is logged in.
http://bakery.cakephp.org/articles/view/authhelper
This was working fine like that, not generating cache for logged in user.
But if the cache is created before, apache/htaccess will fetch the cached version even if user is logged in.
So I make a little workaround:
I created app/webroot/html_cache.php:
[code]
<?php
session_name('CAKEPHP');
session_start();
$cacheFile = './cache/'.$_GET['url'].'index.html';
if(isset($_SESSION['Auth']['User']['id']) && !empty($_SESSION['Auth']['User']['id'])){
//user logged in
include('index.php');
}
elseif(is_readable($cacheFile)){
//not logged, cache exists
include($cacheFile);
}
else{
//not logged, not cache
include('index.php');
}
?>
[/code]
in the app/webroot/.htaccess, replaced 1 line:
[code]
RewriteCond %{REQUEST_METHOD} ^GET$
RewriteCond %{DOCUMENT_ROOT}/cache/$1/index.html -f
#RewriteRule ^(.*)$ /cache/$1/index.html [L]
RewriteRule ^(.*)$ /html_cache.php?url=$1 [L]
[/code]
And this works fine!
This is passing through some PHP code to include cached version or CakePHP, but performances are still good.
Have you comments about this workaround ?