Force update of AppCache with PHP

AppCache can be hard to deal with. Browsers seem unpredictable in updating the AppCache. This example of a PHP-script (for PHP 5) makes sure the AppCache gets updated when pages have been changed.

Important

This script relies heavily on http-headers sent dynamically by PHP. Headers that are defined in .htaccess can never be overruled by PHP, so don’t put these headers (especially Last-Modified and Expires) in .htaccess: the script won’t work then.

The Process: two modi

The script will be executed on the server in two modi: from the manifest and from a regular webpage on your site.

To clarify this let’s name the first mode “manifest-mode” and the second mode “page-mode”. Both modi use different parts of the PHP-script in manifest-loader.php.

  • In manifest-mode the script will be activated directly by .htaccess. In this mode the script will echo the contents of manifest.appcache to the browser, but not before it has searched for pages that are more recent than the manifest, in all the folders that are listed in a special file named folders.txt. Since almost every page of your site may link to the manifest, every time one of these pages is visited, the script will start its search for modificated pages.
  • In page-mode if the script is included in the page, it will only check if the current webpage is more recent than the manifest. In page-mode the script sends cache-control-headers (Cache-Control: max-age, Last-Modified, Expires) for the current webpage to the browser.

More about the manifest-mode

  1. The script will investigate if the script itself or one of your regular webpages is more recent than the manifest.
  2. If so, the manifest on the server will be updated with an new version-number and modification-time. Then it will be sent to the browser, along with some headers that tell the browser that the manifest has expired and will have to be refreshed from the browser during the next request of a webpage.
  3. The update of the online manifest forces the visitor’s browser to update its local AppCache.
  4. In case there are no webpages more recent than the manifest, the script checks if maybe your stylesheet is more recent than the manifest. In that case the manifest also will be updated.
  5. There may be a delay before you see updates of the manifest. This delay is determined by:
    1. The number of files in the appcache that have to be checked for updates (and the time it takes to download modificated files).
    2. On the max-age send sent by the script for each file while in page-mode.

More about the page mode

  1. If the requested webpage on the server is more recent than the manifest, the script will update the manifest on the server with a new version-number and modification-time.
  2. The script also checks if the requested page is in a folder that is not yet listed in folders.txt. If so, this file will be updated with the new folder. This list of folders enables the script to check this new folder for modificated pages when it’s in manifest-mode.

The Files

You need four files, each of which you can find below:

  1. The manifest that is called manifest.appcache.
  2. The .htaccess file
  3. The script that is called manifest-loader.php
  4. A register for the folders on your site folders.txt

Also you need to include some extra PHP-code in your regular webpages, as you can see in the samples below.

All files mentioned above are available in an archive: zip-archive.

Don’t forget to make manifest.appcache and folders.txt writable! Without this permission the script will not be able to write/change/add anything in these files.

The Manifest

This is an example of the manifest thats is needed to enable AppCache. Modify according to your requirements, but don’t change the first two occurrences of lines that start with “#”.

download
0001CACHE MANIFEST
0002 
0003# Don't touch the following two lines; they will be updated dynamically by manifest-loader.php:
0004# Version: 58
0005# Updated: 2011-09-05, 16:26:57
0006 
0007CACHE:
0008 
0009/page.php
0010/css/css.css
0011 
0012#If you want to use fallback pages for pages that haven't been downloaded yet of if you want to define pages that have to be always fetched online, remove the corresponding hashtags below
0013 
0014#FALLBACK:
0015#/ /offline.php
0016 
0017#NETWORK:
0018#*

.htaccess

Only the bare minimum of rules needed for this setup are demonstrated here. This file must reside in the root directory of your site.

download
0001RewriteEngine On
0002 
0003#send the manifest:
0004rewriteRule ^manifest\.appcache$ appcache/manifest-loader.php?sendmanifest = true [L]
0005 
0006<IfModule mod_headers.c>
0007#make sure (with Etag and max-age) that modified static content will be reloaded after the manifest has been updated:
0008 <FilesMatch ".(html|htm|xml|txt|css)$">
0009  FileETag MTime Size
0010 
0011  Header set Cache-Control "max-age = 120"
0012 </FilesMatch>
0013 
0014</IfModule>

folders.txt

In this file manifest-loader.php maintains a list of folders on your site. This list enables manifest-loader.php while in manifest-mode to crawl the folders of your site in search of files that are more recent than the manifest.

download
0001a:1:{i:0;s:1:"/";}

manifest-loader.php

This is the script that does all the hard work. It has to be included in the head of your pages (see the sample below).

download
0001<?php
0002class AppCache {
0003 
0004 //first we define some properties of the class:
0005 
0006/** root folder of your site; dynamically filled in the constructor
0007* @var string */
0008 private $_root_folder = "";
0009 
0010/** The folder that contains this script and the manifest; dynamically filled in the constructor
0011* @var string */
0012 private $_appcache_folder = "";
0013 
0014/** Path to the manifest; dynamically filled in the constructor
0015* @var string */
0016 private $_manifest = "";
0017 
0018/** Contents of the manifest, fetched from manifest.appcache and sent to the browser
0019* @var string */
0020 private $_manifest_contents = "";
0021 
0022/** Will be true if and when the manifest source-file manifest.appcache has been updated. This update is executed by AppCache::_manifest_update()
0023* @var boolean */
0024 private $_manifest_updated = false;
0025 
0026/** The delay in seconds it will take before you will see updated content in your browser. 2 Minutes (120 seconds) is a good value for this: you don't have to wait overly long for updated content and your pages will still load very fast.
0027* @var numeric */
0028 private $_appcache_refresh_delay = 120;
0029 
0030/** Folders that will be checked by the dynamic manifest. Folder list will be stored in the file folders.txt in the folder appcache and will be automatically updated. Initially only the root folder of the site - "/" - is stored
0031* @var array */
0032 private $_folders = Array("/");
0033 
0034/** txt-file that stores the serialized list op folders of your site
0035* @var string */
0036 private $_folder_index = "";
0037 
0038/** Folders that contain the stylesheet(s) of your site. Has a stylesheet been updated, the manifest will also be updated, to force an update of the AppCache in the browser
0039* @var array */
0040 private $_css_folders = Array("/css/");
0041 
0042/** File modification time of the manifest, needed for computations to determine if the manifest needs an update
0043* @var numeric */
0044 private $_manifest_filemtime = 0;
0045 
0046/** True if this file has been referenced from .htaccess (when the manifest has to be echoed). If the current class has been instantiated from an include on a page, this property will stay false.
0047* @var boolean */
0048 private $_is_manifest = false;
0049 
0050/**
0051 * Constructor of the class. It will echo the manifest or updated it, if required
0052 *
0053 * @return    void
0054 */
0055 public function __construct () {
0056 
0057  //determine if this file has been loaded as stand-alone (to echo the manifest), of as an include on a page:
0058  if (isset($_GET, $_GET['sendmanifest']))$this->_is_manifest = true;
0059 
0060  //initialise the locations of the folders:
0061  $this->_root_folder = $_SERVER['DOCUMENT_ROOT'];
0062  $this->_appcache_folder = $this->_root_folder . "/appcache";
0063 
0064  $this->_folder_index = $this->_appcache_folder . "/folders.txt";
0065 
0066  //initialize some properties:
0067  $this->_folders = unserialize(file_get_contents($this->_folder_index));
0068 
0069  $this->_manifest = $this->_appcache_folder . "/manifest.appcache";
0070 
0071  $this->_manifest_filemtime = filemtime($this->_manifest);
0072 
0073  //if in manifest mode, echo the contents of manifest.appcache:
0074  if ($this->_is_manifest)$this->_echo_manifest();
0075 
0076  //otherwise: if this file has been loaded from a regular php-page:
0077  else {
0078   //determine the file modification time of the page that has called this file/class:
0079   $pagetime = filemtime(CALLER);
0080 
0081   //if the page is newer then the manifest, update the manifest:
0082   if ($pagetime > $this->_manifest_filemtime)$this->_manifest_update();
0083 
0084   //send some headers for cache-control of the page (don't use the second argument true for header(), or you'll get internal server errors):
0085   header("Cache-Control: max-age = " . $this->_appcache_refresh_delay);
0086   header("Last-Modified: " . gmdate("D, d M Y H:i:s") . " GMT", $pagetime);
0087 
0088   $expires = time() + 60 * 60 * 24 * 14;
0089   header("Expires: ".gmdate("D, d M Y H:i:s", $expires)." GMT"); // Far future expiration header set to 14 days
0090 
0091   //optionally update the list of folders that have to be checked for changes in the php-pages (list will be stored in appcache/folders.txt):
0092   $this->_update_folders();
0093  }
0094 
0095 } //end of __construct()
0096 
0097/**
0098 * Echoes the manifest and before that conditionally updates the manifest if pages or resources of the site have been updated
0099 *
0100 * @return    void
0101 */
0102 private function _echo_manifest () {
0103 
0104  // if this class / file has been updated, update the manifest:
0105  if (filemtime(__FILE__) > $this->_manifest_filemtime)$this->_manifest_update();
0106 
0107  //if the manifest hasn't been updated by the previous line, check the folders of the site for changes in de php-pages.
0108  //if any are found, the manifest will then be also updated, in order to force a refresh of the local AppCache of the browser
0109  if (!$this->_manifest_updated) {
0110   //will contain folders that are listed in folders.txt but don't exist anymore:
0111   $delete_folders = Array();
0112 
0113   foreach ($this->_folders as $folder) {
0114    $folder_to_check = $this->_root_folder . substr($folder, 0, -1);
0115 
0116    //if a folder doesn't exist anymore, add it to a list. The folders in this list will be purged from folders.txt later on:
0117    if (!file_exists($folder_to_check) || !is_dir($folder_to_check)) {
0118     $delete_folders[] = $folder;
0119     continue;
0120    }
0121 
0122    //determine the most recent file modification time in a folder:
0123    $most_recent_page = $this->_get_most_recent_file_in_folder($folder_to_check);
0124 
0125    //if the most recent page in a folder is more recent than the manifest, update the manifest:
0126    if ($most_recent_page > $this->_manifest_filemtime) {
0127     $this->_manifest_update();
0128 
0129     break; //because the manifest has been updated, the folders don't have to be checked anymore for files with more recent modification times
0130    }
0131   }
0132 
0133   //if there are any folders listed in folders.txt that don't exist anymore, delete them from this file:
0134   if (count($delete_folders)) {
0135    $this->_folders = array_diff($this->_folders, $delete_folders);
0136    file_put_contents($this->_folder_index, serialize($this->_folders));
0137   }
0138  }
0139 
0140 
0141  // only if an update of the manifest has not been neccessary until now, check the shylesheets for modification. If a stylesheet exists that is more recent than the manifest, update the manifest:
0142  if (!$this->_manifest_updated) {
0143   foreach ($this->_css_folders as $folder) {
0144    $folder_to_check = $this->_root_folder . substr($folder, 0, -1);
0145 
0146    $most_recent_page = $this->_get_most_recent_file_in_folder($folder_to_check, Array("extension" => "css"));
0147 
0148    if ($most_recent_page > $this->_manifest_filemtime) {
0149     $this->_manifest_update();
0150 
0151     break; //because the manifest has been updated, the other folders don't have to be checked anymore for files with more recent modification times
0152    }
0153   }
0154  }
0155 
0156  //send some headers for the manifest. Specifically needed for FireFox, which browser has a tendency to cache the manifest and therefor not to update the AppCache. With these headers the manifest will be marked as always expired, even for Firefox:
0157 
0158  header("Content-Type: text/cache-manifest", true);
0159  header("Cache-Control: must-revalidate, proxy-revalidate, max-age = 0", true);
0160  header("Expires: ".gmdate("D, d M Y H:i:s")." GMT", true); // Always expired
0161  header("Last-Modified: " . gmdate("D, d M Y H:i:s") . " GMT", true);// always modified
0162 
0163 
0164  //echo the contents of the sourcefile manifest.appcache:
0165  readfile($this->_manifest);
0166 
0167 } //end of _echo_manifest()
0168 
0169 
0170/**
0171 * Update the manifest, to force an update of the AppCache in the browser
0172 *
0173 * @return    void
0174 */
0175 private function _manifest_update () {
0176 
0177  $this->_manifest_contents = file_get_contents($this->_manifest);
0178 
0179  $this->_manifest_contents = preg_replace_callback("/Version:\s*(\d+)/", create_function(
0180    '$var',
0181    '
0182    $var[1] += 1;
0183    return "Version: "  . $var[1];
0184    '
0185  ), $this->_manifest_contents);
0186 
0187  $this->_manifest_contents = preg_replace("/# Updated: .+/", "# Updated: " . date("Y-m-d, H:i:s"), $this->_manifest_contents);
0188 
0189  file_put_contents($this->_manifest, $this->_manifest_contents);
0190 
0191  //log the modified-status of the manifest; this way further checks for the need to update the manifest will be skipped
0192  $this->_manifest_updated = true;
0193 
0194 } //end of _manifest_update()
0195 
0196/**
0197 * If an update thereof is required, update the list of folders with pages in folders.txt
0198 *
0199 * @return   void
0200 */
0201 private function _update_folders () {
0202 
0203  //determine the path to files relative to the document root of the site:
0204  $folder = preg_replace("/[^\/]+$/", "", $_SERVER['REQUEST_URI']);
0205 
0206  //if "http" has been used in the request_uri, a hacker probably is trying things. So we won't continue with the method in that case.
0207  if (stristr($folder, "http"))return;
0208 
0209  //you have to make sure the folder index file folders.txt exists AND it has to be writable.
0210  // if a new folder has been detected that isn't in this list, it has to be added to the index file:
0211  if (!in_array($folder, $this->_folders)) {
0212   $this->_folders[] = $folder;
0213   file_put_contents($this->_folder_index, serialize($this->_folders));
0214  }
0215 }
0216 
0217/**
0218 * Returns the most recent file in a folder
0219 *
0220 * @param string $folder The folder that is being searched
0221 * @param array $options Options for the method
0222 *
0223 * @return    numeric file modification time
0224 */
0225 private function _get_most_recent_file_in_folder ($folder, $options = Array()) {
0226 
0227  if(!file_exists($folder) || !is_dir($folder))return "";
0228 
0229  extract($options);
0230 
0231  if (!isset($extension))$extension = "php";
0232 
0233  $list = scandir ($folder); //scandir() returns a list of files (only the filenames) in the folder
0234  if (count($list) == 2)return ""; // then only .. and . found, which are no regular files
0235 
0236  $most_recent_filemtime = 0;
0237 
0238  foreach ($list as $file) {
0239 
0240   $full_path = $folder . "/" . $file;
0241 
0242   if (in_array($file, Array(".", "..")) || is_dir($full_path) || ($extension != "" && !preg_match("/\." . $extension . "$/", $file)))continue;
0243 
0244   $time = filemtime($full_path);
0245   if ($time > $most_recent_filemtime) {
0246    $most_recent_filemtime = $time;
0247   }
0248  } //end of the loop through all files
0249 
0250  return $most_recent_filemtime;
0251 
0252 } //end of get_most_recent_file_in_folder()
0253 
0254} //end of class AppCache
0255 
0256//instantiate the class, so that the manifest will be echoed or updated:
0257new AppCache();

Sample files

Below you find examples of three regular pages (two in the root folder of your site and one in a subfolder “subdir”) and of the stylesheet for these pages.

Notice how the manifest loader is being loaded in the start section of your pages. You only have to add some extra PHP-code above the doctype and in the HTML element of your pages. Of course your webpages must have the extension .php for this to work.

In the root folder: page.php (in the manifest)

download
0001<?php
0002 define("CALLER", __FILE__);
0003 require_once($_SERVER['DOCUMENT_ROOT'] . "/appcache/manifest-loader.php");
0004?><!DOCTYPE HTML>
0005 
0006<!--the manifest seemingly resides in the root folder of your site, when actually it is located in the subfolder /appcache -->
0007<html manifest = "/manifest.appcache">
0008<head>
0009   <meta http-equiv = "Content-Type" content = "text/html; charset = UTF-8" />
0010   <title>AppCache demo page 1</title>
0011 
0012   <link href = "/css/css.css" rel = "stylesheet" />
0013</head>
0014 
0015<body>
0016<h1>Page 1</h1>
0017<p>This is the only page that is explicitly listed in the manifest. The other pages will be added to the AppCache when the visitor loads them in his browser.</p>
0018<p>To <a href = "page2.php">page 2</a></p>
0019<p>To <a href = "subdir/page3.php">page 3</a></p>
0020</body>
0021</html>

In the root folder: page2.php (not in the manifest)

download
0001<?php
0002 define("CALLER", __FILE__);
0003 require_once($_SERVER['DOCUMENT_ROOT'] . "/appcache/manifest-loader.php");
0004?><!DOCTYPE HTML>
0005 
0006<!--the manifest seemingly resides in the root folder of your site, when actually it is located in the subfolder /appcache -->
0007<html manifest = "/manifest.appcache">
0008<head>
0009   <meta http-equiv = "Content-Type" content = "text/html; charset = UTF-8" />
0010   <title>AppCache demo page 2</title>
0011 
0012   <link href = "/css/css.css" rel = "stylesheet" />
0013</head>
0014 
0015<body>
0016<h1>Page 2</h1>
0017<p>This page is not listed in the manifest; so it will only be added to the AppCache if and when the visitor has visited this page in his browser.</p>
0018<p>To <a href = "page.php">page 1</a></p>
0019<p>To <a href = "subdir/page3.php">page 3</a></p>
0020</body>
0021</html>

In a subfolder: page3.php (not in the manifest)

download
0001<?php
0002 define("CALLER", __FILE__);
0003 require_once($_SERVER['DOCUMENT_ROOT'] . "/appcache/manifest-loader.php");
0004?><!DOCTYPE HTML>
0005 
0006<!--the manifest seemingly resides in the root folder of your site, when actually it is located in the subfolder /appcache -->
0007<html manifest = "/manifest.appcache">
0008<head>
0009   <meta http-equiv = "Content-Type" content = "text/html; charset = UTF-8" />
0010   <title>AppCache demo page 3</title>
0011 
0012   <link href = "/css/css.css" rel = "stylesheet" />
0013</head>
0014 
0015<body>
0016<h1>Page 3 (in subfolder)</h1>
0017<p>This page is not listed in the manifest; so it will only be added to the AppCache if and when the visitor has visited this page in his browser.</p>
0018<p>To <a href = "../page.php">page 1</a></p>
0019<p>To <a href = "../page2.php">page 2</a></p>
0020</body>
0021</html>

The stylesheet: css/css.css (in the manifest)

The contents of this file are not very relevant. I’ve only added it here because this sheets also will be checked for modifications by manifest-loader.php.

download
0001body {
0002 background: red;
0003 color: white;
0004}
0005 
0006a {
0007 color: white;
0008 text-decoration: underline;
0009}
0010 
0011a:hover {
0012 text-decoration: none;
0013}

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>