by Travis Foster
Overview
Making a website available without an internet connection is a fairly recent development, but also an extremely important one. With smartphones in common use, more and more people who depend on webapps in their daily lives go long stretches without a consistent internet connection. Although it’s usually impossible to make an entire website work offline, it’s often worthwhile to make parts of it available for use without a connection, so (for instance) people can use the app to collect data that they can upload to a server later. In this case, it’s important to categorize your pages as being either online-only or offline-enabled, and to develop with both online and offline use in mind.
This document was originally written for a particular university class on human-computer interactions using the Grails backend framework, and is primarily aimed at students who do not necessarily have much experience with web development.
Managing the Page
Generating a webpage using something like Grails or PHP is pretty easy. You just write the HTML file, and then you add special tags with code in them wherever you have dynamic content. When the file is requested on the server, all the code in the special tags are run first, and replaced with the resulting generated HTML, before the page is sent back to the web browser. This new markup is different depending on which user is logged in, what’s in the database at the time, etc. So, you end up with a dynamic page. This is fine in most cases, but it doesn’t work if the page needs to be accessible offline. The server is an essential part of this sort of page generation, and if you’re offline, the server is unavailable. In order to have dynamic pages offline, then, all the code needed to generate the page must be run clientside, in the browser. This means Javascript must be used.
The best way to do this is to create a ‘skeleton’ of the page, which contains all of the page’s static markup. All elements that need to change dynamically, by changing their styles or attributes, or by adding new elements inside of them, should be given ids or classes, so they can be easily located and changed by the Javascript. The static page should include a reference to a separate Javascript file, which contains all the code for managing the page and adding the correct dynamic content. This Javascript must be able to manage the DOM, and can do so in multiple ways. JQuery is probably the most straightforward way, though other libraries or direct use of the built-in DOM-manipulation system also works.
JQuery is primarily a DOM manipulation library. It works by selecting elements on the page and storing them as Javascript objects, which have a ton of useful methods that can be used to manipulate the selected elements in various different ways. To select an element or group of elements, the jQuery (or $) function is used. This function is a global JQuery selector function, that takes a DOM object, DOM object array, or CSS selector string, and returns a JQuery object containing the matching DOM elements. For example, $(document) returns a JQuery object representing the root document object, and $(“div.foo > .bar”) returns a JQuery object representing every element of class ‘bar’ that is a child of a div element of class ‘foo’. Once you have a JQuery object, you can run its methods to manipulate it. You can get or set an attribute using the ‘attr’ method, edit css with the ‘css’ method, write HTML inside the element with the ‘html’ method, append existing elements inside with the ‘append’ method, bind events to elements with the ‘bind’ method, and a lot of other things. For example, $(‘.blue’).css(‘color’, ‘blue’); changes the color of every element of class ‘blue’ to blue, and $(‘#header’).html(‘<h1>Header</h1>’); inserts an h1 element containing the text ‘Header’ as the contents of the element with an id of ‘header’. JQuery can also be used to extract content from the page; for example, var foo = $(‘#foo’).val(); sets a Javascript variable ‘foo’ to the value of the element with id ‘foo’, which is usually some kind of form field, such as a text input box.
Using Device Storage
JQuery is a powerful system that can be used to easily add dynamic content to an otherwise static web page, and extract user-generated content from said page, all on the user’s device. Now the question is, where does the dynamic content come from? Where should the user-generated content go? The obvious answer to both questions is ‘the database on the server’, but in offline mode there’s no access to the server. Of course, the data needs to get to the server eventually. But until then, the dynamic content needs to be stored somewhere else, temporarily: the device’s internal storage. This is where Javascript’s persistent device storage systems come in. There are a few of these, the most common of which is called LocalStorage. LocalStorage is easy to use, and supported by every modern browser, so it’s very good for most things. Basically, LocalStorage is just a Javascript object, but the contents are preserved across pages within a domain, and even across browser sessions.
The main problem with LocalStorage is that it has a hard upper limit for how much data it can hold from a single domain, and that limit is usually fairly small (about 5MB for most browsers). If you want to store more than that, there are other systems available, particularly WebSQL and IndexedDB. These are both database systems, and they both have unlimited storage, but they aren’t supported as widely as LocalStorage. Some browsers support both WebSQL and IndexedDB, others only support WebSQL, and still others only support IndexedDB. Writing code to support both would be a lot of work. Enter LocalForage.
LocalForage is a third-party Javascript library, designed by Mozilla. It acts as a common interface for either IndexedDB or WebSQL, and it has a similar API to LocalStorage. LocalForage basically combines the ease-of-use and widespread support of LocalStorage with the unlimited storage capacity of IndexedDB or WebSQL. You can use localforage.getItem(), localforage.setItem(), and localforage.deleteItem() to retrieve, create/modify, or delete an entry in LocalForage, respectively. You can also use localforage.clear() to remove everything from LocalForage, localforage.length() to see how many items are in LocalForage, and localforage.keys() to retrieve a list of all the key strings currently in use.
Using AJAX
So, now we can extract user data from the page, store it to the device indefinitely, load it back, and display it again, all on the client side and without an internet connection. But when the user does get back on the web, the data will need to be uploaded to the server. How can we accomplish this? The obvious answer is to fill the data into a form, and programmatically press the submit button, but that has its fair share of drawbacks. A better option is to use a technique called AJAX. AJAX is a method of sending a request to the server via Javascript, in such a way that the response becomes available to the page as a Javascript string, rather than loading in place of the page like a normal page load would.
Because of this, AJAX requests are not restricted to containing POST key-value pair strings, and AJAX responses are not restricted to containing HTML. In fact, AJAX stands for Asynchronous Javascript And XML, because XML is traditionally used to represent data being transferred in both the request and response bodies. More recently, however, JSON (JavaScript Object Notation) has primarily replaced XML in that regard, due to its simplicity and ease of use. JSON is a restricted subset of Javascript, and can represent a subset of the data structures that can be expressed in Javascript (namely: strings, numbers, booleans, arrays, objects (dictionaries), and null).
JSON is nice for a couple of reasons. First, because it is a subset of Javascript, feeding a JSON string through Javascript’s eval() function will properly parse the string (note that this is absolutely not recommended, as it is unsafe: use Javascript’s built-in JSON.parse() function instead), so even if the browser doesn’t support JSON, it can still read JSON strings. More importantly, though, JSON is very easy to use because JSON values directly correspond to Javascript values. An XML document would need to be built in a special way, using special functions to inject or extract the data properly, but a JSON object can be converted directly to and from a Javascript object with a single function call. For this and other reasons, JSON is generally recommended over XML for use with AJAX.
AJAX can be performed in a variety of ways, one of the more straightforward of which is using JQuery. You can send GET and POST AJAX requests using jQuery.get() and jQuery.post(), and you can convert between Javascript data and JSON strings using JSON.parse() and JSON.stringify().
Responding to AJAX Requests
AJAX can be used to make server requests directly from Javascript, at any time, in any format. AJAX calls are pretty simple to make: just call an AJAX function with whatever data you want to send, and provide a callback that takes the response data as an argument. But how does AJAX work on the serverside? Well, as far as the server is concerned, AJAX requests are no different from any other requests. In Grails, just write a controller action, send the AJAX call to the corresponding address, and the action will run. Use the render command to build the response text, which will be sent back to the client. When the browser receives the response text, it will send it to the AJAX callback, rather than rendering it to the page as HTML like it would with a non-AJAX response.
Because the request and response are in JSON format, the server code needs to be able to correctly parse the request data, and correctly serialize the response data. Luckily, Grails can do both of these things automatically. If the request is in JSON format, it will be automatically parsed to a Groovy value, and stored in the request.JSON variable. The response can be built as a Groovy value as well. When you have finished building the response and are ready to send it, use the command render my_response_data as JSON. This will automatically serialize the my_response_data variable to a JSON string, and send it as the response text.
Writing the Manifest
The final step to making a web app work offline is to make it available offline in the first place. Normally the pages and resources are stored on the server, but if the browser doesn’t have access to the internet, they need to be stored somewhere the browser does have access to: the device’s internal storage. This is possible with the help of the HTML5 Application Cache, a special browser cache that will indefinitely store all the files that are necessary to run the app.
To use the appcache, the first thing you need is a manifest file. This is a text file that describes all the files in your app and how the appcache should handle them. Once a manifest file is written, every HTML file that needs to be available offline should include a reference to it in the manifest attribute of the html tag, like so: <html manifest=”/path/to/manifest”>. The name of the manifest file doesn’t matter, but it must be served as a mimetype text/cache-manifest. In Grails, the easiest way to serve the manifest file is to save it as a gsp, and in the associated controller action, have the command response.setContentType ‘text/cache-manifest’. Whenever a page with a manifest attribute is loaded by the browser, it gets stored in the appcache, along with the files listed in the manifest itself. This can cause problems with dynamic online-only pages. Because of this, the best approach (at least, for these kinds of apps) is to only include the manifest attribute on pages that are explicitly listed in the appcache already. That way, there is a clear distinction between the offline-accessible pages and the online-only pages.
The manifest file itself begins with the line CACHE MANIFEST. After that, it consists of a set of paths, one on each line, with up to three optional section headers denoting what the paths mean. The CACHE: header tells the browser that all the pages under it should be cached. This should include every page that needs to be accessible in offline mode, and all the resources those pages need to access when they’re being used offline. The NETWORK: header tells the browser that all pages under it can be loaded over the network. If an uncached page is not listed under this header, the browser will not be able to load it, even when it’s online. Unless there’s a particular need to do otherwise, the only line under this header should be *, which denotes that every page should be accessible through the network. Finally, the FALLBACK: header gives a list of fallback pages to use if uncached pages can’t be loaded. This section actually has two paths per line, separated by a space. The first path is to a file that might be requested, and the second path is to a cached file that should be used in place of the first file, in case it can’t be loaded. This is useful when you want to load a different page or resource depending on whether the app is being used online or offline. If no headers exist in the manifest file, all paths are implicitly under the CACHE: header. Comments are denoted by a hash (#) symbol.
The first time a browser visits a page that includes a manifest attribute in its html tag, it will download the manifest file specified by that attribute, and then it will attempt to download every file the manifest explicitly lists for caching. If all files download properly, they are all stored in the appcache. If the browser visits that page later on, it will attempt to download the manifest file again, and it will compare the new manifest to the old one. If there have been any changes, the browser will attempt to recache the app. Otherwise, the browser will always use the cached files, even if it has access to hosted versions of the same files. Therefore, the contents of the manifest should always be changed whenever a cached page is updated, so that the browser will use the new version. This often takes the form of a comment with a version number, either for the entire manifest file or a separate one for each file being cached.
The manifest file may be difficult to properly include in a project that uses a Grails layout to provide things like headers, footers, and common CSS and Javascript files for every page. In this situation, every page that uses the layout shares the layout’s html tag, so if you want to include the manifest, you have to include it from every page, not just the pages that need to be accessible offline. This becomes a problem when online-only pages contain dynamic content, because they get cached along with the offline pages. The best solution for this, as far as I can tell, is to use a content tag to tell Grails which kind of page is being generated, so that the layout can dynamically decide whether to include the manifest. In the layout, use this code to generate the html tag:
<g:if test="${pageProperty(name: 'page.manifest')}"> <html manifest="/path/to/manifest"> </g:if><g:else><html></g:else>
Then, in each of the pages that you want to be available in offline mode, include the line <content tag=”manifest”>manifest</content> after the html tag.
Online or Offline?
Offline-enabled pages should display and respond differently depending on whether they’re being viewed online or offline. Some page elements might require an internet connection, and should be removed or replaced in offline mode, and some page elements (such as warning messages about the app being in offline mode, etc) should only be displayed when offline. This, of course, can all be done using simple Javascript logic. But how can you tell whether you’re online or offline in the first place? There are a few different ways of doing this, each with their own advantages and disadvantages.
The most obvious method is to run an AJAX request to the server, whenever you load an offline-enabled page. If the server responds, you know you have online access, and otherwise, you are offline. This works pretty well, but in some circumstances it can cause pages to load extremely slowly. There is also a variable called navigator.onLine, which can sometimes be useful, but it will only tell you if the device is connected to a network, so even if that variable is true, you still may not be able to connect to your server.
The best method, as far as I can tell, is to use the appcache system. Every time you load an offline page, the browser makes a special request to your server for the cache manifest, so it can decide whether to recache. The result of this request can tell you whether you are online. There are two ways to determine this result. The first is to listen for the window.applicationCache “error” event, which fires if the manifest file couldn’t be downloaded. An easier method is to take advantage of the FALLBACK: section in the manifest file. For example, you might create two files, online.js (which sets a global online variable to true), and offline.js (which sets the global online variable to false). In each of your offline-enabled pages, you might include a reference to online.js. Then, you can add a line to your cache manifest under the FALLBACK: header that sets offline.js as a fallback for online.js. Now, one of those two scripts will run on every page load, and set the online variable, according to whether the manifest downloaded successfully. From here, you can just check the value of the online variable any time you need, and it will correctly reflect the state of your internet connection.
Offline Implementation Examples
Adam Weidner and Stephen Radachy have separately developed example offline implementations of a photo album web app. User of the app can make photo album which includes “topics” and photos in the topic. Both the topics and photos can be created offline and later uploaded to the database server.
Both apps assume that the app is originally implemented using grailn to work online only, and then later development adds offline capabilities. The code for the online version of the web app, myPhotos, is on bitbucket:
https://bitbucket.org/grailsdirectedstudy/myphotos-online-only
myPhotos Offline
Adam Weidner added the offline using only JQuery and IndexedDB. It a tour de force because of the many layers of callbacks required to handle the code. The code is on bitbucket:
https://bitbucket.org/grailsdirectedstudy/myphotos-complete
The code documentation in the repository (Documentation.docx) and here:
ngPhotos Offline
Stephen Radachy added offline using AngularJS and an IndexedDB plugin for AngularJS. It is an elegant solution because AngularJS provides a MVC paradigm and two way data binding. The code is on bitbucket:
https://bitbucket.org/grailsdirectedstudy/ngphotos
The code documentation in the repository (Documentation.docx) and here: