The DNAnexus platform’s front-end is driven off a technology stack called “Membrane”. Membrane is a front-end views framework which allows internal developers to build isolated components that can be used throughout the website to build complex interactions. Below are two screenshots from two different components, the Data Tree and the Data List, respectively.
Data Tree – Used to show the folder hierarchy in a project, allowing users to expand/collapse folders, select a folder, etc.
Data List – Used to show a list of objects/folders, with support for sorting, etc. Typically used to show the items in folder, or search results.
The problem / more background
Utilizing the web cache can greatly speed up the loading of your product, so we know we want to use the cache as much as possible. Simply updating your server to tell the clients that these assets never expire is a simple solution to ensure the cache is used, but at the expense of clients seeing a stale product as long as these entries exist in their cache. What we want is a solution which will use a browser cache entry as long as possible, while quickly invaliding a cache entry when there is a newer version of the asset.
We will now define a new release asset called the manifest. The manifest contains a mapping from an asset path, such as /data/list/view.js to the MD5 checksum for that asset (c2a92513c4604b92255f620950ecb93c). The manifest is loaded during bootstrapping and is always consulted when fetching an asset to provide the latest MD5 checksums for that asset. To prevent caching of the manifest itself, we use the HTTP header “Cache-Control: no-cache”. Every time a user loads the page we will fetch the manifest from the server. We also periodically ping the server during the user session to detect manifest changes and notify users that there are some updates.
The next part of the solution involves how the manifest information is used to manage the browser cache. Earlier I mentioned that the browser may use the cache to fulfil requests for previously accessed content. The browser determines whether or not it has already seen an asset by comparing the url of the asset with entries in the cache. Therefore if the page is requesting http://foo.com/image.png the browser will first check if it has a cache entry with that url. Knowing how the browser keys cache entries gives us the ability to force a client update by simply changing the URL of a particular asset.
The final piece of our solution is to make deployments simple. We don’t want to keep these MD5 checksums on our server’s file system, so we use an nginx rewrite rule to strip the MD5 checksum from the path when resolving the asset on disk. So in reality we don’t have assets on disk with md5 checksums in their path, but rather the MD5 checksum is used only in the URL to fetch the asset as a mechanism for managing the user’s browser cache.
Cache management is a tricky problem and if you search the web you will find a variety of solutions, each having its own set of pros and cons. Cache validation is also a complex topic which involves more mechanisms that I’ve gone into here, such as ETags.
After evaluating existing solutions both internally and externally, we’ve come up with a fairly simple solution that allows us to update quickly and often, while leveraging the user’s cache as much as possible to provide fast load times and remove the potential for users seeing a stale version of a component.
For additional information or if you have any questions, please feel free to email firstname.lastname@example.org.