Download OpenAPI specification:Download
This document describes the configuration format for CacheFly Advanced Cache Control.
CacheFly Advanced Cache Control (ACC) is a service which improve the efficiency of delivering dynamic websites, where dynamic means that the website generates content on demand. The response can be any type of media in any format (including html, image, video, etc).
This is primarily achieved by allowing configuration of Caching Policy for specific paths and file extensions. Through this config you are able to significantly increase your cache-hit-ratio and provide a better experience to end users.
Advanced Cache Control (ACC) is a script which needs to be configured to your specific use case.
CacheFly is responsible for the script.
You are responsible for the config (with our help, if you need).
Once the script is enabled for your account, you will have the option of creating a configuration file. This config is then associated with your services. This gives you full control of your service.
You may upload the configuration file to the CDN via the API or the Portal.
If you do not see any reference to this script in your account and would like to use it please contact us.
The configuration file may be provided to us in either YAML or JSON format. We have supported both as a convenience.
We are aware that some automations find it easier to work with JSON, while those who write the config by hand may find it easier to work with YAML. Please pay attention to the MIME type while uploading the configuration.
All the configuration examples here are shown in YAML format. YAML is intended to be more human friendly than JSON. It is reasonably easy to follow (without any experience), but sometimes writing it takes a little getting used to. The main advantages of YAML are the lack of curly braces, the use of indentation to define the structure, and the ability to add comments.
If you are not familiar with YAML, there are many great resources online to help you get started.
Additionally a YAML aware IDE or editor will help you with authoring and modifying the configuration file.
The editor within our Portal is based on the very popular Visual Studio Code editor. It will show various warnings and errors as you type. You should ensure that all of these are addressed before saving the configuration.
However our Portal is intended to be useful and convenient (we're a CDN and not an IDE a company). As such we also wish to provide you with a list of other tools that our team has found useful;
If you require any support with authoring or modifying your configuration files, please contact our support team who will be very happy to assist.
After uploading your configuration to us, our systems will deploy it to the CDN (almost) immediately.
When applying changes there may, or may not, be a propagation delay. This is to ensure that a high frequency of changes by a small number of users does not negatively impact the performance of the network as a whole. As such, although we endeavour to make this as fast as possible, and immediate in most cases, the only guarantee is that it will occur eventually.
Some limited validation checks are performed during upload. If they fail your new configuration will not be applied and the existing configuration will continue to be used. However there may be many small details that can not be checked automatically, and it is easy to break an in production system by introducing an error into the configuration.
We recommend that you:
It is possible for us to revert to a previous configuration if you have a problem. However this is a manual process and will take some time (several hours) to complete.
The Advanced Cache Control (ACC) configuration file contains two keys. Both need to be present for the configuration to be considered valid.
The default
key allows you specify the default behaviour of the CDN.
Under this key the only valid key to specify is;
caching
The exceptions
key allows you specify exceptions to the default behaviour.
Under this key you must specify a list (aka. sequence, or array).
Each item in the list may contain the following keys:
path
extensions
caching
The caching
key configures the Caching Policy that should be used.
This is described in the Caching Policy section below.
The path
and extensions
keys are conditions which configure when that exception should be
used. These are described in the Exception Matching section below.
default: # configure the CDN default behaviour.
caching:
mode: respect-origin-assume-cache
exceptions: # configure exceptions to the default.
- path: "/images/" # first exception
caching:
mode: ignore-origin-and-cache
See below for a more complete example.
The caching policy is configured by selecting a mode and then tweaking the other available parameters. Where a value is not specified a default value will be used instead.
When defining the default caching policy, the default value is hard coded into the CDN logic. These defaults are documented within this document.
When defining the caching policy for an exception, the default is taken from the default policy specified in your configuration.
The following keys may be specified:
mode
ttl
maxTtl
varyByQuery
ignoredQueryParameters
varyByCookie
ignoredCookies
There are four caching modes.
never-cache
This is intended for dynamically generated content which is different for every request.
This mode will never cache the content. All requests are always forwarded to the origin. Request coalescing is disabled.
Any cache control header sent by the origin is ignored. The CDN does not modify or add cache control headers when in this mode.
If the origin is unavailable, then the content is unavailable.
ignore-origin-and-cache
This is intended for content which is the same for every request and almost never changes (aka static content).
This mode will always cache the content. Request coalescing is enabled.
The cache control header sent by the origin is ignored. A CDN generated cache control header is placed on every response from the CDN.
Requests are served from the cache whenever possible. If the origin is unavailable, then already cached content is still served by the CDN.
You are expected to send a purge request if the content ever changes.
For generated static content (such as the output of running webpack), it is recommended that you use file names which contain a content hash.
respect-origin-assume-cache
This is intended for an origin which is well behaved and can generally be trusted to send sensible cache control headers.
This mode will honor the cache control header sent by the origin.
Request coalescing is enabled.
If the origin does not send a cache control header, the content will be cached using the default ttl. The cached content will be served without a cache control header.
When the origin indicates that the content is not cacheable, the CDN
switches to the never-cache
mode for this exact cache key
for a period of a few minutes.
NB. The cache key is affected by options described below.
respect-origin-assume-nocache
This is intended for an origin which is well behaved and can generally be trusted to send sensible cache control headers.
This mode will honor the cache control header sent by the origin.
Request coalescing is enabled.
If the origin does not send a cache control header, no caching will be performed.
When it is determined that the content can not be cached, the CDN
switches to the never-cache
mode for this exact cache key
for a period of a few minutes.
NB. The cache key is affected by options described below.
never-cache
ignore-origin-and-cache
respect-origin-assume-cache
respect-origin-assume-nocache
Functionality | A | B | C | D |
---|---|---|---|---|
Allows caching | No | Yes | Yes | Yes |
Origin cache control header | Ignored | Ignored | Respected | Respected |
New cache control header | No | Yes | No | No |
Request coalescing | Never | Always | Usually | Usually |
Default behaviour ^ | No cache | Cache | Cache | No cache |
^ When the origin does not send a cache control header.
default: # configure default behaviour for the CDN
caching:
mode: respect-origin-assume-cache
The Time To Live (TTL) specifies how long the content should be kept in the CDN cache before checking with the origin for an updated version.
Note that the TTLs configured here only apply to successful responses. When an error is cached (such as a 404) a different TTL is applied.
The TTL may be specified as a number of seconds. Alternatively it can also be written as a number followed by a unit of time. Please see the below examples;
ttl | duration | duration in seconds |
---|---|---|
1 | 1 second | 1 |
1s | 1 second | 1 |
1m | 1 minute | 60 |
1h | 1 hour | 3600 |
1d | 1 day | 86400 |
1w | 1 week | 604800 |
1y | 1 year | 31536000 |
When caching content, and the origin is not specifying a TTL (no cache control header), or where we're ignoring the origin (see modes); the TTL specified here will be used.
The default ttl is specified with the key ttl
.
The default which is used when this is not specified is configured in
Service Options
under reverseProxy.ttl
. By default this is set to 31 days (2678400 seconds).
default: # configure default behaviour for the CDN
caching: # configure caching policy
ttl: 1w # configure the default TTL for cached content
** THIS FEATURE IS STILL IN DEVELOPMENT AND MAY CHANGE BEHAVIOUR **
The max TTL feature places a limit on how long the CDN caches the content but without modifying the cache control headers which are sent to the browser.
This is intended for large content which should be cached by the browser for a long period of time, but is unlikely to be requested frequently. This is ideal when access to the content is likely to be in large bursts of requests seperated by long periods of no requests (e.g. scheduled events).
When specified with a value greater than zero, the CDN will ensure that the content is not stored for longer than specified here. With a value of zero this feature is disabled.
The max ttl is specified with the key maxTtl
.
default: # configure default behaviour for the CDN
caching:
maxTtl: 3h # This is a bad idea. Don't do this.
With these settings you are able to include query parameters in the cache key.
All of these values default to "off". Additionally this completely overrides the boolean value in the
Service Options
under reverseProxy.cacheByQueryParam
.
The key varyByQuery
may be specified as either a boolean or a list.
When set to false
, all query parameters are completely stripped from the request and not included in the
cache key. The origin receives all requests without query parameters.
This is the default behaviour.
When set to true
, all query parameters are included in the cache key.
When set to a list, only the query parameters mentioned in the list are included in the cache key.
In both cases all of the query parameters are allowed to pass through to the origin (when the request is not served from cache).
The origin must only use query parameters which have been added to the cache key to vary the response. Failure to follow this rule will lead to incorrect responses being served by the CDN.
If an empty list is defined, or if the list contains a wildcard (*
) then it behaves the same
as if it was set to true
.
default: # configure default behaviour for the CDN
caching:
varyByQuery: # Let the origin see the query parameters
- page # Add the parameter "page" to the cache key
When varyByQuery
is not set to false
, the key ignoredQueryParameters
can be used to
specifically ignore certain query parameters.
This is intended to be used in the scenario where you want to specify "everything except bob" (for example). This is ideal for keeping UTM parameters out of the cache key (which will increase your cache hit ratio).
The key ignoredQueryParameters
must be specified as a list (aka. sequence or array). An empty
list has no meaning, and is the default.
The query parameters listed here are still visible to the origin (when the request is not served from cache).
# Add all query parameters to the cache key, except for "bob".
default:
caching:
varyByQuery: true
ignoredQueryParameters:
- bob
default:
caching:
varyByQuery: true
ignoredQueryParameters:
- utm_source
- utm_medium
- utm_campaign
- utm_term
- utm_content
The key varyByCookie
may be specified as either a boolean or a list.
When set to false
, all cookies are completely stripped from the request and not included in the
cache key. The origin receives all requests without cookies.
This is the default behaviour.
When set to true
, all cookies are included in the cache key.
When set to a list, only the cookies mentioned in the list are included in the cache key.
In both cases all of the cookies are allowed to pass through to the origin (when the request is not served from cache).
The origin must only use cookies which have been added to the cache key to vary the response. Failure to follow this rule will lead to incorrect responses being served by the CDN.
If an empty list is defined, or if the list contains a wildcard (*
) then it behaves the same
as if it was set to true
.
default: # configure default behaviour for the CDN
caching:
varyByCookie: # Let the origin see the cookies
- page # Add the parameter "page" to the cache key
When varyByCookie
is not set to false
, the key ignoredCookieParameters
can be used to
specifically ignore certain cookies.
This is intended to be used in the scenario where you want to specify "everything except bob" (for example). This is ideal for keeping UTM parameters out of the cache key (which will increase your cache hit ratio).
The key ignoredCookieParameters
must be specified as a list (aka. sequence or array). An empty
list has no meaning, and is the default.
The cookies listed here are still visible to the origin (when the request is not served from cache).
# Add all cookies to the cache key, except for "bob".
default:
caching:
varyByCookie: true
ignoredCookieParameters:
- bob
default:
caching:
varyByCookie: true
ignoredCookieParameters:
- utm_source
- utm_medium
- utm_campaign
- utm_term
- utm_content
Exceptions have two conditions. Both need to be true for the exception to match. In the case of multiple exceptions a given request, the more specific should apply.
You should avoid writing a configuration where two exceptions overlap (i.e. both would match a request and neither is more specific).
However if this does occur then one of the two will be selected at random. This will produce inconsistent results when exposed to a large volume of traffic.
The path
key is used to specify the path which must match for the exception to be used.
Paths always start with a /
; this is also the default value.
Longer paths are considered to be more specific.
As standard paths match on a prefix basis;
/images/
matches /images/something.jpg
.To match on an exact basis only you add an $
to the end of the path (this is called the end of path anchor);
/images/$
matches only the exact path of /images/
, meaning that /images/something.jpg
will not match.The intention is to allow you to specify a different Caching Policy for the html index of of the directory vs the contents within.
As noted above, there is the possibility of random selection when two exceptions match a request. To break the tie
and prefer one other the other you can use #
characters on the end of the path
to make that exception appear
to be more specific than it really is.
/abc###
is more specific than /abc
but is otherwise functionally identical.When combing the anchor ($
) with padding (#
), the padding must come last.
/abc$###
is valid, but /abc###$
is not valid.Note that using padding is tricky to get correct as it may cause other exceptions to be overlooked. We advise that you create a second service to allow you to test your configuration before applying it to the service which is receiving production traffic.
Other characters which are not permitted within URLs may have special functions attached to them in the future.
Ensure that you only specify valid characters in order to avoid unexpected behaviours. This includes (but is not
limited to) the following characters; !
, "
, '
, &
, (
, )
, *
, +
, ,
, ;
, <
, >
, =
and ?
.
If you need to include any of these within the path
key in the exception please contact support who will be
able to assist by configuring an override for your account.
The extensions
key is used to match against the file extension if it is present in the request path.
This is a list (aka. sequence, or array) of file extensions, without the preceding dot.
The wildcard value *
always matches. The wildcard is always considered to be the least specific match.
No value, or an empty list is treated the same as a list containing a single wildcard. Hence you may consider the wildcard to be the default value.
When the request does not contain a file extension it can only match the wildcard.
exceptions:
# First exception = matches various audio files
- path: /
extensions:
- mp3
- ogg
- aac
- wav
# Second exception = matches everything else
- path: /
extensions:
- "*"
NB. In YAML the *
character is special and must be placed inside quotes when used literally; i.e. "*"
is
necessary. Also worthy of noting is that because the sequence has only one item the alternative single line syntax
may be preferable (to those familiar with it); i.e. extensions: ["*"]
. The functionality is identical.
In this example you can see the default behaviour of the CDN is to be respect-origin-assume-cache
. Note that this
is actually the default you get without Advanced Cache Control.
For all paths starting with /images/
the CDN has been instructed to switch to the mode ignore-origin-and-cache
.
This demonstrates how it is possible to ignore incorrect caching instructions being emitted from the origin.
For all paths starting with /invoices/
the CDN has been instructed to switch to the mode never-cache
. Again this
will ignore any incorrect caching instructions being emitted from the origin.
For all paths starting with /complex/
all of the various options have been given values. Explaining this exception
has been left as an exercise for the reader.
If you're new to YAML, learn the basics here.
---
# Everything on a line after a # is a comment. You can use
# comments them to leave little notes and reminders. This is
# especially useful when you're working with others people.
default: # Configure the CDN default behaviour
caching:
mode: respect-origin-assume-cache
exceptions: # Configure exceptions to the default
- path: /images/ # First exception
caching:
mode: ignore-origin-and-cache
ttl: 2592000
- path: /invoices/ # Second exception
caching:
mode: never-cache
- path: /complex/ # Third exception
extensions:
- "*"
caching:
mode: ignore-origin-and-cache
ttl: 2592000
maxTtl: 0 # Zero means disabled
varyByQuery: true # Boolean or a list
ignoredQueryParameters: # List of parameters to ignore
- something
varyByCookie: true # Boolean or a list
ignoredCookies: # List of cookies to ignore
- csrftoken
Although comments are great, sometimes they make things look more complex than they really are. Here is exactly the same config, still in YAML, just with all the comments and unnecessary whitespace removed.
---
default:
caching:
mode: respect-origin-assume-cache
exceptions:
- path: "/images/"
caching:
mode: ignore-origin-and-cache
ttl: 2592000
- path: "/invoices/"
caching:
mode: never-cache
- path: "/complex/"
extensions:
- "*"
caching:
mode: ignore-origin-and-cache
ttl: 2592000
maxTtl: 0
varyByQuery: true
ignoredQueryParameters:
- something
varyByCookie: true
ignoredCookies:
- csrftoken
If you're new to JSON, learn the basics here.
NB. The unnecessary whitespace in this JSON example is here to aid readability. JSON is intended to be processed by machines. The machines don't need the unnecessary whitespace, and JSON is handled more efficiently without it (in several ways). If you're making an integration with our API please do not include unnecessary whitespace in your JSON.
{
"default": {
"caching": {
"mode": "respect-origin-assume-cache"
}
},
"exceptions": [
{
"path": "/images/",
"caching": {
"mode": "ignore-origin-and-cache",
"ttl": 2592000
}
},
{
"path": "/invoices/",
"caching": {
"mode": "never-cache"
}
},
{
"path": "/complex/",
"extensions": [
"*"
],
"caching": null,
"mode": "ignore-origin-and-cache",
"ttl": 2592000,
"maxTtl": 0,
"varyByQuery": true,
"ignoredQueryParameters": [
"something"
],
"varyByCookie": true,
"ignoredCookies": [
"csrftoken"
]
}
]
}
In the context of a Content Distribution Network (CDN), an "origin" refers to the original server or source from which the CDN retrieves the original web content, files, or data. This origin server typically hosts the original, master copies of web pages, images, videos, scripts, or any other content that needs to be distributed to users.
When a user requests content, the CDN's edge servers act as intermediaries. If the requested content is not already cached in the edge server, the CDN will fetch it from the origin server. This process is known as an "origin fetch."
The HTTP Cache-Control header is a fundamental mechanism used in HTTP (Hypertext Transfer Protocol) to control caching behavior, instructing how a response should be cached, stored, and used by both clients and intermediary caches (such as proxies and CDNs).
The primary purpose of the Cache-Control header is to improve web performance by managing how caches store and serve web content. It enables fine-grained control over caching, specifying directives that define caching policies and influence how long a response can be cached, whether it can be stored, and if it can be served from a cache without revalidating with the origin server.
Please see the MDN documentation for further details.
A cache key is a unique identifier or string used within a caching system to associate a specific piece of content with its corresponding cached version. It allows the caching system to quickly locate, retrieve, and serve cached content without having to reprocess the original request or access the origin server.
Each element of the request which is used to vary the response from the origin must be incorporated into the cache key. This way when the CDN computes the cache key for a request it can be confident that if it finds content stored under that key, it is the correct content.
By default the cache key is the request path.
You can use the options above to incorporate Query Parameters and Cookies in to the cache key.
When your origin outputs a Vary header then an additional layer of indirection may be added dynamically creating what are essentially "sub cache keys". This usually works but is not ideal. For best performance each header listed in the Vary header from the origin should be incorporated into the base cache key.
Moreover, there are specific options for each service, like considering if a browser supports gzip encoding. It's important to tailor the cache key thoughtfully; for instance, you wouldn't want to respond with content in gzip format for a browser that doesn't support it. Including this in the cache key needs careful consideration, especially if the original source can't produce content in that format, as it could lead to inefficiencies in storage, cache usage, and reliance on the original source.
Request coalescing is a technique used to optimize the delivery of content by consolidating multiple requests for resources into a single request.
When there are multiple requests for the same resource at roughly the same time (within the same storage region), only a single request is sent to the origin to retrieve the resource. When the resource has been retrieved, all of the waiting requests are fulfilled simultaneously.
The above video provides a quick explanation of how request coalescing works behind the scenes, and when it should (or should not) be used.