Create and manage S3 buckets
If you’re running in AWS, odds are you’re using S3. Cumulus makes configuring your buckets much simpler through its easy-to-use JSON configuration. You can also share CORS and policy configuration between buckets. Read the following sections to learn about configuring your S3 environment with Cumulus. Example configuration can be found in the Cumulus repo.
Each S3 bucket is defined in its own file, and the folder distributions are located in is configurable. File names for S3 bucket configuration correspond to the name of that bucket and are also the name that Cumulus will use to refer to your bucket in its output, and as input into its command line interface (minus the .json
, of course). This section describes the base Bucket configuration, and the following sections describe how to configure the various features available to S3 Buckets. Buckets are JSON objects with the following attributes:
region
- the region the bucket should be located inpermissions
- A JSON object containing permission configuration. Can have any of the following attributes:
cors
- a JSON object configuring the cors template. See CORS Templates to learn how to create these templates. Has the following properties:
template
- the name of the template to includevars
- a JSON object of variables to use in the templatepolicy
- a JSON object configuring the policy template. See Policy Templates to lear how to create templates. Has the following properties:
template
- the name of the template to includevars
- a JSON object of variables to use in the templategrants
- A JSON object that configures users that have access to the bucket. See Grantswebsite
- Configuration to serve a static website out of this bucket. See Static Website Configuration for more info.logging
- Configuration to log access to this bucket. See Bucket Logging for more info.notifications
- Configuration to provide notifications on bucket events. See Bucket Notifications for more info.lifecycle
- Configuration for lifecycle rules. See Lifecycle Configuration for more info.versioning
- a true/false value indicating if versioning is enabled on items in the bucketreplication
- Configuration for S3 Cross Region Replication. See Bucket Replication for more info.tags
- an optional JSON object that specifies the tags to include on the bucket. Each tag is in the form of "key": "value"
CORS controls which origins have access to the content in your bucket. A CORS template has the following properties:
origins
- an array of origins that are allowed to access content on the bucket e.g. ["www.example.com"]
methods
- an array of methods that the origins are allowed to execute e.g. ["GET", "HEAD"]
headers
- an array of headers that are allowed in a pre-flight OPTIONS
requestexposed-headers
- an array headers that a client will be able to access from their application e.g. ["Origin"]
max-age-seconds
- the number of seconds that a browser is allowed to cache a preflight response from the specified originsHere is an example CORS configuration:
Policies control access to the S3 bucket and its resources. Cumulus mirrors the format used by AWS for configuring policies. A detailed explanation of bucket policies by AWS can be found here.
Here is an example policy configuration which allows public access to objects in whatever bucket you passed in as the value of the bucket
variable:
Grants control which accounts have permissions to perform actions on your bucket. A grants configuration has the following properties:
name
- the name of the accountemail
- the email address of the account. Not required if id
is definedid
- the canonical AWS id of the account. Not required if email
is definedpermissions
- an array of which permissions are given to the account. Valid values include "all"
, "list"
, "update"
, "view-permissions"
, and "edit-permissions"
Additionally, the following names match the ones in the AWS console, and, when used, do not need to be specified with an email or an id:
Here is an example grant:
S3 allows you to serve a static website from your bucket. To do so, add a JSON object called "website"
to your bucket with the following attributes:
redirect
- the protocol and hostname to redirect all traffic to when a request is received to an object in the bucket e.g. "https://www.example.com"
. Should not be defined if index
is definedindex
- the object to serve when requesting the index on the bucket. Should not be defined if redirect
is definederror
- the object to serve when a 4XX error occurs. Should only be defined if redirect
is not defined.Here are two website configuration examples (the first configures the bucket as a website, the second just redirects all traffic to the bucket):
S3 allows you to log access information about your bucket. To enable logging, provide a JSON object named "logging"
to your bucket configuration with the following attributes:
target-bucket
- the name of the bucket in which to store logsprefix
- the prefix to give to all generated log filesHere’s an example:
When certain events happen in a bucket, S3 allows you to post an event to an SNS topic, SQS queue, or Lambda function. To do so, add an array named "notifications"
to your bucket configuration, and fill it with JSON objects containing the following attributes:
name
- the name of the notification (something human readable)triggers
- a string array containing the names of the events to notify on (for example ["ObjectCreated:*", "ObjectRemoved:Delete"]
)prefix
- the prefix that objects must have in order to generate eventssuffix
- the suffix that objects must have in order to generate eventstype
- the type of notification to generate (accepts "sns"
, "sqs"
, and "lambda"
)target
- the name of the target assetHere’s an example of a notification configuration object:
S3 lets you configure lifecycle rules for the objects in a bucket. To configure lifecycle rules, add an array named "lifecycle"
that contains JSON objects with the following attributes:
name
- the name of the rule (something human readable)prefix
- the prefix for objects to apply the rule todays-until-glacier
- the number of days before objects are transitioned to Glacier storagedays-until-delete
- the number of days before objects are deletedpast-versions
- for versioned buckets, an optional JSON object can be supplied to provide rules for objects for past versions of the bucket
days-until-glacier
- the number of days before objects are transitioned to Glacier storagedays-until-delete
- the number of days before objects are deletedHere’s an example lifecycle rule:
To enable replication of a bucket to another bucket in another region, supply a JSON object named "replication"
to your bucket configuration and give it the following attributes:
iam-role
- the name of the IAM role to use for replicating the bucketprefixes
- an array of prefixes of the objects that replication applies to e.g. ["images/", "js/"]
. Omit to replicate the entire bucketdestination
- the name of the destination bucket for replicated itemsHere’s an example replication rule:
Here’s an example of a full bucket configuration with all features specified.
There are some configuration options for S3 buckets that Cumulus does not handle because we do not use them at Lucid or do not want them managed by Cumulus at this time. These include:
If you would like any of these limitations changed, please submit a pull request.
Cumulus’s S3 module has the following usage:
S3 buckets can be diffed, listed, and synced (migration is covered in the following section). The three actions do the following:
diff
- Shows the differences between the local definition and the AWS S3 configuration. If <asset>
is specified, Cumulus will diff only the bucket defined in that file.list
- Lists the names of the files that contain bucket definitionssync
- Syncs local configuration with AWS. If <asset>
is specified, Cumulus will sync only the bucket defined in the file with that name.If your environment is anything like ours, you have dozens of S3 buckets, and would rather not write Cumulus configuration for them by hand. Luckily, Cumulus provides a migrate
task that will pull down your buckets and produce configuration for them. It will also pull down your CORS rules and bucket policies, and, where buckets are using the same CORS rules or policies, reference the same file.
Cumulus reads configuration from a configuration file, conf/configuration.json
(the location of this file can also be specified by running cumulus with the --config
option). The values in configuration.json
can be changed to customized to change some of Cumulus’s behavior. The following is a list of the values in configuration.json
that pertain to the S3 module:
$.s3.buckets.directory
- the directory where Cumulus expects to find S3 bucket definitions.$.s3.buckets.cors.directory
- the directory where Cumulus expects to find CORS definitions.$.s3.buckets.policies.directory
- the directory where Cumulus expects to find policy definitions.$.s3.print-progress
- whether to print progress of the diff or sync operation as the operation moves onto a new bucket. This is helpful because diffing buckets can take a while, and if this configuration value is set to false, there is no feedback as to what is actually happening in Cumulus.