AWS S3 Distribution Configuration

This page documents the configuration for Opencast module distribution-service-aws-s3. This configuration is only required on the presentation node, and only if you are using Amazon S3 and/or Cloudfront for distributing media to end users.

Amazon User Configuration

Configuration of Amazon users is beyond the scope of this documentation, instead we suggest referring to Amazon's documentation. You will, however, require to set up proper credentials by either:

AmazonS3FullAccess permission is required, which can be granted using these instructions.

A free Amazon account will work for small scale testing, but be aware that S3 distribution can cost you a lot of money very quickly. Be aware of how much data and how many requests you are making, and be sure to set alarms to notify you of cost overruns.

Amazon Service Configuration

The development and testing it is generally safe to allow the Opencast AWS S3 Distribution service to create the S3 bucket for you. It will create the bucket per its configuration, with public read-only access to the files, and no versioning. For production use we suggest using Amazon CloudFront, which requires additional configuration.

Amazon CloudFront

Amazon CloudFront provides an optional way to better handle distributing your media to end users. While fully configuring CloudFront is outside the scope of this documentation, we wish to note that this does affect one of the keys described below. Please ensure you use the correct distribution base format depending on which service you are using!

Presigned URL

S3 and Cloudfront work together to speed delivery of your content, but if your media URLs leak then anyone can download your recordings. S3 allows you to create Presigned URLs, which are only valid for a limited time. This means that even if your media URLs leak, they will only be valid for a configurable duration.

Set org.opencastproject.distribution.aws.s3.presigned.url to true to enable this feature.

Note: CloudFront and Presigned URL can be used together.

Note: Opencast's distribution files can be quite large depending on your settings, and some of your users may not be able to complete the download within the time limit. While AWS should not stop a download currently in progress, some players may not completely download the media if playback is stopped. If you are experiencing complaints about playback breaking and have presigned URLs enabled, try lengthening the timeout.

Service Default Security Note

On startup, Opencast checks to see if the S3 bucket exists, and if it does not it creates it. This bucket has default permissions allowing anyone to read the full contents of the bucket. This may not be what you want, depending on your institutional priorites. If you wish to protect the files with presigned URLs, then please create the bucket in advance, with the appropriate security settings.

S3 Compatible Service

The S3 API has become the de facto standard interface for almost all storage providers. This module also supports S3 compatible service. In this case, org.opencastproject.distribution.aws.s3.endpoint should be set to the endpoint of the S3 service. Meanwhile, org.opencastproject.distribution.aws.s3.region should not be set. Note: only one of these two configuration keys may be set.

There are two access style for bucket, virtual hosted style (default) and path style. - Virtual hosted style sample: https://bucketname.s3.service.com/ - Path style sample: https://s3.service.com/bucketname

AWS use virtual hosted style by default, and will deprecate path style. Yet, for self hosted s3 compatible service, path style URL is useful. Set org.opencastproject.distribution.aws.s3.path.style to true to enable this feature.

Opencast Service Configuration

The Opencast AWS S3 Distribution service has five configuration keys, which can be found in the org.opencastproject.distribution.aws.s3.AwsS3DistributionServiceImpl.cfg configuration file.

Key Description Default Example
org.opencastproject.distribution.aws.s3.distribution.enable Whether to enable distribution to S3 false
org.opencastproject.distribution.aws.s3.region The AWS region to set us-east-1
org.opencastproject.distribution.aws.s3.bucket The S3 bucket name example-org-dist
org.opencastproject.distribution.aws.s3.access.id Your access ID 20 alphanumeric characters
org.opencastproject.distribution.aws.s3.secret.key Your secret key 40 characters
org.opencastproject.distribution.aws.s3.endpoint The endpoint to use Default AWS S3 endpoint https://s3.service.com
org.opencastproject.distribution.aws.s3.path.style Whether to use path style access URL false / Default AWS S3 style
org.opencastproject.distribution.aws.s3.distribution.base Where the S3 files are available
(derived from bucket & region or set by CloudFront)
http://s3-us-west-2.amazonaws.com/example-org-dist
or DOMAIN_NAME.cloudfront.net
org.opencastproject.distribution.aws.s3.presigned.url Whether to enable presigned URL false
org.opencastproject.distribution.aws.s3.presigned.url.valid.duration Valid duration for presigned URL in ms 21600000 (6 hours)
org.opencastproject.distribution.aws.s3.max.connections Number of max connections 50
org.opencastproject.distribution.aws.s3.connection.timeout Connection timeout in ms 10000
org.opencastproject.distribution.aws.s3.max.retries Number of max retries 100
job.load.aws.s3.distribute Distribute job load 0.1
job.load.aws.s3.retract Retract job load 0.1
job.load.aws.s3.restore Restore job load 0.1

If org.opencastproject.distribution.aws.s3.access.id and org.opencastproject.distribution.aws.s3.secret.key are not explicitly provided, search for credentials will be performed in the order specified by the Default Credentials Provider Chain.

Using S3 Distribution

Amazon S3 distribution is already included in the default Opencast workflows, however it must first be enabled. The schedule-and-upload.xml and publish.xml workflow configuration files both contain lines containing the string "Remove this line if you wish to publish to AWS S3". Both of these lines must be removed before publishing to AWS S3 will function correctly.

If you wish to use AWS S3 publishing with your own custom workflow, you must add the publish-engage-aws workflow operation to your workflow. The operation documentation can be found here.

Publishing to multiple distribution services

Currently we do not support publication to multiple distribution services simultaneously. This means that whichever workflow operation is last in the workflow will be the final publication.

Using this handler in custom workflows

If your workflow contains both publish-engage and publish-engage-aws, in that order, and without a conditional you would have publication files stored both locally and in AWS. This is likely not what you want, so protect your workflow operations appropriately. If you really do need these files stored in both places (for example, in cases where you need to make the files available immediately, and only push to AWS in some cases) then remember to add a retract-engage in between the publication operations. Note that if this step is omitted the files will remain available locally, but will not be used. Of further note, if you retract after publication to AWS then your workflow will not be available to users. To summarize, this table presents a subset of the various situations that are possible

Workflow Operations Files present in the Media Module Files present in AWS Files served from
publish-engage Yes No Opencast Media Module
publish-engage-aws No Yes AWS
publish-engage, publish-engage-aws Yes Yes AWS
publish-engage-aws, publish-engage Yes Yes Opencast Media Module
publish-engage, retract-engage, publish-engage-aws Temporary Yes AWS
publish-engage, publish-engage-aws, retract-engage No Yes Not available

Migrating to S3 Distribution with Pre-Existing Data

If you already have data published to your local Opencast install, you should be able to republish the media selecting AWS S3 as the distribution service to use.