AmberScript Transcription Service

Using this requires you to turn on an Opencast plugin. Take a look at the plugin management documentation to find out how you can do that.

Overview

The AmberScriptTranscriptionService uses the AmberScript Transcription API to transcribe audio files. Audio will get extracted from an opencast recording video file and sent to the AmberScript server to be processed. AmberScriptTranscriptionService will periodically check for a transcription result. Depending on your audio length and job type chosen, transcribing will take some time. When the transcription result is ready the service will transform it to VTT format and attach it to the recording. The recording will be available prior when its workflow finishes. As soon as the transcription gets attached, the Video will be able to be played back using transcriptions.

Configuration

Step 1: Get AmberScript API key

Step 2: Configure AmberscriptTranscriptionService

Edit opencast/etc/org.opencastproject.transcription.amberscript.AmberscriptTranscriptionService.cfg:

Step 3: Add the amberscript workflows to Opencast

In your Opencast workflow directory (usually /etc/workflows or /etc/opencast/workflows), add the workflows from the dropdowns below.

amberscript-attach-transcription.yaml
---
id: amberscript-attach-transcription
title: Attach caption/transcripts generated by AmberScript
description: Attach transcription generated by the AmberScript service.
             This is an internal workflow, started by the Transcription Service.

operations:

  - id: amberscript-attach-transcription
    fail-on-error: true
    exception-handler-workflow: partial-error
    description: Attach captions/transcription
    configurations:
      - transcription-job-id: ${transcriptionJobId}
      - target-caption-format: vtt
      - target-flavor: captions/delivery
      - target-tags: engage-download

  - id: publish-engage
    fail-on-error: true
    exception-handler-workflow: partial-error
    description: Distribute and publish to engage server
    configurations:
      - download-source-flavors: "dublincore/*,security/*"
      - download-source-tags: engage-download
      - strategy: merge
      - check-availability: false

  - id: snapshot
    fail-on-error: true
    exception-handler-workflow: partial-error
    description: Archive media package
    configurations:
      - source-flavors: "*/*"

  - id: cleanup
    fail-on-error: false
    description: Remove temporary processing artifacts
    configurations:
      - delete-external: false
      - preserve-flavors: "security/*"
amberscript-start-transcription.yaml
---
id: amberscript-start-transcription
title: Start AmberScript Transcription
tags:
  - archive

description: Start AmberScript transcription

operations:
  - id: encode
    fail-on-error: false
    exception-handler-workflow: partial-error
    description: Encoding audio for transcription
    configurations:
      - source-flavor: "*/source"
      - target-flavor: audio/mp3
      - target-tags: transcript
      - encoding-profile: audio-mp3

  - id: amberscript-start-transcription
    max-attempts: 3
    retry-strategy: hold
    fail-on-error: false
    exception-handler-workflow: partial-error
    description: Start AmberScript transcription job
    configurations:
      - source-tag: transcript
      - language: de
      - jobtype: direct
      - skip-if-flavor-exists: captions/vtt

Step 4: Include workflow operations into your workflow

Integrate AmberScript workflow operations by including the provided workflow file amberscript-start-transcription.yaml into your existing workflow:

<operation
  id="include"
  description="Start AmberScript Transcription">
  <configurations>
    <configuration key="workflow-id">amberscript-start-transcription</configuration>
  </configurations>
</operation>

Workflow Operations