Transcripts (Automated by Microsoft Azure)

Overview

The Microsoft Azure Transcription Service invokes the Azure Speech-to-Text service via its Java SDK to transcribe audio to text.

During the execution of an Opencast workflow, a file containing the audio to be transcribed is streamed to the Microsoft Azure Speech-to-Text service. Results are converted to WebVTT or SRT format and attached to the media package.

Note that because Speech-to-Text services require a significant amount of time to process a recording, we do not wait for it to finish before proceeding with the rest of Opencast's normal processing. This means that the transcription process is run asynchronously.

Speech-to-Text job finishes, workflow 2 is started.

Microsoft Azure Speech-to-Text service documentation, including which languages are currently supported, can be found here.

Configuration

Step 1: Get Azure subscription credentials

Step 2: Configure the Microsoft Azure Transcription Service

Edit etc/org.opencastproject.transcription.microsoftazure.MicrosoftAzureTranscriptionServiec.cfg:

Step 3: Add a workflow operations or create new workflow to start transcription

The workflow below is a minimal working example for an event that has a video file in the presenter/source flavor. You do not necessarily need an extra workflow, and instead you can integrate the parts you need into your own. The relevant operations are microsoft-azure-start-transcription to start the transcription and snapshot, so a second workflow can retrieve the mediapackage to attach captions/transcriptions

<?xml version="1.0" encoding="UTF-8"?>
<definition xmlns="http://workflow.opencastproject.org">

  <id>microsoft-azure-transcription</id>
  <title>Microsoft Azure Transcription Testing Workflow</title>
  <tags>
    <tag>upload</tag>
    <tag>schedule</tag>
    <tag>archive</tag>
  </tags>
  <displayOrder>100</displayOrder>
  <description>
    Microsoft Azure Transcription Testing Workflow
  </description>

  <operations>

    <operation
        id="defaults"
        description="Applying default configuration values">
      <configurations>
        <!-- The language code for the language spoken in the source file -->
        <configuration key="language-code">de-DE</configuration>
      </configurations>
    </operation>

    <!-- Start Microsoft Azure transcription job -->
    <operation
        id="microsoft-azure-start-transcription"
        fail-on-error="true"
        exception-handler-workflow="partial-error"
        description="Start Microsoft Azure transcription job">
      <configurations>
        <!--  Skip this operation if flavor already exists. Used for cases when mediapackage already has captions. -->
        <configuration key="skip-if-flavor-exists">captions/vtt+${language-code}</configuration>
        <configuration key="language-code">${language-code}</configuration>
        <configuration key="source-flavor">presenter/source</configuration>
      </configurations>
    </operation>

    <!-- Publish preview internal -->
    <operation
        id="publish-configure"
        exception-handler-workflow="partial-error"
        description="Publish to preview publication channel">
      <configurations>
        <configuration key="download-source-flavors">*/source</configuration>
        <configuration key="channel-id">internal</configuration>
        <configuration key="url-pattern">http://localhost:8080/admin-ng/index.html#/events/events/${event_id}/tools/playback</configuration>
        <configuration key="check-availability">false</configuration>
      </configurations>
    </operation>

    <!-- Publish to engage player -->
    <operation
        id="publish-engage"
        max-attempts="2"
        fail-on-error="true"
        exception-handler-workflow="partial-error"
        description="Publishing to Engage">
      <configurations>
        <configuration key="download-source-flavors">dublincore/*,security/*,*/source</configuration>
        <configuration key="download-source-tags">engage-download</configuration>
        <configuration key="streaming-source-tags">engage-streaming</configuration>
        <configuration key="check-availability">false</configuration>
      </configurations>
    </operation>


    <!-- Save the language-code to the mediapackage, so the attach workflow knows how to flavor the transcription file -->
    <operation
        id="export-wf-properties"
        fail-on-error="false"
        description="Export workflow settings to Java properties file">
      <configurations>
        <configuration key="target-flavor">wf-properties/language</configuration>
        <configuration key="target-tags">archive</configuration>
        <configuration key="keys">language-code</configuration>
      </configurations>
    </operation>

    <!-- Archive the current state of the media package -->
    <operation
        id="snapshot"
        fail-on-error="true"
        exception-handler-workflow="partial-error"
        description="Archiving">
      <configurations>
        <configuration key="source-flavors">*/source,dublincore/*,security/*</configuration>
        <configuration key="source-tags">archive</configuration>
      </configurations>
    </operation>

    <!-- Clean up the working file repository -->
    <operation
        id="cleanup"
        fail-on-error="false"
        description="Cleaning up">
      <configurations>
        <configuration key="delete-external">true</configuration>
        <!-- FixMe Don't clean up ACLs until workflow service no longer looks for them in the WFR. -->
        <configuration key="preserve-flavors">security/*</configuration>
      </configurations>
    </operation>

  </operations>

</definition>

Step 4: Add a workflow to attach transcriptions

A sample attach workflow that works together with the workflow from Step 3. Attaches the generated transcription to the mediapackages and republishes it. Copy it into a new file under etc/workflows/microsoft-azure-attach-transcripts.xml in your Opencast installation.

<?xml version="1.0" encoding="UTF-8"?>
<definition xmlns="http://workflow.opencastproject.org">
  <id>microsoft-azure-attach-transcripts</id>
  <title>Attach caption/transcripts generated by Microsoft Azure</title>
  <description>Attach automated transcription generated by the Microsoft Azure service. This is an internal workflow, started by the Transcription Service.
  </description>
  <configuration_panel />

  <operations>

    <!-- Import the language code used in the start workflow -->
    <operation
        id="import-wf-properties"
        fail-on-error="true"
        exception-handler-workflow="partial-error"
        description="Load language setting">
      <configurations>
        <configuration key="source-flavor">wf-properties/language</configuration>
      </configurations>
    </operation>

    <!-- Attach caption/transcript -->
    <operation id="microsoft-azure-attach-transcription"
               fail-on-error="true"
               exception-handler-workflow="partial-error"
               description="Attach captions/transcription">
      <configurations>
        <!-- This is filled out by the transcription service when starting this workflow -->
        <configuration key="transcription-job-id">${transcriptionJobId}</configuration>
        <!-- Set the flavor to something the Paella player will parse -->
        <!-- Using the language code here is not necessary, but nice to have -->
        <configuration key="target-flavor">captions/vtt+${language-code}</configuration>
        <configuration key="target-tag">archive</configuration>
      </configurations>
    </operation>

    <!-- Publish to engage player -->
    <operation id="publish-engage"
               fail-on-error="true"
               exception-handler-workflow="partial-error"
               description="Distribute and publish to engage server">
      <configurations>
        <configuration key="download-source-flavors">dublincore/*,security/*,captions/*</configuration>
        <configuration key="strategy">merge</configuration>
        <configuration key="check-availability">false</configuration>
      </configurations>
    </operation>

    <!-- Archive media package -->
    <operation id="snapshot"
               fail-on-error="true"
               exception-handler-workflow="partial-error"
               description="Archive media package">
      <configurations>
        <configuration key="source-flavors">*/*</configuration>
      </configurations>
    </operation>

    <!-- Clean up work artifacts -->
    <operation
        id="cleanup"
        fail-on-error="false"
        description="Remove temporary processing artifacts">
      <configurations>
        <configuration key="delete-external">true</configuration>
        <!-- FixMe Don't clean up ACLs until workflow service no longer looks for them in the WFR. -->
        <configuration key="preserve-flavors">security/*</configuration>
      </configurations>
    </operation>

  </operations>

</definition>

Workflow Operations