Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -104,3 +104,10 @@ OPEN_CATALOG_WEBHOOK_KEY=changeme
WAYBACK_MACHINE_ACCESS_KEY=changeme
WAYBACK_MACHINE_SECRET_KEY=changeme
ENABLE_WAYBACK_TASKS=false

# Video transcoding settings
VIDEO_TRANSCODING_STATUS_UPDATE_FREQUENCY=30
TRANSCODE_RESULT_TEMPLATE=./test_videos_webhook/cloudwatch_sns_complete.json
TRANSCODE_ERROR_TEMPLATE=./test_videos_webhook/cloudwatch_sns_error.json
VIDEO_S3_TRANSCODE_BUCKET=changeme
POST_TRANSCODE_ACTIONS=videos.api.update_video_job
4 changes: 2 additions & 2 deletions .secrets.baseline
Original file line number Diff line number Diff line change
Expand Up @@ -113,9 +113,9 @@
"filename": "README.md",
"hashed_secret": "be4fc4886bd949b369d5e092eb87494f12e57e5b",
"is_verified": false,
"line_number": 247
"line_number": 262
}
]
},
"generated_at": "2024-09-04T01:40:31Z"
"generated_at": "2025-06-23T07:34:17Z"
Comment thread
umar8hassan marked this conversation as resolved.
}
30 changes: 30 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,21 @@

OCW Studio manages deployments for OCW courses.

## Recent Updates

### Video Transcoding Enhancements (December 2024)

Recent improvements to the video transcoding system include:

- **Enhanced error handling** in video processing and transcoding workflows
- **Local testing support** for video transcoding with mock AWS MediaConvert callbacks
- **Improved video job status tracking** with comprehensive unit test coverage
- **New API functions** for MediaConvert job management (`get_media_convert_job`, `prepare_job_results`)
- **Automated transcoding status updates** via periodic Celery tasks for development environments
- **Template-based result mocking** for testing transcoding workflows without AWS dependencies

For detailed configuration, see the [Enabling AWS MediaConvert transcoding](#enabling-aws-mediaconvert-transcoding) section.

**SECTIONS**

- [ocw_studio](#ocw_studio)
Expand Down Expand Up @@ -480,14 +495,29 @@ AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY
AWS_STORAGE_BUCKET_NAME
VIDEO_S3_TRANSCODE_ENDPOINT
VIDEO_S3_TRANSCODE_PREFIX
VIDEO_S3_TRANSCODE_BUCKET
AWS_ROLE_NAME
DRIVE_SHARED_ID
DRIVE_SERVICE_ACCOUNT_CREDS
API_BEARER_TOKEN
POST_TRANSCODE_ACTIONS
```

This will allow for videos to be submitted for transcoding to the AWS MediaConvert service. This is done automatically once a video has been synced to Studio from Google Drive.

## Local Development and Testing

For local development and testing, additional environment variables can be configured to mock transcoding behavior:

```
VIDEO_TRANSCODING_STATUS_UPDATE_FREQUENCY=30
TRANSCODE_RESULT_TEMPLATE=./test_videos_webhook/cloudwatch_sns_complete.json
TRANSCODE_ERROR_TEMPLATE=./test_videos_webhook/cloudwatch_sns_error.json
```

These settings enable a periodic task that simulates AWS MediaConvert callbacks for testing video transcoding workflows locally without requiring actual AWS MediaConvert services.

# Enabling 3Play integration

The following environment variables need to be defined in your .env file (for a pre-configured 3Play account):
Expand Down
20 changes: 20 additions & 0 deletions app.json
Original file line number Diff line number Diff line change
Expand Up @@ -751,6 +751,26 @@
"YT_UPLOAD_LIMIT": {
"description": "Max Youtube uploads allowed per day",
"required": false
},
"VIDEO_TRANSCODING_STATUS_UPDATE_FREQUENCY": {
"description": "Frequency in seconds for checking transcoding video statuses (dev only)",
"required": false
},
"TRANSCODE_RESULT_TEMPLATE": {
"description": "Template file for mock transcoding results (dev only)",
"required": false
},
"TRANSCODE_ERROR_TEMPLATE": {
"description": "Template file for mock transcoding error results (dev only)",
"required": false
},
"VIDEO_S3_TRANSCODE_BUCKET": {
"description": "S3 bucket name for MediaConvert transcoded videos",
"required": false
},
"POST_TRANSCODE_ACTIONS": {
"description": "Python function path for post-transcoding actions",
"required": false
}
},
"keywords": ["Django", "Python", "MIT", "Office of Digital Learning"],
Expand Down
31 changes: 30 additions & 1 deletion main/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -553,6 +553,29 @@
default=50,
description="Max Youtube uploads allowed per day",
)

# Transcoding settings for local testing
VIDEO_TRANSCODING_STATUS_UPDATE_FREQUENCY = get_int(
name="VIDEO_TRANSCODING_STATUS_UPDATE_FREQUENCY",
default=30,
dev_only=True,
description="Frequency in seconds for checking transcoding video statuses",
)

TRANSCODE_RESULT_TEMPLATE = get_string(
name="TRANSCODE_RESULT_TEMPLATE",
default="./test_videos_webhook/cloudwatch_sns_complete.json",
dev_only=True,
description="Template file for mock transcoding results",
)

TRANSCODE_ERROR_TEMPLATE = get_string(
name="TRANSCODE_ERROR_TEMPLATE",
default="./test_videos_webhook/cloudwatch_sns_error.json",
dev_only=True,
description="Template file for mock transcoding error results",
)

# OCW metadata fields
FIELD_RESOURCETYPE = get_string(
name="FIELD_RESOURCETYPE",
Expand Down Expand Up @@ -764,6 +787,12 @@
"schedule": CHECK_EXTERNAL_RESOURCE_STATUS_FREQUENCY,
}

if ENVIRONMENT.lower() == "staging":
CELERY_BEAT_SCHEDULE["update-video-transcoding-statuses"] = {
"task": "videos.tasks.update_video_transcoding_statuses",
"schedule": VIDEO_TRANSCODING_STATUS_UPDATE_FREQUENCY,
}

# django cache back-ends
CACHES = {
"default": {
Expand Down Expand Up @@ -1307,7 +1336,7 @@
name="PUBLISH_POSTHOG_FEATURE_FLAG_REQUEST_TIMEOUT_MS",
default=3000,
description=(
"Timeout (ms) for PostHog feature flag requests, " "published to pipelines"
"Timeout (ms) for PostHog feature flag requests, published to pipelines"
),
required=False,
)
Expand Down
18 changes: 9 additions & 9 deletions test_videos_webhook/cloudwatch_sns_complete.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,25 +3,25 @@
"id": "c120fe11-87db-c292-b3e5-1cc90740f6e1",
"detail-type": "MediaConvert Job State Change",
"source": "aws.mediaconvert",
"account": "AWS_ACCOUNT_ID",
"account": "<AWS_ACCOUNT_ID>",
"time": "2021-08-05T16:52:33Z",
"region": "{aws_region}",
"region": "<AWS_REGION>",
"resources": [
"arn:aws:mediaconvert:AWS_REGION:AWS_ACCOUNT_ID:jobs/26235173873033-qav1eq"
"arn:aws:mediaconvert:<AWS_REGION>:<AWS_ACCOUNT_ID>:jobs/<VIDEO_JOB_ID>"
],
"detail": {
"timestamp": 1628172900136,
"accountId": "AWS_ACCOUNT_ID",
"queue": "arn:aws:mediaconvert:AWS_REGION:AWS_ACCOUNT_ID:queues/Default",
"jobId": "VIDEO_JOB_ID",
"accountId": "<AWS_ACCOUNT_ID>",
"queue": "arn:aws:mediaconvert:<AWS_REGION>:<AWS_ACCOUNT_ID>:queues/<VIDEO_TRANSCODE_QUEUE>",
"jobId": "<VIDEO_JOB_ID>",
"status": "COMPLETE",
"userMetadata": {},
"outputGroupDetails": [
{
"outputDetails": [
{
"outputFilePaths": [
"s3://AWS_BUCKET/TRANSCODE_PREFIX/SHORT_ID/DRIVE_FILE_ID/testvid_youtube.mp4"
"s3://<AWS_STORAGE_BUCKET_NAME>/<VIDEO_S3_TRANSCODE_PREFIX>/<SHORT_ID>/<DRIVE_FILE_ID>/<VIDEO_NAME>_youtube.mp4"
],
"durationInMs": 132033,
"videoDetails": {
Expand All @@ -31,7 +31,7 @@
},
{
"outputFilePaths": [
"s3://AWS_BUCKET/TRANSCODE_PREFIX/SHORT_ID/DRIVE_FILE_ID/testvid_360p_16_9.mp4"
"s3://<AWS_STORAGE_BUCKET_NAME>/<VIDEO_S3_TRANSCODE_PREFIX>/<SHORT_ID>/<DRIVE_FILE_ID>/<VIDEO_NAME>_360p_16_9.mp4"
],
"durationInMs": 132033,
"videoDetails": {
Expand All @@ -41,7 +41,7 @@
},
{
"outputFilePaths": [
"s3://AWS_BUCKET/TRANSCODE_PREFIX/SHORT_ID/DRIVE_FILE_ID/testvid_360p_4_3.mp4"
"s3://<AWS_STORAGE_BUCKET_NAME>/<VIDEO_S3_TRANSCODE_PREFIX>/<SHORT_ID>/<DRIVE_FILE_ID>/<VIDEO_NAME>_360p_4_3.mp4"
],
"durationInMs": 132033,
"videoDetails": {
Expand Down
12 changes: 6 additions & 6 deletions test_videos_webhook/cloudwatch_sns_error.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,17 +3,17 @@
"id": "c8879bt5-730e-6a80-3340-80712099846e",
"detail-type": "MediaConvert Job State Change",
"source": "aws.mediaconvert",
"account": "AWS_ACCOUNT_ID",
"account": "<AWS_ACCOUNT_ID>",
"time": "2021-08-05T19:15:37Z",
"region": "AWS_REGION",
"region": "<AWS_REGION>",
"resources": [
"arn:aws:mediaconvert:AWS_REGION:AWS_ACCOUNT_ID:jobs/VIDEO_JOB_ID"
"arn:aws:mediaconvert:<AWS_REGION>:<AWS_ACCOUNT_ID>:jobs/<VIDEO_JOB_ID>"
],
"detail": {
"timestamp": 1628190937233,
"accountId": "919801701561",
"queue": "arn:aws:mediaconvert:AWS_REGION:AWS_ACCOUNT_ID:queues/Default",
"jobId": "VIDEO_JOB_ID",
"accountId": "<AWS_ACCOUNT_ID>",
"queue": "arn:aws:mediaconvert:<AWS_REGION>:<AWS_ACCOUNT_ID>:queues/<VIDEO_TRANSCODE_QUEUE>",
"jobId": "<VIDEO_JOB_ID>",
"status": "ERROR",
"errorCode": 1030,
"errorMessage": "Video codec [indeo4] is not a supported input video codec",
Expand Down
56 changes: 51 additions & 5 deletions videos/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,16 @@ This document describes the components of the video workflow for OCW.

# Overview

This assumes that [Google Drive sync](/README.md#enabling-google-drive-integration), [YouTube integration](/README.md#enabling-youtube-integration), [AWS MediaConvert](/README.md#enabling-aws-transcoding), and [3Play submission](/README.md#enabling-3play-integration) are all enabled, which is required for the video workflow.
This assumes that [Google Drive sync](/README.md#enabling-google-drive-integration), [YouTube integration](/README.md#enabling-youtube-integration), [AWS MediaConvert](/README.md#enabling-aws-mediaconvert-transcoding), and [3Play submission](/README.md#enabling-3play-integration) are all enabled, which is required for the video workflow.

The high-level description of the process is below, and each subsequent section contains additional details, including links to the relevant code.

- Browse to a course site in the Studio UI, go to the Resources page and click the icon to the right of the `Sync w/ Google Drive` button to open the site's Google Drive folder in the Google Drive UI.
- Upload a video with the name `<video_name>.<video_extension>` to the `videos_final` folder on Google Drive, where `<video_extension>` is a valid video extension, such as `mp4`. If there are pre-existing captions that should be uploaded with the video (as opposed to requesting captions/transcript from 3Play), then these should be named _exactly_ `<video_name>_captions.vtt` and `<video_name>_transcript.pdf`, and uploaded into the `files_final` folder on Google Drive.
- Sync using the Studio UI. This uploads the video to S3.
- As soon as the upload to S3 is complete, Studio initiates a celery task to submit the video to the AWS Media Convert service.
- Once trancoding is complete, the video is uploaded to YouTube (set as unlisted prior to the course being published).
- The enhanced transcoding system monitors job progress with automatic status updates and comprehensive error handling.
- Once transcoding is complete, the video is uploaded to YouTube (set as unlisted prior to the course being published).
- After the video has been successfully uploaded to YouTube, and if there are no pre-existing captions, Studio sends a transcript request to 3Play.
- Once 3Play completes the transcript job, the captions (`.vtt` format) and transcript (`.pdf` format) are fetched and associated with the video.
- On any publish action, the video metadata and YouTube metadata are updated, assuming the information has been received from the external services.
Expand All @@ -34,7 +35,17 @@ Users upload videos in a valid video format to the `videos_final` folder. Whethe

The parameters of the AWS transcode request are defined through the AWS interface, and the role is defined [here](https://github.com/mitodl/ol-infrastructure/blob/main/src/ol_infrastructure/applications/ocw_studio/__main__.py). Some example JSONs used for triggering MediaConvert job are in [this folder](/test_videos_webhook/).

The [`TranscodeJobView` endpoint](/videos/views.py) listens for the webhook that is sent when the transcoding job is complete.
## Enhanced Transcoding Features

The transcoding system has been enhanced with the following features:

- **Enhanced error handling** in video processing workflows with comprehensive logging
- **Local testing support** for transcoding workflows using mock AWS MediaConvert callbacks
- **Automated status updates** via the [`update_video_transcoding_statuses`](/videos/tasks.py) Celery task
- **Template-based result processing** using [`prepare_job_results`](/videos/api.py) for flexible response handling
- **MediaConvert job management** via [`get_media_convert_job`](/videos/api.py) for real-time job status checking

The [`TranscodeJobView` endpoint](/videos/views.py) listens for the webhook that is sent when the transcoding job is complete. For local development, the system can simulate these webhooks using template files and periodic status updates.

# YouTube Submission

Expand All @@ -59,6 +70,8 @@ In cases where something may have gone wrong with the data, often due to legacy

# Testing PRs with Transcoding

## Production-like Testing

Before working on, testing, or reviewing any PR that requires a video to be uploaded to YouTube, make sure that AWS buckets (instead of local Minio storage) are being used for testing. To do that, set `OCW_STUDIO_ENVIRONMENT` to any value other than `dev`.

Set the following variables to the same values as for RC:
Expand All @@ -74,13 +87,44 @@ DRIVE_SERVICE_ACCOUNT_CREDS
DRIVE_SHARED_ID
VIDEO_S3_TRANSCODE_ENDPOINT
VIDEO_S3_TRANSCODE_PREFIX
VIDEO_S3_TRANSCODE_BUCKET
```

Upload the video to the course's Google Drive folder, as described in the [Google Drive Sync and AWS Transcoding](#google-drive-sync-and-aws-transcoding) section above. Wait for the video transcoding job to complete, which requires an amount of time proportional to the length of the video; for a very short video, this should only take a few minutes.

Next, the response to the transcode request needs to be simulated. This is because the AWS MediaConvert service will not send a webhook notification to the local OCW Studio instance, but rather to the RC URL.
## Local Development Testing

For local development and testing without AWS dependencies, you can use the enhanced mock transcoding system:

### Configuration

Add these environment variables to your `.env` file:

```
VIDEO_TRANSCODING_STATUS_UPDATE_FREQUENCY=30
TRANSCODE_RESULT_TEMPLATE=./test_videos_webhook/cloudwatch_sns_complete.json
TRANSCODE_ERROR_TEMPLATE=./test_videos_webhook/cloudwatch_sns_error.json
POST_TRANSCODE_ACTIONS=videos.api.update_video_job
```

### Testing Workflow

1. **Upload Video**: Upload a video to the course's Google Drive folder and sync it through the Studio UI
2. **Automatic Processing**: The system will automatically:

To simulate the response, use cURL, Postman, or an equivalent tool to POST a message to `https://localhost:8043/api/transcode-jobs/`, with the body as in the example below, updated to match the relevant environment variables, course name, and video name.
- Create a `VideoJob` with a mock job ID
- Start periodic status checking via the `update_video_transcoding_statuses` task
- Use template files to simulate AWS MediaConvert responses
- Process results using the enhanced `prepare_job_results` function

3. **Monitor Progress**: Check the Django admin interface to see:
- `VideoJob` status updates
- `VideoFile` objects created from mock transcoding results
- Comprehensive error logging if issues occur

### Manual Testing (Legacy Method)

If you need to manually simulate transcoding responses, use cURL, Postman, or an equivalent tool to POST a message to `https://localhost:8043/api/transcode-jobs/`, with the body as in the example below, updated to match the relevant environment variables, course name, and video name.

```json
{
Expand Down Expand Up @@ -148,3 +192,5 @@ making sure to set the values in `<>`. In particular, set
The `DriveFile` will be the one associated with the video: http://localhost:8043/admin/gdrive_sync/drivefile/.

If this completes successfully, the `VideoJob` status in Django admin should be `COMPLETE`, and there should now be three new `VideoFile` objects populated with `status`, `destination`, and `s3_key` fields.

**Note**: The enhanced transcoding system now uses the `prepare_job_results` function to process these responses, which supports template variables like `<DRIVE_FILE_ID>`, `<VIDEO_JOB_ID>`, `<SHORT_ID>`, and various AWS settings, making manual testing more flexible and realistic.
Loading
Loading