Add comprehensive edge case tests for external resources and mass publish pipeline by Copilot · Pull Request #2800 · mitodl/ocw-studio

Copilot · 2025-11-19T11:55:52Z

Recent production issues stemmed from data migration converting Hugo shortcode subscripts/superscripts in external resource links (e.g., H{{}}O) to nested markdown, breaking the mass publish pipeline. Test coverage was insufficient to catch these edge cases.

Changes

External Resource API Edge Cases (external_resources/api_edge_cases_test.py, 174 lines)

Network errors: timeouts, SSL failures, connection errors, redirect chains
URL edge cases: empty, malformed, internationalized domains, very long paths
Server errors: 500/502/503/504 status codes

Mass Build/Publish Edge Cases (content_sync/mass_build_edge_cases_test.py, 276 lines)

Empty websites, mixed publish statuses, large-scale scenarios (50+ sites)
Content with null metadata, special characters, nested markdown structures
All version/offline parameter combinations

External Resource Tasks Edge Cases (external_resources/tasks_edge_cases_test.py, 450 lines)

Concurrent updates and race conditions
Wayback Machine API failures, missing jobs, partial responses
Submission interval boundary conditions
Deleted content handling

Subscript/Superscript Integration Tests (websites/management/commands/markdown_cleaning/external_resource_subsup_integration_test.py, 484 lines)

Regression tests for Hugo shortcode conversion issue
Chemical formulas: H{{}}O, CO{{}}, Fe{{}}O{{}}
Trademark symbols: Product{{}}
Validates converted markdown doesn't create nested structures
Mass publish compatibility, idempotency verification

Example

def test_external_resource_conversion_with_subscript_in_link_text(settings):
    """Validates H{{< sub "2" >}}O converts without breaking publish pipeline"""
    markdown = '[H{{< sub "2" >}}O](http://example.com/water)'
    content = WebsiteContentFactory.create(markdown=markdown, website=website)
    
    cleaner.update_website_content(content)
    
    # Verifies: no nested brackets, Hugo shortcodes converted to HTML
    assert "resource_link" in content.markdown
    assert "<sub>2</sub>" in content.markdown
    assert "[[" not in content.markdown  # No invalid nesting

Impact: 1,384 lines of test code, 50+ test functions preventing regressions in external resource handling and mass publish operations.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

esm.ubuntu.com
- Triggering command: /usr/lib/apt/methods/https (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

Original prompt

This section details on the original issue you should resolve

<issue_title>Improve Tests in OCW Studio</issue_title>
<issue_description>### Description/Context

Recently, we have been experiencing issues on QA and Production alike related to mass publish and mass build. The issue stemmed from the data migration of legacy external links to External Resources. This converted the subscripts and superscripts in markdown (in Hugo partials syntax) to a nested markdown syntax for an external resource, which broke the mass publish pipeline with invalid data errors.

Apart from this, it was observed that a few tests became flaky after a Posthog version upgrade. The tests, however, were failing locally only.

We need to come up with a plan to improve testing in OCW locally or in QA.

Plan/Design

TBA</issue_description>

<agent_instructions>work on improving tests in general and increasing coverage overall. Review our test suite to identify any improper tests that need to be fixed.</agent_instructions>

Comments on the Issue (you are @copilot in this section)
@pdpinch @umar8hassan please share a plan for this before you start working on it. @umar8hassan > [@umar8hassan](https://github.com/umar8hassan) please share a plan for this before you start working on it.
Since both the mentioned issues have been resolved in separate pull requests, we can work on improving tests in general and increasing coverage overall.

Over the top of my head, I can work on this in parallel and ask Copilot to review our test suite and increase coverage. I'd then review/update the generated tests.</comment_new>
<comment_new>@zawan-ila
@pdpinch @umar8hassan One important improvement we can make here is to add e2e tests for the offline theme. Over time, I have fixed a number of issues with the offline theme and feel that many of the problems that arose were regressions that would have been caught earlier if we had tests for the offline theme (similar to the ones that we have for the online theme)

While working on these issues, I have wanted to add tests but have been held back by the absence of a test infrastructure for offline theme. If we do set up the test infra for the offline theme, it will be easier to test any offline theme changes going forward and give us more confidence in the correctness of the modifications that we make. </comment_new>
<comment_new>@pdpinch
Ok. Please open an issue for e2e tests for the offline theme and make a plan for how you would approach it. I presume it would be similar to how we handle e2e tests for the courses-v2 theme. </comment_new>
<comment_new>@zawan-ila
@pdpinch I have created https://github.com/mitodl/hq/issues/9186 for e2e tests for the offline theme.</comment_new>

Fixes mitodl/hq#8286

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

…ld functionality Co-authored-by: umar8hassan <71461724+umar8hassan@users.noreply.github.com>

…subscripts/superscripts These tests specifically address the regression mentioned in the issue where data migration converted Hugo shortcode subscripts/superscripts in external resource links to nested markdown, breaking the mass publish pipeline. Co-authored-by: umar8hassan <71461724+umar8hassan@users.noreply.github.com>

Co-authored-by: umar8hassan <71461724+umar8hassan@users.noreply.github.com>

Initial plan

ffe9ef3

Copilot AI assigned Copilot and umar8hassan Nov 19, 2025

Copilot started work on behalf of umar8hassan November 19, 2025 11:56 View session

Copilot AI and others added 3 commits November 19, 2025 12:04

Add comprehensive edge case tests for external resources and mass bui…

4dcfd2e

…ld functionality Co-authored-by: umar8hassan <71461724+umar8hassan@users.noreply.github.com>

Add comprehensive documentation of test improvements

1166fe0

Co-authored-by: umar8hassan <71461724+umar8hassan@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Improve tests in OCW Studio for better coverage~~ Add comprehensive edge case tests for external resources and mass publish pipeline Nov 19, 2025

Copilot AI requested a review from umar8hassan November 19, 2025 12:10

Copilot finished work on behalf of umar8hassan November 19, 2025 12:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add comprehensive edge case tests for external resources and mass publish pipeline#2800

Add comprehensive edge case tests for external resources and mass publish pipeline#2800
Copilot wants to merge 4 commits intomasterfrom
copilot/improve-tests-in-ocw-studio

Copilot AI commented Nov 19, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Example

I tried to connect to the following addresses, but was blocked by firewall rules:

Plan/Design

Comments on the Issue (you are @copilot in this section)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented Nov 19, 2025 •

edited

Loading