diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md index 591fea83ac..3fd238e411 100644 --- a/.github/copilot-instructions.md +++ b/.github/copilot-instructions.md @@ -67,8 +67,13 @@ Copilot must still follow these steps: ## Prompt Feedback (Compact) - Always assess prompt quality for every prompt and emit scores unless the scoring thresholds for suppression are met. - When feedback is not suppressed, use the following format: -- Format: `Scores: Completeness X/10, Assumptions X/10, Clarity X/10 | Critique: | Improve: `. -- Scoring: Completeness and Clarity are higher-is-better; Assumptions is lower-is-better. +- Format: `Scores: Completeness X/10, Assumptions X/10, Clarity X/10, CostRisk X/10 | Critique: | Improve: `. +- Scoring: Completeness and Clarity are higher-is-better; Assumptions and CostRisk are lower-is-better. +- **CostRisk** estimates likely token usage and repository traversal behaviour. + - Score higher when the prompt involves: large logs, many files, broad repository + requests, unrestricted agent instructions, repeated context, or exploratory prompting. + - Score lower when the prompt is narrowly scoped, attaches minimal context, + and targets a specific outcome. - Strict rubric for underspecified prompts: - If the prompt is extremely vague (for example: "build something"), score it harshly. - For these prompts, use: Completeness 0-3/10, Clarity 0-3/10, Assumptions 7-10/10. @@ -78,7 +83,7 @@ Copilot must still follow these steps: - Never suppress scored feedback for underspecified prompts (including prompts that fall under the strict rubric above). - Suppress displayed feedback when Completeness >= 8, Assumptions <= 2, - and Clarity >= 8, + Clarity >= 8, and CostRisk <= 3, unless the user explicitly asks to apply feedback to the current prompt. - Determine suppression from the current user prompt only; retrospective analysis of earlier prompts should be provided only when explicitly requested. - Prefer high-compliance guidance: suggest exact wording that reduces diff --git a/docs/GithubCopilot/CopilotGeneralUseGuidelines.md b/docs/GithubCopilot/CopilotGeneralUseGuidelines.md new file mode 100644 index 0000000000..7e94778787 --- /dev/null +++ b/docs/GithubCopilot/CopilotGeneralUseGuidelines.md @@ -0,0 +1,548 @@ +# GitHub Copilot Usage-Based Billing Changes and Cost Management Guidance + +## Overview + +GitHub announced significant changes to Copilot pricing and billing in April 2026. The previous model, which largely treated Copilot usage as a fixed-cost subscription with soft usage limits, is being replaced with a usage-based billing model driven by token consumption. + +This document explains: + +- What changed +- Why the changes matter +- How engineering behaviour must adapt +- Best practices for controlling cost exposure +- Anti-patterns that will become expensive under the new model + +--- + +# What Changed + +## Previous Model + +Historically, Copilot pricing was structured as a fixed monthly subscription. Different models already carried different effective costs through premium request multipliers and quota weighting, but these distinctions were rarely visible in day-to-day use. The result was: + +- Broadly predictable cost at the subscription level +- "Unlimited" or near-unlimited usage perception +- Premium request limits that felt abstract and disconnected from behaviour +- No clear relationship between prompt size, context volume, and cost + +This encouraged widespread behaviours such as: + +- Large conversational debugging sessions +- Whole-repository analysis +- Repeated exploratory prompting +- Long-lived chat sessions +- "Infinite prompt" workflows + +In practice, many engineers treated Copilot usage as essentially free once licensed. + +--- + +## New Model (2026) + +GitHub has introduced a usage-based billing model that makes the cost differences between models, workflows, and context sizes explicit. The underlying economics were always present through premium request multipliers and quota weighting, but the new model ties cost directly to token usage, context size, and workflow behaviour. + +Key billing factors: + +- Prompt size (input tokens) +- Context size (attached files, editor context, conversation history) +- Response size (output tokens) +- Model selected (different models carry different per-token rates) +- Agentic workflow activity (tool calls, repository traversal, multi-step reasoning) + +This changes Copilot from a flat-cost development tool into a metered AI compute platform. + +--- + +# Why This Matters + +Under the new model, engineering behaviour directly affects cost. + +The following activities now materially increase spend: + +| Activity | Cost Risk | +|---|---| +| Large prompts | High | +| Long logs | High | +| Whole-repo context | High | +| Agent/refactor mode | Very High | +| Long-running chat sessions | High | +| Repeated iterative prompting | High | +| Premium models | High | + +Previously inefficient prompting habits were mostly harmless. + +They are now financially visible. + +--- + +# The End of "Infinite Prompt" Culture + +A number of prompting patterns became popular under the old pricing model because usage appeared effectively unlimited. + +Examples included: + +- Treating Copilot as a conversational debugger +- Keeping one chat open for days or weeks +- Uploading entire repositories +- Pasting massive logs +- Iteratively refining prompts indefinitely +- Using agent mode for broad exploratory work + +These approaches are now expensive. + +## Why "Infinite Prompt" Workflows Became Popular + +The old pricing model unintentionally encouraged behaviour such as: + +```text +Explain this. +Now explain this. +What about this case? +Rewrite it. +Now optimise it. +Now modernise it. +Now compare approaches. +``` + +Each additional prompt was perceived as free. + +Under token billing, every interaction carries incremental cost because: + +- Context history is resent +- Prior responses remain in memory +- Larger histories increase token consumption +- Repository context may be reloaded repeatedly + +Long conversational sessions can become surprisingly expensive. + +--- + +# Before vs After: Engineering Behaviour + +## Before + +Typical workflow: + +```text +Paste entire log +Paste multiple files +Ask broad question +Iterate repeatedly +Keep chat open indefinitely +``` + +Cost impact: +- Mostly invisible + +## After + +Recommended workflow: + +```text +Isolate problem first +Reduce logs +Limit scope +Ask targeted questions +Start fresh chats frequently +``` + +Cost impact: +- Controlled and predictable + +--- + +# Cost-Control Best Practices + +## 1. Minimise Context Size + +The single biggest driver of token cost is context volume. + +### Recommended + +- Limit prompts to relevant files only +- Include only necessary log excerpts +- Remove unrelated stack traces +- Summarise findings before prompting + +### Avoid + +- Whole repository analysis +- Multi-thousand-line logs +- Entire manifests +- Large copy/paste sessions + +--- + +## 2. Pre-Filter Data Before Using Copilot + +Use traditional tooling first. + +Examples: + +- grep +- rg +- klogg +- log filters +- custom scripts + +Goal: +- Reduce AI input size before submission + +Example: + +Instead of: + +```text +Analyse this 20,000-line playback log +``` + +Prefer: + +```text +The issue occurs during DRM renewal. +Here are the 40 relevant lines around the failure. +``` + +This improves: + +- Response quality +- Cost efficiency +- Signal-to-noise ratio + +--- + +## 3. Scope Questions Tightly + +### Good + +```text +Review this ABR selection logic. +``` + +### Bad + +```text +Explain how playback works. +``` + +Large architectural questions often trigger expensive repository traversal. + +### Repository Traversal Behaviour + +Copilot prioritises explicitly provided files and editor context when answering questions. However, semantic search, dependency tracing, symbol resolution, and agent planning may still cause Copilot to inspect additional files beyond what was explicitly attached. + +To limit unnecessary traversal: + +- Attach only the files relevant to the question. +- State the scope explicitly ("only consider `AampAbrManager.cpp`"). +- Add instructions such as: "Do not inspect unrelated files unless the current evidence is insufficient." +- Avoid broad prompts that require Copilot to search the repository for context you could provide directly. + +--- + +## 4. Avoid Long Conversational Sessions + +Long-lived chats accumulate hidden context. + +This means: + +- Higher token reuse +- Increasing prompt size +- Rising cost over time + +### Recommendation + +Start a new chat when: + +- Changing topic +- Moving subsystem +- Starting new investigations + +--- + +## 5. Use Bounded Prompts With Bounded Scope + +Both extremes are expensive: + +- A single massive prompt that includes everything (large logs, many files, broad instructions) consumes heavy input tokens in one request. +- Endless iterative prompting accumulates cost through repeated context reloads and growing conversation history. + +The goal is bounded prompts with bounded scope. + +### Patterns to Avoid + +- One giant prompt containing entire logs, many files, and open-ended questions +- Long iterative chains where each follow-up resends the full conversation history +- Broad exploratory prompting ("tell me everything about this module") +- Repeated context reloads from starting over without narrowing scope + +### Recommended Approach + +Multiple focused prompts are reasonable when each prompt: + +- Has a clear, narrow objective +- Includes only the context needed for that objective +- Does not repeat large attachments unnecessarily +- Avoids open-ended exploration + +### Example + +Instead of: + +```text +Here are 15 files and a 2000-line log. +Explain, fix, optimise, and modernise everything. +``` + +Prefer: + +```text +Prompt 1: Review the ABR fallback logic in AampAbrManager.cpp + for the off-by-one issue described in VPAAMP-99. +Prompt 2: Suggest a unit test for the corrected branch. +``` + +Each prompt is self-contained, scoped, and verifiable. + +--- + +## 6. Avoid Whole-Repository Agent Operations + +Agent and refactor modes can generate substantial token usage through multi-step reasoning, tool calls, and broad repository traversal. + +Examples of high-risk operations: + +- Repository-wide refactors +- Broad code modernisation +- Autonomous exploratory debugging +- Large-scale architectural analysis +- Unrestricted agent instructions without scope constraints + +### Recommended + +Restrict operations to: + +- Specific modules +- Small file sets +- Clearly bounded scopes + +When using agent mode, include explicit scope boundaries in the prompt: + +```text +Refactor AampTsbReader.cpp to use RAII for file handles. +Do not modify other files. +Do not inspect unrelated files unless the current evidence is insufficient. +``` + +--- + +## 7. Use Copilot for High-Value Tasks + +Copilot remains extremely valuable when used efficiently. + +Good return-on-cost activities include: + +| Use Case | Efficiency | +|---|---| +| Inline completion | Excellent | +| Boilerplate generation | Excellent | +| Unit test creation | Excellent | +| Small refactors | Excellent | +| API documentation | Good | +| Targeted debugging | Good | + +Less efficient activities include: + +| Use Case | Efficiency | +|---|---| +| Exploratory conversations | Poor | +| Broad architecture tutoring | Poor | +| Massive log analysis | Poor | +| Open-ended investigation | Poor | + +--- + +## 8. Understand Model and Workflow Cost Differences + +Different models and workflows carry significantly different costs. + +### Autocomplete + +Inline autocomplete (code completions as you type) is generally expected to remain low-cost under the new model. It uses small, fast models with minimal context and generates short completions. For most developers, autocomplete will not be a significant cost driver. + +### Chat and Agent Workflows + +Chat and agent workflows are substantially more expensive than autocomplete. They involve larger context windows, longer responses, tool calls, and potentially multi-step reasoning with repository traversal. Cost scales with the complexity and breadth of the request. + +### Model Selection + +Premium models cost more per token than standard models. + +| Activity | Model Strategy | +|---|---| +| Routine coding | Standard model | +| Complex debugging | Premium model | +| Architecture reviews | Premium model | +| Autocomplete | Default model (generally low-cost) | + +Choose the model appropriate to the task. Do not default all users to the most expensive models without justification. + +--- + +## 9. Establish Team-Level Governance + +Organisations should confirm their specific billing configuration, as arrangements vary: + +- **Hard quotas:** Some organisations enforce a fixed monthly spending cap. Once reached, access may be restricted. +- **Metered overages:** Others allow usage beyond the included allowance at a per-unit overage rate. +- **Pooled enterprise billing:** Some enterprise agreements pool credits across teams or business units. + +Regardless of arrangement, the following controls are recommended: + +- Usage dashboards with per-team and per-user visibility +- Budget monitoring and cost alerts +- Confirmed spending limits and overage settings +- Team-level reporting on consumption trends +- Prompt hygiene guidance and periodic review +- Controlled rollout of agent mode features + +Without governance, costs may grow unpredictably. Teams should understand their organisation's specific limits and alert thresholds before relying on high-volume workflows. + +--- + +## 10. Create Prompt Templates + +Structured prompts reduce: + +- Ambiguity +- Repetition +- Token waste + +Recommended format: + +```text +Problem: +Observed behaviour: +Expected behaviour: +Relevant module: +Relevant logs: +Specific question: +``` + +This improves both: + +- Response quality +- Cost efficiency + +--- + +# Prompt Feedback and Cost Awareness + +## Automatic Prompt Feedback + +The AAMP repository includes custom instructions that cause Copilot to assess prompt quality automatically. When you submit a prompt, Copilot evaluates it for completeness, clarity, and the level of assumptions required, then provides a brief feedback line. + +Developers should pay attention to this feedback. It is designed to surface prompting weaknesses that may result in poor outputs or unnecessary cost. + +When feedback is suppressed (no feedback line is displayed), this typically means the prompt is already sufficiently clear and bounded. Suppression is a positive signal. + +## CostRisk Metric + +An optional scoring metric, `CostRisk X/10`, can be included in prompt feedback to estimate the likely token usage and repository traversal cost of a prompt. + +- Lower is better. +- The score estimates how expensive a prompt is likely to be based on its structure and scope. + +### Heuristics + +The following factors increase CostRisk: + +| Factor | Why It Increases Cost | +|---|---| +| Large logs or attachments | High input token count | +| Many attached files | Broad context window | +| Broad repository requests | Triggers traversal and semantic search | +| Unrestricted agent instructions | Multi-step reasoning with tool calls | +| Repeated context across prompts | Redundant token consumption | +| Exploratory or open-ended prompting | Unpredictable expansion | + +### Example Feedback Line + +```text +Scores: Completeness 7/10, Assumptions 3/10, Clarity 7/10, CostRisk 6/10 +| Critique: Prompt includes a 1500-line log and asks for broad analysis + without isolating the failure. +| Improve: Extract the 30-50 lines around the DRM renewal failure and + ask specifically about the licence acquisition timeout. +``` + +A CostRisk of 1-3 indicates a well-scoped prompt. A CostRisk of 7-10 suggests the prompt will likely trigger heavy token consumption or broad repository traversal and should be narrowed. + +--- + +# Recommended Log Handling Guidelines + +## Maximum Recommended Sizes + +| Content Type | Suggested Limit | +|---|---| +| Logs | < 300 lines | +| Source files | 2-3 files | +| Manifest excerpts | Relevant sections only | +| Chat session duration | Short-lived | + +These are guidelines rather than hard limits, but exceeding them significantly increases cost risk. + +--- + +# Key Cultural Change + +The old mindset: + +```text +Copilot usage is effectively free. +``` + +The new mindset: + +```text +Copilot usage is metered compute consumption. +Every prompt has a cost proportional to its scope and context. +``` + +Engineering teams that adapt successfully will: + +- Scope narrowly +- Minimise context +- Use traditional tooling first +- Avoid conversational drift +- Treat AI usage as an engineering resource +- Provide bounded, evidence-based prompts +- Reduce unnecessary repository traversal +- Limit agent expansion to well-defined tasks + +--- + +# Final Recommendations + +The most effective single improvement is: + +## Reduce input size before invoking AI + +For most debugging workflows: + +1. Investigate manually first +2. Isolate the issue +3. Extract minimal evidence +4. Ask focused questions + +This typically: + +- Reduces cost substantially +- Improves answer quality +- Produces faster responses +- Reduces hallucination risk + +The goal is not to reduce Copilot usage. + +The goal is deliberate usage, predictable cost, better outputs, easier validation, and reusable prompting patterns. \ No newline at end of file diff --git a/docs/GithubCopilot/CopilotPromptGuidelines.md b/docs/GithubCopilot/CopilotPromptGuidelines.md new file mode 100644 index 0000000000..a086704e2e --- /dev/null +++ b/docs/GithubCopilot/CopilotPromptGuidelines.md @@ -0,0 +1,753 @@ +# Copilot Prompting Guidelines for AAMP Development + +This checklist helps ensure Copilot produces accurate, safe, useful, and cost-conscious outputs when working on AAMP, including DASH, HLS, buffering, ABR, DRM, manifest handling, playback state, and log analysis. + +These guidelines assume Copilot usage may be metered by token consumption, model choice, context size, and agent activity. The goal is not to avoid Copilot, but to use it deliberately and efficiently. + +--- + +## Core Principle + +Treat Copilot prompts as metered engineering work. + +Before asking Copilot, reduce the input to the smallest useful context: + +- define the goal +- define the scope +- include only relevant files, logs, manifests, or snippets +- ask for structured output +- avoid long exploratory prompt chains +- prefer a short, focused prompt over a broad open-ended request + +--- + +# Must-Haves + +## 1. Start with a clear goal + +State the task in one sentence first. + +Good: + +```text +Analyze why playback stalls after a bitrate switch. +``` + +Poor: + +```text +Something is wrong with playback. +``` + +## Prompt Feedback Mechanism + +The repository instructions include an automatic prompt feedback mechanism. + +Developers should expect prompts to be assessed for quality. When the prompt is already clear, complete, and low-assumption, no feedback may be shown. When feedback is shown, it should be treated as useful guidance rather than noise. + +The feedback is intended to help developers: + +- reduce ambiguity +- avoid excessive context usage +- reduce unnecessary Copilot cost +- improve evidence quality +- avoid broad or speculative requests +- get better answers with fewer follow-up prompts + +Feedback may appear in this compact format: + +```text +Scores: Completeness X/10, Assumptions X/10, Clarity X/10 | Critique: | Improve: +``` + +Scoring is interpreted as follows: + +| Score | Meaning | +|---|---| +| Completeness | Higher is better. Measures whether the prompt includes the goal, context, scope, constraints, evidence, and success criteria. | +| Assumptions | Lower is better. Measures how much Copilot must infer or guess. | +| Clarity | Higher is better. Measures whether the request is specific, understandable, and actionable. | + +Developers should pay particular attention to feedback when: + +- Completeness is below 8 +- Assumptions is above 2 +- Clarity is below 8 +- Copilot suggests a more specific prompt wording +- the critique says the prompt is too broad, underspecified, or likely to require unnecessary context + +A prompt may be scored harshly if it omits: + +- target +- scope +- constraints +- success criteria +- relevant evidence +- expected output format + +Very vague prompts such as: + +```text +Build something. +``` + +or: + +```text +Fix playback. +``` + +should be expected to receive low Completeness and Clarity scores, and a high Assumptions score. + +Do not ignore this feedback. It is there to prevent common failure modes: + +- Copilot guessing the intent +- unnecessary repository traversal +- excessive log or file context being loaded +- expensive iterative prompt chains +- broad refactors before root cause is understood +- answers that are difficult to validate + +When feedback is shown, revise the prompt before continuing where practical. + +For example, instead of continuing with: + +```text +Fix playback. +``` + +use the suggested improvement to produce something like: + +```text +Analyze why DASH playback stalls after an ABR upswitch. Focus on buffer state and ABR decision logic only. Use the provided 80-line log excerpt and the named source files. Do not modify code yet. Return root cause, evidence, minimal fix, risks, and validation steps. +``` + +The feedback mechanism is not intended to slow development down. It is intended to make prompts: + +- cheaper +- clearer +- safer +- more deterministic +- easier to validate +- less dependent on follow-up questions + +If feedback is suppressed, that normally means the current prompt is already considered sufficiently complete, clear, and low-assumption. + +--- + +## 2. Define scope explicitly + +Mention only the relevant scope. + +Include: + +- files, classes, or components +- stream type: DASH or HLS +- functional area: ABR, buffering, DRM, manifest parsing, subtitle handling, playback state, etc. +- whether the request is analysis-only or should include code changes + +Good: + +```text +Focus only on DASH ABR switching in StreamAbstractionAAMP and related buffer management. Do not inspect unrelated DRM code unless the evidence points there. +``` + +--- + +## 3. Minimise context before prompting + +Do not paste large inputs by default. + +Before using Copilot: + +- filter logs +- isolate timestamps +- identify the relevant playback event +- include only the smallest set of files needed +- summarise known findings instead of pasting full investigations + +Avoid: + +- entire logs +- whole manifests +- unrelated stack traces +- multiple large source files +- repository-wide questions without a clear reason + +Recommended limits: + +| Input Type | Suggested Limit | +|---|---:| +| Logs | Prefer under 100 lines; avoid exceeding 300 lines | +| Source files | Prefer 1-3 files | +| Manifest content | Relevant periods, variants, renditions, or tags only | +| Chat history | Start a new chat when changing topic | +| Agent tasks | Use only with bounded scope | + +--- + +## 4. Describe observable behaviour + +Describe what happens, when it happens, and under what conditions. + +Good: + +```text +Playback stalls about 10 seconds after an ABR upswitch on a DASH live stream. Audio continues for several seconds, then video freezes. The issue occurs only with Widevine-enabled streams. +``` + +Include: + +- tune, seek, trickplay, ABR switch, license renewal, discontinuity, or stall point +- live or VOD +- DASH or HLS +- DRM type if relevant +- device/platform if relevant +- whether the issue is reproducible + +--- + +## 5. State constraints + +Examples: + +- Do not change public APIs. +- Preserve existing playback behaviour. +- Avoid performance regressions. +- Maintain DRM flow. +- Avoid additional network requests. +- Do not change manifest parsing outside the affected stream type. +- Prefer changes local to the identified module. +- Do not perform a broad refactor unless explicitly requested. + +--- + +## 6. Require evidence-based reasoning + +Always ask Copilot to ground its answer in evidence. + +Ask it to separate: + +- facts +- hypotheses +- conclusions +- unknowns + +Good: + +```text +Base the analysis only on the provided code and logs. Separate facts, hypotheses, and conclusions. Point to the code paths or log lines that support each conclusion. +``` + +--- + +## 7. Ask for structured output + +Request clear sections. + +Recommended sections: + +- Summary +- Root cause +- Evidence +- Minimal fix +- Risks +- Validation steps +- Unknowns +- Follow-up improvements + +--- + +## 8. Prefer minimal safe changes + +Explicitly say: + +```text +Prefer the smallest safe fix before suggesting refactors. +``` + +This matters for both engineering risk and Copilot cost. Broad refactors often require more context, more iterations, and more agent activity. + +--- + +## 9. Require uncertainty to be called out + +Use: + +```text +State assumptions and unknowns explicitly. Do not present guesses as facts. +``` + +--- + +## 10. Avoid unnecessary prompt chains + +Under usage-based pricing, repeated small follow-up prompts can become expensive because context history may be reused. + +Poor pattern: + +```text +Explain this. +Now explain that. +Now compare it with another approach. +Now rewrite it. +Now make it cleaner. +``` + +Better pattern: + +```text +Analyze the issue, identify the root cause, propose the smallest safe fix, list risks, and provide validation steps in one response. +``` + +--- + +# Cost-Aware Prompting Rules + +## 1. Do not use Copilot as a log search tool + +Use local tools first: + +- grep +- rg +- klogg +- scripts +- existing test output +- targeted log filters + +Then provide Copilot with the reduced evidence. + +Poor: + +```text +Analyze this 20,000-line AAMP log. +``` + +Better: + +```text +The failure occurs after an ABR upswitch at 12:04:31. Here are the 60 relevant log lines from 10 seconds before to 20 seconds after the switch. +``` + +--- + +## 2. Do not use Copilot as a repository crawler by default + +Poor: + +```text +Explain how playback works in AAMP. +``` + +Better: + +```text +Explain how this specific code path handles buffer updates after a DASH fragment download. Limit the analysis to these files. +``` + +--- + +## 3. Use agent or plan mode selectively + +Agent mode can be expensive because it may inspect files, generate plans, make changes, run tools, and iterate. + +Use agent or plan mode when: + +- changes span multiple files +- sequencing matters +- there are multiple subsystems involved +- manual coordination would be error-prone + +Avoid agent mode for: + +- small local fixes +- simple explanations +- single-function review +- trivial refactoring +- broad exploratory investigation + +Always bound the task. + +Good: + +```text +Create a plan first. Do not implement yet. Limit the plan to DASH ABR handling in these files only. +``` + +--- + +## 4. Use cheaper interaction patterns first + +Prefer this order: + +1. inline completion +2. targeted chat question +3. focused patch request +4. plan mode +5. agent mode + +Do not start with the most expensive interaction mode unless the task justifies it. + +--- + +## 5. Start new chats for new topics + +Long chats accumulate context. + +Start a new chat when: + +- moving to a different defect +- switching from DASH to HLS +- switching from buffering to DRM +- moving from investigation to implementation +- changing files or subsystems significantly + +--- + +# Task-Specific Guidance + +## Code Creation + +Must include: + +- where the code should live +- what must not change +- expected behaviour +- tests or validation steps +- performance constraints +- whether implementation is requested now or only a plan + +Good structure: + +```text +Design first. +List impacted files. +Propose the patch. +Describe validation. +List risks. +``` + +Cost-aware addition: + +```text +Do not inspect unrelated files unless required. Ask before expanding scope. +``` + +--- + +## Bug Analysis and Fixing + +Must include: + +- symptom +- expected behaviour +- reproduction conditions +- relevant logs or code +- recent changes if known +- scope boundaries + +Always ask for: + +- root cause, not just a fix +- minimal fix first +- risk assessment +- validation steps + +Good prompt: + +```text +Analyze this playback stall. Focus on DASH ABR switching and buffer state only. Use the provided logs and code snippets. Identify the first abnormal event, root cause, smallest safe fix, risks, and validation steps. Do not suggest broad refactoring unless needed for correctness. +``` + +--- + +## Architecture or New Features + +Must include: + +- problem statement +- current behaviour +- desired behaviour +- performance expectations +- reliability expectations +- integration points +- constraints +- whether the request is for options only or implementation + +Always ask for: + +- multiple design options +- tradeoffs +- integration points +- migration risks +- validation strategy + +Cost-aware addition: + +```text +Provide a design proposal only. Do not inspect or rewrite code unless explicitly asked. +``` + +--- + +## Plan Agent for Complex Work + +Use plan mode or agent mode when: + +- changes span multiple files +- multiple subsystems are involved +- sequencing is non-trivial +- validation requires multiple steps +- rollback planning is needed + +Prompt: + +```text +Create a plan first. Do not implement yet. +``` + +The plan should include: + +- impacted files +- steps +- assumptions +- risks +- validation +- rollback +- points where human confirmation is needed + +Cost-aware addition: + +```text +Keep the plan bounded to the named files and subsystems. Do not expand the search unless the current evidence is insufficient. +``` + +--- + +## Log Debugging for AAMP Runs + +Must include: + +- time window +- anchor event: tune, seek, ABR switch, license renewal, discontinuity, stall, error +- stream type: DASH or HLS +- DRM type if relevant +- environment or device +- expected behaviour +- observed behaviour + +Always ask for: + +- timeline reconstruction +- first abnormal event +- ranked root causes +- evidence for each root cause +- unknowns +- next log lines or instrumentation needed + +Cost-aware log prompt: + +```text +Analyze only the following log excerpt. Do not ask for or infer from the full log unless necessary. Reconstruct the timeline, identify the first abnormal event, rank possible root causes, and cite the evidence for each. +``` + +--- + +# Nice-to-Haves + +When useful, ask for: + +- multiple hypotheses +- timeline of events +- current versus proposed behaviour +- instrumentation suggestions +- two solution levels: + - minimal fix + - cleaner follow-up +- test cases +- rollback considerations +- performance impact +- risk to DASH, HLS, DRM, subtitles, downloads, or buffering + +--- + +# Common Mistakes + +Avoid: + +- asking "fix this" with no context +- pasting excessive logs +- pasting unrelated files +- asking broad architecture questions when a focused question would work +- jumping to refactors before root cause +- ignoring constraints such as performance, DRM, timing, and public APIs +- accepting answers without evidence +- keeping one long-running chat for unrelated issues +- using agent mode for small tasks +- repeatedly refining prompts when a single structured prompt would work +- asking Copilot to rediscover information already known + +--- + +# Before and After Prompting Patterns + +## Before: Expensive and vague + +```text +This playback is broken. Look through the code and fix it. +``` + +Problems: + +- no scope +- no stream type +- no evidence +- likely to trigger broad code search +- likely to produce speculative answers +- high cost risk + +## After: Focused and cost-aware + +```text +Goal: +Analyze why playback stalls after a DASH ABR upswitch. + +Context: +The stall occurs on a live DASH stream about 10 seconds after an upswitch. The issue is reproducible with DRM enabled. + +Scope: +Focus on DASH ABR selection and buffer state handling. Use only the provided files and log excerpt unless there is clear evidence that another subsystem is involved. + +Constraints: +Do not change public APIs. Preserve existing playback behaviour. Prefer the smallest safe fix. + +Evidence: +Relevant log excerpt and code snippets are below. + +What I want back: +- root cause +- evidence +- minimal fix +- risks +- validation steps +- assumptions and unknowns +``` + +--- + +# Default Prompt Structure + +Use this template: + +```text +Goal: + +Context: + +Scope: + +Constraints: + +Evidence: + +Cost-control instruction: +Use only the provided context unless there is a clear reason to expand scope. Prefer a concise, evidence-based answer. + +What you want back: +- Summary +- Root cause +- Evidence +- Minimal fix +- Risks +- Validation +- Assumptions and unknowns +``` + +--- + +# Default Plan-Only Prompt Structure + +Use this when you want a report or plan without changes: + +```text +Goal: + +Context: + +Scope: + +Constraints: + +Evidence: + +Instruction: +Create a plan only. Do not modify files. Do not generate a patch yet. + +What you want back: +- Findings +- Recommended approach +- Impacted files +- Risks +- Validation plan +- Open questions +``` + +--- + +# Default Log Analysis Prompt Structure + +```text +Goal: +Analyze this AAMP playback issue. + +Context: +Stream type: +Playback mode: +DRM: +Device/environment: +Anchor event: +Time window: + +Observed behaviour: + +Expected behaviour: + +Evidence: +Relevant log excerpt only. + +Instructions: +Reconstruct the timeline. +Identify the first abnormal event. +Rank possible root causes. +Cite evidence for each conclusion. +State assumptions and unknowns. +Do not request or analyze the full log unless this excerpt is insufficient. + +What you want back: +- Timeline +- First abnormal event +- Ranked root causes +- Evidence +- Next checks +- Validation steps +``` + +--- + +# AAMP-Specific Reminder + +Most AAMP issues are multi-layer problems involving some combination of: + +- manifest handling: DASH or HLS +- network timing +- fragment download behaviour +- buffer state +- ABR selection +- DRM/license flow +- player state transitions +- platform integration +- eventing and telemetry + +Prompt as if the issue may span layers, but do not load every layer into context by default. + +Start narrow, follow the evidence, and expand scope only when justified. \ No newline at end of file