Skip to content

docs: add troubleshooting guide, tensor shapes reference, and FAQ#147

Open
yurekami wants to merge 1 commit into
deepseek-ai:mainfrom
yurekami:docs/add-troubleshooting-guides
Open

docs: add troubleshooting guide, tensor shapes reference, and FAQ#147
yurekami wants to merge 1 commit into
deepseek-ai:mainfrom
yurekami:docs/add-troubleshooting-guides

Conversation

@yurekami
Copy link
Copy Markdown

@yurekami yurekami commented Jan 2, 2026

Summary

This PR adds three comprehensive documentation files to help users troubleshoot issues, understand tensor shapes, and find answers to common questions. These documents are based on analysis of 53+ open GitHub issues and address the most frequent pain points reported by the community.

Changes

  • docs/TROUBLESHOOTING.md - GPU compatibility matrix, build error solutions, common runtime errors
  • docs/TENSOR_SHAPES.md - Complete tensor shape reference for all major functions
  • docs/FAQ.md - Answers to frequently asked questions from GitHub issues

Motivation

The FlashMLA repository has 53+ open issues, many of which are:

  • Questions about GPU architecture support (SM90 vs SM100 vs SM120)
  • Confusion about sparse vs dense attention modes
  • Build errors related to CUDA versions
  • Tensor shape mismatches
  • Integration questions with vLLM/SGLang

This documentation directly addresses these recurring questions, which should:

  1. Reduce maintainer support burden
  2. Improve developer onboarding experience
  3. Make the project more accessible to new users

Documentation Overview

TROUBLESHOOTING.md

  • GPU compatibility matrix (SM90/SM100/SM120)
  • Build error solutions by CUDA version
  • Runtime error fixes (e.g., "Sparse BF16 MLA is not supported on SM90")
  • Windows/ARM64 workarounds
  • Links to relevant GitHub issues

TENSOR_SHAPES.md

  • Shape notation legend
  • Complete function signatures with shape annotations:
    • flash_mla_with_kvcache
    • get_mla_metadata
    • flash_mla_prefill
    • mha_fwd_kvcache
  • ASCII diagrams for paged KV cache layout
  • Common shape mismatch errors and fixes

FAQ.md

  • 10 frequently asked questions with detailed answers
  • Topics: paged mode, prefill stage, GPU support, sparse attention, framework integration, performance expectations, variable-length batching, FP8 quantization

Related Issues

This documentation addresses questions raised in:

Test Plan

  • Verified all markdown renders correctly on GitHub
  • Cross-referenced with existing README to ensure consistency
  • Validated technical accuracy against FlashMLA source code and issues

🤖 Generated with Claude Code

Add comprehensive documentation to help users:

- TROUBLESHOOTING.md: GPU compatibility matrix (SM90/SM100/SM120),
  build error solutions, common runtime errors, and workarounds
- TENSOR_SHAPES.md: Complete tensor shape reference for all major
  functions with ASCII diagrams and common error fixes
- FAQ.md: Answers to frequently asked questions based on GitHub issues

These documents address recurring questions from issues deepseek-ai#101, deepseek-ai#108,
deepseek-ai#110, deepseek-ai#113, deepseek-ai#116, deepseek-ai#119, deepseek-ai#121, deepseek-ai#124, deepseek-ai#126 and should reduce support
burden while improving developer onboarding.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant