Skip to content

Releases: HKUDS/RAG-Anything

v1.3.1

21 May 07:28

Choose a tag to compare

What's Changed

  • Fix duplicate detection for text-bearing insert_content_list calls by deferring doc_status creation until after LightRAG ainsert runs.
  • Apply the same deferral to process_document_complete for parsed documents with text.
  • Keep early doc_status creation for multimodal-only content that does not call ainsert.
  • Add regression coverage for content-list and parsed-document duplicate handling.

Validation

  • PYTHONPATH=. .venv/bin/python -m pytest tests/test_insert_content_list.py tests/testparser_wiring.py tests/testparser_kwargs.py -q
  • uvx ruff check raganything/processor.py tests/test_insert_content_list.py raganything/__init__.py
  • .venv/bin/python -m compileall raganything/processor.py tests/test_insert_content_list.py raganything/__init__.py

v1.3.0

06 May 07:38

Choose a tag to compare

What's Changed

⚠️ Behavior changes worth noting

  • DoclingParser now uses the Docling Python API instead of shelling out to the docling CLI. This means:
    • You now need pip install docling to use it (the docling executable on PATH alone is no longer sufficient).
    • The env={...} kwarg on DoclingParser parse methods is still accepted for compatibility but is now ignored — set the relevant environment variables in the parent process or pass _get_converter kwargs (artifacts_path, table_mode, …).
    • <file_stem>.json and <file_stem>.md artifacts written under <output_dir>/<file_stem>/docling/ are still produced, but via export_to_dict() / export_to_markdown() rather than the CLI serializer — the logical content is the same but the files are not byte-identical.
    • check_installation() now tests Python importability rather than probing the CLI on PATH.
  • MineruParser subprocess calls now run with a default timeout (configurable) and raise TimeoutError instead of hanging indefinitely.

✨ New features

  • feat(parser): add remote URL support for DoclingParser by @bueno12223 in #195
  • feat(omml): add OMML equation extraction utility for DOCX documents (closes #259) by @Abdeltoto in #262
  • feat: add MiniMax provider support by @octo-patch in #264
  • feat(examples): make LLM and vision model names configurable via env vars by @zhangzhenfei in #231
  • feat: add Ollama integration example (closes #118) by @jwchmodx in #238

🛠 Refactor / performance

  • refactor(parser): replace Docling CLI subprocess with Python API (closes #222) by @Abdeltoto in #261

🐛 Bug fixes

  • fix: create doc_status even when LightRAG lacks multimodal insert args (closes #244) by @DeepaliPaspule in #255
  • fix: prevent crashes from uninitialized LightRAG, env-var stripping, and parser cleanup by @jwchmodx in #240
  • fix: add timeout parameter to MinerU subprocess to prevent indefinite hang (#172) by @peterCheng123321 in #254
  • fix: pass entity_chunks_storage and relation_chunks_storage to all merge_nodes_and_edges calls (#241) by @peterCheng123321 in #250
  • fix: handle messages= kwarg in vision_model_func (insert_content_list_example) (#28) by @peterCheng123321 in #252
  • fix: forward system_prompt parameter in aquery_with_multimodal (#257) by @kuishou68 in #258
  • fix(examples): preserve embedding kwargs with partial by @txhno in #263
  • fix: demote misleading LibreOffice 'not found' warning to debug (closes #230) by @jwchmodx in #237
  • fix: strip <think> tags from modal processor fallback responses (closes #159) by @jwchmodx in #236
  • fix: create example log directory correctly by @haosenwang1018 in #242
  • fix(init): remove duplicate __all__ assignment (#267) by @kuishou68 in #268
  • fix: improve PDF parser handling by @davidangularme in #243

New Contributors

Full Changelog: v1.2.10...v1.3.0

v1.2.10

24 Mar 07:49

Choose a tag to compare

What's Changed

  • fix: use a single docling command for json and md formats by @wkpark in #198
  • fix: normalize MinerU 2.0 field names for backward compatibility (#89) by @teamauresta in #202
  • feat: add vLLM backend integration by @teamauresta in #201
  • Fix potential path traversal and local file read vulnerabilities by @RinZ27 in #197
  • feat(parser): add optional PaddleOCR backend by @SaqlainXoas in #199
  • fix: prevent same-name file collision in parser output directories (#51) by @teamauresta in #203
  • feat: add get_version() helper by @haosenwang1018 in #214
  • feat: support environment variables in parsers by @wkpark in #210
  • chore: export new public APIs in init.py by @Jah-yee in #219
  • test: expand coverage for core config, utils, and batch parser by @Jah-yee in #218
  • feat: add custom parser plugin system (closes #151) by @Jah-yee in #215
  • feat: add processing events and callbacks system by @Jah-yee in #217
  • fix(examples): use openai_embed.func to prevent double EmbeddingFunc wrapping by @syshin0116 in #223
  • fix: use valid CID font names for Chinese text rendering (fixes #24) by @Exploreunive in #226
  • fix: preserve full_entities metadata when adding multimodal entities by @Exploreunive in #228
  • fix: handle closed event loop in close() to eliminate atexit warning (fixes #135) by @Exploreunive in #225
  • feat: add retry and circuit breaker utilities for LLM calls (mitigates #172) by @Jah-yee in #216
  • feat: add multilingual prompt template support (closes #85) by @Jah-yee in #220

New Contributors

Full Changelog: v1.2.9...v1.2.10

v1.2.9

13 Jan 11:55
dc97594

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.2.8...v1.2.9

v1.2.8

22 Sep 03:12

Choose a tag to compare

What's Changed

  • Add RAGAnything processing to LightRAG's webui by @hzywhite in #97
  • Add RAGAnything processing to LightRAG's webui by @hzywhite in #113
  • fix: replace del with atexit to fix RAGAnything cleanup warning by @liz-in-tech in #106
  • feat: Add support for Chinese characters in PDF generation by @hongdongjian in #103
  • Feat: LM Studio integration example and uv implementation by @LaansDole in #99

New Contributors

Full Changelog: v1.2.7...v1.2.8

v1.2.7

15 Aug 12:19
79078b2

Choose a tag to compare

What's Changed

Full Changelog: v1.2.6...v1.2.7

v1.2.6

31 Jul 10:36

Choose a tag to compare

What's Changed

  • Update .gitignore to include AI-related files and directories by @BenjaminX in #68
  • Add Batch Processing and Enhanced Markdown Features by @ShorthillsAI in #64

New Contributors

Full Changelog: v1.2.5...v1.2.6

v1.2.5

24 Jul 07:01

Choose a tag to compare

What's Changed

  • Direct Content List Insertion by @LarFii in #62
  • Comprehensive Optimization of Multimodal Chunk Processing in RAGAnything by @LarFii in #65

Full Changelog: v1.2.4...v1.2.5

v1.2.4

21 Jul 18:25
9f9fb68

Choose a tag to compare

What's Changed

Full Changelog: v1.2.3...v1.2.4

v1.2.3

15 Jul 10:37

Choose a tag to compare

What's Changed

  • fix: _read_output_files 函数无法准确找到 md 文件的路径问题 by @nssai001 in #47
  • 更新mineru2.0部分参数配置 by @liseri in #53

Hot Fix

New Contributors

Full Changelog: v1.2.2...v1.2.3