Skip to content

Feature Request: Support for Multiple LLM Models (Text + Vision) #365

@diroverflow

Description

@diroverflow

Feature Request: Support for Multiple LLM Models (Text + Vision)

Issue Description

Currently, the memu-bot configuration system supports only a single LLM model via the customModel configuration. This creates a significant limitation for users who want to:

  1. Use one model for text reasoning (e.g., GLM-4.7, Claude Opus)
  2. Use a separate model for vision/multimodal tasks (e.g., GLM-4.6V, GPT-4V)

With the current single-model architecture, users must compromise between:

  • Text model quality vs. Vision capabilities
  • Cost optimization (using cheaper text models vs. more capable vision models)

Proposed Solution

Option A: Separate Configuration Objects (Recommended)

{
  "llmProvider": "custom",
  "customApiKey": "...",
  "customBaseUrl": "...",
  "customModel": "glm-4.7",
  
  "visionProvider": "custom",
  "visionApiKey": "your-vision-api-key",
  "visionBaseUrl": "...",
  "visionModel": "glm-4v",
  
  "autoSelectModel": true
}

Behavior:

  • Image tasks automatically use visionModel
  • Text-only tasks use the primary customModel
  • Backward compatible with existing configs

Option C: Minimal Change - Single Vision Field

{
  "llmProvider": "custom",
  "customApiKey": "...",
  "customBaseUrl": "...",
  "customModel": "glm-4.7",
  
  "visionModel": "glm-4v"
}

Implementation Details

Changes Required

  1. Configuration Schema Update (config/settings.json)

    • Add visionProvider, visionModel, visionApiKey, visionBaseUrl fields
  2. LLM Service Layer

    • Detect task type (text-only vs. multimodal)
    • Route requests to appropriate model based on configuration

Task Detection Logic

function selectLLM(messages) {
  const hasImage = messages.some(msg => 
    msg.content && msg.content.some(c => c.type === 'image')
  );
  
  if (hasImage && config.visionProvider) {
    return { 
      provider: config.visionProvider,
      model: config.visionModel,
      apiKey: config.visionApiKey || config.customApiKey,
      baseUrl: config.visionBaseUrl || config.customBaseUrl
    };
  }
  
  return {
    provider: config.llmProvider,
    model: config.customModel,
    apiKey: config.customApiKey,
    baseUrl: config.customBaseUrl
  };
}

Priority

Medium-High (Enhances usability for power users)...

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions