Skip to content

Latest commit

 

History

History
40 lines (34 loc) · 1.29 KB

File metadata and controls

40 lines (34 loc) · 1.29 KB

Streaming Inference Example (No GPU)

You can stream audio without a GPU by using orpheus-cpp, which is a llama.cpp-compatible backend of the Orpheus TTS model.

  1. Install orpheus-cpp

    pip install orpheus-cpp
  2. Install llama-cpp-python

    Linux/Windows

    pip install llama-cpp-python --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/cpu

    MacOS with Apple Silicon

    pip install llama-cpp-python --extra-index-url https://abetlen.github.io/llama-cpp-python/whl/metal
  3. Run the example below:

    from scipy.io.wavfile import write
    from orpheus_cpp import OrpheusCpp
    import numpy as np
    
    orpheus = OrpheusCpp(verbose=False, lang="en")
    
    text = "I really hope the project deadline doesn't get moved up again."
    buffer = []
    for i, (sr, chunk) in enumerate(orpheus.stream_tts_sync(text, options={"voice_id": "tara"})):
       buffer.append(chunk)
       print(f"Generated chunk {i}")
    buffer = np.concatenate(buffer, axis=1)
    write("output.wav", 24_000, np.concatenate(buffer))
  4. WebRTC Streaming Example:

    python -m orpheus_cpp

    2025-03-26_10-37-56.mp4