Clarify usage of InferenceOptions for advanced audio parameters in model.generate()

Firstly, thank you for open sourcing this project.

**Issue**
I'd like to suggest a few improvements to the README.md. The current README.md provides a great quickstart for text-to-speech generation. However, it doesn't explicitly mention how to adjust the advanced acoustic parameters (like noise_temperature, acoustic_cfg_scale, or num_flow_matching_steps) via the Python API.

Users who see these settings in the Gradio demo (or in blog posts) might try passing them directly as kwargs to model.generate(), which results in a TypeError: unexpected keyword argument.

**Suggested Solution**
It would be incredibly helpful to add a short section to the README (perhaps under "Run Inference") demonstrating how to import and use the InferenceOptions dataclass to customize the voice generation.

Example snippet that could be added:
```
import torch
import torchaudio
from tada.modules.encoder import Encoder
from tada.modules.tada import TadaForCausalLM, InferenceOptions

device = "cuda"
# ... (load encoder, model, and prompt as usual) ...

# Configure human voice dynamics and flow matching steps
custom_options = InferenceOptions(
    noise_temperature=0.95,         # Higher = more emotional/varied micro-expressions
    acoustic_cfg_scale=2.5,         # Adherence to the reference voice tone
    duration_cfg_scale=1.5,         # Adherence to the pacing/rhythm of the reference
    num_flow_matching_steps=15      # Higher = better audio fidelity
)

output = model.generate(
    prompt=prompt, 
    text="Your text here.",
    inference_options=custom_options
)
```

I ran into this while trying to make the voice output sound less rigid and more natural. Discovering the InferenceOptions object completely solved my issue, so surfacing this in the docs will definitely help other developers get the most out of the model!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify usage of InferenceOptions for advanced audio parameters in model.generate() #15

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Clarify usage of InferenceOptions for advanced audio parameters in model.generate() #15

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions