Skip to content

Commit 11735db

Browse files
authored
Update README.md
1 parent 106f59c commit 11735db

1 file changed

Lines changed: 28 additions & 1 deletion

File tree

README.md

Lines changed: 28 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -78,13 +78,40 @@ To represent images, create a column of SQL type `text` in your table and store
7878

7979
# Query Language
8080

81-
ThalamusDB supports SQL queries with semantic filter predicates. Specifically, ThalamusDB supports two types of semantic filters:
81+
ThalamusDB supports SQL queries with semantic filter predicates. Specifically, ThalamusDB supports two types of semantic filters (both must appear in the SQL `WHERE` clause):
8282

8383
| Operator | Semantics |
8484
| --- | --- |
8585
| `NLfilter([Column], [Condition])` | Filters rows based on a condition in natural language |
8686
| `NLjoin([Column in Table 1], [Column in Table2], [Condition])` | Filters row pairs using the join condition in natural language |
8787

88+
# Configuring Models
89+
90+
ThalamusDB works with models of various providers. Users specify the models to use on specific data types in a model configuration file. Also, the configuration file enables users to configure models for specific operators (e.g., by setting the `temperature` parameter or `reasoning_effort`). You can find an example configuration file in this repository at `config/models.json`.
91+
92+
The model configuration file contains a dictionary with a single field, `models`, that stores a list of model configurations. Each list entry is a dictionary with three fields:
93+
- `modalities`: a list of data modalities the model can process (a subset of "text", "image", and "audio").
94+
- `priority`: if multiple models can be used to serve a request, ThalamusDB prefers the ones with higher priority.
95+
- `kwargs`: describes the parameter settings used for each semantic operator (parameters include the model ID).
96+
97+
The `kwargs` field is a dictionary that contains two fields: `filter` and `join`. Each field contains the settings (mapping from parameter names to values) that are used when calling the language model for the corresponding semantic operator (semantic filter or join). The following entry is an example model configuration, setting up both semantic operators to use the GPT-5 Mini model:
98+
99+
```json
100+
{
101+
"modalities": ["text", "image"], "priority": 10,
102+
"kwargs": {
103+
"filter": {
104+
"model": "gpt-5-mini",
105+
"reasoning_effort": "minimal"
106+
},
107+
"join": {
108+
"model": "gpt-5-mini",
109+
"reasoning_effort": "minimal"
110+
}
111+
}
112+
}
113+
```
114+
88115
# Approximate Processing
89116

90117
ThalamusDB is designed for approximate processing. During query processing, ThalamusDB periodically displays approximate results. These results are calculated based on evaluating semantic operators on a subset of the data. When displaying approximate results, ThalamusDB distinguishes two query types:

0 commit comments

Comments
 (0)