Skip to content

Commit 6038e9b

Browse files
Merge pull request #59 from linkml/claude/binary-enums-documentation-V0piC
Add documentation on preferring binary enums over booleans
2 parents d490e77 + 3e045e5 commit 6038e9b

File tree

2 files changed

+172
-0
lines changed

2 files changed

+172
-0
lines changed
Lines changed: 168 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,168 @@
1+
# Prefer Binary Enums Over Booleans
2+
3+
When modeling data that has two possible states, it may be tempting to use a boolean type (`true`/`false`). However, in many cases, a two-element enumeration (binary enum) is the better choice. This document explains why and when to prefer binary enums over booleans in your LinkML schemas.
4+
5+
## The Case for Binary Enums
6+
7+
The [Tidy Design Principles](https://design.tidyverse.org/boolean-strategies.html) from the tidyverse project articulate several compelling reasons to prefer enums even when there are only two choices.
8+
9+
### 1. Extensibility
10+
11+
If you later discover a third (or fourth, or fifth) option, you'll need to change the interface. With an enum, adding new values is straightforward. With a boolean, you face a breaking change.
12+
13+
**Example:** Consider a data submission status. You might initially think "submitted" or "not submitted" covers it:
14+
15+
```yaml
16+
# Boolean approach - seems simple at first
17+
slots:
18+
is_submitted:
19+
range: boolean
20+
```
21+
22+
But what about "pending review", "rejected", or "withdrawn"? With a boolean, you're stuck. With an enum, you simply add new values:
23+
24+
```yaml
25+
# Enum approach - extensible
26+
enums:
27+
SubmissionStatus:
28+
permissible_values:
29+
SUBMITTED:
30+
NOT_SUBMITTED:
31+
PENDING_REVIEW: # Easy to add later
32+
REJECTED: # Easy to add later
33+
```
34+
35+
### 2. Clarity of Intent
36+
37+
Boolean values often have asymmetric clarity. `something = TRUE` tells you what *will* happen, but `something = FALSE` only tells you what *won't* happen, not what will happen instead.
38+
39+
**Example from tidyverse:** The `sort()` function uses `decreasing = TRUE/FALSE`. Reading `decreasing = FALSE` leaves ambiguity:
40+
- Does it mean "sort in increasing order"?
41+
- Or does it mean "don't sort at all"?
42+
43+
Compare this with `vctrs::vec_sort()` which uses `direction = "asc"` or `direction = "desc"`. Both options are explicit and self-documenting.
44+
45+
### 3. Avoiding Cryptic Negations
46+
47+
Boolean parameters often require mental gymnastics to interpret, especially with negated names.
48+
49+
**Example from tidyverse:** The `cut()` function has a `right` parameter:
50+
- `right = TRUE`: right-closed, left-open intervals `(a, b]`
51+
- `right = FALSE`: right-open, left-closed intervals `[a, b)`
52+
53+
A clearer design would be `open_side = c("right", "left")` or `bounds = c("[)", "(]")`.
54+
55+
### 4. Self-Documenting Code
56+
57+
Enums make data and code more readable without needing to consult documentation.
58+
59+
```yaml
60+
# What does this mean? Need to check docs.
61+
sample:
62+
is_control: false
63+
64+
# Self-explanatory
65+
sample:
66+
sample_type: EXPERIMENTAL
67+
```
68+
69+
### 5. The "Name the Scale" Pattern
70+
71+
When converting booleans to enums, consider naming the scale with values that represent points on it. This signals that intermediate values could be added.
72+
73+
**Example:** Instead of `verbose = TRUE/FALSE`, use:
74+
75+
```yaml
76+
enums:
77+
VerbosityLevel:
78+
permissible_values:
79+
NONE:
80+
description: No output
81+
MINIMAL:
82+
description: Errors only
83+
NORMAL:
84+
description: Standard output
85+
VERBOSE:
86+
description: Detailed output
87+
DEBUG:
88+
description: All available information
89+
```
90+
91+
## When Booleans Are Acceptable
92+
93+
Booleans remain appropriate in certain cases:
94+
95+
1. **Truly binary states**: The states are fundamentally and permanently binary (e.g., physical properties like "alive/dead" in certain contexts)
96+
97+
2. **Well-named parameters**: The parameter name makes both states crystal clear (e.g., `include_header` where `false` clearly means "exclude header")
98+
99+
3. **Toggle operations**: When the operation is clearly about enabling/disabling something (`enabled = true/false`)
100+
101+
## LinkML Examples
102+
103+
### Binary Enum Pattern
104+
105+
```yaml
106+
enums:
107+
SortDirection:
108+
permissible_values:
109+
ASCENDING:
110+
description: Sort from lowest to highest
111+
meaning: SIO:001395 # ascending order
112+
DESCENDING:
113+
description: Sort from highest to lowest
114+
meaning: SIO:001396 # descending order
115+
116+
StrandOrientation:
117+
permissible_values:
118+
FORWARD:
119+
description: Forward/plus strand
120+
meaning: SO:0000853 # forward_strand
121+
REVERSE:
122+
description: Reverse/minus strand
123+
meaning: SO:0000854 # reverse_strand
124+
125+
PresenceStatus:
126+
permissible_values:
127+
PRESENT:
128+
description: The entity is present
129+
ABSENT:
130+
description: The entity is absent
131+
NOT_DETERMINED:
132+
description: Presence could not be determined
133+
```
134+
135+
### Applying to Slots
136+
137+
```yaml
138+
slots:
139+
sort_direction:
140+
range: SortDirection
141+
description: Direction for sorting results
142+
143+
strand:
144+
range: StrandOrientation
145+
description: DNA strand orientation
146+
147+
presence:
148+
range: PresenceStatus
149+
description: Whether the feature was detected
150+
```
151+
152+
## Summary
153+
154+
| Aspect | Boolean | Binary Enum |
155+
|--------|---------|-------------|
156+
| Extensibility | Poor - breaking change to add states | Good - add new values easily |
157+
| Clarity | Often asymmetric | Both values explicit |
158+
| Documentation | Requires external docs | Self-documenting |
159+
| Ontology mapping | Not possible | Supports `meaning` annotations |
160+
| Future-proofing | Risky | Safe |
161+
162+
When in doubt, prefer a two-element enum. The small additional effort pays dividends in clarity, maintainability, and extensibility.
163+
164+
## References
165+
166+
- [Tidy Design Principles: Prefer an enum, even if only two choices](https://design.tidyverse.org/boolean-strategies.html)
167+
- [Tidy Design Principles: Explicit Strategies](https://design.tidyverse.org/explicit-strategies.html)
168+
- [Tidy Design Principles: Extract strategies into objects](https://design.tidyverse.org/strategy-objects.html)

docs/index.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,10 @@ A collection of commonly used value sets
1010
- [Agentic IDE Support](how-to-guides/agentic-ide-support.md)
1111
- [Sync UniProt Species](how-to-guides/sync-uniprot-species.md)
1212

13+
## Explanations
14+
15+
- [Prefer Binary Enums Over Booleans](explanations/binary-enums-vs-booleans.md) - Why two-element enums are often better than boolean types
16+
1317
Note: this schema consists ONLY of enums, so it is normal
1418
that classes and slots are empty.
1519

0 commit comments

Comments
 (0)