Skip to content

Pack fill_gap into alpha_idx#1238

Merged
taj-p merged 9 commits into
linebender:mainfrom
taj-p:tajp/reduceStripSize
Sep 21, 2025
Merged

Pack fill_gap into alpha_idx#1238
taj-p merged 9 commits into
linebender:mainfrom
taj-p:tajp/reduceStripSize

Conversation

@taj-p

@taj-p taj-p commented Sep 21, 2025

Copy link
Copy Markdown
Contributor

Fixes #1213

See #1206 (comment)

Since Strip has an alignment of 4 bytes (due to alpha_idx), the contribution of fill_gap is an additional 4 bytes. This means Strip becomes 12 bytes or 5.3 strips per cache line.

If we instead use the last bit of alpha_idx as fill_gap, Strip becomes 8 bytes or 8 Strips per cache line and much less to allocate.

The main motivations for the packing are to:

  • Reduce the cost of Recordings and Strip allocation
  • Improve cache line utilization

When I tried this on Vello Hybrid, I saw no change in performance for direct rendering, but a minor improvement (3-5% improvement) in rendering from recordings (likely due to Strip being cheaper to allocate and iterate).

@taj-p taj-p added the C-sparse-strips Applies to sparse strips variants of vello in general label Sep 21, 2025

@LaurenzV LaurenzV left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a bit unfortunate that this reduces the number of alpha values that can be stored by a lot, but I doubt that we will exceed 2GB under normal circumstances.

Comment thread sparse_strips/vello_common/src/strip.rs Outdated

/// Sets the alpha index.
///
/// Note that the largest value that can be stored in the alpha index is `u32::MAX - 1`, as the

@LaurenzV LaurenzV Sep 21, 2025

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't it u32::MAX / 2 (or u32::MAX << 1)?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops - fixed in bd97a68

Comment thread sparse_strips/vello_cpu/src/render.rs Outdated
.map(move |strip| {
let mut adjusted_strip = *strip;
adjusted_strip.alpha_idx += alpha_offset;
adjusted_strip.set_alpha_idx(adjusted_strip.alpha_idx().wrapping_add(alpha_offset));

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really want a wrapping add here? Wouldn't it be better to just use normal add so that at least in debug mode it panics?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. Thank you - nice catch! Fixed in a217c0c

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And fixed in hybrid in c0947d3

Comment thread sparse_strips/vello_common/src/strip.rs Outdated
@taj-p taj-p enabled auto-merge September 21, 2025 20:22
@taj-p taj-p added this pull request to the merge queue Sep 21, 2025
@taj-p taj-p removed this pull request from the merge queue due to a manual request Sep 21, 2025
@taj-p taj-p enabled auto-merge September 21, 2025 20:36
@taj-p taj-p added this pull request to the merge queue Sep 21, 2025
Merged via the queue into linebender:main with commit 6a5bdc0 Sep 21, 2025
17 checks passed
@taj-p taj-p deleted the tajp/reduceStripSize branch September 21, 2025 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

C-sparse-strips Applies to sparse strips variants of vello in general

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Evaluate whether to merge alpha_idx and fill_gap in Strip into a single packed field

2 participants