Skip to content

feat: add AddImageFromBytes API for in-memory image insertion#30

Open
mmonterroca wants to merge 4 commits intomasterfrom
feature/add-image-from-bytes
Open

feat: add AddImageFromBytes API for in-memory image insertion#30
mmonterroca wants to merge 4 commits intomasterfrom
feature/add-image-from-bytes

Conversation

@mmonterroca
Copy link
Copy Markdown
Owner

Summary

Add support for adding images from raw byte data without requiring file system access, addressing the use case described in #29.

Changes

New public API (ParagraphBuilder)

  • AddImageFromBytes(data []byte, format ImageFormat) — inline image from bytes
  • AddImageFromBytesWithSize(data, format, size) — with custom dimensions
  • AddImageFromBytesWithPosition(data, format, size, pos) — floating with positioning

New domain interface methods (Paragraph)

  • AddImageFromBytes, AddImageFromBytesWithSize, AddImageFromBytesWithPosition

Internal constructors (core)

  • NewImageFromBytes, NewImageFromBytesWithSize, NewImageFromBytesWithPosition

CLI handler optimization

  • applyImage() now uses AddImageFromBytes directly for base64 images instead of writing to a temp file and calling AddImage(path). Eliminates unnecessary disk I/O.

Tests & examples

  • Unit tests for all 3 core constructors (valid data, empty data, empty format)
  • Builder tests for all 3 new methods (error path validation)
  • Updated examples/08_images with a new section demonstrating in-memory image creation

Usage

// Image from bytes — no file system needed
builder.AddParagraph().
    AddImageFromBytes(pngBytes, domain.ImageFormatPNG).
    End()

// With custom size
builder.AddParagraph().
    AddImageFromBytesWithSize(pngBytes, domain.ImageFormatPNG, domain.NewImageSize(200, 150)).
    End()

Closes #29

Add support for adding images from raw byte data without requiring
file system access. This addresses the use case where users have
limited access to a file system (issue #29).

New public API methods on ParagraphBuilder:
- AddImageFromBytes(data, format)
- AddImageFromBytesWithSize(data, format, size)
- AddImageFromBytesWithPosition(data, format, size, pos)

New domain.Paragraph interface methods:
- AddImageFromBytes(data, format)
- AddImageFromBytesWithSize(data, format, size)
- AddImageFromBytesWithPosition(data, format, size, pos)

New internal constructors:
- core.NewImageFromBytes(id, data, format)
- core.NewImageFromBytesWithSize(id, data, format, size)
- core.NewImageFromBytesWithPosition(id, data, format, size, pos)

Includes unit tests and updated example 08_images.

Closes #29
…around

The CLI handler previously wrote base64 image data to a temporary file
and then called AddImage(path). Now it uses AddImageFromBytes directly,
eliminating unnecessary disk I/O and temp file cleanup.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ce026b2b52

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +333 to +335
if format == "" {
return nil, errors.InvalidArgument("NewImageFromBytes", "format", format, "image format is required")
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Validate format for AddImageFromBytes inputs

NewImageFromBytes only checks that format is non-empty, so any arbitrary value (for example "jepg") is accepted and flows into package generation. Because MediaManager.detectContentType falls back to application/octet-stream for unknown extensions, this produces DOCX media entries with unrecognized content types instead of failing fast. This is a regression for base64 CLI images now routed through this path (applyImage), where invalid formats previously errored via AddImage/detectImageFormat; please reject unsupported formats (or normalize/validate against known constants) before creating the image.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the ability to add images directly from byte data, bypassing the need for temporary files when handling base64-encoded images. It adds several new methods to the Paragraph interface and ParagraphBuilder, along with corresponding implementations, unit tests, and an example. The review feedback suggests improving error reporting consistency by passing the invalid value to the error constructor and using a setter method for image positioning instead of direct field access via type assertion.

// NewImageFromBytes creates a new image from raw byte data.
func NewImageFromBytes(id string, data []byte, format domain.ImageFormat) (domain.Image, error) {
if len(data) == 0 {
return nil, errors.InvalidArgument("NewImageFromBytes", "data", nil, "image data cannot be empty")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The third argument to errors.InvalidArgument should ideally be the invalid value itself for better error reporting, consistent with the check for format below.

Suggested change
return nil, errors.InvalidArgument("NewImageFromBytes", "data", nil, "image data cannot be empty")
return nil, errors.InvalidArgument("NewImageFromBytes", "data", data, "image data cannot be empty")

Comment on lines +380 to +381
docxImg := img.(*docxImage)
docxImg.position = pos
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

While direct field access via type assertion is consistent with existing code in this file, using the SetPosition method (if available on the interface) or checking the error from the setter would be more robust.

Suggested change
docxImg := img.(*docxImage)
docxImg.position = pos
if err := img.SetPosition(pos); err != nil {
return nil, err
}

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds first-class support for embedding images from in-memory byte slices (no filesystem required), and updates the CLI to avoid temporary files when inserting base64 images—addressing the use case in #29.

Changes:

  • Introduces AddImageFromBytes* APIs on domain.Paragraph and ParagraphBuilder.
  • Adds core.NewImageFromBytes* constructors plus unit tests for these constructors.
  • Optimizes CLI base64 image handling to call AddImageFromBytes* directly (no temp file I/O), and updates the images example.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
internal/core/paragraph.go Implements paragraph-level AddImageFromBytes* methods and hooks them into image attachment.
internal/core/image.go Adds new in-memory image constructors (NewImageFromBytes*).
internal/core/image_test.go Adds unit tests covering the new constructors (valid bytes + error cases).
domain/paragraph.go Extends the public Paragraph interface with AddImageFromBytes*.
builder.go Adds builder chaining methods for AddImageFromBytes*.
builder_test.go Adds builder tests for error propagation on empty image data.
cmd/docxgo/handlers.go Updates CLI image handler to embed base64 images directly from bytes.
examples/08_images/main.go Demonstrates creating/embedding a PNG entirely in memory using the new API.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +329 to +336
func NewImageFromBytes(id string, data []byte, format domain.ImageFormat) (domain.Image, error) {
if len(data) == 0 {
return nil, errors.InvalidArgument("NewImageFromBytes", "data", nil, "image data cannot be empty")
}
if format == "" {
return nil, errors.InvalidArgument("NewImageFromBytes", "format", format, "image format is required")
}

Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewImageFromBytes only checks format != "" and then uses it to build the file extension/target. This allows unsupported/invalid formats (e.g. "xyz", ".png", uppercase) to flow through and produce media parts with application/octet-stream (or malformed names), which can lead to unreadable DOCX files. Consider normalizing (strings.ToLower, trim leading '.') and validating against the supported set (png/jpeg/gif/bmp/tiff/svg/webp, etc.), similar to detectImageFormat, and return InvalidArgument for unknown formats.

Copilot uses AI. Check for mistakes.
Comment on lines +342 to +356
copyData := make([]byte, len(data))
copy(copyData, data)

target := fmt.Sprintf("media/image%s.%s", id, format)

return &docxImage{
id: id,
format: format,
size: size,
originalSize: size,
data: copyData,
target: target,
description: "",
position: domain.DefaultImagePosition(),
}, nil
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewImageFromBytes clones data into copyData, but (*docxImage).Data() also returns a cloned slice. In the common flow (Paragraph.attachImage calls img.Data() and MediaManager.Add stores the passed slice), this results in an extra full-image copy vs ReadImageFromReader/NewImage and can double peak memory for large in-memory images. Consider adopting a single-copy strategy (e.g., store data directly like ReadImageFromReader does, or add an internal/raw accessor used by core/manager code paths) while still keeping external immutability guarantees.

Copilot uses AI. Check for mistakes.
Comment on lines +337 to +340
size, err := getImageDimensions(data)
if err != nil {
return nil, errors.Wrap(err, "NewImageFromBytes")
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewImageFromBytes calls getImageDimensions(data), which currently routes through strings.NewReader(string(data)) and retries decode; this forces an extra allocation proportional to image size and does unnecessary work for byte slices. Since this API is specifically for in-memory images, consider updating dimension detection to use bytes.NewReader(data) (and a single decode path) to avoid large allocations and improve performance.

Copilot uses AI. Check for mistakes.
Comment on lines +844 to 847
format := domain.ImageFormat(img.Format)
if format == "" {
format = domain.ImageFormatPNG
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

applyImage now accepts any non-empty img.Format string for base64 images and passes it to AddImageFromBytes*. Previously, unsupported formats would fail via NewImage()/detectImageFormat; now an invalid format like "xyz" will silently embed with an unknown extension/content type (likely producing a corrupted/unopenable DOCX). Please normalize/validate the format (lowercase, map jpg/jpeg, reject unknown) before calling the new APIs (or let AddImageFromBytes return a clear error).

Copilot uses AI. Check for mistakes.
Comment on lines +188 to +190
if err := p.attachImage(img, fmt.Sprintf("image%s.%s", id, format)); err != nil {
return nil, err
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These methods build sourceName from the input format (not from the created img). If NewImageFromBytes starts normalizing/validating formats (recommended), this can drift (e.g., input "JPG" normalized to "jpeg" but sourceName still uses "JPG"). Consider deriving the extension from img.Format() (or otherwise using the normalized value returned/used by the constructor) when calling attachImage, to ensure media naming/content type stays consistent.

Copilot uses AI. Check for mistakes.
- Validate format against known image formats, reject unsupported values
- Normalize format (lowercase, trim dots, map jpg->jpeg, tiff->tif)
- Extract shared normalizeImageFormat() used by both detectImageFormat
  and NewImageFromBytes
- Fix getImageDimensions to use bytes.NewReader instead of
  strings.NewReader(string(data)) to avoid unnecessary allocation
- Remove double copy in NewImageFromBytes (Data() already returns copy)
- Use img.Format() for sourceName in paragraph methods to stay
  consistent after normalization
- Fix error arg: pass data instead of nil in InvalidArgument
- Add tests for invalid format rejection and format normalization
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

return nil, errors.Wrap(err, "Paragraph.AddImageFromBytesWithPosition")
}

if err := p.attachImage(img, fmt.Sprintf("image%s.%s", id, format)); err != nil {
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In AddImageFromBytesWithPosition, the media filename is built using the raw format argument. This can diverge from the image’s normalized format (img.Format()), leading to inconsistent extensions (e.g., "JPG" vs "jpeg") or even double-dots if callers pass a leading ".". Use the created image’s format (or the normalized value returned by the constructor) when generating sourceName to keep content-type/extension consistent across the three AddImageFromBytes* methods.

Suggested change
if err := p.attachImage(img, fmt.Sprintf("image%s.%s", id, format)); err != nil {
if err := p.attachImage(img, fmt.Sprintf("image%s.%s", id, img.Format())); err != nil {

Copilot uses AI. Check for mistakes.
Comment on lines +349 to +356
return &docxImage{
id: id,
format: normalized,
size: size,
originalSize: size,
data: data,
target: target,
description: "",
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NewImageFromBytes stores the caller-provided data slice directly in the docxImage. Because slices are references, callers can mutate the underlying bytes after insertion, which can corrupt the embedded image (and create data races if reused concurrently). Make a defensive copy of data (as NewImageFromPackage does) before storing it.

Copilot uses AI. Check for mistakes.
Comment on lines +493 to +502
t.Run("normalizes jpg to jpeg", func(t *testing.T) {
// Use PNG data but verify format normalization logic
data := createTestImageBytes(t, 50, 50)
img, err := NewImageFromBytes("img12d", data, "JPG")
if err != nil {
t.Fatalf("NewImageFromBytes() error = %v", err)
}
if img.Format() != domain.ImageFormatJPEG {
t.Errorf("Format() = %v, want jpeg", img.Format())
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "normalizes jpg to jpeg" test passes PNG-encoded bytes while declaring the format as JPG. This can mask real-world failures because the produced DOCX would be labeled as JPEG even though the payload is PNG. Generate actual JPEG bytes for this case (or change the assertion to only test string normalization without constructing an image from real bytes).

Copilot uses AI. Check for mistakes.
- Restore defensive copy of data in NewImageFromBytes to prevent
  caller mutation (consistent with NewImageFromPackage)
- Fix AddImageFromBytesWithPosition to use img.Format() instead of
  raw format parameter for sourceName consistency
- Fix jpg normalization test to use actual JPEG-encoded bytes instead
  of PNG bytes with JPG format declaration
@mmonterroca mmonterroca mentioned this pull request Apr 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

AddImageFromBytes

2 participants