Skip to content

Releases: alexferrari88/sbstck-dl

v0.7 - Archive Index Page Generation

03 Sep 08:11

Choose a tag to compare

This release introduces comprehensive archive index page generation functionality.

New Features

  • 🗂️ Archive Index Pages: Generate comprehensive index pages linking all downloaded posts with metadata
  • 📋 Multiple Formats: Support for HTML, Markdown, and Text archive formats
  • 🏷️ Rich Metadata: Includes post titles, publication dates, descriptions, cover images, and download timestamps
  • 📅 Smart Sorting: Posts automatically sorted by publication date (newest first)
  • 🔗 Relative Links: Archive pages use relative paths for optimal portability

Usage

# Download posts and create archive index page
go run . download --url https://example.substack.com --create-archive --output ./downloads

# Create archive in specific format
go run . download --url https://example.substack.com --create-archive --format md --output ./downloads

The archive page is generated as index.{format} in the output directory root, providing a comprehensive overview of all downloaded content.

Release v0.6.7

31 Jul 19:16

Choose a tag to compare

Final fix for Windows CI compatibility

This release completes the resolution of Windows CI issues:

Fixed:

  • Windows test timeouts (infinite select{} loops in test servers)
  • Cross-platform filename extraction compatibility
  • GitHub Actions workflow timeout configuration

Added file attachment download support in v0.6.4:

  • Download file attachments from Substack posts with --download-files
  • Filter by file extensions with --file-extensions
  • Customize attachment directory with --files-dir
  • Full integration with existing image download workflow

All tests now pass on Windows, macOS, and Linux. The CLI functionality remains unchanged.

Release v0.6.6

31 Jul 19:10

Choose a tag to compare

Complete fix for Windows CI test timeouts

  • Fixed all infinite timeout simulations in test servers
  • Added proper timeouts to prevent CI hanging
  • Improved GitHub Actions workflow with explicit test timeouts
  • No functional changes to the CLI tool - this is purely a CI/testing fix

Release v0.6.5

31 Jul 19:05

Choose a tag to compare

Hotfix for Windows CI test timeouts

  • Fixed Windows test hanging issue in file download test server
  • No functional changes to the CLI tool

Release v0.6.4

31 Jul 16:29

Choose a tag to compare

New features and improvements introduced in commit f50e111

v0.6.3

29 Jul 02:58

Choose a tag to compare

Bug Fixes

  • Fixed image URL replacement in HTML files where images were downloaded locally but HTML still linked to Substack CDN URLs instead of local paths
  • Enhanced image URL collection to handle <a> and <source> tags in addition to <img> tags
  • Added comprehensive regression tests to prevent future occurrences

This release ensures that when using --download-images, all image references in the generated HTML files properly point to the local downloaded images rather than the original Substack CDN URLs.

v0.6.2 - Fix comma-separated URL fragments in srcset parsing

28 Jul 13:23

Choose a tag to compare

Bug Fixes

  • Fixed comma-separated URL fragments in srcset parsing: Resolves issue where Substack CDN URLs containing commas in their parameters (like w_424,c_limit,f_auto,q_auto:good) were being incorrectly parsed, causing malformed image paths in downloaded HTML
  • Improved srcset parsing robustness: Added regex-based parsing to handle complex URLs with embedded commas
  • Updated test coverage: Enhanced test cases to properly validate URL parsing with realistic HTTP URLs

Technical Details

  • Refactored parseSrcsetEntries() to use regex parsing for URLs with commas
  • Updated extractURLFromSrcset, extractAllURLsFromSrcset, and updateSrcsetAttribute functions
  • All tests now pass and real-world Substack posts download correctly
  • Verified fix works on multiple Substack publications

This release fixes the regression reported in issue where downloaded HTML contained malformed image URLs like:

server.local/substack/creator/images/post/image.png,images/post/image.png,f_webp,images/post/image.png

Images now correctly reference clean local paths like:

images/post-name/image.png

v0.6.1

28 Jul 12:43

Choose a tag to compare

Bug fix release

  • Fixed comma-separated URL bug in srcset attributes
  • Improved URL validation in srcset parsing to prevent malformed fragments
  • Added test-download folder to .gitignore

v0.6.0 - Image URL Replacement Fix

27 Jul 12:28

Choose a tag to compare

🔧 Bug Fix

Image URL Replacement Issue Fixed

Problem: When downloading Substack posts with images, the images were successfully downloaded locally but the HTML content still referenced Substack's CDN URLs instead of the local image paths.

Solution:

  • Fixed the image downloader to collect ALL URLs from each image element (, , )
  • Ensured all URL variants for the same image are mapped to the same local file path
  • Updated HTML content replacement to handle all image URL references

Changes

  • Enhanced image URL extraction to handle multiple URL sources per image
  • Added comprehensive test coverage for URL replacement functionality
  • Maintained backward compatibility with existing functionality

Impact

Downloaded Substack posts are now truly self-contained with all image references pointing to local files instead of external CDNs.

Full Changelog: v0.5.0...v0.6.0

v0.5.0: Local Image Downloading

17 Jul 14:09

Choose a tag to compare

🖼️ Local Image Downloading

This release introduces comprehensive image downloading functionality to preserve Substack posts with all their visual content locally.

✨ New Features

Image Downloading

  • --download-images flag to download all images locally with posts
  • --image-quality flag with three quality options:
    • high: 1456px width (best quality, larger files)
    • medium: 848px width (balanced quality/size)
    • low: 424px width (smaller files, mobile-optimized)
  • --images-dir flag to customize the image directory name (default: images)

Smart Content Processing

  • 🎯 Automatically detects and extracts images from complex Substack HTML structures
  • 📝 Updates HTML/Markdown content to reference local image paths
  • 🔗 Supports all Substack CDN patterns (substackcdn.com, substack-post-media.s3.amazonaws.com, legacy bucketeer domains)
  • 📁 Creates organized directory structure: {output}/images/{post-slug}/
  • 🛡️ Generates filesystem-safe filenames while preserving uniqueness

Error Handling & Reliability

  • ⚡ Integrates with existing rate limiting and retry logic
  • 🔄 Graceful handling of individual image download failures
  • 📊 Provides download success/failure summaries in verbose mode
  • ↩️ Full backwards compatibility - existing functionality unchanged

📚 Usage Examples

# Download posts with high-quality images (default)
sbstck-dl download --url https://example.substack.com --download-images

# Download with medium quality images
sbstck-dl download --url https://example.substack.com --download-images --image-quality medium

# Download with custom images directory name
sbstck-dl download --url https://example.substack.com --download-images --images-dir assets

# Download single post with images in markdown format
sbstck-dl download --url https://example.substack.com/p/post-title --download-images --format md

📂 Directory Structure

output/
├── 20231201_120000_post-title.html
└── images/
    └── post-title/
        ├── image1_1456x819.jpeg
        ├── image2_848x636.png
        └── image3_1272x720.webp

🔧 Technical Details

  • New Module: lib/images.go with comprehensive image processing
  • Test Coverage: 100% functionality coverage with real Substack HTML integration tests
  • Performance: Concurrent image downloading with configurable quality levels
  • Formats: Works with HTML, Markdown, and text output formats

💡 Why This Matters

This ensures your downloaded Substack posts remain fully accessible even if:

  • Images are deleted from Substack's CDN
  • The original Substack blog becomes unavailable
  • You need offline access to the complete content

🏗️ Development

  • Added 1,165 lines of new code across 7 files
  • Comprehensive test suite with real-world Substack content
  • Full integration with existing codebase patterns
  • Maintained 100% backwards compatibility

Full Changelog: v0.4.0...v0.5.0