Releases: alexferrari88/sbstck-dl
v0.7 - Archive Index Page Generation
This release introduces comprehensive archive index page generation functionality.
New Features
- 🗂️ Archive Index Pages: Generate comprehensive index pages linking all downloaded posts with metadata
- 📋 Multiple Formats: Support for HTML, Markdown, and Text archive formats
- 🏷️ Rich Metadata: Includes post titles, publication dates, descriptions, cover images, and download timestamps
- 📅 Smart Sorting: Posts automatically sorted by publication date (newest first)
- 🔗 Relative Links: Archive pages use relative paths for optimal portability
Usage
# Download posts and create archive index page
go run . download --url https://example.substack.com --create-archive --output ./downloads
# Create archive in specific format
go run . download --url https://example.substack.com --create-archive --format md --output ./downloadsThe archive page is generated as index.{format} in the output directory root, providing a comprehensive overview of all downloaded content.
Release v0.6.7
Final fix for Windows CI compatibility
This release completes the resolution of Windows CI issues:
Fixed:
- Windows test timeouts (infinite select{} loops in test servers)
- Cross-platform filename extraction compatibility
- GitHub Actions workflow timeout configuration
Added file attachment download support in v0.6.4:
- Download file attachments from Substack posts with --download-files
- Filter by file extensions with --file-extensions
- Customize attachment directory with --files-dir
- Full integration with existing image download workflow
All tests now pass on Windows, macOS, and Linux. The CLI functionality remains unchanged.
Release v0.6.6
Complete fix for Windows CI test timeouts
- Fixed all infinite timeout simulations in test servers
- Added proper timeouts to prevent CI hanging
- Improved GitHub Actions workflow with explicit test timeouts
- No functional changes to the CLI tool - this is purely a CI/testing fix
Release v0.6.5
Hotfix for Windows CI test timeouts
- Fixed Windows test hanging issue in file download test server
- No functional changes to the CLI tool
Release v0.6.4
New features and improvements introduced in commit f50e111
v0.6.3
Bug Fixes
- Fixed image URL replacement in HTML files where images were downloaded locally but HTML still linked to Substack CDN URLs instead of local paths
- Enhanced image URL collection to handle
<a>and<source>tags in addition to<img>tags - Added comprehensive regression tests to prevent future occurrences
This release ensures that when using --download-images, all image references in the generated HTML files properly point to the local downloaded images rather than the original Substack CDN URLs.
v0.6.2 - Fix comma-separated URL fragments in srcset parsing
Bug Fixes
- Fixed comma-separated URL fragments in srcset parsing: Resolves issue where Substack CDN URLs containing commas in their parameters (like
w_424,c_limit,f_auto,q_auto:good) were being incorrectly parsed, causing malformed image paths in downloaded HTML - Improved srcset parsing robustness: Added regex-based parsing to handle complex URLs with embedded commas
- Updated test coverage: Enhanced test cases to properly validate URL parsing with realistic HTTP URLs
Technical Details
- Refactored
parseSrcsetEntries()to use regex parsing for URLs with commas - Updated
extractURLFromSrcset,extractAllURLsFromSrcset, andupdateSrcsetAttributefunctions - All tests now pass and real-world Substack posts download correctly
- Verified fix works on multiple Substack publications
This release fixes the regression reported in issue where downloaded HTML contained malformed image URLs like:
server.local/substack/creator/images/post/image.png,images/post/image.png,f_webp,images/post/image.png
Images now correctly reference clean local paths like:
images/post-name/image.png
v0.6.1
Bug fix release
- Fixed comma-separated URL bug in srcset attributes
- Improved URL validation in srcset parsing to prevent malformed fragments
- Added test-download folder to .gitignore
v0.6.0 - Image URL Replacement Fix
🔧 Bug Fix
Image URL Replacement Issue Fixed
Problem: When downloading Substack posts with images, the images were successfully downloaded locally but the HTML content still referenced Substack's CDN URLs instead of the local image paths.
Solution:
- Fixed the image downloader to collect ALL URLs from each image element (, , )
- Ensured all URL variants for the same image are mapped to the same local file path
- Updated HTML content replacement to handle all image URL references
Changes
- Enhanced image URL extraction to handle multiple URL sources per image
- Added comprehensive test coverage for URL replacement functionality
- Maintained backward compatibility with existing functionality
Impact
Downloaded Substack posts are now truly self-contained with all image references pointing to local files instead of external CDNs.
Full Changelog: v0.5.0...v0.6.0
v0.5.0: Local Image Downloading
🖼️ Local Image Downloading
This release introduces comprehensive image downloading functionality to preserve Substack posts with all their visual content locally.
✨ New Features
Image Downloading
--download-imagesflag to download all images locally with posts--image-qualityflag with three quality options:high: 1456px width (best quality, larger files)medium: 848px width (balanced quality/size)low: 424px width (smaller files, mobile-optimized)
--images-dirflag to customize the image directory name (default:images)
Smart Content Processing
- 🎯 Automatically detects and extracts images from complex Substack HTML structures
- 📝 Updates HTML/Markdown content to reference local image paths
- 🔗 Supports all Substack CDN patterns (
substackcdn.com,substack-post-media.s3.amazonaws.com, legacybucketeerdomains) - 📁 Creates organized directory structure:
{output}/images/{post-slug}/ - 🛡️ Generates filesystem-safe filenames while preserving uniqueness
Error Handling & Reliability
- ⚡ Integrates with existing rate limiting and retry logic
- 🔄 Graceful handling of individual image download failures
- 📊 Provides download success/failure summaries in verbose mode
- ↩️ Full backwards compatibility - existing functionality unchanged
📚 Usage Examples
# Download posts with high-quality images (default)
sbstck-dl download --url https://example.substack.com --download-images
# Download with medium quality images
sbstck-dl download --url https://example.substack.com --download-images --image-quality medium
# Download with custom images directory name
sbstck-dl download --url https://example.substack.com --download-images --images-dir assets
# Download single post with images in markdown format
sbstck-dl download --url https://example.substack.com/p/post-title --download-images --format md📂 Directory Structure
output/
├── 20231201_120000_post-title.html
└── images/
└── post-title/
├── image1_1456x819.jpeg
├── image2_848x636.png
└── image3_1272x720.webp
🔧 Technical Details
- New Module:
lib/images.gowith comprehensive image processing - Test Coverage: 100% functionality coverage with real Substack HTML integration tests
- Performance: Concurrent image downloading with configurable quality levels
- Formats: Works with HTML, Markdown, and text output formats
💡 Why This Matters
This ensures your downloaded Substack posts remain fully accessible even if:
- Images are deleted from Substack's CDN
- The original Substack blog becomes unavailable
- You need offline access to the complete content
🏗️ Development
- Added 1,165 lines of new code across 7 files
- Comprehensive test suite with real-world Substack content
- Full integration with existing codebase patterns
- Maintained 100% backwards compatibility
Full Changelog: v0.4.0...v0.5.0