Skip to content

IsbatBInHossain/walrus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

WALRUS: Append-Only Log Engine

WALRUS (Write-Ahead Log for Reliable Unix Storage) is a thread-safe, crash-resilient, binary append-only log engine written in C.

This project implements the foundational storage mechanics used in modern databases and message brokers (like PostgreSQL's WAL or Kafka). It guarantees durability, handles concurrent writes, and self-heals from torn writes caused by sudden power loss.

Core Features

  • Crash Consistency & Self-Healing: Every payload is hashed using a CRC32 checksum. On boot, the engine scans the log. If it detects a missing magic number or a checksum mismatch (indicating a power failure during a write), it safely truncates (ftruncate) the corrupted trailing bytes.
  • Thread Safety: Appends are synchronized using POSIX Mutexes (pthread_mutex_t). Reads utilize pread() to allow concurrent point-reads without moving a global file descriptor offset.
  • O(1) Point Reads: During initialization, the engine builds an in-memory index mapping sequential Log IDs to physical byte offsets on disk.
  • Strict Durability: Every append operation forces physical disk persistence using fsync() before returning success to the caller.
  • Binary Serialization: Payloads are encapsulated in a strict 24-byte aligned header to prevent struct padding issues across different architectures.

On-Disk Binary Format

Data is written to disk sequentially. Every entry consists of a 24-byte packed header followed immediately by the raw payload bytes.

typedef struct {
  uint64_t timestamp;    // 8 bytes: Unix timestamp
  uint32_t magic;        // 4 bytes: Magic number (0xDEADC0DE)
  uint32_t checksum;     // 4 bytes: CRC32 hash of the payload
  uint32_t payload_size; // 4 bytes: Size of the actual data
  uint32_t flags;        // 4 bytes: Reserved / padding for 8-byte alignment
} __attribute__((packed)) LogHeader;

Disk Layout: [Header 24B] [Payload X Bytes] [Header 24B] [Payload Y Bytes] ...

API Overview

// Opens the log file, verifies integrity, builds the in-memory index, and truncates torn writes.
LogEngine *log_open(const char *filepath);

// Thread-safe append. Hashes the payload, writes the header and data, and calls fsync().
// Returns a unique, sequential Log ID.
int64_t log_append(LogEngine *engine, const void *data, size_t size);

// O(1) lookup using the in-memory index. Validates the CRC32 checksum before returning.
// Caller is responsible for freeing the allocated buffer_out.
int log_read(LogEngine *engine, uint64_t id, void **buffer_out, size_t *size_out);

// Closes file descriptors, frees the offset index, and destroys mutexes.
void log_close(LogEngine *engine);

Quick Start

Build

To compile the project with the test suite:

make

Basic Usage

#include "log.h"
#include <stdio.h>

int main() {
    LogEngine *engine = log_open("data.log");

    // Append data
    char *msg = "System boot successful.";
    int64_t id = log_append(engine, msg, 24);

    // Read data
    void *buffer;
    size_t size;
    if (log_read(engine, id, &buffer, &size) == 0) {
        printf("Read Log %ld: %s\n", id, (char*)buffer);
        free(buffer);
    }

    log_close(engine);
    return 0;
}

Dependencies

  • POSIX-compliant operating system (Linux/macOS)
  • GCC or Clang
  • pthreads

About

WALRUS A thread-safe, crash-resilient, append-only binary log engine in C with durable writes, checksum-based recovery, and O(1) indexed reads.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors