Playwright-based scraper that fetches all FortiOS CLI config command
reference pages from docs.fortinet.com and saves them as structured
Markdown files with Pandoc Grid Tables.
The config/ directory contains the pre-scraped output, organized as:
config/<major>/<patch>/<section>/<config_command>.md
See versions.yaml for the full list. Currently: 7.4.x, 7.6.x, 8.0.x.
pip install -r requirements.txt
playwright install chromium
# Scrape a single version + section (quick test)
python scrape_cli_ref.py --version 8.0.0 --section alertemail
# Scrape one full version
python scrape_cli_ref.py --version 7.4.0
# Scrape everything (skip already-scraped files)
python scrape_cli_ref.py
# Force re-scrape
python scrape_cli_ref.py --forcepytest tests/ -v- Python 3.11+
- Pandoc system binary:
sudo dnf install pandoc(Fedora) orsudo apt install pandoc(Debian/Ubuntu)