Skip to content

Command Line Parameters

Michael X. Wang edited this page Apr 25, 2026 · 4 revisions

Usage

seqwin [OPTIONS]

Required inputs

Unless --download-only is used, you must provide:

  • one of --tar-paths or --tar-taxa (target genomes)
  • one of --neg-paths or --neg-taxa (non-target genomes)

Options

  1. Input selection
  2. Output options
  3. Signature options
  4. NCBI download options
  5. Miscellaneous

Input selection

--tar-taxa, -t

Target NCBI taxonomy name or ID. Must be an exact match. Needs NCBI Datasets CLI to be installed.

  • Type: repeatable string option

  • Default: none

  • Example:

    -t "Escherichia coli O157" -t "Escherichia coli O26"

--neg-taxa, -n

Non-target NCBI taxonomy name or ID. Must be an exact match. Needs NCBI Datasets CLI to be installed.

  • Type: repeatable string option

  • Default: none

  • Example:

    -n "Salmonella enterica" -n "Klebsiella pneumoniae"

--tar-paths

Text file containing paths to target genome FASTA files, one path per line. Gzipped FASTA is supported.

  • Type: path
  • Default: none

--neg-paths

Text file containing paths to non-target genome FASTA files, one path per line. Gzipped FASTA is supported.

  • Type: path
  • Default: none

--tar-dir

Directory containing target genome FASTA files. Only files directly inside the directory are used (non-recursive). Gzipped FASTA is supported.

  • Type: path
  • Default: none

--neg-dir

Directory containing non-target genome FASTA files. Only files directly inside the directory are used (non-recursive). Gzipped FASTA is supported.

  • Type: path
  • Default: none

Output options

--prefix

Existing parent path where the output directory will be created.

  • Type: path
  • Default: current working directory

--title, -o

Name of the output directory created under --prefix.

  • Type: string
  • Default: seqwin-out

--overwrite

Overwrite existing output files.

  • Type: flag
  • Default: False

Signature options

--kmerlen, -k

K-mer length.

  • Type: integer
  • Default: 21

--windowsize, -w

Window size for minimizer sketch.

  • Type: integer
  • Default: 200

--penalty-th

Node penalty threshold, from 0 to 1. If not provided, Seqwin computes it automatically.

  • Type: float
  • Default: none
  • Valid range: 0 to 1

--no-mash

Do not run Mash to estimate node penalty threshold. Instead, use minimizer sketches. This is much faster but the estimation might be biased.

If Mash is not installed, Seqwin falls back to minimizer sketches automatically.

Only used when --penalty-th is not provided.

  • Type: flag
  • Default: False

--stringency, -s

Controls the sensitivity and specificity of output signatures.

Increasing this value generally yields fewer and shorter signatures, while improving their sensitivity and specificity.

Internally, Seqwin uses this setting to adjust the estimated node penalty threshold. Only used when --penalty-th is not provided.

  • Type: integer
  • Default: 5
  • Valid range: 0 to 10

--min-len

Minimum length of output signatures.

  • Type: integer
  • Default: 200

--max-len

Estimated maximum length of output signatures. If not provided, no explicit limit is applied.

  • Type: integer
  • Default: none
  • Constraint: must be greater than --min-len

--no-blast

Do not evaluate signature sequences with BLAST.

If NCBI BLAST+ is not installed, Seqwin skips BLAST evaluation automatically.

  • Type: flag
  • Default: False

NCBI download options

--level

Limit downloads to genomes at or above this assembly level.

Possible values follow this order: contig, scaffold, chromosome, complete

  • Type: text
  • Default: contig

--source

Genome source to download from.

Supported values: genbank, refseq

  • Type: text
  • Default: genbank

--annotated

Only include annotated genomes.

  • Type: flag
  • Default: False

--exclude-mag

Exclude metagenome-assembled genomes (MAGs).

  • Type: flag
  • Default: False

--no-gzip

Do not download genomes as gzipped FASTA.

  • Type: flag
  • Default: False

--download-only

Only download genome sequences without running Seqwin.

  • Type: flag
  • Default: False

Miscellaneous

--seed

Random seed for reproducibility.

  • Type: integer
  • Default: 42

--threads, -p

Number of parallel processes or threads to use.

  • Type: integer
  • Default: 4

--version

Show Seqwin version and exit.

--help, -h

Show help message and exit.

Clone this wiki locally