-
Notifications
You must be signed in to change notification settings - Fork 1
Command Line Parameters
seqwin [OPTIONS]Unless --download-only is used, you must provide:
- one of
--tar-pathsor--tar-taxa(target genomes) - one of
--neg-pathsor--neg-taxa(non-target genomes)
--tar-taxa, -t
Target NCBI taxonomy name or ID. Must be an exact match. Needs NCBI Datasets CLI to be installed.
-
Type: repeatable string option
-
Default: none
-
Example:
-t "Escherichia coli O157" -t "Escherichia coli O26"
--neg-taxa, -n
Non-target NCBI taxonomy name or ID. Must be an exact match. Needs NCBI Datasets CLI to be installed.
-
Type: repeatable string option
-
Default: none
-
Example:
-n "Salmonella enterica" -n "Klebsiella pneumoniae"
--tar-paths
Text file containing paths to target genome FASTA files, one path per line. Gzipped FASTA is supported.
- Type: path
- Default: none
--neg-paths
Text file containing paths to non-target genome FASTA files, one path per line. Gzipped FASTA is supported.
- Type: path
- Default: none
--tar-dir
Directory containing target genome FASTA files. Only files directly inside the directory are used (non-recursive). Gzipped FASTA is supported.
- Type: path
- Default: none
--neg-dir
Directory containing non-target genome FASTA files. Only files directly inside the directory are used (non-recursive). Gzipped FASTA is supported.
- Type: path
- Default: none
--prefix
Existing parent path where the output directory will be created.
- Type: path
- Default: current working directory
--title, -o
Name of the output directory created under --prefix.
- Type: string
- Default:
seqwin-out
--overwrite
Overwrite existing output files.
- Type: flag
- Default:
False
--kmerlen, -k
K-mer length.
- Type: integer
- Default:
21
--windowsize, -w
Window size for minimizer sketch.
- Type: integer
- Default:
200
--penalty-th
Node penalty threshold, from 0 to 1. If not provided, Seqwin computes it automatically.
- Type: float
- Default: none
- Valid range:
0to1
--no-mash
Do not run Mash to estimate node penalty threshold. Instead, use minimizer sketches. This is much faster but the estimation might be biased.
If Mash is not installed, Seqwin falls back to minimizer sketches automatically.
Only used when --penalty-th is not provided.
- Type: flag
- Default:
False
--stringency, -s
Controls the sensitivity and specificity of output signatures.
Increasing this value generally yields fewer and shorter signatures, while improving their sensitivity and specificity.
Internally, Seqwin uses this setting to adjust the estimated node penalty threshold. Only used when --penalty-th is not provided.
- Type: integer
- Default:
5 - Valid range:
0to10
--min-len
Minimum length of output signatures.
- Type: integer
- Default:
200
--max-len
Estimated maximum length of output signatures. If not provided, no explicit limit is applied.
- Type: integer
- Default: none
- Constraint: must be greater than
--min-len
--no-blast
Do not evaluate signature sequences with BLAST.
If NCBI BLAST+ is not installed, Seqwin skips BLAST evaluation automatically.
- Type: flag
- Default:
False
--level
Limit downloads to genomes at or above this assembly level.
Possible values follow this order: contig, scaffold, chromosome, complete
- Type: text
- Default:
contig
--source
Genome source to download from.
Supported values: genbank, refseq
- Type: text
- Default:
genbank
--annotated
Only include annotated genomes.
- Type: flag
- Default:
False
--exclude-mag
Exclude metagenome-assembled genomes (MAGs).
- Type: flag
- Default:
False
--no-gzip
Do not download genomes as gzipped FASTA.
- Type: flag
- Default:
False
--download-only
Only download genome sequences without running Seqwin.
- Type: flag
- Default:
False
--seed
Random seed for reproducibility.
- Type: integer
- Default:
42
--threads, -p
Number of parallel processes or threads to use.
- Type: integer
- Default:
4
--version
Show Seqwin version and exit.
--help, -h
Show help message and exit.