Command Line Interface¶
BayesTME provides a suite of command line utilities that allow users to script running the pipeline end to end.
These commands will be available on the path in the python environment in which the bayestme package is installed.
load_spaceranger¶
Convert data from spaceranger to a SpatialExpressionDataset in h5 format
usage: load_spaceranger [-h] [--output OUTPUT] [--input INPUT] [-v]
Named Arguments¶
- --output
Output file, a SpatialExpressionDataset in h5 format
- --input
Input spaceranger dir
- -v, --verbose
Enable verbose logging
Default: False
filter_genes¶
This command will create a new SpatialExpressionDataset that has genes filtered according to adjustable criteria. One or more of the criteria can be specified.
Filter genes from dataset based on one or more criteria
usage: filter_genes [-h] [--adata ADATA] [--output OUTPUT]
[--filter-ribosomal-genes]
[--n-top-by-standard-deviation N_TOP_BY_STANDARD_DEVIATION]
[--spot-threshold SPOT_THRESHOLD]
[--expression-truth EXPRESSION_TRUTH] [-v]
Named Arguments¶
- --adata
Input AnnData in h5 format
- --output
Output file, AnnData in h5 format containing the bleed corrected counts
- --filter-ribosomal-genes
Filter ribosomal genes (based on gene name regex)
Default: False
- --n-top-by-standard-deviation
Use the top N genes with the highest spatial variance.
- --spot-threshold
Filter genes appearing in greater than the provided threshold of tissue spots.
- --expression-truth
Filter out genes not found in all expression truth datasets.
- -v, --verbose
Enable verbose logging
Default: False
bleeding_correction¶
Perform bleeding correction
usage: bleeding_correction [-h] [--adata ADATA] [--bleed-out BLEED_OUT]
[--adata-output ADATA_OUTPUT] [-i] [--n-top N_TOP]
[--max-steps MAX_STEPS]
[--local-weight LOCAL_WEIGHT] [-v]
Named Arguments¶
- --adata
Input file, AnnData in h5 format
- --bleed-out
Output file, BleedCorrectionResult in h5 format
- --adata-output
A new AnnData in h5 format created using the bleed corrected counts
- -i, --inplace
If provided, overwrite the input file –adata
Default: False
- --n-top
Use N top genes by standard deviation to calculate the bleeding functions. Genes will not be filtered from output dataset.
Default: 50
- --max-steps
Number of EM steps
Default: 5
- --local-weight
Initial value for local weight, a tuning parameter for bleed correction. rho_0g from equation 1 in the paper. By default will be set to sqrt(N tissue spots)
- -v, --verbose
Enable verbose logging
Default: False
phenotype_selection¶
Select values for number of cell types and lambda smoothing parameter via k-fold cross-validation.
usage: phenotype_selection [-h] [--adata ADATA] [--job-index JOB_INDEX]
[--n-fold N_FOLD] [--n-splits N_SPLITS]
[--n-samples N_SAMPLES] [--n-burn N_BURN]
[--n-thin N_THIN] [--n-gene N_GENE]
[--n-components-min N_COMPONENTS_MIN]
[--n-components-max N_COMPONENTS_MAX]
[--lambda-values LAMBDA_VALUES]
[--max-ncell MAX_NCELL] [--background-noise]
[--lda-initialization] [--output-dir OUTPUT_DIR]
[-v]
Named Arguments¶
- --adata
Input file, AnnData in h5 format
- --job-index
Run only this job index, suitable for running the sampling in parallel across many machines
- --n-fold
Number of times to run k-fold cross-validation.
Default: 5
- --n-splits
Split dataset into k consecutive folds for each instance of k-fold cross-validation
Default: 15
- --n-samples
Number of samples from the posterior distribution.
Default: 100
- --n-burn
Number of burn-in samples
Default: 2000
- --n-thin
Thinning factor for sampling
Default: 5
- --n-gene
Use N top genes by standard deviation to model deconvolution. If this number is less than the total number of genes the top N by spatial variance will be selected
Default: 1000
- --n-components-min
Minimum number of cell types to try.
Default: 2
- --n-components-max
Maximum number of cell types to try.
Default: 12
- --lambda-values
Potential values of the lambda smoothing parameter to try. Defaults to (1, 1e1, 1e2, 1e3, 1e4, 1e5)
- --max-ncell
Maximum cell count within a spot to model.
Default: 120
- --background-noise
Default: False
- --lda-initialization
Default: False
- --output-dir
Output directory. N new files will be saved in this directory, where N is the number of cross-validation jobs.
- -v, --verbose
Enable verbose logging
Default: False
deconvolve¶
Deconvolve data
usage: deconvolve [-h] [--adata ADATA] [--adata-output ADATA_OUTPUT] [-i]
[--output OUTPUT] [--n-gene N_GENE]
[--n-components N_COMPONENTS] [--lam2 LAM2]
[--n-samples N_SAMPLES] [--n-burn N_BURN] [--n-thin N_THIN]
[--random-seed RANDOM_SEED] [--background-noise]
[--lda-initialization] [--expression-truth EXPRESSION_TRUTH]
[-v]
Named Arguments¶
- --adata
Input AnnData in h5 format, expected to be already bleed corrected
- --adata-output
A new AnnData in h5 format created with the deconvolution summary results appended.
- -i, --inplace
If provided, append deconvolution summary results to the –adata archive in place
Default: False
- --output
Path where DeconvolutionResult will be written h5 format
- --n-gene
number of genes
- --n-components
Number of cell types, expected to be determined from cross validation.
- --lam2
Smoothness parameter, this tuning parameter expected to be determinedfrom cross validation.
- --n-samples
Number of samples from the posterior distribution.
Default: 100
- --n-burn
Number of burn-in samples
Default: 1000
- --n-thin
Thinning factor for sampling
Default: 10
- --random-seed
Random seed
Default: 0
- --background-noise
Turn background noise on
Default: False
- --lda-initialization
Turn LDA Initialization on
Default: False
- --expression-truth
Use expression ground truth from one or matched samples that have been processed with the seurat companion scRNA fine mapping workflow. This flag can be provided multiple times for multiple matched samples.
- -v, --verbose
Enable verbose logging
Default: False
select_marker_genes¶
Perform marker gene selection
usage: select_marker_genes [-h] [--adata ADATA] [--adata-output ADATA_OUTPUT]
[-i] [--deconvolution-result DECONVOLUTION_RESULT]
[--n-marker-genes N_MARKER_GENES] [--alpha ALPHA]
[--marker-gene-method {TIGHT,FALSE_DISCOVERY_RATE}]
[-v]
Named Arguments¶
- --adata
Input file, AnnData in h5 format
- --adata-output
A new AnnData in h5 format created with the deconvolution summary results appended.
- -i, --inplace
If provided, append deconvolution summary results to the –adata archive in place
Default: False
- --deconvolution-result
Input file, DeconvolutionResult in h5 format
- --n-marker-genes
Maximum number of marker genes per cell type.
Default: 5
- --alpha
Alpha cutoff for choosing marker genes.
Default: 0.05
- --marker-gene-method
Possible choices: TIGHT, FALSE_DISCOVERY_RATE
Method for choosing marker genes.
Default: TIGHT
- -v, --verbose
Enable verbose logging
Default: False
spatial_expression¶
Detect spatial differential expression patterns
usage: spatial_expression [-h] [--deconvolve-results DECONVOLVE_RESULTS]
[--adata ADATA] [--output OUTPUT]
[--n-cell-min N_CELL_MIN]
[--n-spatial-patterns N_SPATIAL_PATTERNS]
[--n-samples N_SAMPLES] [--n-burn N_BURN]
[--n-thin N_THIN] [--simple] [--alpha0 ALPHA0]
[--prior-var PRIOR_VAR] [--lam2 LAM2]
[--n-gene N_GENE] [-v]
Named Arguments¶
- --deconvolve-results
DeconvolutionResult in h5 format
- --adata
AnnData in h5 format
- --output
Path to store SpatialDifferentialExpressionResult in h5 format
- --n-cell-min
Only consider spots where there are at least <n_cell_min> cells of a given type, as determined by the deconvolution results.
Default: 5
- --n-spatial-patterns
Number of spatial patterns.
- --n-samples
Number of samples from the posterior distribution.
Default: 100
- --n-burn
Number of burn-in samples
Default: 1000
- --n-thin
Thinning factor for sampling
Default: 2
- --simple
Simpler model for sampling spatial differential expression posterior
Default: False
- --alpha0
Alpha0 tuning parameter. Defaults to 10
Default: 10
- --prior-var
Prior var tuning parameter. Defaults to 100.0
Default: 100.0
- --lam2
Smoothness parameter, this tuning parameter expected to be determined from cross validation.
Default: 1
- --n-gene
Number of genes to consider for detecting spatial programs, if this number is less than the total number of genes the top N by spatial variance will be selected
- -v, --verbose
Enable verbose logging
Default: False
Plotting¶
Creating plots is separated into separate commands:
plot_bleeding_correction¶
Plot bleeding correction results
usage: plot_bleeding_correction [-h] [--raw-adata RAW_ADATA]
[--corrected-adata CORRECTED_ADATA]
[--bleed-correction-results BLEED_CORRECTION_RESULTS]
[--output-dir OUTPUT_DIR] [--n-top N_TOP] [-v]
Named Arguments¶
- --raw-adata
Input file, AnnData in h5 format
- --corrected-adata
Input file, AnnData in h5 format
- --bleed-correction-results
Input file, BleedCorrectionResult in h5 format
- --output-dir
Output directory
- --n-top
Plot top n genes by stddev
Default: 10
- -v, --verbose
Enable verbose logging
Default: False
plot_deconvolution¶
Plot deconvolution results
usage: plot_deconvolution [-h] [--adata ADATA] [--output-dir OUTPUT_DIR]
[--cell-type-names CELL_TYPE_NAMES] [-v]
Named Arguments¶
- --adata
Input file, AnnData in h5 format. Expected to be annotated with deconvolution results.
- --output-dir
Output directory.
- --cell-type-names
A comma separated list of cell type names to use for plots.For example –cell-type-names “type 1, type 2, type 3”
- -v, --verbose
Enable verbose logging
Default: False
plot_spatial_expression¶
Plot spatial differential expression results
usage: plot_spatial_expression [-h] [--adata ADATA]
[--deconvolution-result DECONVOLUTION_RESULT]
[--sde-result SDE_RESULT]
[--output-dir OUTPUT_DIR]
[--cell-type-names CELL_TYPE_NAMES] [-v]
Named Arguments¶
- --adata
Input file, AnnData in h5 format
- --deconvolution-result
Input file, DeconvolutionResult in h5 format
- --sde-result
Input file, SpatialDifferentialExpressionResult in h5 format
- --output-dir
Output directory
- --cell-type-names
A comma separated list of cell type names to use for plots.For example –cell-type-names “type 1, type 2, type 3”
- -v, --verbose
Enable verbose logging
Default: False