CellRanger 2016

From HPC users
Jump to navigationJump to search

Introduction

Cell Ranger is a set of analysis pipelines that process Chromium single-cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analysis, and more. Cell Ranger includes five pipelines relevant to the 3' and 5' Single Cell Gene Expression Solutions and related products:

  • cellranger mkfastq demultiplexes raw base call (BCL) files generated by Illumina sequencers into FASTQ files. It is a wrapper around Illumina's bcl2fastq, with additional features that are specific to 10x libraries and a simplified sample sheet format.
  • cellranger count takes FASTQ files from cellranger mkfastq and performs alignment, filtering, barcode counting, and UMI counting. It uses the Chromium cellular barcodes to generate feature-barcode matrices, determine clusters, and perform gene expression analysis. The count pipeline can take input from multiple sequencing runs on the same GEM well. cellranger count also processes Feature Barcode data alongside Gene Expression reads.
  • cellranger aggr aggregates outputs from multiple runs of cellranger count, normalizing those runs to the same sequencing depth and then recomputing the feature-barcode matrices and analysis on the combined data. The aggr pipeline can be used to combine data from multiple samples into an experiment-wide feature-barcode matrix and analysis.
  • cellranger reanalyze takes feature-barcode matrices produced by cellranger count or cellranger aggr and reruns the dimensionality reduction, clustering, and gene expression algorithms using tunable parameter settings.
  • cellranger multi is used to analyze Cell Multiplexing data. It inputs FASTQ files from cellranger mkfastq and performs alignment, filtering, barcode counting, and UMI counting. It uses the Chromium cellular barcodes to generate feature-barcode matrices, determine clusters, and perform gene expression analysis. The cellranger multi pipeline also supports the analysis of Feature Barcode data. 1

Installed version(s)

The following version is currently available...

... on environment hpc-env/8.3:

  • CellRanger/6.1.1

... on environment hpc-env/6.4:

  • CellRanger/6.1.1

... on environment hpc-uniol-env:

  • CellRanger/6.1.1

Loading / Using CellRanger

To load the desired version of the module, use the module load command, e.g.

module load hpc-env/8.3
module load CellRanger/6.1.1

Always remember: this command is case sensitive!


To find out on how to use CellRanger you can just type in cellranger -h to print out a help text to get you started:

$ cellranger -h
cellranger cellranger-6.1.1
Process 10x Genomics Gene Expression, Feature Barcode, and Immune Profiling
data

USAGE:
    cellranger <SUBCOMMAND>

FLAGS:
    -h, --help       Prints help information
    -V, --version    Prints version information

SUBCOMMANDS:
    count               Count gene expression (targeted or whole-
                        transcriptome) and/or feature barcode reads from a
                        single sample and GEM well
    multi               Analyze multiplexed data or combined gene
                        expression/immune profiling/feature barcode data
    vdj                 Assembles single-cell VDJ receptor sequences from
                        10x Immune Profiling libraries
    aggr                Aggregate data from multiple Cell Ranger runs
    reanalyze           Re-run secondary analysis (dimensionality
                        reduction, clustering, etc)
    targeted-compare    Analyze targeted enrichment performance by
                        comparing a targeted sample to its cognate parent
                        WTA sample (used as input for targeted gene
                        expression)
    targeted-depth      Estimate targeted read depth values (mean reads
                        per cell) for a specified input parent WTA sample
                        and a target panel CSV file
    mkvdjref            Prepare a reference for use with CellRanger VDJ
    mkfastq             Run Illumina demultiplexer on sample sheets that
                        contain 10x-specific sample index sets
    testrun             Execute the 'count' pipeline on a small test
                        dataset
    mat2csv             Convert a gene count matrix to CSV format
    mkref               Prepare a reference for use with 10x analysis
                        software. Requires a GTF and FASTA
    mkgtf               Filter a GTF file by attribute prior to creating a
                        10x reference
    upload              Upload analysis logs to 10x Genomics support
    sitecheck           Collect linux system configuration information
    help                Prints this message or the help of the given
                        subcommand(s)


Additionally, we included some reference files (References - 2020-A (July 7, 2020)) which you can find inside the folder called data which can be found at software path $EBROOTCELLRANGER.
To make the file access easier for you, we created the environment variable $CELLRANGER_DATA which leads to the files directory:

$ ls $CELLRANGER_DATA 
chromium-shared-sample-indexes-plate.csv
chromium-shared-sample-indexes-plate.json
chromium-single-cell-sample-indexes-plate-v1.csv
chromium-single-cell-sample-indexes-plate-v1.json
gemcode-single-cell-sample-indexes-plate.csv
gemcode-single-cell-sample-indexes-plate.json
refdata-gex-GRCh38-2020-A
refdata-gex-GRCh38-and-mm10-2020-A
refdata-gex-mm10-2020-A


Documentation

More information and a tutorial can be found here.