
<!-- README.md is generated from README.Rmd. Please edit that file -->

# chevreulProcess

This package includes functions for processing single cell RNA datasets
processed as SingleCellExperiments

A demo with a developing human retina scRNA-seq dataset from Shayler et
al. is available
<a href="https://cobrinik-1.saban-chla.usc.edu/shinyproxy/app/chevreul" target="_blank" rel="noopener noreferrer">here</a>

There are also convenient functions for:

- Clustering and Dimensional Reduction of Raw Sequencing Data.
- Integration and Label Transfer
- Louvain Clustering at a Range of Resolutions
- Cell cycle state regression and labeling

## Installation

You can install the released version of chevreulProcess from
<a href="https://github.com/whtns/chevreulProcess" target="_blank" rel="noopener noreferrer">github</a>
with:

### Install locally and run in three steps:

You can install chevreulProcess locally using the following steps:

## Installation instructions

`Chevreul` depends on a minimum R version \>=4.4 Get the latest stable
`R` release from [CRAN](http://cran.r-project.org/). Then install
`Chevreul` and its dependencies using the following code:

``` r
install.packages("BiocManager")
BiocManager::install("chevreulProcess")

chevreulProcess::create_project_db()
```

You can also customize the location of the app using these steps:

``` r
install.packages("BiocManager")
BiocManager::install("chevreulProcess")
chevreulProcess::create_project_db(destdir = "/your/path/to/app")
```

## Getting Started

First, load chevreulProcess and all other packages required

``` r
library(chevreulProcess)
library(SingleCellExperiment)
library(tidyverse)
library(ggraph)
```

## TLDR

chevreulProcess provides a single command to:

- construct a SingleCellExperiment object

- filter genes by minimum expression and ubiquity

- normalize and scale expression by any of several methods packaged in
  SingleCellExperiment

## Run clustering on a single object

By default clustering will be run at ten different resolutions between
0.2 and 2.0. Any resolution can be specified by providing the resolution
argument as a numeric vector.

``` r

data("small_example_dataset")

clustered_sce <- sce_process(small_example_dataset,
    experiment_name = "sce_hu_trans",
    organism = "human"
)
```

Chevreul includes tools for:

- Louvain clustering at a range of resolutions
- Dimensionality reduction of raw sequencing data.
- Integration (batch correction) of multiple datasets

### Troubleshooting installation

#### Dependency management

When installing an R package like Chevreul with many dependencies,
conflicts with existing installations can arise. This is a common issue
in R package management. Here are some strategies to address this
problem:

1.  Consider
    <a href="https://rstudio.github.io/renv/articles/renv.html" target="_blank" rel="noopener noreferrer">renv</a>
    for dependency management. This tool creates isolated environments
    for each project, ensuring that package versions don’t conflict
    across different projects.

2.  Use the conflicted Package The
    <a href="https://conflicted.r-lib.org" target="_blank" rel="noopener noreferrer">conflicted</a>
    package provides an alternative conflict resolution strategy. It
    makes every conflict an error, forcing you to choose which function
    to use

#### Slow internet connection

When installing R packages on slow internet connections, several issues
can arise, particularly with larger packages or when using functions
like remotes::install_github(). Here are some strategies to address
bandwidth-related problems:

Set a longer timeout for downloads: `options(timeout = 9999999)`

Specify the download method: `options(download.file.method = "libcurl")`

## Transcript-level quantification

For transcript-level analysis, users must incorporate transcript-level
data into the SingleCellExperiment object as an alternative experiment
before initiating the Chevreul processing pipeline. This step is crucial
for enabling detailed exploration at the transcript level.

Transcripts may be quantified using any of several available methods,
including alignment-free methods best used with well-annotated
transcriptomes (Salmon, Kallisto), alignment-based methods best used to
detect novel isoforms (StringTie2), or long-read methods for use with
long-read sequencing data (IsoQuant).

## Integration implementation

The `sce_integrate()` function in Chevreul implements integration (batch
correction) of scRNA-seq datasets by using the
<a href="https://bioconductor.org/packages/devel/bioc/vignettes/batchelor/inst/doc/correction.html" target="_blank" rel="noopener noreferrer">batchelor</a>
package.

It accepts a list of SingleCellExperiment objects as input for
integration and stores the corresponding batch information in a metadata
field named ‘batch’. By default, it employs batchelor’s
`correctExperiments` function to preserve pre-existing data structures
and metadata from input SingleCellExperiment objects within the
integrated output.

## Hardware requirements

Recommended minimum hardware requirements for running Chevreul are as
follows:

- RAM: A minimum of 16 GB RAM is recommended for initial analysis.
  However, for larger datasets or more complex analyses, 64 GB or more
  is advisable.
- CPU: Having multiple cores can be beneficial for parallel processing.
- Storage: Sufficient storage space is necessary, especially for
  temporary files. The exact amount depends on the size of your datasets
- R Version: Chevruel requires R version 4.4 or greater

It’s important to note that these requirements can vary based on the
size and complexity of your dataset. As the number of cells increases,
so do the hardware requirements. For instance: A dataset with around
8,000 cells can be analyzed with 8 GB of RAM. For larger datasets or
more complex analyses, 64-128 GB of RAM can be beneficial.

## Learn More

To learn more about the usage of Bioconductor tools for single-cell
RNA-seq analysis. Consult the book
<a href="https://bioconductor.org/books/release/OSCA/" target="_blank" rel="noopener noreferrer">Orchestrating
Single-Cell Analysis with Bioconductor</a>. The book walks through
common workflows for the analysis of single-cell RNA-seq data
(scRNA-seq). This book will show you how to make use of cutting-edge
Bioconductor tools to process, analyze, visualize, and explore scRNA-seq
data
