% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/qQCReport.R
\name{qQCReport}
\alias{qQCReport}
\title{QuasR Quality Control Report}
\usage{
qQCReport(
  input,
  pdfFilename = NULL,
  chunkSize = 1000000L,
  useSampleNames = FALSE,
  clObj = NULL,
  a4layout = TRUE,
  ...
)
}
\arguments{
\item{input}{A vector of files or a \code{qProject} object as returned by
\code{qAlign}.}

\item{pdfFilename}{The path and name of a pdf file to store the report.
If \code{NULL}, the quality control plots will be generated in separate
plotting windows on the standard graphical device.}

\item{chunkSize}{The number of sequences, sequence pairs (for paired-end
data) or alignments that will be sampled from each data file to collect
quality statistics.}

\item{useSampleNames}{If TRUE, the plots will be labelled using the sample
names instead of the file names. Sample names are obtained from the
\code{qProject} object, or from \code{names(input)} if \code{input} is a
named vector of file names. Please not that if there are multiple files
for the same sample, the sample names will not be unique.}

\item{clObj}{A cluster object to be used for parallel processing of multiple
input files.}

\item{a4layout}{A logical scalar. If TRUE, the output of mapping rate and
uniqueness plots will be adjusted for a4 format devices.}

\item{\dots}{Additional arguments that will be passed to the functions
generating the individual quality control plots, see \sQuote{Details}.}
}
\value{
The function is called for its side effect of generating quality
control plots. It invisibly returns a list with components that contain the
data used to generate each of the QC plots. Available components are
(depending on input data, see \sQuote{Details}):
\describe{
  \item{\emph{qualByCycle}}{: quality score boxplot}
  \item{\emph{nuclByCycle}}{: nucleotide frequency plot}
  \item{\emph{duplicated}}{: duplication level plot}
  \item{\emph{mappings}}{: mapping statistics barplot}
  \item{\emph{uniqueness}}{: library complexity barplot}
  \item{\emph{errorsByCycle}}{: mismatch frequency plot}
  \item{\emph{mismatchTypes}}{: mismatch type plot}
  \item{\emph{fragDistribution}}{: fragment size distribution plot}
}
}
\description{
Generate quality control plots for a \code{qProject} object or a vector of
fasta/fastq/bam files. The available plots vary depending on the types of
available input (fasta, fastq, bam files or \code{qProject} object;
paired-end or single-end).
}
\details{
This function generates quality control plots for all input files or the
sequence and alignment files contained in a \code{qProject} object,
allowing assessment of the quality of a sequencing experiment.
\code{qQCReport} uses functionality from the \pkg{ShortRead} package to
collect quality data, and visualizes the results similarly as the
\sQuote{FastQC} quality control tool from Simon Andrews (see
\sQuote{References} below). It is recommended to create PDF reports
(\code{pdfFilename} argument), for which the plot layouts have been optimised.

Some plots will only be generated if the necessary information is available
(e.g. base qualities in fastq sequence files).

The currently available plot types are:
\describe{
  \item{\emph{Quality score boxplot}}{shows the distribution of
        base quality values as a box plot for each position in the input
        sequence. The background color (green, orange or red)
        indicates ranges of high, intermediate and low qualities.}
  \item{\emph{Nucleotide frequency}}{plot shows the frequency of A, C,
        G, T and N bases by position in the read.}
  \item{\emph{Duplication level}}{plot shows for each sample the
        fraction of reads observed at different duplication levels
        (e.g. once, two-times, three-times, etc.). In addition, the most
        frequent sequences are listed.}
  \item{\emph{Mapping statistics}}{shows fractions of reads that were
        (un)mappable to the reference genome.}
  \item{\emph{Library complexity}}{shows fractions of unique
        read(-pair) alignment positions, as a measure of the complexity in
        the sequencing library. Please note that this measure is not
        independent from the total number of reads in a library, and is best
        compared between libraries of similar sizes.}
  \item{\emph{Mismatch frequency}}{shows the frequency and position
        (relative to the read sequence) of mismatches in the alignments
        against the reference genome.}
  \item{\emph{Mismatch types}}{shows the frequency of
        read bases that caused mismatches in the alignments to the
        reference genome, separately for each genome base.}
  \item{\emph{Fragment size}}{shows the distribution of fragment sizes
        inferred from aligned read pairs.}
}

One approach to assess the quality of a sample is to compare its
control plots to the ones from other samples and search for relative
differences. Special quality measures are expected for certain types
of experiments: A genomic re-sequencing sample with an
overrepresentation of T bases may be suspicious, while such a
nucleotide bias is normal for a directed bisulfite-sequencing sample.

Additional arguments can be passed to the internal functions that
generate the individual quality control plots using \code{\dots{}}:
\describe{
  \item{\code{lmat}:}{a matrix (e.g. \code{matrix(1:12, ncol=2)}) used
        by an internal call to the \code{layout} function to specify the
        positioning of multiple plot panels on a device page. Individual panels
        correspond to different samples.}
  \item{\code{breaks}:}{a numerical vector
        (e.g. \code{c(1:10)}) defining the bins used by
        the \sQuote{Duplication level} plot.}
}
}
\examples{
# copy example data to current working directory
file.copy(system.file(package="QuasR", "extdata"), ".", recursive=TRUE)

# create alignments
sampleFile <- "extdata/samples_chip_single.txt"
genomeFile <- "extdata/hg19sub.fa"

proj <- qAlign(sampleFile, genomeFile)

# create quality control report
qQCReport(proj, pdfFilename="qc_report.pdf")

}
\references{
FastQC quality control tool at
\url{http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/}
}
\seealso{
\code{\linkS4class{qProject}}, \code{\link{qAlign}},
\code{\link[ShortRead:ShortReadBase-package]{ShortRead}} package
}
\author{
Anita Lerch, Dimos Gaidatzis and Michael Stadler
}
\keyword{methods}
