% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/SRA_metadata.R
\name{download.SRA.metadata}
\alias{download.SRA.metadata}
\title{Downloads metadata from SRA}
\usage{
download.SRA.metadata(
  SRP,
  outdir = tempdir(),
  remove.invalid = TRUE,
  auto.detect = FALSE,
  abstract = "printsave",
  force = FALSE,
  rich.format = FALSE,
  fetch_GSE = FALSE
)
}
\arguments{
\item{SRP}{character string, a study ID as either the PRJ, SRP, ERP, DRP, GSE or SRA of the study,
examples would be "SRP226389" or "ERP116106". If GSE it will try to convert to the SRP
to find the files. The call works as long the runs are registered on the efetch server,
as their is a linked SRP link from bioproject or GSE. Example which fails is "PRJNA449388",
which does not have a linking like this.}

\item{outdir}{character string, directory to save file, default: tempdir().
The file will be called "SraRunInfo_SRP.csv", where SRP is
the SRP argument. We advice to use bioproject IDs "PRJNA...".
The directory will be created if not existing.}

\item{remove.invalid}{logical, default TRUE. Remove Runs with 0 reads (spots)}

\item{auto.detect}{logical, default FALSE. If TRUE, ORFik will add additional columns:\cr
LIBRARYTYPE: (is this Ribo-seq or mRNA-seq, CAGE etc), \cr
REPLICATE: (is this replicate 1, 2 etc),\cr
STAGE: (Which time point, cell line or tissue is this, HEK293, TCP-1, 24hpf etc),\cr
CONDITION: (is this Wild type control or a mutant etc).\cr
These values are only qualified guesses from the metadata, so always double check!}

\item{abstract}{character, default "printsave". If abstract for project exists,
print and save it (save the file to same directory as runinfo).
Alternatives: "print", Only print first time downloaded,
will not be able to print later.\cr
save" save it, no print\cr
"no" skip download of abstract}

\item{force}{logical, default FALSE. If TRUE, will redownload
all files needed even though they exists. Useuful if you wanted
auto.detection, but already downloaded without it.}

\item{rich.format}{logical, default FALSE. If TRUE, will fetch all Experiment and Sample attributes.
It means, that different studies can have different set of columns if set to TRUE.}

\item{fetch_GSE}{logical, default FALSE. Search for GSE, if exists, appends a column
called GEO. Will be included even though this study is not from GEO, then it
sets all to NA.}
}
\value{
a data.table of the metadata, 1 row per sample,
 SRR run number defined in 'Run' column.
}
\description{
Given a experiment identifier, query information from different locations of SRA
to get a complete metadata table of the experiment. It first finds Runinfo for each
library, then sample info,
if pubmed id is not found searches for that and searches for author through pubmed.
}
\details{
A common problem is that the project is not linked to an article, you will then not
get a pubmed id.

The algorithm works like this:\cr
If GEO identifier, find the SRP.\cr
Then search Entrez for project and get sample identifier.\cr
From that extract the run information and collect into a final table.\cr
}
\examples{
## Originally on SRA
download.SRA.metadata("SRP226389")
## Now try with auto detection (guessing additional library info)
## Need to specify output dir as tempfile() to re-download
#download.SRA.metadata("SRP226389", tempfile(), auto.detect = TRUE)
## Originally on ENA (RCP-seq data)
# download.SRA.metadata("ERP116106")
## Originally on GEO (GSE) (save to directory to keep info with fastq files)
# download.SRA.metadata("GSE61011")
## Bioproject ID
# download.SRA.metadata("PRJNA231536")
}
\references{
doi: 10.1093/nar/gkq1019
}
\seealso{
Other sra: 
\code{\link{browseSRA}()},
\code{\link{download.SRA}()},
\code{\link{download.ebi}()},
\code{\link{get_bioproject_candidates}()},
\code{\link{install.sratoolkit}()},
\code{\link{rename.SRA.files}()}
}
\concept{sra}
