% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/aggregate_duplicates.R
\docType{methods}
\name{aggregate_duplicates}
\alias{aggregate_duplicates}
\alias{aggregate_duplicates,SummarizedExperiment-method}
\alias{aggregate_duplicates,RangedSummarizedExperiment-method}
\title{Aggregates multiple counts from the same samples (e.g., from isoforms), concatenates other character columns, and averages other numeric columns}
\usage{
aggregate_duplicates(
  .data,
  .transcript = NULL,
  feature = NULL,
  .abundance = NULL,
  aggregation_function = sum,
  keep_integer = TRUE,
  ...
)

\S4method{aggregate_duplicates}{SummarizedExperiment}(
  .data,
  .transcript = NULL,
  feature = NULL,
  .abundance = NULL,
  aggregation_function = sum,
  keep_integer = TRUE,
  ...
)

\S4method{aggregate_duplicates}{RangedSummarizedExperiment}(
  .data,
  .transcript = NULL,
  feature = NULL,
  .abundance = NULL,
  aggregation_function = sum,
  keep_integer = TRUE,
  ...
)
}
\arguments{
\item{.data}{A `tbl` (with at least three columns for sample, feature and transcript abundance) or `SummarizedExperiment` (more convenient if abstracted to tibble with library(tidySummarizedExperiment))}

\item{.transcript}{DEPRECATED The name of the transcript/gene column (deprecated, use `feature` instead)}

\item{feature}{The name of the feature column as a character string}

\item{.abundance}{The name of the transcript/gene abundance column}

\item{aggregation_function}{A function for counts aggregation (e.g., sum,  median, or mean)}

\item{keep_integer}{A boolean. Whether to force the aggregated counts to integer}

\item{...}{Additional arguments passed to the aggregation function}
}
\value{
A consistent object (to the input) with aggregated transcript abundance and annotation

A `SummarizedExperiment` object

A `SummarizedExperiment` object
}
\description{
aggregate_duplicates() takes as input A `tbl` (with at least three columns for sample, feature and transcript abundance) or `SummarizedExperiment` (more convenient if abstracted to tibble with library(tidySummarizedExperiment)) and returns a consistent object (to the input) with aggregated transcripts that were duplicated.
}
\details{
`r lifecycle::badge("maturing")`

This function aggregates duplicated transcripts (e.g., isoforms, ensembl).
For example, we often have to convert ensembl symbols to gene/transcript symbol,
 but in doing so we have to deal with duplicates. `aggregate_duplicates` takes a tibble
 and column names (as symbols; for `sample`, `transcript` and `count`) as arguments and
 returns a tibble with aggregate transcript with the same name. All the rest of the column
 are appended, and factors and boolean are appended as characters.

 Underlying custom method:
 data |>
		filter(n_aggr > 1) |>
		group_by(!!.sample,!!.transcript) |>
		dplyr::mutate(!!.abundance := !!.abundance |> aggregation_function())
}
\examples{
## Load airway dataset for examples

  data('airway', package = 'airway')
  # Ensure a 'condition' column exists for examples expecting it

    SummarizedExperiment::colData(airway)$condition <- SummarizedExperiment::colData(airway)$dex



# Create a aggregation column
airway = airway
SummarizedExperiment::rowData(airway )$gene_name = rownames(airway )

   aggregate_duplicates(
     airway,
   feature = "gene_name"
   )

}
\references{
Mangiola, S., Molania, R., Dong, R., Doyle, M. A., & Papenfuss, A. T. (2021). tidybulk: an R tidy framework for modular transcriptomic data analysis. Genome Biology, 22(1), 42. doi:10.1186/s13059-020-02233-7

Lawrence, M., Huber, W., Pagès, H., Aboyoun, P., Carlson, M., Gentleman, R., Morgan, M. T., & Carey, V. J. (2013). Software for computing and annotating genomic ranges. PLoS Computational Biology, 9(8), e1003118. doi:10.1371/journal.pcbi.1003118

Wickham, H., François, R., Henry, L., Müller, K., & Vaughan, D. (2023). dplyr: A Grammar of Data Manipulation. R package version 1.1.0. https://CRAN.R-project.org/package=dplyr
}
