---
title: 'NxtIRFdata: a data package for SpliceWiz'
author: "Alex Chit Hei Wong"
date: "`r format(Sys.Date(), '%m/%d/%Y')`"
abstract: >    
    NxtIRFdata is a data package containing ready-made BED files of Mappability
    exclusion genomic regions. It also contains a fully-functioning example
    data set with a "mock" genome and genome annotation to demonstrate the
    functionalities of SpliceWiz, a powerful and interactive analysis and
    visualisation tool for alternative splicing.
output:
    rmarkdown::html_document:
        highlight: pygments
        toc: true
        toc_float: true
vignette: >
    %\VignetteIndexEntry{NxtIRFdata: a data package for NxtIRF}
    %\VignetteEngine{knitr::rmarkdown}
    %\usepackage[utf8]{inputenc}
---

# 0 Important announcement

NxtIRFcore's full functionality (plus more) will be replaced by SpliceWiz in
Bioconductor version 3.16 onwards!

# 1 Installation

To install this package, start R (version "4.2.1") and enter: 

```{r eval=FALSE}
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("NxtIRFdata")
```

Start using NxtIRFdata:

```{r}
library(NxtIRFdata)
```

# 2 Obtaining the example NxtIRF genome and gene annotation files

Examples in SpliceWiz are demonstrated using an artificial genome and gene
annotation. A synthetic reference, with genome sequence (FASTA) and gene 
annotation (GTF) files are provided, based on the genes SRSF1, SRSF2, SRSF3, 
TRA2A, TRA2B, TP53 and NSUN5. These genes, each with an additional 100 flanking 
nucleotides, were used to construct an artificial "chromosome Z" (chrZ). 
Gene annotations, based on release-94 of Ensembl GRCh38 (hg38), were modified 
with genome coordinates corresponding to this artificial chromosome.

These files can be accessed as follows:

```{r}
example_fasta = chrZ_genome()
example_gtf = chrZ_gtf()
```

# 3 Obtaining the example BAM file dataset for SpliceWiz

The set of 6 BAM files used in the SpliceWiz vignette / example code can be
downloaded to a path of the user's choice using the following function:

```{r, results = FALSE, message=FALSE}
bam_paths = example_bams(path = tempdir())
```

Note that this downloads BAM files and not their respective BAI (BAM file 
indices). This is because SpliceWiz reads BAM files natively and does not 
require RSamtools. BAI files are provided with BAM files in their respective
ExperimentHub entries for users wishing to view these files using RSamtools.

# 4 Obtaining the Mappability Exclusion BED files

NxtIRFdata retrieves the relevant records from AnnotationHub and makes a local
copy of the BED file. This BED file is used to produce Mappability Exclusion
information to SpliceWiz.

Note that this function is intended to be called internally by SpliceWiz. Users
interested in the format or nature of the Mappability BED file can call this
function to examine the contents of the BED file

```{r, results = FALSE, message=FALSE}

# To get the MappabilityExclusion for hg38 as a GRanges object
gr = get_mappability_exclusion(genome_type = "hg38", as_type = "GRanges")

# To get the MappabilityExclusion for hg38 as a locally-copied gzipped BED file
bed_path = get_mappability_exclusion(genome_type = "hg38", as_type = "bed.gz",
	path = tempdir())

# Other `genome_type` values include "hg19", "mm10", and "mm9"
```

# 5 Accessing NxtIRFdata via ExperimentHub

The data deposited in ExperimentHub can be accessed as follows:

```{r results = FALSE, message=FALSE}
library(ExperimentHub)
```

```{r eval = FALSE}
eh = ExperimentHub()
NxtIRF_hub = query(eh, "NxtIRF")

NxtIRF_hub

temp = eh[["EH6792"]]
temp

temp = eh[["EH6787"]]
temp
```

# 5 Information about the example BAM files

For more information about the example BAM files, refer to the NxtIRFdata
package documentation:

```{r eval=FALSE}
?`NxtIRFdata-package`
```

# SessionInfo

```{r}
sessionInfo()
```