FastqFile-class {ShortRead}R Documentation

Sampling and streaming records from fastq files

Description

FastqFile represents a path and connection to a fastq file. FastqFileList is a list of such connections.

FastqSampler draws a subsample from a fastq file. yield is the method used to extract the sample from the FastqSampler instance; a short illustration is in the example below.

FastqStreamer draws successive subsets from a fastq file. Iterating over the stream will eventually yield the entire file; a short illustration is in the example below.

Usage

## FastqFile and FastqFileList
FastqFile(con, ...)
FastqFileList(...)
## S3 method for class 'ShortReadFile'
open(con, ...)
## S3 method for class 'ShortReadFile'
close(con, ...)
## S4 method for signature 'FastqFile'
readFastq(dirPath, pattern=character(), ...)

## FastqSampler and FastqStreamer
FastqSampler(con, n=1e6, readerBlockSize=1e8, verbose=FALSE)
FastqStreamer(con, n=1e6, readerBlockSize=1e8, verbose=FALSE)
yield(x, ...)

Arguments

con, dirPath

A character string naming a connection, or (for con) an R connection (e.g., file, gzfile).

n

The size of the sample (number of records) to be drawn.

readerBlockSize

The number of bytes or characters to be read at one time; smaller readerBlockSize reduces memory requirements but is less efficient.

verbose

Display progress.

x

An instance from the FastqSampler or FastqStreamer class.

...

Additional arguments. For FastqFileList, this can either be a single character vector of paths to fastq files, or several instances of FastqFile objects.

pattern

Ignored.

Objects from the class

Available classes include:

FastqFile

A file path and connection to a fastq file.

FastqFileList

A list of FastqFile instances.

FastqSampler

Uniformly sample records from a fastq file.

FastqStreamer

Iterate over a fastq file, returning successive parts of the file.

Methods

The following methods are available to users:

readFastq,FastqFile-method:

see also ?readFastq.

writeFastq,ShortReadQ,FastqFile-method:

see also ?writeFastq, ?"writeFastq,ShortReadQ,FastqFile-method".

yield:

Draw a single sample from the instance. Operationally this requires that the underlying data (e.g., file) represented by the Sampler instance be visited; this may be time consuming.

See Also

readFastq, writeFastq, yield.

Examples

sp <- SolexaPath(system.file('extdata', package='ShortRead'))
fl <- file.path(analysisPath(sp), "s_1_sequence.txt")

f <- FastqFile(fl)
rfq <- readFastq(f)

f <- FastqSampler(fl, 50)
yield(f)    # sample of size n=50
yield(f)    # independent sample of size 50

f <- FastqStreamer(fl, 50)
yield(f)    # records 1 to 50
yield(f)    # records 51 to 100

## iterating over an entire file
f <- FastqStreamer(fl, 50)
while (length(fq <- yield(f))) {
    ## do work here
    print(length(fq))
}

## Internal fields, methods, and help; for developers
ShortRead:::.FastqSampler_g$methods()
ShortRead:::.FastqSampler_g$fields()
ShortRead:::.FastqSampler_g$help("yield")


[Package ShortRead version 1.12.4 Index]