extract_barcodes.py

Extracts FASTQ records matching the specified barcodes from a pair of unmatched read FASTQ files. For each specified barcode, records matching the barcode will be written to a new pair of FASTQ files. All output files are compressed with gzip. It is expected that both input FASTQ files (the forward and reverse read files) are sorted by read name and that their aren’t any missing mates, otherwise an Exception will be raised.


usage: extract_barcodes.py [-h] --r1 R1 --r2 R2 [--outdir OUTDIR]
                           --outfile-prefix OUTFILE_PREFIX
                           [-b BARCODES [BARCODES ...]]

Named Arguments

--r1
FASTQ file containing the (forward) reads.
--r2
FASTQ file containing the reverse reads.
--outdir
The pre-existing directory to output the FASTQ files containing the extracted barcodes. Defaults to the current working directory.
--outfile-prefix
 
The file prefix of each output FASTQ file for a given barcode. The barcode name will be appended to this prefix, as well as the read number. For example, setting the outfile prefix to ‘output’ would result in ‘output_${barcode}_R1.fastq’ and ‘output_${barcode}_R2.fastq’.
-b, --barcodes
One or more barcodes to extract from the input FASTQ file(s).