illuminaBarcodeDist.py¶
Given a FASTQ file containing unmatched reads, tabulates the frequencies at which each barcode is present. The first line of each FASTQ record must be in the standard Illumina format: @<instrument>:<run number>:<flowcell ID>:<lane>:<tile>:<x-pos>:<y-pos>[:<UMI>] <read>:<is filtered>:<control number>:<index sequence> where anything in [] is optional. The output file will have 3 tab-delimite fields being 1) Barcode 2) Freq 3) Relative_Freq%
usage: illuminaBarcodeDist.py [-h] -i INFILE [-o OUTFILE] [-s SAMPLE_SIZE]
[-v]
Named Arguments¶
| -i, --infile | Input FASTQ file. May be gzip’d with a .gz extension. |
| -o, --outfile | Output file name with barcode histogram. Default is –infile name with the added extension ‘_barcodeHist.txt’. |
| -s, --sample-size | |
The number of reads to use to create the distribution, taken from the start of the file. Default is 100,000 Default: 100000 | |
| -v, --version | show program’s version number and exit |