|   | infoassembly | 
Please help by correcting and extending the Wiki pages.
The initial release is a basic version. More output can be added. Please contact the authors with suggestions.
| % infoassembly sam::samspec1.4example.sam stdout -auto Read distribution: ------------------ READ_CATEGORY #READS Av_Read_Length all reads 6 10.33 first of pair 1 9.00 second of pair 1 17.00 pair 2 13.00 unpaired 4 9.00 Contig stats: ------------- CONTIG LENGTH Mean_QUAL #READS Max DEPTH Mean_DEPTH %GC ref 45 0.00 6 0 0.00 0.00 | 
Go to the input files for this example
Example 2
| % infoassembly sam::xxx.sam stdout -auto -qual qualsdist.txt Phred encoding: Sanger / Illumina 1.9 Read distribution: ------------------ READ_CATEGORY #READS Av_Read_Length all reads 2 35.00 first of pair 1 35.00 second of pair 1 35.00 pair 2 35.00 unpaired 0 Contig stats: ------------- CONTIG LENGTH Mean_QUAL #READS Max DEPTH Mean_DEPTH %GC chr20 62435964 25.36 2 2 1.97 0.00 | 
Go to the input files for this example
Go to the output files for this example
Example 3
| % infoassembly sam::../data/samspec1.4example.ref.fasta Read distribution: ------------------ READ_CATEGORY #READS Av_Read_Length all reads 6 10.33 first of pair 1 9.00 second of pair 1 17.00 pair 2 13.00 unpaired 4 9.00 Contig stats: ------------- CONTIG LENGTH Mean_QUAL #READS Max DEPTH Mean_DEPTH %GC ref 45 0.00 6 3 1.82 46.67 | 
Go to the input files for this example
Example 4
| % infoassembly sam::../data/samspec1.4example.ref.fasta -gc gcbias.txt Read distribution: ------------------ READ_CATEGORY #READS Av_Read_Length all reads 6 10.33 first of pair 1 9.00 second of pair 1 17.00 pair 2 13.00 unpaired 4 9.00 Contig stats: ------------- CONTIG LENGTH Mean_QUAL #READS Max DEPTH Mean_DEPTH %GC ref 45 0.00 6 3 1.82 46.67 | 
Go to the output files for this example
| 
Display information about assemblies
Version: EMBOSS:6.6.0.0
   Standard (Mandatory) qualifiers:
  [-assembly]          assembly   (no help text) assembly value
  [-outassembly]       outassembly (no help text) outassembly value
  [-gcbiasmetricsoutfile] outfile    [*.infoassembly] GC bias metrics
  [-qualvaluesdistoutfile] outfile    [*.infoassembly] Distribution of quality
                                  values for the reads
   Additional (Optional) qualifiers:
   -refsequence        seqset     Reference sequences in the assembly
   -windowsize         integer    [100] The size of windows on the genome that
                                  are used to bin reads. (Any integer value)
   -bisulfite          boolean    [N] If this is true, it is assumed that the
                                  reads were bisulfite treated
   Advanced (Unprompted) qualifiers: (none)
   Associated qualifiers:
   "-assembly" associated qualifiers
   -cbegin1            integer    Start of the contig/consensus sequences
   -cend1              integer    End of the contig/consensus sequences
   -iformat1           string     Input assembly format
   -iquery1            string     Input query fields or ID list
   -ioffset1           integer    Input start position offset
   -idbname1           string     User-provided database name
   "-refsequence" associated qualifiers
   -sbegin             integer    Start of each sequence to be used
   -send               integer    End of each sequence to be used
   -sreverse           boolean    Reverse (if DNA)
   -sask               boolean    Ask for begin/end/reverse
   -snucleotide        boolean    Sequence is nucleotide
   -sprotein           boolean    Sequence is protein
   -slower             boolean    Make lower case
   -supper             boolean    Make upper case
   -scircular          boolean    Sequence is circular
   -squick             boolean    Read id and sequence only
   -sformat            string     Input sequence format
   -iquery             string     Input query fields or ID list
   -ioffset            integer    Input start position offset
   -sdbname            string     Database name
   -sid                string     Entryname
   -ufo                string     UFO features
   -fformat            string     Features format
   -fopenfile          string     Features file name
   "-outassembly" associated qualifiers
   -odirectory2        string     Output directory
   -oformat2           string     Assembly output format
   "-gcbiasmetricsoutfile" associated qualifiers
   -odirectory3        string     Output directory
   "-qualvaluesdistoutfile" associated qualifiers
   -odirectory4        string     Output directory
   General qualifiers:
   -auto               boolean    Turn off prompts
   -stdout             boolean    Write first file to standard output
   -filter             boolean    Read first file from standard input, write
                                  first file to standard output
   -options            boolean    Prompt for standard and additional values
   -debug              boolean    Write debug output to program.dbg
   -verbose            boolean    Report some/full command line options
   -help               boolean    Report command line options and exit. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose
   -warning            boolean    Report warnings
   -error              boolean    Report errors
   -fatal              boolean    Report fatal errors
   -die                boolean    Report dying program messages
   -version            boolean    Report version number and exit
 | 
| Qualifier | Type | Description | Allowed values | Default | 
|---|---|---|---|---|
| Standard (Mandatory) qualifiers | ||||
| [-assembly] (Parameter 1) | assembly | (no help text) assembly value | Assembly of sequence reads | |
| [-outassembly] (Parameter 2) | outassembly | (no help text) outassembly value | Assembly of sequence reads | |
| [-gcbiasmetricsoutfile] (Parameter 3) | outfile | GC bias metrics | Output file | <*>.infoassembly | 
| [-qualvaluesdistoutfile] (Parameter 4) | outfile | Distribution of quality values for the reads | Output file | <*>.infoassembly | 
| Additional (Optional) qualifiers | ||||
| -refsequence | seqset | Reference sequences in the assembly | Readable set of sequences | Required | 
| -windowsize | integer | The size of windows on the genome that are used to bin reads. | Any integer value | 100 | 
| -bisulfite | boolean | If this is true, it is assumed that the reads were bisulfite treated | Boolean value Yes/No | No | 
| Advanced (Unprompted) qualifiers | ||||
| (none) | ||||
| Associated qualifiers | ||||
| "-assembly" associated assembly qualifiers | ||||
| -cbegin1 -cbegin_assembly | integer | Start of the contig/consensus sequences | Any integer value | 0 | 
| -cend1 -cend_assembly | integer | End of the contig/consensus sequences | Any integer value | 0 | 
| -iformat1 -iformat_assembly | string | Input assembly format | Any string | |
| -iquery1 -iquery_assembly | string | Input query fields or ID list | Any string | |
| -ioffset1 -ioffset_assembly | integer | Input start position offset | Any integer value | 0 | 
| -idbname1 -idbname_assembly | string | User-provided database name | Any string | |
| "-refsequence" associated seqset qualifiers | ||||
| -sbegin | integer | Start of each sequence to be used | Any integer value | 0 | 
| -send | integer | End of each sequence to be used | Any integer value | 0 | 
| -sreverse | boolean | Reverse (if DNA) | Boolean value Yes/No | N | 
| -sask | boolean | Ask for begin/end/reverse | Boolean value Yes/No | N | 
| -snucleotide | boolean | Sequence is nucleotide | Boolean value Yes/No | N | 
| -sprotein | boolean | Sequence is protein | Boolean value Yes/No | N | 
| -slower | boolean | Make lower case | Boolean value Yes/No | N | 
| -supper | boolean | Make upper case | Boolean value Yes/No | N | 
| -scircular | boolean | Sequence is circular | Boolean value Yes/No | N | 
| -squick | boolean | Read id and sequence only | Boolean value Yes/No | N | 
| -sformat | string | Input sequence format | Any string | |
| -iquery | string | Input query fields or ID list | Any string | |
| -ioffset | integer | Input start position offset | Any integer value | 0 | 
| -sdbname | string | Database name | Any string | |
| -sid | string | Entryname | Any string | |
| -ufo | string | UFO features | Any string | |
| -fformat | string | Features format | Any string | |
| -fopenfile | string | Features file name | Any string | |
| "-outassembly" associated outassembly qualifiers | ||||
| -odirectory2 -odirectory_outassembly | string | Output directory | Any string | |
| -oformat2 -oformat_outassembly | string | Assembly output format | Any string | |
| "-gcbiasmetricsoutfile" associated outfile qualifiers | ||||
| -odirectory3 -odirectory_gcbiasmetricsoutfile | string | Output directory | Any string | |
| "-qualvaluesdistoutfile" associated outfile qualifiers | ||||
| -odirectory4 -odirectory_qualvaluesdistoutfile | string | Output directory | Any string | |
| General qualifiers | ||||
| -auto | boolean | Turn off prompts | Boolean value Yes/No | N | 
| -stdout | boolean | Write first file to standard output | Boolean value Yes/No | N | 
| -filter | boolean | Read first file from standard input, write first file to standard output | Boolean value Yes/No | N | 
| -options | boolean | Prompt for standard and additional values | Boolean value Yes/No | N | 
| -debug | boolean | Write debug output to program.dbg | Boolean value Yes/No | N | 
| -verbose | boolean | Report some/full command line options | Boolean value Yes/No | Y | 
| -help | boolean | Report command line options and exit. More information on associated and general qualifiers can be found with -help -verbose | Boolean value Yes/No | N | 
| -warning | boolean | Report warnings | Boolean value Yes/No | Y | 
| -error | boolean | Report errors | Boolean value Yes/No | Y | 
| -fatal | boolean | Report fatal errors | Boolean value Yes/No | Y | 
| -die | boolean | Report dying program messages | Boolean value Yes/No | Y | 
| -version | boolean | Report version number and exit | Boolean value Yes/No | N | 
The input is a standard EMBOSS assembly query.
The major assembly sources are files in SAM and BAM format. infoassembly reads a sequnece assembly.
| @CO example from SAM format specification v1.4 @HD VN:1.3 SO:coordinate @SQ SN:ref LN:45 r001 163 ref 7 30 8M2I4M1D3M = 37 39 TTAGATAAAGGATACTG * r002 0 ref 9 30 3S6M1P1I4M * 0 0 AAAAGATAAGGATA * r003 0 ref 9 30 5H6M * 0 0 AGCTAA * NM:i:1 r004 0 ref 16 30 6M14N5M * 0 0 ATAGCTTCAGC * r003 16 ref 29 30 6H5M * 0 0 TAGGC * NM:i:0 r001 83 ref 37 30 9M = 7 -39 CAGCGCCAT * | 
| @HD VN:1.0 @SQ SN:chr20 LN:62435964 @RG ID:L1 PU:SC_1_10 LB:SC_1 SM:NA12891 @RG ID:L2 PU:SC_2_12 LB:SC_2 SM:NA12891 read_28833_29006_6945 99 chr20 28833 20 10M1D25M = 28993 195 AGCTTAGCTAGCTACCTATATCTTGGTCTTGGCCG <<<<<<<<<<<<<<<<<<<<<:<9/,&,22;;<<< NM:i:1 RG:Z:L1 read_28701_28881_323b 147 chr20 28834 30 35M = 28701 -168 ACCTATATCTTGGCCTTGGCCGATGCGGCCTTGCA <<<<<;<<<<7;:<<<6;<<<<<<<<<<<<7<<<< MF:i:18 RG:Z:L2 | 
| >ref AGCATGTTAGATAAGATAGCTGTGCTAGTAGGCAGTCAGCGCCAT | 
| #QUALITY: Phred quality score #BASES: Number of bases with the quality score QUALITY #BASES 5 1 11 2 14 1 17 2 21 1 22 2 24 1 25 2 26 5 27 53 | 
| # GC bias metrics as described in GcBiasDetailMetrics class # in picard project (picard.sf.net). # The mean base quality are determined via the error rate # of all bases of all reads that are assigned to windows of a GC. GC WINDOWS READ_STARTS MEAN_BASE_QUALITY NORMALIZED_COVERAGE 20 5 1 8 1.13333 30 5 2 10 2.26667 40 4 0 0 0 50 11 1 0 0.515152 60 4 1 0 1.41667 70 5 0 0 0 | 
| Program name | Description | 
|---|---|
| assemblyget | Get assembly of sequence reads | 
Please report all bugs to the EMBOSS bug team (emboss-bug © emboss.open-bio.org) not to the original author.