|   | splitter | 
Please help by correcting and extending the Wiki pages.
splitter splits one or more input sequences into smaller, optionally overlapping, subsequences. The subsequence size and overlap (if any) may be specified. Optionally, feature information will be used.
Split a sequence into sub-sequences of 10,000 bases (the default size) with no overlap between the sub-sequences:
| % splitter tembl:BA000025 ba000025.split Split sequence(s) into smaller sequences | 
Go to the input files for this example
Go to the output files for this example
Example 2
Split a sequence into sub-sequences of 50,000 bases with an overlap of 3,000 bases on each sub-sequence:
| % splitter tembl:BA000025 ba000025.split -size=50000 -over=3000 Split sequence(s) into smaller sequences | 
Go to the output files for this example
| 
Split sequence(s) into smaller sequences
Version: EMBOSS:6.6.0.0
   Standard (Mandatory) qualifiers:
  [-sequence]          seqall     Sequence(s) filename and optional format, or
                                  reference (input USA)
  [-outseq]            seqoutall  [ | 
| Qualifier | Type | Description | Allowed values | Default | 
|---|---|---|---|---|
| Standard (Mandatory) qualifiers | ||||
| [-sequence] (Parameter 1) | seqall | Sequence(s) filename and optional format, or reference (input USA) | Readable sequence(s) | Required | 
| [-outseq] (Parameter 2) | seqoutall | Sequence set(s) filename and optional format (output USA) | Writeable sequence(s) | <*>.format | 
| Additional (Optional) qualifiers | ||||
| -size | integer | Size to split at | Integer 1 or more | 10000 | 
| -overlap | integer | Overlap between split sequences | Integer 0 or more | 0 | 
| Advanced (Unprompted) qualifiers | ||||
| -feature | boolean | Use feature information | Boolean value Yes/No | No | 
| -addoverlap | boolean | Include overlap in output sequence size | Boolean value Yes/No | No | 
| Associated qualifiers | ||||
| "-sequence" associated seqall qualifiers | ||||
| -sbegin1 -sbegin_sequence | integer | Start of each sequence to be used | Any integer value | 0 | 
| -send1 -send_sequence | integer | End of each sequence to be used | Any integer value | 0 | 
| -sreverse1 -sreverse_sequence | boolean | Reverse (if DNA) | Boolean value Yes/No | N | 
| -sask1 -sask_sequence | boolean | Ask for begin/end/reverse | Boolean value Yes/No | N | 
| -snucleotide1 -snucleotide_sequence | boolean | Sequence is nucleotide | Boolean value Yes/No | N | 
| -sprotein1 -sprotein_sequence | boolean | Sequence is protein | Boolean value Yes/No | N | 
| -slower1 -slower_sequence | boolean | Make lower case | Boolean value Yes/No | N | 
| -supper1 -supper_sequence | boolean | Make upper case | Boolean value Yes/No | N | 
| -scircular1 -scircular_sequence | boolean | Sequence is circular | Boolean value Yes/No | N | 
| -squick1 -squick_sequence | boolean | Read id and sequence only | Boolean value Yes/No | N | 
| -sformat1 -sformat_sequence | string | Input sequence format | Any string | |
| -iquery1 -iquery_sequence | string | Input query fields or ID list | Any string | |
| -ioffset1 -ioffset_sequence | integer | Input start position offset | Any integer value | 0 | 
| -sdbname1 -sdbname_sequence | string | Database name | Any string | |
| -sid1 -sid_sequence | string | Entryname | Any string | |
| -ufo1 -ufo_sequence | string | UFO features | Any string | |
| -fformat1 -fformat_sequence | string | Features format | Any string | |
| -fopenfile1 -fopenfile_sequence | string | Features file name | Any string | |
| "-outseq" associated seqoutall qualifiers | ||||
| -osformat2 -osformat_outseq | string | Output seq format | Any string | |
| -osextension2 -osextension_outseq | string | File name extension | Any string | |
| -osname2 -osname_outseq | string | Base file name | Any string | |
| -osdirectory2 -osdirectory_outseq | string | Output directory | Any string | |
| -osdbname2 -osdbname_outseq | string | Database name to add | Any string | |
| -ossingle2 -ossingle_outseq | boolean | Separate file for each entry | Boolean value Yes/No | N | 
| -oufo2 -oufo_outseq | string | UFO features | Any string | |
| -offormat2 -offormat_outseq | string | Features format | Any string | |
| -ofname2 -ofname_outseq | string | Features file name | Any string | |
| -ofdirectory2 -ofdirectory_outseq | string | Output directory | Any string | |
| General qualifiers | ||||
| -auto | boolean | Turn off prompts | Boolean value Yes/No | N | 
| -stdout | boolean | Write first file to standard output | Boolean value Yes/No | N | 
| -filter | boolean | Read first file from standard input, write first file to standard output | Boolean value Yes/No | N | 
| -options | boolean | Prompt for standard and additional values | Boolean value Yes/No | N | 
| -debug | boolean | Write debug output to program.dbg | Boolean value Yes/No | N | 
| -verbose | boolean | Report some/full command line options | Boolean value Yes/No | Y | 
| -help | boolean | Report command line options and exit. More information on associated and general qualifiers can be found with -help -verbose | Boolean value Yes/No | N | 
| -warning | boolean | Report warnings | Boolean value Yes/No | Y | 
| -error | boolean | Report errors | Boolean value Yes/No | Y | 
| -fatal | boolean | Report fatal errors | Boolean value Yes/No | Y | 
| -die | boolean | Report dying program messages | Boolean value Yes/No | Y | 
| -version | boolean | Report version number and exit | Boolean value Yes/No | N | 
The input is a standard EMBOSS sequence query (also known as a 'USA').
Major sequence database sources defined as standard in EMBOSS installations include srs:embl, srs:uniprot and ensembl
Data can also be read from sequence output in any supported format written by an EMBOSS or third-party application.
The input format can be specified by using the command-line qualifier -sformat xxx, where 'xxx' is replaced by the name of the required format. The available format names are: gff (gff3), gff2, embl (em), genbank (gb, refseq), ddbj, refseqp, pir (nbrf), swissprot (swiss, sw), dasgff and debug.
See: http://emboss.sf.net/docs/themes/SequenceFormats.html for further information on sequence formats.
| 
ID   BA000025; SV 2; linear; genomic DNA; STD; HUM; 2229817 BP.
XX
AC   BA000025; AP000502-AP000521;
XX
DT   09-DEC-2004 (Rel. 82, Created)
DT   17-JUN-2008 (Rel. 96, Last updated, Version 5)
XX
DE   Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region.
XX
KW   .
XX
OS   Homo sapiens (human)
OC   Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia;
OC   Eutheria; Euarchontoglires; Primates; Haplorrhini; Catarrhini; Hominidae;
OC   Homo.
XX
RN   [1]
RP   1-2229817
RA   Hirakawa M., Yamaguchi H., Imai K., Shimada J.;
RT   ;
RL   Submitted (21-AUG-2001) to the INSDC.
RL   Mika Hirakawa, Japan Science and Technology Corporation (JST), Advanced
RL   Databases Department; 5-3, Yonbancho, Chiyoda-ku, Tokyo 102-0081, Japan
RL   (E-mail:mika@tokyo.jst.go.jp, URL:http://www-alis.tokyo.jst.go.jp/,
RL   Tel:81-3-5214-8491, Fax:81-3-5214-8470)
XX
RN   [2]
RA   Shiina S., Tamiya G., Oka A., Inoko H.;
RT   "Homo sapiens 2,229,817bp genomic DNA of 6p21.3 HLA class I region";
RL   Unpublished.
XX
DR   EPD; EP11158; HS_TNF.
DR   EPD; EP11159; HS_LTA.
DR   EPD; EP73522; HS_HLA-B.
DR   EPD; EP73908; HS_GTF2H4.
DR   EPD; EP73940; HS_NEU1.
DR   EPD; EP74013; HS_VARS2.
DR   EPD; EP74203; HS_MRPS18B.
DR   EPD; EP74346; HS_HLA-E.
DR   EPD; EP74389; HS_BAT1.
DR   EPD; EP74485; HS_IER3.
DR   Ensembl-Gn; ENSG00000096155; Homo_sapiens.
DR   Ensembl-Gn; ENSG00000096171; Homo_sapiens.
DR   Ensembl-Gn; ENSG00000111971; Homo_sapiens.
DR   Ensembl-Gn; ENSG00000137310; Homo_sapiens.
DR   Ensembl-Gn; ENSG00000137312; Homo_sapiens.
DR   Ensembl-Gn; ENSG00000137313; Homo_sapiens.
DR   Ensembl-Gn; ENSG00000137331; Homo_sapiens.
DR   Ensembl-Gn; ENSG00000137332; Homo_sapiens.
DR   Ensembl-Gn; ENSG00000137337; Homo_sapiens.
  [Part of this file has been deleted for brevity]
     ttggccccac cccagcatgt ctccaggttc ctctcagccc tggttccttt tggccctgca   2226900
     gtcacaatgg gcaacactgt gacgcaccct gtcctgtgtc acagtgtcat acactcaggc   2226960
     tcacattgcc cctaggccac ttgccagcca agggacatgg ccacattttg tgtcttctgc   2227020
     acctcagcct tgctttcaag tgcaggtgat gatggcaccc acgcagaaca aatgttattt   2227080
     gctatcttcg tcgagtttag tcatccaatt ttccaaccct cactgggcaa ggaagagtgt   2227140
     ggtttccacc aagaaggcag gatgtcagca gtcacagggg caaccaacag ggaaagccgc   2227200
     cggaaaatag accccacagg aagcacaggt gtccagtgga gatgggaacc ctgcagattt   2227260
     gaccgtcttt aagcagatta gagagattac cgttactaac aacttagcca taaaagttta   2227320
     ttagctattt tcaaaaagca taaaattatg taatataatt ttttttaaat ttccatcaat   2227380
     acaaaactaa tctgggcact gcaacttccg gtgggcaact gggataggcg gcatcatcag   2227440
     gaaggcgagc cctgccgtgc cccatgtgcc agtgccccag atggcggcag cctccccaga   2227500
     agcaccttgt atctcccctg cacagggcca gggtcccagc ttcccataca ccttctcctg   2227560
     ctttttcttt tctgtccttt cctttttcaa taaaccacct gcaaaaaggg aaaaccattc   2227620
     tgaggacaag aaacatgtca atgggaaata cacagttgcc agagggtaaa aggccctgtt   2227680
     cattctcatt gaaaagctca ggtatttctg ttaaagtctc tccttttact ttaggatgct   2227740
     gactcctgcg tccatctcaa cctgggcatc gtgccaccac cttcaagaag agaaaaacta   2227800
     agtagtgctt tgcaaagggg cagcagcatt tctcatttct gaccatgtca ggcacatggc   2227860
     catgcagatg agcaggtggg ggacacaggt gagtctccag acctgctctc ctcccacagt   2227920
     acattcttga gtctttttaa acagttgtga aaatgccaca gatgcaagca cctgtgggcc   2227980
     actcccatgg ggaccgttgc acaaggcagt gccactcatt ctcagaacct cctaccatgg   2228040
     gctatgctta gtgacccgag gccaagccaa ggaagacgcc agccacaggg tgccatcctc   2228100
     aggggcatgc tgccagcagg ggcaaagtta tccctagcaa caagatacag aaagaaagaa   2228160
     aaaaggaagg aaatgtagcc aatgggccgg ttcaggttct tgactttgcc acacaaaaga   2228220
     atttgagagc aagtccaaag taaaagtcag caagagaatt tattgcaaag tgaaagtaca   2228280
     ctctgacagc tgatcagagc agctgctcaa aagagagaca gtaccctccc ctcacgggag   2228340
     tcttacatga ttattcatga ataggtggga aggggtattg ttttaagcat gttctgtggt   2228400
     ctcttgaacg tgcatgcact gtggttgtac atatcagcac acacatctta cgtctcatta   2228460
     gcatcttaac ttccctctca gagttgtgtt tgctactatt gtaatgagca taggtcagcc   2228520
     caaggacact attcatgggt ttctgggctt cctcagatgt ggggatgcct cccttggctc   2228580
     ttctacctct ttgctgcagg atgttctaac cacaagccca ggatatggtt tgcgcactgt   2228640
     cgaacagctt gttctctcca tcaacctgac aagtctcttg tttcctttca agggaggctg   2228700
     tgaacaccct atctcactga cctcagaagg acagtacagc agtagccacc atgaccaaaa   2228760
     agatgattcc agaagtgcag gacaactccc tacccagagg ctgtggctgt gcagtaacac   2228820
     accaagaggg gagtccagct ggctctcagg gtgctcacta ccctcatctg ggggcctgga   2228880
     ggacgtcaat tcctgagaac gccacgttct agtgagtaga atgaactgag agatacacag   2228940
     caaagctcca catacttttc cttttctttg tgcccgcagt gttcttcatc agtgtgctct   2229000
     cgcttttcag ctactactgt tggctggctg gaaaaaatag aacaatagta aaaattagag   2229060
     accagtcttt ggtgatgaag agaaatattg gctacttcca gtattttcta gctttggtta   2229120
     tggttgcagt tttccagctc accttgtggg gatgaattca gaaaaaagtt acaaattgaa   2229180
     atgaacatgc cagaagtatt ggctcaaatc aacgttgtcc tattaagcca cttagtgaat   2229240
     caaaagaccg cttgttggac tgttaatctc ggtggccaga gaaaggagct gaagaaggtg   2229300
     ttgccagatc aggaacaaat aattacagcg gcaatagaaa atggaagacc acttgttcat   2229360
     aaccatttga ataagggcaa ggtgtatgga aacacattat gaactgatat tttcagtttt   2229420
     gtttgcaaga aaatgattaa taaggtgaaa taattgaagt atcacggaag atacattaaa   2229480
     aaaaaaaaaa gcctttgtac agtttgctgg agccacagat gtcctactcc agagcagaac   2229540
     aatgcctgaa tcttcagggt ccatttctgc cgcattcact agcaaccaca aatgtgactt   2229600
     aattttactt tggaaataat gcttacccat tgtgagatgc tgtaatatga accatcatta   2229660
     catgttaaca tggcacatgg aattttgagt gtctaagtta catttttaga gttgtttctt   2229720
     agtagccatg tgagtttcca ctccaaaaac acaagctaaa aacttgtttt gagtgaagga   2229780
     catctagggc aaatggtggc tgaaagtgaa tgagatc                            2229817
//
 | 
The output is a standard EMBOSS sequence file.
The results can be output in one of several styles by using the command-line qualifier -osformat xxx, where 'xxx' is replaced by the name of the required format. The available format names are: embl, genbank, gff, pir, swiss, dasgff, debug, listfile, dbmotif, diffseq, excel, feattable, motif, nametable, regions, seqtable, simple, srs, table, tagseq.
See: http://emboss.sf.net/docs/themes/SequenceFormats.html for further information on sequence formats.
| >BA000025_1-10000 Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region. gatctccagagcactcttccctgcagggcaccctcccatcccagactccaggcacctggc atgggtggacatctttactttctgggccagcttcagcagagctatgtcatcaccatagaa ctccaggattccctggttctttttggcaaagacatcaaaccctggggagatcaccgcctt ctcaataaggaattctttgccccactgggatttggggtctcctggaaatgatacactaga ttaggctagaccagggctcctgcaggggccagaggctgggtgaggtggtaggatctgtgg cttcaggatcaggaggctggtgcatcccctgccttacccacattgaccctccacagggag tggtcgttgccatcgcggaagcaatgagctgctgtcaggacccattggtcggagatgagg gccccccggcaggtctcttggctcttgggctgcaggggaacaggtgattttcagagattg cagtatgtctggcccatggccgcttttacctctggaatccaagccctgcccctccttcct ggtaccttaatagtgacatgccagggtgtcctctcctggtcagaggcgtttgctgacatg ttccccaccccgcagatggtgtctgtgagcttggagacatctgtgggtgtgaggatcaga tggggaaggaggcaagtgaggggcactgtgtccaggttcccaacacgggcctctggcggg ctcctcaccatcctccccacaccaaggagggcaaagctcactcacccagcatatgttcaa agacctggtgcagagcctttgtgtcctgcagaatgaaggcatgcctctcaccatccttct tggaccctagctcattcagttctctccagtccacatccagcttgcccaccccgatggcat agatgtctggtggggaagagggaaatcaccagactcctgtggctttggggctaccccatg agacaggaggctgtcatctgaaactcactgtgtccaatcaagacctacatgagctggacc cctgcgtcctccccactgctacctgtctgccttcatttcctgccactccctgcccttcac tctcctgcagcacacagcctctttgaagttcctcaaatccataggcatggtcacacctca ggccctttgcccagctgtgcctctgcctagttcactcctcccccccagacttccacatgg ctcactttcgtacctttttaagtcttggctcaaatgtcaccttctcagtgaggccttccc tggtcttcctgtctaaaactgcaatgccccagacaaactttcatccccactttgggaggc aaggtgggaggatcccttgaagccagaagtttgagaccagcctgggcaacatggcaacac cccttagcttgtgtcacctaccacctgctgggttctatggttttcttatcctgtttattc cctgtaatggtggaattgtgtcccccagaaagatgtgttcgagtcctaatccccagtatc tgtgactttatttggaaaaagggtctttgcagatgtaatcaagttaagattaagtcatac tagattagggtgagctctaatccaatgactgaggtccttataagaagaggtaagccagag ccaggcgtggtggctcacacctgtaatcaccaggaggcggtggttgtggtgagccaagat cgcgccattgcactccagcctgggcaacaagagcaaaaccccgtctcaaaaaaaaaaaaa gaagaggtgagccgggcacggtggctcacacctgtaatcccagcactctgggaggctgag gcgggcagatcacgaggtcaggaattcaagaccagcctgaccaacatggtgaaaccctgt ctctactaaaaatacaaaaattagccagacatgctggcacacacctgtaatcccagctac tcaggaggctgaggcaggagaatcgcttgaaccgggaggcggatgttgcagtgagccgag attgcaccactgcactccagcctgggcaacagagcaagactccatctcaaaaaaaaaaaa aaaaaaaaaaagtgaactggctgggcatggtggtgactcatgcctgtaatcccggcagtt tttttgaggcgaaggcaggcagatcgccttgaggccaggagtttaagaccagcctagcca acatggcgagaccatgtctctactaaaaatacaaaaatttgccgggcatggtggcacatg cctgtaatcccagcttcttgggagactgaggcacgagaatcacctgaacccaggaggcag aggttacagtgagccgggatcccgccactgcactgcagcctgggcttctgggtgacagag cgagactctgtctcaaacaaatgaacagaaaaagaagaaaggaatttggacacaaagaca caggtagtgggtctcctatctatataagagaacagcatgtaatgacacagaggcacacac agaaaagaaggcgagttgaagacagaggcagagaatgggtttatgctgccgcaagccaag gttggagctgccggcagccggaaaaggcaggaaagaattcttcccaagagccttctgagg aagcacggccctgccaacaccttgatttcagacttctaacctccagaactgtaagaaaaa gaaattctgtgttctaagccacccaggtttgtggtagtttggtaagtacttttaaatgac tgaatgaatagaaagaactcagaacacaacatggaaactaaacctcagatctggtcttcc tctgtaaaaggtagcatctgggagaagggcctaaagccacgttttcccactggaggccct ggacccacacaacaggccgcgcctgtcctccgactgtggtgccagtcagaactgccctca gacagaccacagagtctactcctctcccagcctttgcaccccttgtggcccatttttgtt [Part of this file has been deleted for brevity] cctcggtctgtctccaccaggccctgtgagggtgggtggaggctctctccaagccctcgt ttggccccaccccagcatgtctccaggttcctctcagccctggttccttttggccctgca gtcacaatgggcaacactgtgacgcaccctgtcctgtgtcacagtgtcatacactcaggc tcacattgcccctaggccacttgccagccaagggacatggccacattttgtgtcttctgc acctcagccttgctttcaagtgcaggtgatgatggcacccacgcagaacaaatgttattt gctatcttcgtcgagtttagtcatccaattttccaaccctcactgggcaaggaagagtgt ggtttccaccaagaaggcaggatgtcagcagtcacaggggcaaccaacagggaaagccgc cggaaaatagaccccacaggaagcacaggtgtccagtggagatgggaaccctgcagattt gaccgtctttaagcagattagagagattaccgttactaacaacttagccataaaagttta ttagctattttcaaaaagcataaaattatgtaatataattttttttaaatttccatcaat acaaaactaatctgggcactgcaacttccggtgggcaactgggataggcggcatcatcag gaaggcgagccctgccgtgccccatgtgccagtgccccagatggcggcagcctccccaga agcaccttgtatctcccctgcacagggccagggtcccagcttcccatacaccttctcctg ctttttcttttctgtcctttcctttttcaataaaccacctgcaaaaagggaaaaccattc tgaggacaagaaacatgtcaatgggaaatacacagttgccagagggtaaaaggccctgtt cattctcattgaaaagctcaggtatttctgttaaagtctctccttttactttaggatgct gactcctgcgtccatctcaacctgggcatcgtgccaccaccttcaagaagagaaaaacta agtagtgctttgcaaaggggcagcagcatttctcatttctgaccatgtcaggcacatggc catgcagatgagcaggtgggggacacaggtgagtctccagacctgctctcctcccacagt acattcttgagtctttttaaacagttgtgaaaatgccacagatgcaagcacctgtgggcc actcccatggggaccgttgcacaaggcagtgccactcattctcagaacctcctaccatgg gctatgcttagtgacccgaggccaagccaaggaagacgccagccacagggtgccatcctc aggggcatgctgccagcaggggcaaagttatccctagcaacaagatacagaaagaaagaa aaaaggaaggaaatgtagccaatgggccggttcaggttcttgactttgccacacaaaaga atttgagagcaagtccaaagtaaaagtcagcaagagaatttattgcaaagtgaaagtaca ctctgacagctgatcagagcagctgctcaaaagagagacagtaccctcccctcacgggag tcttacatgattattcatgaataggtgggaaggggtattgttttaagcatgttctgtggt ctcttgaacgtgcatgcactgtggttgtacatatcagcacacacatcttacgtctcatta gcatcttaacttccctctcagagttgtgtttgctactattgtaatgagcataggtcagcc caaggacactattcatgggtttctgggcttcctcagatgtggggatgcctcccttggctc ttctacctctttgctgcaggatgttctaaccacaagcccaggatatggtttgcgcactgt cgaacagcttgttctctccatcaacctgacaagtctcttgtttcctttcaagggaggctg tgaacaccctatctcactgacctcagaaggacagtacagcagtagccaccatgaccaaaa agatgattccagaagtgcaggacaactccctacccagaggctgtggctgtgcagtaacac accaagaggggagtccagctggctctcagggtgctcactaccctcatctgggggcctgga ggacgtcaattcctgagaacgccacgttctagtgagtagaatgaactgagagatacacag caaagctccacatacttttccttttctttgtgcccgcagtgttcttcatcagtgtgctct cgcttttcagctactactgttggctggctggaaaaaatagaacaatagtaaaaattagag accagtctttggtgatgaagagaaatattggctacttccagtattttctagctttggtta tggttgcagttttccagctcaccttgtggggatgaattcagaaaaaagttacaaattgaa atgaacatgccagaagtattggctcaaatcaacgttgtcctattaagccacttagtgaat caaaagaccgcttgttggactgttaatctcggtggccagagaaaggagctgaagaaggtg ttgccagatcaggaacaaataattacagcggcaatagaaaatggaagaccacttgttcat aaccatttgaataagggcaaggtgtatggaaacacattatgaactgatattttcagtttt gtttgcaagaaaatgattaataaggtgaaataattgaagtatcacggaagatacattaaa aaaaaaaaaagcctttgtacagtttgctggagccacagatgtcctactccagagcagaac aatgcctgaatcttcagggtccatttctgccgcattcactagcaaccacaaatgtgactt aattttactttggaaataatgcttacccattgtgagatgctgtaatatgaaccatcatta catgttaacatggcacatggaattttgagtgtctaagttacatttttagagttgtttctt agtagccatgtgagtttccactccaaaaacacaagctaaaaacttgttttgagtgaagga catctagggcaaatggtggctgaaagtgaatgagatc | 
| >BA000025_1-50000 Homo sapiens genomic DNA, chromosome 6p21.3, HLA Class I region. gatctccagagcactcttccctgcagggcaccctcccatcccagactccaggcacctggc atgggtggacatctttactttctgggccagcttcagcagagctatgtcatcaccatagaa ctccaggattccctggttctttttggcaaagacatcaaaccctggggagatcaccgcctt ctcaataaggaattctttgccccactgggatttggggtctcctggaaatgatacactaga ttaggctagaccagggctcctgcaggggccagaggctgggtgaggtggtaggatctgtgg cttcaggatcaggaggctggtgcatcccctgccttacccacattgaccctccacagggag tggtcgttgccatcgcggaagcaatgagctgctgtcaggacccattggtcggagatgagg gccccccggcaggtctcttggctcttgggctgcaggggaacaggtgattttcagagattg cagtatgtctggcccatggccgcttttacctctggaatccaagccctgcccctccttcct ggtaccttaatagtgacatgccagggtgtcctctcctggtcagaggcgtttgctgacatg ttccccaccccgcagatggtgtctgtgagcttggagacatctgtgggtgtgaggatcaga tggggaaggaggcaagtgaggggcactgtgtccaggttcccaacacgggcctctggcggg ctcctcaccatcctccccacaccaaggagggcaaagctcactcacccagcatatgttcaa agacctggtgcagagcctttgtgtcctgcagaatgaaggcatgcctctcaccatccttct tggaccctagctcattcagttctctccagtccacatccagcttgcccaccccgatggcat agatgtctggtggggaagagggaaatcaccagactcctgtggctttggggctaccccatg agacaggaggctgtcatctgaaactcactgtgtccaatcaagacctacatgagctggacc cctgcgtcctccccactgctacctgtctgccttcatttcctgccactccctgcccttcac tctcctgcagcacacagcctctttgaagttcctcaaatccataggcatggtcacacctca ggccctttgcccagctgtgcctctgcctagttcactcctcccccccagacttccacatgg ctcactttcgtacctttttaagtcttggctcaaatgtcaccttctcagtgaggccttccc tggtcttcctgtctaaaactgcaatgccccagacaaactttcatccccactttgggaggc aaggtgggaggatcccttgaagccagaagtttgagaccagcctgggcaacatggcaacac cccttagcttgtgtcacctaccacctgctgggttctatggttttcttatcctgtttattc cctgtaatggtggaattgtgtcccccagaaagatgtgttcgagtcctaatccccagtatc tgtgactttatttggaaaaagggtctttgcagatgtaatcaagttaagattaagtcatac tagattagggtgagctctaatccaatgactgaggtccttataagaagaggtaagccagag ccaggcgtggtggctcacacctgtaatcaccaggaggcggtggttgtggtgagccaagat cgcgccattgcactccagcctgggcaacaagagcaaaaccccgtctcaaaaaaaaaaaaa gaagaggtgagccgggcacggtggctcacacctgtaatcccagcactctgggaggctgag gcgggcagatcacgaggtcaggaattcaagaccagcctgaccaacatggtgaaaccctgt ctctactaaaaatacaaaaattagccagacatgctggcacacacctgtaatcccagctac tcaggaggctgaggcaggagaatcgcttgaaccgggaggcggatgttgcagtgagccgag attgcaccactgcactccagcctgggcaacagagcaagactccatctcaaaaaaaaaaaa aaaaaaaaaaagtgaactggctgggcatggtggtgactcatgcctgtaatcccggcagtt tttttgaggcgaaggcaggcagatcgccttgaggccaggagtttaagaccagcctagcca acatggcgagaccatgtctctactaaaaatacaaaaatttgccgggcatggtggcacatg cctgtaatcccagcttcttgggagactgaggcacgagaatcacctgaacccaggaggcag aggttacagtgagccgggatcccgccactgcactgcagcctgggcttctgggtgacagag cgagactctgtctcaaacaaatgaacagaaaaagaagaaaggaatttggacacaaagaca caggtagtgggtctcctatctatataagagaacagcatgtaatgacacagaggcacacac agaaaagaaggcgagttgaagacagaggcagagaatgggtttatgctgccgcaagccaag gttggagctgccggcagccggaaaaggcaggaaagaattcttcccaagagccttctgagg aagcacggccctgccaacaccttgatttcagacttctaacctccagaactgtaagaaaaa gaaattctgtgttctaagccacccaggtttgtggtagtttggtaagtacttttaaatgac tgaatgaatagaaagaactcagaacacaacatggaaactaaacctcagatctggtcttcc tctgtaaaaggtagcatctgggagaagggcctaaagccacgttttcccactggaggccct ggacccacacaacaggccgcgcctgtcctccgactgtggtgccagtcagaactgccctca gacagaccacagagtctactcctctcccagcctttgcaccccttgtggcccatttttgtt [Part of this file has been deleted for brevity] ggagaggggcaggtgcccctcctcggtctgtctccaccaggccctgtgagggtgggtgga ggctctctccaagccctcgtttggccccaccccagcatgtctccaggttcctctcagccc tggttccttttggccctgcagtcacaatgggcaacactgtgacgcaccctgtcctgtgtc acagtgtcatacactcaggctcacattgcccctaggccacttgccagccaagggacatgg ccacattttgtgtcttctgcacctcagccttgctttcaagtgcaggtgatgatggcaccc acgcagaacaaatgttatttgctatcttcgtcgagtttagtcatccaattttccaaccct cactgggcaaggaagagtgtggtttccaccaagaaggcaggatgtcagcagtcacagggg caaccaacagggaaagccgccggaaaatagaccccacaggaagcacaggtgtccagtgga gatgggaaccctgcagatttgaccgtctttaagcagattagagagattaccgttactaac aacttagccataaaagtttattagctattttcaaaaagcataaaattatgtaatataatt ttttttaaatttccatcaatacaaaactaatctgggcactgcaacttccggtgggcaact gggataggcggcatcatcaggaaggcgagccctgccgtgccccatgtgccagtgccccag atggcggcagcctccccagaagcaccttgtatctcccctgcacagggccagggtcccagc ttcccatacaccttctcctgctttttcttttctgtcctttcctttttcaataaaccacct gcaaaaagggaaaaccattctgaggacaagaaacatgtcaatgggaaatacacagttgcc agagggtaaaaggccctgttcattctcattgaaaagctcaggtatttctgttaaagtctc tccttttactttaggatgctgactcctgcgtccatctcaacctgggcatcgtgccaccac cttcaagaagagaaaaactaagtagtgctttgcaaaggggcagcagcatttctcatttct gaccatgtcaggcacatggccatgcagatgagcaggtgggggacacaggtgagtctccag acctgctctcctcccacagtacattcttgagtctttttaaacagttgtgaaaatgccaca gatgcaagcacctgtgggccactcccatggggaccgttgcacaaggcagtgccactcatt ctcagaacctcctaccatgggctatgcttagtgacccgaggccaagccaaggaagacgcc agccacagggtgccatcctcaggggcatgctgccagcaggggcaaagttatccctagcaa caagatacagaaagaaagaaaaaaggaaggaaatgtagccaatgggccggttcaggttct tgactttgccacacaaaagaatttgagagcaagtccaaagtaaaagtcagcaagagaatt tattgcaaagtgaaagtacactctgacagctgatcagagcagctgctcaaaagagagaca gtaccctcccctcacgggagtcttacatgattattcatgaataggtgggaaggggtattg ttttaagcatgttctgtggtctcttgaacgtgcatgcactgtggttgtacatatcagcac acacatcttacgtctcattagcatcttaacttccctctcagagttgtgtttgctactatt gtaatgagcataggtcagcccaaggacactattcatgggtttctgggcttcctcagatgt ggggatgcctcccttggctcttctacctctttgctgcaggatgttctaaccacaagccca ggatatggtttgcgcactgtcgaacagcttgttctctccatcaacctgacaagtctcttg tttcctttcaagggaggctgtgaacaccctatctcactgacctcagaaggacagtacagc agtagccaccatgaccaaaaagatgattccagaagtgcaggacaactccctacccagagg ctgtggctgtgcagtaacacaccaagaggggagtccagctggctctcagggtgctcacta ccctcatctgggggcctggaggacgtcaattcctgagaacgccacgttctagtgagtaga atgaactgagagatacacagcaaagctccacatacttttccttttctttgtgcccgcagt gttcttcatcagtgtgctctcgcttttcagctactactgttggctggctggaaaaaatag aacaatagtaaaaattagagaccagtctttggtgatgaagagaaatattggctacttcca gtattttctagctttggttatggttgcagttttccagctcaccttgtggggatgaattca gaaaaaagttacaaattgaaatgaacatgccagaagtattggctcaaatcaacgttgtcc tattaagccacttagtgaatcaaaagaccgcttgttggactgttaatctcggtggccaga gaaaggagctgaagaaggtgttgccagatcaggaacaaataattacagcggcaatagaaa atggaagaccacttgttcataaccatttgaataagggcaaggtgtatggaaacacattat gaactgatattttcagttttgtttgcaagaaaatgattaataaggtgaaataattgaagt atcacggaagatacattaaaaaaaaaaaaagcctttgtacagtttgctggagccacagat gtcctactccagagcagaacaatgcctgaatcttcagggtccatttctgccgcattcact agcaaccacaaatgtgacttaattttactttggaaataatgcttacccattgtgagatgc tgtaatatgaaccatcattacatgttaacatggcacatggaattttgagtgtctaagtta catttttagagttgtttcttagtagccatgtgagtttccactccaaaaacacaagctaaa aacttgttttgagtgaaggacatctagggcaaatggtggctgaaagtgaatgagatc | 
The names of the sequences are the same as the original sequence, with '_start-end' appended, where 'start', and 'end' are the start and end positions of the sub-sequence. eg: The name U01317 would be changed in the sub-sequences to: U01317_1-50000 and U01317_50001-73308 if they were split at the size of 50000 with no overlap.
Splitting a large sequence into smaller sub-sequences for analysis might be useful in cases where a particularly memory or CPU intensive application will not run quickly enough or at all on the full sequence. This should seldom be necessary in EMBOSS.
By default, splitter will write all the sub-sequences to a single file. In some cases, particularly where non-EMBOSS programs are used, it is necessary to have a single sequence per file. To write the sub-sequences into separate files use the command-line switch -ossingle.
| Program name | Description | 
|---|---|
| aligncopy | Read and write alignments | 
| aligncopypair | Read and write pairs from alignments | 
| biosed | Replace or delete sequence sections | 
| codcopy | Copy and reformat a codon usage table | 
| cutseq | Remove a section from a sequence | 
| degapseq | Remove non-alphabetic (e.g. gap) characters from sequences | 
| descseq | Alter the name or description of a sequence | 
| entret | Retrieve sequence entries from flatfile databases and files | 
| extractalign | Extract regions from a sequence alignment | 
| extractfeat | Extract features from sequence(s) | 
| extractseq | Extract regions from a sequence | 
| featcopy | Read and write a feature table | 
| featmerge | Merge two overlapping feature tables | 
| featreport | Read and write a feature table | 
| feattext | Return a feature table original text | 
| listor | Write a list file of the logical OR of two sets of sequences | 
| makenucseq | Create random nucleotide sequences | 
| makeprotseq | Create random protein sequences | 
| maskambignuc | Mask all ambiguity characters in nucleotide sequences with N | 
| maskambigprot | Mask all ambiguity characters in protein sequences with X | 
| maskfeat | Write a sequence with masked features | 
| maskseq | Write a sequence with masked regions | 
| newseq | Create a sequence file from a typed-in sequence | 
| nohtml | Remove mark-up (e.g. HTML tags) from an ASCII text file | 
| noreturn | Remove carriage return from ASCII files | 
| nospace | Remove whitespace from an ASCII text file | 
| notab | Replace tabs with spaces in an ASCII text file | 
| notseq | Write to file a subset of an input stream of sequences | 
| nthseq | Write to file a single sequence from an input stream of sequences | 
| nthseqset | Read and write (return) one set of sequences from many | 
| pasteseq | Insert one sequence into another | 
| revseq | Reverse and complement a nucleotide sequence | 
| seqcount | Read and count sequences | 
| seqret | Read and write (return) sequences | 
| seqretsetall | Read and write (return) many sets of sequences | 
| seqretsplit | Read sequences and write them to individual files | 
| sizeseq | Sort sequences by size | 
| skipredundant | Remove redundant sequences from an input set | 
| skipseq | Read and write (return) sequences, skipping first few | 
| splitsource | Split sequence(s) into original source sequences | 
| trimest | Remove poly-A tails from nucleotide sequences | 
| trimseq | Remove unwanted characters from start and end of sequence(s) | 
| trimspace | Remove extra whitespace from an ASCII text file | 
| union | Concatenate multiple sequences into a single sequence | 
| vectorstrip | Remove vectors from the ends of nucleotide sequence(s) | 
| yank | Add a sequence reference (a full USA) to a list file | 
Please report all bugs to the EMBOSS bug team (emboss-bug © emboss.open-bio.org) not to the original author.