|   | dbxflat | 
Please help by correcting and extending the Wiki pages.
These indexes allow access of flat files larger than 2Gb.
| 
% dbxflat 
Index a flat file database using b+tree indices
Basename for index files: embl
Resource name: emblresource
      EMBL : EMBL
     SWISS : Swiss-Prot, SpTrEMBL, TrEMBLnew
        GB : Genbank, DDBJ
    REFSEQ : Refseq
     FASTQ : Fastq files
     USPTO : Iguspto files
Entry format [SWISS]: embl
Wildcard database filename [*.dat]: rod.dat
Database directory [.]: embl
        id : ID
       acc : Accession number
        sv : Sequence Version and GI
       des : Description
       key : Keywords
       org : Taxonomy
Index fields [id,acc]: 
Compressed index files [Y]: 
General log output file [outfile.dbxflat]: 
 | 
Go to the output files for this example
| 
Index a flat file database using b+tree indices
Version: EMBOSS:6.6.0.0
   Standard (Mandatory) qualifiers:
  [-dbname]            string     Basename for index files (Any string from 2
                                  to 19 characters, matching regular
                                  expression /[A-z][A-z0-9_]+/)
  [-dbresource]        string     Resource name (Any string from 2 to 19
                                  characters, matching regular expression
                                  /[A-z][A-z0-9_]+/)
   -idformat           menu       [SWISS] Entry format (Values: EMBL (EMBL);
                                  SWISS (Swiss-Prot, SpTrEMBL, TrEMBLnew); GB
                                  (Genbank, DDBJ); REFSEQ (Refseq); FASTQ
                                  (Fastq files); USPTO (Iguspto files))
   -filenames          string     [*.dat] Wildcard database filename (Any
                                  string)
   -directory          directory  [.] Database directory
   -fields             menu       [id,acc] Index fields (Values: id (ID); acc
                                  (Accession number); sv (Sequence Version and
                                  GI); des (Description); key (Keywords); org
                                  (Taxonomy))
   -[no]compressed     boolean    [Y] Compressed index files
   -outfile            outfile    [*.dbxflat] General log output file
   Additional (Optional) qualifiers: (none)
   Advanced (Unprompted) qualifiers:
   -release            string     [0.0] Release number (Any string up to 9
                                  characters)
   -date               string     [00/00/00] Index date (Date string dd/mm/yy)
   -exclude            string     Wildcard filename(s) to exclude (Any string)
   -statistics         boolean    [N] Report I/O statistics for each input
                                  file
   -indexoutdir        outdir     [.] Index file output directory
   Associated qualifiers:
   "-directory" associated qualifiers
   -extension          string     Default file extension
   "-indexoutdir" associated qualifiers
   -extension          string     Default file extension
   "-outfile" associated qualifiers
   -odirectory         string     Output directory
   General qualifiers:
   -auto               boolean    Turn off prompts
   -stdout             boolean    Write first file to standard output
   -filter             boolean    Read first file from standard input, write
                                  first file to standard output
   -options            boolean    Prompt for standard and additional values
   -debug              boolean    Write debug output to program.dbg
   -verbose            boolean    Report some/full command line options
   -help               boolean    Report command line options and exit. More
                                  information on associated and general
                                  qualifiers can be found with -help -verbose
   -warning            boolean    Report warnings
   -error              boolean    Report errors
   -fatal              boolean    Report fatal errors
   -die                boolean    Report dying program messages
   -version            boolean    Report version number and exit
 | 
| Qualifier | Type | Description | Allowed values | Default | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Standard (Mandatory) qualifiers | ||||||||||||||||
| [-dbname] (Parameter 1) | string | Basename for index files | Any string from 2 to 19 characters, matching regular expression /[A-z][A-z0-9_]+/ | Required | ||||||||||||
| [-dbresource] (Parameter 2) | string | Resource name | Any string from 2 to 19 characters, matching regular expression /[A-z][A-z0-9_]+/ | Required | ||||||||||||
| -idformat | list | Entry format | 
 | SWISS | ||||||||||||
| -filenames | string | Wildcard database filename | Any string | *.dat | ||||||||||||
| -directory | directory | Database directory | Directory | . | ||||||||||||
| -fields | list | Index fields | 
 | id,acc | ||||||||||||
| -[no]compressed | boolean | Compressed index files | Boolean value Yes/No | Yes | ||||||||||||
| -outfile | outfile | General log output file | Output file | <*>.dbxflat | ||||||||||||
| Additional (Optional) qualifiers | ||||||||||||||||
| (none) | ||||||||||||||||
| Advanced (Unprompted) qualifiers | ||||||||||||||||
| -release | string | Release number | Any string up to 9 characters | 0.0 | ||||||||||||
| -date | string | Index date | Date string dd/mm/yy | 00/00/00 | ||||||||||||
| -exclude | string | Wildcard filename(s) to exclude | Any string | |||||||||||||
| -statistics | boolean | Report I/O statistics for each input file | Boolean value Yes/No | No | ||||||||||||
| -indexoutdir | outdir | Index file output directory | Output directory | . | ||||||||||||
| Associated qualifiers | ||||||||||||||||
| "-directory" associated directory qualifiers | ||||||||||||||||
| -extension | string | Default file extension | Any string | |||||||||||||
| "-indexoutdir" associated outdir qualifiers | ||||||||||||||||
| -extension | string | Default file extension | Any string | |||||||||||||
| "-outfile" associated outfile qualifiers | ||||||||||||||||
| -odirectory | string | Output directory | Any string | |||||||||||||
| General qualifiers | ||||||||||||||||
| -auto | boolean | Turn off prompts | Boolean value Yes/No | N | ||||||||||||
| -stdout | boolean | Write first file to standard output | Boolean value Yes/No | N | ||||||||||||
| -filter | boolean | Read first file from standard input, write first file to standard output | Boolean value Yes/No | N | ||||||||||||
| -options | boolean | Prompt for standard and additional values | Boolean value Yes/No | N | ||||||||||||
| -debug | boolean | Write debug output to program.dbg | Boolean value Yes/No | N | ||||||||||||
| -verbose | boolean | Report some/full command line options | Boolean value Yes/No | Y | ||||||||||||
| -help | boolean | Report command line options and exit. More information on associated and general qualifiers can be found with -help -verbose | Boolean value Yes/No | N | ||||||||||||
| -warning | boolean | Report warnings | Boolean value Yes/No | Y | ||||||||||||
| -error | boolean | Report errors | Boolean value Yes/No | Y | ||||||||||||
| -fatal | boolean | Report fatal errors | Boolean value Yes/No | Y | ||||||||||||
| -die | boolean | Report dying program messages | Boolean value Yes/No | Y | ||||||||||||
| -version | boolean | Report version number and exit | Boolean value Yes/No | N | ||||||||||||
| Processing directory: /homes/user/test/embl/ Processing file: rod.dat entries: 6 (6) time: 0.0/0.0s (0.0/0.0s) Total time: 0:00.0 Entry idlen 15 OK. Maximum ID length was 6 for 'L48662'. Field acc acclen 15 OK. Maximum acc term length was 6 for 'L48662'. | 
| # Number of files: 1 # Release: 0.0 # Date: 00/00/00 Single filename database rod.dat | 
| Type Identifier Compress Yes Pages 3 Secpages 3 Order 71 Fill 56 Level 0 Pagesize 2048 Cachesize 20000 Order2 22 Fill2 41 Secpagesize 512 Seccachesize 20000 Count 7 Fullcount 9 Kwlimit 15 Reffiles 0 | 
| Type Identifier Compress Yes Pages 3 Secpages 0 Order 71 Fill 56 Level 0 Pagesize 2048 Cachesize 20000 Order2 22 Fill2 41 Secpagesize 512 Seccachesize 20000 Count 6 Fullcount 6 Kwlimit 15 Reffiles 0 | 
This file contains non-printing characters and so cannot be displayed here.
This file contains non-printing characters and so cannot be displayed here.
Having created the EMBOSS indexes for this file, a database can then be defined in the file emboss.defaults as something like:
DB embl [ type: N dbalias: embl (see below) format: embl method: emboss directory: /data/embl file: *.dat indexdirectory: /data/embl/indexes ]The index file 'basename' given to dbxflat must match the DB name in the definition. If not, then a 'dbalias' line must be given which specifies the basename of the indexes.
SET PAGESIZE 2048 SET CACHESIZE 200The above values are recommended for most systems. The PAGESIZE is a multiple of the size of disc pages the operating system buffers. The CACHESIZE is the number of disc pages dbxflat is allowed to cache.
RES embl [ type: Index idlen: 15 acclen: 15 svlen: 20 keylen: 25 deslen: 25 orglen: 25 ]The length definitions are the maximum lengths of 'words' in the field being indexed. Longer words will be truncated to the value set.
| Program name | Description | 
|---|---|
| dbiblast | Index a BLAST database | 
| dbifasta | Index a fasta file database | 
| dbiflat | Index a flat file database | 
| dbigcg | Index a GCG formatted database | 
| dbxcompress | Compress an uncompressed dbx index | 
| dbxedam | Index the EDAM ontology using b+tree indices | 
| dbxfasta | Index a fasta file database using b+tree indices | 
| dbxgcg | Index a GCG formatted database using b+tree indices | 
| dbxobo | Index an obo ontology using b+tree indices | 
| dbxreport | Validate index and report internals for dbx databases | 
| dbxresource | Index a data resource catalogue using b+tree indices | 
| dbxstat | Dump statistics for dbx databases | 
| dbxtax | Index NCBI taxonomy using b+tree indices | 
| dbxuncompress | Uncompress a compressed dbx index | 
Please report all bugs to the EMBOSS bug team (emboss-bug © emboss.open-bio.org) not to the original author.