QUERY SEQUENCE PAGE DESCRIPTION
The query page is divided into 5 sections : General, Source, References, Cross-references and Features.
The General section allows to choose the Sequence type. The type of sequence data retrieved from the
database depends on it according to the following rule:
This section also allows to retrieve entry based on their accesssion number, description, keywords, sequence length or standrad name.
- "Genome" will return complete nucleotide sequences (as deposited in the EMBL nucleotide sequence database)
- "Polyprotein" will return the protein translations
- "Single nucleotide" will return genome regions (e.g. 5'utr, c, e1, ... depending on user's choice)
- "Single protein" will return individual protein sequences (e.g. c, e1, e2, ... depending on user's choice)
- "Feature protein" will return regions of the protein sequences (e.g. HVR1)
For "keywords" field, you can search more than one keyword and use the AND/OR operators.
The Source section allows to query information related to the source of the sequence (e.g. genotype, isolate, cell line, cell type).
This section allows to select entry based on their genotype/subtype.
The genotypes/subtypes are divided in 3 levels :
The lists availble in the form contain all the genotypes/subtypes present in the euHCVdb database.
- deposited (genotype retrieved from the EMBL entry)
The References section allows to query data related to journal references (e.g. authors, title word, publication year).
For the Authors fields, there are 3 ways of searching references :
- "any" will return any entry matching any author in any reference
- "same entry" will return all the entries referencing all the authors in any of their references
- "same reference" will return the entries referencing all the authors at the same time in one of their references
The Cross-references section allows to query cross-references to external databases (e.g. UniProt or EMBL).
The Features section allows to perform queries on nucleotide features
(e.g. RBS, stem_loop) with or without qualifiers (e.g. function, locus_tag)
and protein regions (e.g. HVR1, ISDR, 3D-models).
HOW TO MAKE A QUERY
When performing a query covering more than one criteria, each individual
criteria has to be satisfied (intersect).
Extract "e1" protein sequence of isolate "H77" :
- Sequence type = "single protein"
- Standard name = "e1"
- Isolate = "H77"
This query will return all entries containing a h77 isolate AND a e1 protein.
Entries satisying only one of these criteria will not be returned.
Multiple values (+)
Text inputs on the page with "+" allow "multiple criteria" querying.
Multiple values have to be separated by a free space.
Choosing "Genome" in the Sequence type field and "AY754636 AY754639" in the
Accession number field will return the complete nucleotide sequences of
entries AY754636 and AY754639 found in the database.
The interface also allows "imprecise" querying by adding "*" before or after a letter (or group of letters).
By choosing "M*" in the Accession number field and "Genome" in the Sequence type field,
the query will return all the entries with an accession number starting with the letter M. On the same note,
chosing "*5" or "*5*" will return the entries with the accession number
ending with "5" or containing "5" respectively.
Operators "AND" or "OR" (default = "AND") can be used to search more than one keyword.
With "AND", the query will return entries with all the query terms present.
With "OR", the query will return entries where any of the terms are present.