euHCVdb®: tools help

The Nobel Prize in Physiology or Medicine 2020 "for the discovery of Hepatitis C virus".

Tools Help

Number

A tool to number a user input sequence using the AF009606 genome as a reference.
The first page of result is a summary made of 2 tables with begin/end and length informations for the reference and query sequences. Difference in length for each sub-genomic regions or proteins between reference and query sequences are highlighted in red.
By clicking on region name, the users can visualize the numbering of a sub-region/protein in the "Number detail" page. The numbering is absolute i.e. starting at the 1st residue of the reference sequence or relative i.e.starting at the first residue of the subregion/protein. The numbering for the full-length sequences is also provided.
When number tool is used with a nucleotide sequence, you will find, in page "Number detail", a button that allows you to run the Number tool with the sequence translation of the query sequence.

Extract

A tool to extract sequences matching a given region of AF009606 reference genome.
The region to be searched could be defined with a from/to or by a sequence pasted in the right box.
The length of matching sequences extracted is defined by length of the query sequence +/- a precentage set by the user.
A multiple sequence alignment is computed if not too many and too long similar sequences are found. For this alignment, all identical sequences are removed.
If the alignment has been computed, a repertoire and Shannon entropies are computed too. For repertoire a parameter allow the user to hide residue with a frequency below this parameter.
The default sequence databank searched contains all the euHCVdb sequences (nucleotides or proteins). The user could upload its own sequence databank. The latter is then used for the similarity search. The sequence databank could be the result of a request on the database.
N.B.:This tool is sequence based i.e. it extracts sequences by using a similiratity search program.
It is less accurate than querying the database, but this is correctyed by the possibility to upload a sequence databank.
However it allows the extraction of sequence spanning several genomic regions or proteins or sub-genomic or sub-protein regions in one step.

Legal notice