-
Notifications
You must be signed in to change notification settings - Fork 64
Description
Endpoint: /VariantValidator/tools/gene2transcripts_v2/{gene_query}/{limit_transcripts}/{transcript_set}/{genome_build}
Not all genes return a correct response when trying to query the database. There is a list of about 73 HGNC ID that will not provide the correct output when requesting data. There are 2 ways it can fail.
1)
E.g, searching for HGNC:32700 returns the following response:
URL: https://rest.variantvalidator.org/VariantValidator/tools/gene2transcripts_v2/HGNC%3A32700/mane_select/all/GRCh38?content-type=application%2Fjson
Response Body:
[
{
"error": "Unable to recognise gene symbol DNAAF19",
"requested_symbol": "DNAAF19"
}
]
This might be because the record matching this query in a specific table is updated to new gene symbol DNAAF19, however the original symbol for this record is CCDC103, so doesn't match it. When trying to search using the old gene symbol gives you an accurate output
Response Body:
[
{
"current_name": "coiled-coil domain containing 103",
"current_symbol": "CCDC103",
"hgnc": "HGNC:32700",
"previous_symbol": "",
"requested_symbol": "CCDC103",
"transcripts": [
Also note that trying to search using "DNAAF19" also gives the same "Unable to recognise DNAAF19 symbol"
Discovery: Itebbs22/SoftwareDevelopmentVIMMO#19
2)
Potentially due to not having a record or for unknown reason some gene symbols throws out an internal server error.
E.g, searching for the gene HGNC:12029 throws server error instead of no match found message.
URL:https://rest.variantvalidator.org/VariantValidator/tools/gene2transcripts_v2/TRAC/False/all/GRCh38?content-type=application%2Fjson
ResponseBody:
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>500 Internal Server Error</title>
</head><body>
<h1>Internal Server Error</h1>
<p>The server encountered an internal error or
misconfiguration and was unable to complete
your request.</p>
<p>Please contact the server administrator at
[no address given] to inform them of the time this error occurred,
and the actions you performed just before this error.</p>
<p>More information about this error may be available
in the server error log.</p>
<hr>
<address>Apache Server at rest.variantvalidator.org Port 80</address>
</body></html>
For a list of problem URL please check: https://github.com/Itebbs22/SoftwareDevelopmentVIMMO/blob/BED-HGNC_FIX/problem_url
For a list of problem HGNC:ID please check: https://github.com/Itebbs22/SoftwareDevelopmentVIMMO/blob/BED-HGNC_FIX/prob_gene.txt
However, I'd like to point out that this list also includes when a gene had a match but had an empty transcript value. I have added to this list, as my initial assumption was that it shouldn't return empty values
e.g, when specifically looking for a gene (HGNC:4713) in mane_select transcripts in GRCh38
URL:https://rest.variantvalidator.org/VariantValidator/tools/gene2transcripts_v2/HGNC%3A4713/mane_select/all/GRCh38?content-type=application%2Fjson
Response Body:
[
{
"current_name": "H19 imprinted maternally expressed transcript",
"current_symbol": "H19",
"hgnc": "HGNC:4713",
"previous_symbol": "",
"requested_symbol": "H19",
"transcripts": []
}
]
However if I don't limit it to just mane_select, then we have some records with an accurate output. My code will later be updated to only include genes that have mismatching queries, as mentioned in the 1st section, and URLs that throw an internal error. so the list will only include the problematic ones at a later date.
Also note that This was only done on a very small subset of endpoint for only the genes used in panelapp i.e, on mane_select, Grch38 for each gene and did not test for genes that are not part of the panelapp and on grch37. Which I can update at a later date.
Potential Issue
Also, another issue when setting the limit_transcripts to False seems to fail with an internal server error (might due to bigger genes), due to time out? as it loads for a bit and fails. However, this is not a universal behaviour, as some genes return a good response.
e.g., HGNC:1100(BRCA1)
URL :https://rest.variantvalidator.org/VariantValidator/tools/gene2transcripts_v2/HGNC%3A1100/False/all/GRCh38?content-type=application%2Fjson
not sure why and have not checked other genes for this behaviour.