-
Notifications
You must be signed in to change notification settings - Fork 6
Open
Description
I was testing seqparse on some application with a specific genbank file. One of the tests I made was to remove some parts of the file format and try to trigger a throw error. However it seems Seqparse is very permissive regarding file content, even when is out of format. The output was a sequence with some incongruent caracters, and an unknown type.
I'm using a conditional to check if type is unknown and catch an possible error during file parsing (if(seq.type == 'unknown').
Should this permissiviness be the expected behavior?
STX-10034 2918 bp ds-DNA linear 23-JAN-2024
DEFINITION .
ACCESSION urn.local...t-hbr8b24
KEYWORDS "Indication:PKU" "CpG Content:140" "Molecular Weight:1896700";
"Genetic Elements:RES-Base-ITR-v1, spacer_left-ITR_v2,;
VandenDriessche_PromoterSet, PmeI_site,;
Mod_Minimum_Consensus_Kozak_v2, hPAH_codop_ORF_v2, PacI_site,;
WPRE_3pUTR, bGH, spacer_right-ITR_v1, RES-Base-ITR-v1" "Long Form;
Name:RES-Base-ITR-v1, spacer_left-ITR_v2,;
VandenDriessche_PromoterSet, PmeI_site,;
Mod_Minimum_Consensus_Kozak_v2, hPAH_codop_ORF_v2, PacI_site,;
WPRE_3pUTR, bGH, spacer_right-ITR_v1, RES-Base-ITR-v1";
"Length:2918" "5' Oligo:SO-300002" "3' Oligo:SO-300002" "Parent;
Plasmid (SP-#):SP-210174" "5' Cut Site:BsaI" "3' Cut Site:BsaI";
"Tissue Specificity (Promoter Only):Liver";
"Comments/Reference:pHK11-412 with WT-ITR-v1 oligo"
"Name:pHK11-412; with WT-ITR-v1 oligo".
SOURCE
ORGANISM .
FEATURES Location/Qualifiers
misc_feature 58..111
/standard_name="RES-Base-ITR-v1"
misc_feature 112..145
/standard_name="spacer_left-ITR_v2"
misc_feature 153..551
/standard_name="VandenDriessche_PromoterSet"
misc_feature 154..225
/standard_name="SerpinEnhancer"
misc_feature 277..460
/standard_name="Mouse TTR 5pUTR (NM_013697.5)"
misc_feature 461..551
/standard_name="MVM Intron"
misc_feature 552..559
/standard_name="PmeI_site"
misc_feature 560..569
/standard_name="Mod_Minimum_Consensus_Kozak_v2"
CDS 570..1928
/translation="MSTAVLENPGLGRKLSDFGQETSYIEDNCNQNGAISLIFSLKEE
VGALAKVLRLFEENDVNLTHIESRPSRLKKDEYEFFTHLDKRSLPALTNIIKILRHDI
GATVHELSRDKKKDTVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQ
FADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCG
FHEDNIPQLEDVSQFLQTCTGFRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPM
YTPEPDICHELLGHVPLFSDRSFAQFSQEIGLASLGAPDEYIEKLATIYWFTVEFGLC
KQGDSIKAYGAGLLSSFGELQYCLSEKPKLLPLELEKTAIQNYTVTEFQPLYYVAESF
NDAKEKVRNFAATIPRPFSVRYDPYTQRIEVLDNTQQLKILADSINSEIGILCSALQK
IK*"
/standard_name="Translation 570-1928"
misc_feature 570..1928
/standard_name="hPAH_codop_ORF_v2"
misc_feature 1932..1939
/standard_name="PacI_site"
misc_feature 1940..2520
/standard_name="WPRE_3pUTR"
misc_feature 2521..2745
/standard_name="bGH"
misc_feature 2746..2806
/standard_name="spacer_right-ITR_v1"
misc_feature complement(2807..2860)
/standard_name="RES-Base-ITR-v1"
ORIGIN
gcgagcgagc gcgcagagag ggagtggcca actccatcac taggggttcc ttgtagttaa
121 tgattaaccc gccatgctac ttatcgcggc cgcgggggag gctgctggtg aatattaacc
181 aaggtcaccc cagttatcgg aggagcaaac aggggctaag tccacacgcg tggtaccgtc
241 tgtctgcaca tttcgtagag cgagtgttcc gatactctaa tctccctagg caaggttcat
301 atttgtgtag gttacttatt ctccttttgt tgactaagtc aataatcaga atcagcaggt
361 ttggagtcag cttggcaggg atcagcagcc tgggttggaa ggagggggta taaaagcccc
421 ttcaccagga gaagccgtca cacagatcca caagctcctg aagaggtaag ggtttaaggg
481 atggttggtt ggtggggtat taatgtttaa ttacctggag cacctgcctg aaatcacttt
541 ttttcaggtt ggtttaaacc gcagccacca tgagcaccgc cgtgctggaa aatcctggcc
601 tgggcagaaa gctgagcgac ttcggccaag agacaagcta catcgaggac aactgcaacc
661 agaacggcgc catcagcctg atcttcagcc tgaaagaaga agtgggcgcc ctggccaagg
721 tgctgagact gttcgaagag aacgacgtga acctgacaca catcgagagc agacccagca
781 gactgaagaa ggacgagtac gagttcttca cccacctgga caagcggagc ctgcctgctc
841 tgaccaacat catcaagatc ctgcggcacg acatcggcgc cacagtgcac gaactgagcc
901 gggacaagaa aaaggacacc gtgccatggt tccccagaac catccaagag ctggacagat
961 tcgccaacca gatcctgagc tatggcgccg agctggacgc tgatcaccct ggctttaagg
1021 accccgtgta ccgggccaga agaaagcagt ttgccgatat cgcctacaac taccggcacg
1081 gccagcctat tcctcgggtc gagtacatgg aagaggaaaa gaaaacctgg ggcaccgtgt
1141 tcaagaccct gaagtccctg tacaagaccc acgcctgcta cgagtacaac cacatcttcc
1201 cactgctcga aaagtactgc ggcttccacg aggacaatat ccctcagctt gaggacgtgt
1261 cccagttcct gcagacctgc accggcttta gactgaggcc agttgccgga ctgctgagca
1321 gcagagattt tctcggcggc ctggccttca gagtgttcca ctgtacccag tacatcagac
1381 acggcagcaa gcccatgtac acccctgagc ctgatatctg ccacgagctg ctgggacatg
1441 tgcccctgtt cagcgataga agcttcgccc agttcagcca agagatcgga ctggcttctc
1501 tgggagcccc tgacgagtac attgagaagc tggccaccat ctactggttc accgtggaat
1561 tcggcctgtg caagcagggc gacagcatca aagcttatgg cgctggcctg ctgtctagct
1621 tcggcgagct gcagtactgt ctgagcgaga agcctaagct gctgcccctg gaactggaaa
1681 agaccgccat ccagaactac accgtgaccg agttccagcc tctgtactac gtggccgaga
1741 gcttcaacga cgccaaagaa aaagtgcgga acttcgccgc caccattcct cggcctttca
1801 gcgtcagata cgacccctac acacagcgga tcgaggtgct ggacaacaca cagcagctga
1861 aaattctggc cgactccatc aacagcgaga tcggcatcct gtgcagcgcc ctgcagaaaa
1921 tcaagtgata gttaattaag agcatcttac cgccatttat tcccatattt gttctgtttt
1981 tcttgatttg ggtatacatt taaatgttaa taaaacaaaa tggtggggca atcatttaca
2041 tttttaggga tatgtaatta ctagttcagg tgtattgcca caagacaaac atgttaagaa
2101 actttcccgt tatttacgct ctgttcctgt taatcaacct ctggattaca aaatttgtga
2161 aagattgact gatattctta actatgttgc tccttttacg ctgtgtggat atgctgcttt
2221 atagcctctg tatctagcta ttgcttcccg tacggctttc gttttctcct ccttgtataa
2281 atcctggttg ctgtctcttt tagaggagtt gtggcccgtt gtccgtcaac gtggcgtggt
2341 gtgctctgtg tttgctgacg caacccccac tggctggggc attgccacca cctgtcaact
2401 cctttctggg actttcgctt tccccctccc gatcgccacg gcagaactca tcgccgcctg
2461 ccttgcccgc tgctggacag gggctaggtt gctgggcact gataattccg tggtgttgtc
2521 tgtgccttct agttgccagc catctgttgt ttgcccctcc cccgtgcctt ccttgaccct
2581 ggaaggtgcc actcccactg tcctttccta ataaaatgag gaaattgcat cgcattgtct
2641 gagtaggtgt cattctattc tggggggtgg ggtggggcag gacagcaagg gggaggattg
2701 ggaagacaat agcaggcatg ctggggatgc ggtgggctct atggctctag agcatggcta
2761 cgtagataag tagcatggcg ggttaatcat taactacacc tgcaggagga acccctagtg
2821 atggagttgg ccactccctc tctgcgcgct cgctcgctca actgaggccg cccgggcaaa
2881 gcccgggcgt cgggcgacct ttggtcgccc ggcctcag
//
Metadata
Metadata
Assignees
Labels
No labels