Search This Blog

Python code for genetic code

[Originally written July 18 2020]

Upon entering Verona, Caliban greeted me with yet another egg, bringing the total count of unhatched Grendels in my inventory to seven. I decided that things were getting rather out of hand, so it would be wisest to clear the backlog by hatching all of them, examining their genetics, and exporting the ones who didn’t make the cut. Before I could get started though, I wanted to put a temporary stop to the egg madness, so I isolated Caliban for a while in the Comms Room. It was also at this point that I gave up on naming the creatures after characters who I knew who they were and/or were major characters and just started picking names off an internet list.

Just as I hatched the final egg, I got a notification that another one was on its way, from Viola and Brutus. I waffled a bit on whether to stash the egg in my inventory or just airlock it immediately, but decided that was a choice for another time. I had lots of genetics to look at.

And since I had so many genetics to look at, I didn’t want to manually cross reference them all. Instead, I improved my existing gendiff script to compare a child genome to parent genomes for me (and while I was at it, I fixed both the existing version and my new mutation finder so they would no longer look at the genus gene). After completing the script, I ran it on Viola against the default Jungle and Banshee Grendel genomes, since I already had a manually-generated ground truth for this. 

Through a very long time spent manually cross referencing, I had previously found 14 differences, all alterations. One was the genus gene and would not be counted in the new version of the script, so that left 13 differences. Eight were pigments and pigment bleeds, two were pose genes, one was an emitter, and two others I didn’t write down since they seemed to be false positives from gendiff.exe (or perhaps I just couldn’t see what was different about them). Sure enough…

But… that receptor removal is something I didn’t see before. Removals are tricky, since by nature you only have the parent’s gene number. Which means you can’t just look up the child gene number in the other parent’s difference output and see if it also appears there, and the same gene may have different numbers in the parents so you can’t necessarily just look for mom’s gene ID in dad’s output. Instead, my script compares mom to dad. If an omission as compared to mom is inherited from dad already lacking the gene, then mom’s instance of the gene will appear as an insertion when you compare her to him. That’s what my script looks for, and if it doesn’t find that insertion, it concludes that dad isn’t missing the gene, which means the kid’s omission is a mutation.

If it finds an alteration of mom’s gene in the comparison between parents, that gives us the gene ID for the same gene in dad, but otherwise, there’s no way to look it up from gendiff.exe output, thus the ???? in the output above. That’s not such a huge loss, though, because if there’s no alteration entry, then dad’s version of the gene is the same as mom’s and there’s no need to look it up anyway.

On seeing this output I went back to look at my script and double check the logic but I couldn’t find anything wrong with it. So instead I opened up Genetics Kit. Gene 0428 in the Banshee Grendel genome is the receptor for muscle toxin, and after finding out what it was, I manually found the corresponding gene in the Jungle Grendel. And sure enough, this gene is indeed missing in Viola (and by extension, Sebastian). Which, if I understand correctly, means they’re both immune to this toxin. Interesting! And I completely missed this with my manual check!

No comments:

Post a Comment