Saturday, November 28, 2020

COVID-19: II. Remdesivir

Release 4.3.0

With hydroxychloroquine no longer a candidate for treating COVID-19, it was time to turn attention towards modeling the more likely helpful, yet more complex molecule, Remdesivir. This proved to be a daunting task. Rather than attempting to model Remdesivir entirely at once, the approach was taken to divide the molecule into four separate moieties. After modeling each of the four moeities indivdually, the plan would then be to combine the four moieties into the larger final molecule, Remdesivir.

More concretely, the steps taken for modeling were the following:

  1. Model the most complex of the four moieties, the fused ring structure containing a pyrrole azine fusion.
  2. Model the second most complex of the moieties, the structure containing the furan ring to which the pyrrole azine is attached. 
  3. Model the phosphoryl group to which the furan is attached.
  4. Model the structure containing the ester linkage to which the phosphoryl group is attached.
  5. Once all four groups were properly modeled individually, model all three combinations of two adjacent groups connected to each other. That is: the pyrrole azine and furan ring group, the furan ring and phosophoryl group, and finally the phosphoryl group and the ester containing group.
  6. Once the three combinations of adjacent groups had been properly modeled, model the two combinations of three adjacent groups. That is: the pyrolle azine, furan ring, and phosphoryl group combination; and the furan ring, phosphoryl group, and ester containing group combination.
  7. Finally, model all four individual groups as attached to each other thus forming Remdesivir.
While steps 5 and 6 were not explicitly necessary for joining the individual four moeities together to synthesize the overall model of Remdesivir, they did serve as very useful test cases. A bit more in depth on each of the four moeities follows:

Pyrrole Azine moeity: Initially, the nomenclature of this moeity was beyond the scope of my organic chemistry knoweldge. As such, my first step was to personally study a guide for fused-ring arenes and heterocycles. Once I felt confident enough, I went ahead and created the nomenclature logic, which is as follows: 
  1. When checking the locant numbering for a bicyclo fused ring, check if the ring is napthalene. If not, proceed to step 2.
  2. Check if the ring is aromatic. If so, proceed to step 3
  3. Determine the name of the components of the fused ring (each indvidual ring).
    • Determine the main component (larger bridge length) and side component (smaller bridge length). 
    • Name the main and side components
    • Generate the fusion numbering (following the format ([matching locant 1 of main component, matching locant 2 of main component - matching face of side component]) where the matching locants and faces are the two atoms that are found in both components
    • Generate full fusion name of both components including fusion numbering
  4. Recheck the locant numbering for the newly created fused ring using proper fused ring locant rules.
  5. Name the entire fused ring using the full fusion name as the name of the primary skeleton (thus ignoring all heteroatoms and pi bonds in the fused ring as they have already been accounted for). Normal rules for naming primary and auxiliary functional groups as well as radical locants apply.
Furan Ring moeity: This moeity was certainly less complex that the previous one. The main challenge was to appropriate the nomenclature specific for furan molecules. Particularly, detecting if the ring is of the furan family first, and then determining how many of the normal two double bonds were saturated and applying the locants for the hydrated Carbons appropriately. One other challenge was properly handling side skeletons when determing stereochemistry of each Carbon in the tetrahydrofuran.

Phosphoryl Group moeity: The phosphoryl group moeity was even less complex still than the furan ring, but it still had one tricky part, namely the fact that the primary skeleton contained zero Carbon atoms. This challenge was overcome by recognizing a phosphoryl component via its length of one (a Phosphorous atom) and the attachments of dual hydoxy groups and one carbonyl group. Once this detection was acccomplished, all attachments could be named as normal following the (attachment 1 name - attachment 2 name)phosphoryl convention. Specifically, an extra methane was used while modelling this group to be able to name the phosphoryl group properly as a radical.

Propanoate Ester moeity: Finally, and the least complex of all moeities was the propanoate ester group. Support for this group had actually already been entirely in place, although there was room for further ester group testing.

Once all four moeities had been modelled and named properly, it was time to begin the synthesis of modelling the three combinations of two adjacent groups. The combinations in more depth as follows:

Pyrrole Azine and Furan Ring: Certainly the most complicated of the three combinations of two adjacent groups. The first challenge was to provide the user with a convient way to add a skeleton attachment at a SPECIFIC location of the attachment to the existing part of the molecule in the interface. The impetus for this interface enhancement actually BEGAN with the modeling of chloroquine, but was delayed for the time being as it was not necessary for the user to create chloroquine in the interface. As the pyrrole azine ring was attached to the furan ring specifically at its number 7 locant, this combination NECESSITATED the creation of such an enhancement.

After hashing out a few different ways of specifying which atom of the new skeletal attachment should be attached to the target atom of the existing molecule in the workspace, I decided to go with handling a a new event. The user now has two options when adding a skeletal attachment to the molecule: 
  1. The existing way. That is, clicking on the attachment and dragging it to a specific target atom on the molecule. By default, this will attach the atom numbered 1 of the new attachment.
  2. If the user instead clicks and HOLDS on the attachment for one second (a long press event), the attachment will then be expanded and the user will be able to click on which specific atom of the new skeletal attachment they want to attach to the target atom of the existing molecule. The user can then drag the skeletal attachment as usual to a target atom on the molecule. 
This interface enhancement will allow creation of Remdesivir and also allow easier creation of cholroquine.

The only other challenge at this step was creating and running test cases to ensure that the stereochemistry still works properly with a radical attached not at the number one locant of the radical. 

Furan Ring and Phosphoryl Group: Again, methane was used as the primary skeleton to which the phosphoryl group was attached for sake of only needing to develop nomenclature for the phosphoryl group. This combination was rather straight forwards to model and test. The one tricky part was implenting proper use of enclosing characters (parentheses, brackets, braces) for nested and complex enough side chains. The convention used was, from outer most enclosing characters to inner most: braces, brackets, parenthesis. The reader who is also a coder might appreciate the importance in separating nomenclature demarcations from coding symbols!

Phosphoryl Group and Propanoate Ester Group: Fortunately, the work in modeling a phosphoryl radical with a primary skeleton of methane proved useful in this step. Otherwise, the one tricky part was handling the nomenclature convention of treating the phosphoryl radical attached to the amino group as  phosphoryl)amino as opposed to N-phosphoryl-2-aminopropanoate. This was essentially handled with a special case for when such a group occurs. This case may be more generalized in the future.

And with those three combinations modeled and tested, it was time to turn our attention towards the two combinations of three adjacent moeitieis attached. The two combinations in more depth:

Pyrrole Azine and Furan Rings and Phosphoryl Group: The challenges for joining these three groups together were rather straightforward. One involved testing the need for doubly nested side chain enclosing characters. A number of test cases were developed to aid in getting this correct. The other challenge, while still straightforward, was rather tedious: verifying the proper stereochemistry of all the atoms in the furan ring with the complexity of the larger molecule. Many test cases and some very scrupulous debuging was required. Both for the interface and the search engine. 

Furan Ring, Phosphoryl Group, and Propanoate Ester Group: VERY fortunately, modeling these three groups worked immediately without the need for any additional code updates.

Remdesivir: With all the pieces in place, as well as all the pieces of all the pieces in place, it was now time to combine all four individual moeities at once into the larger, final molecule, Remdesivir. Also like the previous step, modeling Remdesivir worked immediately without the need for any additional code updates. The one decision made was, since we now have TRIPLY nested side chains, to use braces again to enclose a side chain which contains braces already. This convention may change in the future, but it does not introduce any ambiguities in the full IUPAC name. 

And with Remdesivir fully modeled, this update has been officially finished.

Standards: Existing IUPAC naming conventions were followed as usual. In particular, the fused ring nomenclature including naming of primary and side components as well as fusion numbering and ring numbering after the fusion naming used the following article: Rasmussen, S.C. The nomenclature of fused-ring arenes and heterocycles: a guide to an increasingly important dialect of organic chemistry. ChemTexts 2, 16 (2016). 

The order of enclosing demarcations followed was from outer most side chain to inner most side chain: {}, [], () with braces being used to handle nested side chains beyond three levels.

The convention for naming a phosphoryl group attached to an amino group were followed per the PubChem article on Remdesivir.

Controls: The main enhancement for this update was to allow the user to specify which atom of a new alkane chain attachment to attach to the existing molecule. This was accomplished by introducing a long press event to the alkane chain attachments. The user will first press and hold on an alkane chain for one second which will cause that alkane chain to be zoomed in on. Next, the user will drag the mouse over the atom they wish to attach to the existing molecule. Finally, the user will drag the new alkane chain over the existing molecule. If the long press event is not triggered, by default the first atom of the new alkane chain will be attached to the existing molecule.

Future Considerations: Well the FOREMOST question to ask is will Remdesivir continue to be used in treatment for Covid-19 symptoms. And if so, in what way can this site most specifically aid in production of Remdesivir. The first idea I have to continue down this path is to fully model a syntehsis pathway of the drug, as was modeled for pyrimethamine. This will hopefully aid in the detection of any future more efficient or cheaper production models.

Otherwise, with the increasing complexity of molecules being modeled, it's clear the zoom out automatic detection need to be improved. 

Some more accurate zooming functionality for an alkane side chain attachment after the long press event would be helpful. A tutorial update would also be useful for users new to this task.

Finally, implementing a rotating clockwise and counterclowise set of buttons would be useful for examing certain parts of the more complex molecules. Work has actually already begun on this enhancement. 

Tuesday, November 10, 2020

COVID-19: I. Chloroquine

2020 has been an unprecedented and disorienting year for everyone. To be honest, I had to look back through my notes to really put myself in pre-Covid frame of mind to make a reasonable transition for this update. What were the goals, concerns and hopes for the site back in February 2020? And after the refreshing from my search, I did remember that a recent objective the pathway search engine had accomplished was to independently discover a synthesis pathway for Daraprim (pyrimethamine). And as always, finding ways to improve the interface and make it more user friendly was a high priority. 

But when the world changed in mid-March, I decided that I would spend as much energy as I had for the site to see if I could possibly contribute to the fight against Covid-19. I knew that it might be a long shot, and of course any work here does not merit comparison with that of our front line and essential workers, but I did want to see if there is any part the site could play in helping the world solve the pandemic.

The first idea I came up with was to model one of the most promising drugs for treatment of Covid-19. In April, I considered modeling either Remdesivir or Chloroquine. Looking at the chemical structure of the two, I considered the modeling of Chloroquine far more feasible. In fact, some of the moieties of the Remdesivir structure I had not yet acquired the chemical knowledge to model or even properly name. And at the time, Hydroxychloroquine was legitimately being considered as an effective treatment.

The first step to implement support for Chloroquine was to look at the base fused ring component of the molecule. Fortunately, support in the interface was already in place for bicyclo[4.4.0]decane, so support only had to be added for the aromatic version of the fused ring, napthalene, and then afterwards the more specific version quinoline. Support for these two mainly involved updating the IUPAC naming engine. 

Next, support for tertiary amines needed to be added to handle the N,N-diethylpentan-2-yl side chain protruding from the amino group located at locant number 5 of the quinoline. This adjustment to the interface proved to be straight forward as well. I did make a mental note at the time that it would be MUCH more efficient to allow the user to select which carbon of an alkane chain addition they wished to attach to the current molecule; the process at the time of adding a pentan-2-yl side chain involved first attaching a butyl and then attaching a methyl to the head of the butyl. 

Finally, I took a look at the resulting chloroquine molecule and thought to myself hmm, this is getting pretty convoluted and messy. And as the site is also optimized to work on a small screen device, cleaning it up became even more of a priority. I decided to implement a mode by which the user could view the molecule in a line structure format: where carbons are represented by a point and other atoms by their chemical symbol. After this clean up optimization, the resulting chloroquine molecule is easier to view and interact with.

Standards: Existing IUPAC naming conventions were again followed in the modeling of chloroquine. Specifically, once the interface recognized that the bicyclo[4.4.0]decane skeleton was aromatic, it named it as napthalene. Furthermore, once it recognized a napthalene with a nitrogen heteroatom at the 1 locant, it named it quinoline. There was some trial and error involved to ensure proper stereochemistry naming resulted.

Numbering of locants for the napthalene and quinoline molecules follows the rules per Organic Nomenclature.

Controls: The "View Atom Abbreviation Mode" toggle was added to the control buttons. This allows the user to toggle between viewing full ball and stick molecule respresentations and line structure representations of the molecules. 

Future Considerations: As mentioned previously, implementing support for chloroquine made it clear that the interface would be much more effective if the user could select which carbon of an alkane chain would be attached to the existing molecule when adding a chain. This would allow the user to select the second carbon of pentane when wishing to add the radical pentan-2-yl.

AND as most of us with some knowledge of the life sciences are aware, unfortunately hydroxychloroquine proved to NOT be effective as a treament for COVID-19. Nevertheless, I took the enhancements of the interface and search engine provided from modeling chloroquine as valuable gains for the site, and turned my attention to the drug more promising at the time: Remdesivir.


OChemdle

In light of the recent popularity of games such as Wordle and its offshoots (Worldle, Octordle, Semantle, Redactle, etc), a conversation beg...