Tuesday, February 4, 2020

Quick interface update per user feedback

Release 4.2.1

A thanks to user J.G. who wrote: "'Im a software engineer, not a chemist. Approaching this website with just memory from a basic college chemistry class many years ago, so "helpful/not helpful" is more like "what parts felt natural to use/easy to understand". That said, this is so well done that I actually opted into a survey about a website. Great software. Sorry I'm not proficient enough in the subject to offer much suggestion, but a point that slowed me down starting to try to make a molecule was that I saw carbon underneath the editor and first could not figure out how to drag that in to start (rather than starting with a skeleton on the left. Tutorial cleared that right up, though. If I have to start with a skeleton, perhaps hide the "Additions" section or make it look visibly disabled until a skeleton is used." 

I hope you don't mind me sharing your review! Per feedback, the additions panel now actually IS visibly disabled until the molecule has been created by first giving it a primary skeleton. This feedback is EXACTLY what we're looking for. Keep it coming!

Saturday, February 1, 2020

Support for molecules with cycloalkane side chains, nested side chains, side chains with ether attachments, side chains attached to the parent skeleton with pi bonds and with atoms other than the first Carbon in the side chain, and introduction of support for the reactions used in the synthesis of pyrimethamine

Release 4.2.0

Perhaps the longest title of any entry thus far in the organic chem master blog. This update introduces support for, in general, molecules with more complex side chains. The main impetus for this update was actually to show the potential to take the power to unjustifiably and drastically raise prices for drugs like Daraprim away from greedy biotechnology CEOs like Martin Shkreli. Which of course is a very tall task, in no small part due to chemical patent restrictions, but hopefully the pathway synthesis search engine support added in this update will show a step in that direction and that one day the tool will be able to provide alternative synthesis pathways for important life saving medicines.

With that overarching goal in mind, the particular goal of this update was to empower the search engine to independently discover the same synthesis pathway to produce the drug Daraprim (pyrimethamine) that high school students in Sydney did in 2016. This pathway can be viewed in the image here: https://en.wikipedia.org/wiki/Pyrimethamine#/media/File:Pyrimethamine_traditional_synthesis.png .

The first step in achieving the discovery was to ensure that both the intermediate molecules involved in the synthesis and of course pyrimethamine itself were supported in both the interface and the search engine. The starting molecule, 1-chloro-4-(2-cyanoethyl)benzene, was actually already supported. The next intermediate molecule, 1-chloro-4-((2Z)-1-cyano-3-hydroxypent-2-en-2-yl)benzene, required adding support for side chains (in this case the 1-cyano-3-hydroxypent-2-en-2-yl radical) that were NOT attached to their parent chain at the first Carbon of the side chain. The proceeding intermediate molecule, the etherificated 1-chloro-4-((2Z)-1-cyano-3-methoxypent-2-en-2-yl)benzene, required adding support for side chains containing ethers. This of course leads to the concept of nested side chains! That is, the parent skeleton of a molecule can contain a side skeleton that itself contains a side skeleton. This was previously not allowed in the interface nor the search engine to keep the modeling simpler.

And, finally, support for the molecule pyrimethamine itself, or as know by its IUPAC name 5-(4-chlorophenyl)-6-ethylpyrimidine-2,4-diamine. Support for this molecule specifically required adding support for side chains that are cycloalkanes which in turn required the introduction of an algorithm to determine which of two attached cycloalkanes should function as the primary skeleton. In particular, should the pyrimidine ring be considered the primary skeleton of the molecule or should the chlorobenzene ring be considered the primary skeleton.

As a side note, support was also added for molecules containing side skeletons bonded to the primary skeleton with a pi bond, such as propylidenecyclohexane.

After support for ALL intermediate molecules and the product was added, it was time to add support for the reactions. Support for the following three reactions was added to the pathway search engine: Ethyl propionate condensation, Diazomethane etherification, and Guanidine condensation.

And once all modifications were in place, the search engine was able to successfully "rediscover" the synthesis pathway of Daraprim.

Standards: Per usual, IUPAC naming rules were followed. In particular, the style for nomenclature used for radicals with a Carbon atom with a locant other than 1 attached to the parent skeleton was to use the locant followed by "-yl or -ylidine" as in (propan-2-yl)cyclohexane. The radical prefix "ylidine" was used to indicate the radical was attached to the parent via a double bond. The condensation reactions were modeled after the wikpedia article, employing the strong deactivation properties of the cyano group. The etherification via diazomethane reaction was also modeled after the wikipedia article.

Controls: No new controls were introduced. The user can still create the molecules via the molecule design tool or entering the IUPAC name in the interface and click the beaker icon to perform a synthesis pathway search.

Future Considerations: Hopefully even more power can be added to the pathway search engine via support for more complex molecules, more reactions, and more efficient search techniques in the future.

Wednesday, October 23, 2019

Introduction of click to select and place element option in normal size screen version of molecule design and fix for heteroatom tutorial bug

Release 4.1.2

Also, a rather quick update. First, a bug found in step 7 of the heteroatom tutorial: https://www.organicchemmaster.com/MolGen/Tutorial/HeteroatomMolecule, involving adding a double bonded Oxygen to the 2,4,8,9-tetraazabicyclo[4.3.0]nona-1,3,6-triene molecule thus far created, resulted in an error that prevented the user from finishing the tutorial. This bug has been finished so the user is indeed able to create the allopurinol molecule.

Second, much of the feedback I have received thus far seems to indicate that the generally preferred method for designing a molecule is to use the click to select an element and click to place the element modality, currently in use in the small screen version. As such, I have gone ahead and introduced the click to select and click to place modality as an OPTION to use in the normal size screen version in addition to the standard drag and drop modality. This option can be selected by clicking on the drag icon in the toolbar (four arrows) to switch to click mode. Drag and drop mode can be switched back to by clicking on the icon again.

Controls: The user can now switch between the drag and drop modality of adding an element to the current molecule and the click to select an element and click to place the element modality while in normal size screen mode. This is accomplished by clicking on the drag (four arrows) icon to switch TO click mode and clicking on the arrow/pointer icon to switch TO drag and drop mode.

Thursday, August 1, 2019

Addition of heteroatom molecule design and creation tutorial

Release 4.1.1

A VERY short update. A tutorial was added to allow the user to get familiar with creating molecules involving both nitrogen heteroatoms and heterocyclic rings. The tutorial can be found at: https://www.organicchemmaster.com/MolGen/Tutorial/HeteroatomMolecule. The molecule chosen for the tutorial was allopurinol, as it was a motivating example for the previous update.

Thursday, July 25, 2019

Introduction of support for esters, Nitrogen heteroatoms, and bicyclic molecules

Release 4.1.0

This update mainly brings support for the following, more complex, molecules in the interface and the search engine: esters, Nitrogen heteroatoms and bicyclic molecules. The impetus for the full support of esters was actually to finish the work that had begun in a previous update, that is support for alkoxy side chains of hydrocarbon rings and the molecule 2-acetoxybenzoic acid (aspirin). While adding support for heteroatoms, I actually found the pyrimidine derived DNA bases (cytosine and thymine) to be especially useful as test cases. As such, when I was looking for one more direction to expand interface support for this update it was a logical next step to also add support for the purine DNA bases (adenine and guanine) which in turn required support of bicyclic molecules. This logical flow was actually coupled with my particular interest in the molecules uric acid and allopurinol, as a number of my close friends are rapidly approaching the advanced age that requires treatment of gout symptoms.

Previously, when support was added for 2-acetoxybenzoic acid (aspirin), the alkoxy side chain was treated as an acetyl group bonded by ether linkage to benzoic acid as opposed to an ester linkage between benzoic acid and an ethyl group. As such, FULL support for esters was not necessary. This update proceeds to add that full support for esters. Once support for esters was in place, I was able to add support for LiAlH₄ reduction of esters.

The approach to adding support for heteroatoms began with deciding which heteroatom to add first. The two main contenders were Oxygen and Nitrogen, with Nitrogen being chosen somewhat arbitrarily because of my particular interest in supporting DNA bases, uric acid, and allopurinol. Support began by first modifying the interface drawing engine to allow drawing of molecules containing rings and chains of elements other than Carbon. Once this update was made, the nomenclature engine for both the interface and search engine was updated to support heteroatoms. Pyrimidine was chosen as a goal case for support, as were the DNA bases cytosine and thymine. Of note, the pyrimidine derivative nomenclature style was NOT used in this update but support for this style will be added for future updates.

Once support for Nitrogen heteroatoms was in place for both the interface and the nomenclature engines, it was time to update the interface to allow the user to create molecules with heteroatoms. Simplicity and intuitiveness were the top two priorities for this interface addition. Two basic approaches were considered for this functionality: 1) Allowing the user to click on a Carbon atom to select it to be replaced and then clicking on the atom with which to replace it. 2) Introducing a "replace" mode that the user could toggle with an "add" mode. With the replace mode the user could first select an atom in the additions panel and then click on the atom to replace in the molecule. Ultimately the second option was chosen for two reasons: 1) the user interaction flow of clicking the atom to add to the molecule first then clicking the location to add/replace the new atom was maintained and 2) clicking on an existing Carbon atom in the molecule already has the function of swapping any stereochemistry associated with that Carbon. Fortunately this process is flexible enough that changes can still be explored. One last note is the approach chosen requires less accuracy in selecting a location, which is important for small screen versions.

Following the addition of support for Nitrogen heteroatoms and pyrimidine bases, a next logical step was to add support for the purine bases which would in turn require support for bridged ring heterocyclic molecules. The first step to allow support for the heterocyclic molecules was to update the drawing engine. The molecule chosen to use as a first step to model heterocyclic molecules was bicyclo[4.4.0]decane. Support also required updating the nomenclature engine and modeling on both the interface and search engine. Of specific challenge was introducing support for pi bonds between two bridgeheads, e.g. bicyclo[4.4.0]-dec-1(6)-ene. For now, to create a molecule with a heterocyclic ring, the user has two options: 1) To select either bicyclo[4.4.0]decane or bicyclo[4.3.0]nonane (the backbone for purine bases) from the skeleton panel and add it to the workspace or 2) Enter the IUPAC name. Future support will likely include a process to create such a cyclo skeleton by "fusing" smaller skeleton components.

With support for both Nitrogen heteroatoms and bicyclic molecules, the interface and search engine now support all five DNA/RNA bases!

Standards: Standard IUPAC nomenclature rules were followed for the naming of esters, Nitrogen heteroatoms (using aza to indicate a Nitrogen atom substitution), and heterocyclic molecules (specifically the bracket enclosed style of the lengths of the bridges in descending order and delimited by periods).

Numbering for atoms that are part of a heterocyclic ring follows the following rule per Chapter 13 of Organic Nomenclature by James G. Traynham: "Numbering of a bicycloalkane to indicate location of substituents begins at one bridgehead, proceeds around the longest bridge to the other bridgehead, continues around the second longest bridge back to the number 1 position (original bridgehead), and is completed across the shortest bridge." Once this rule is followed, the standard IUPAC rule of minimizing the locants of the substituents is followed. When a pi bond exists between two bridgeheads, the notation of the smallest numbered bridgehead followed by the other bridgehead in parenthesis is used.

Controls: No control changes were introduced for ester support in the interface. The user can simply create an ester per standard interface controls.

The Add/Replace mode toggle was introduced in the Additions panel. The user can toggle the mode by clicking the "Add/Replace" text. When the mode is "Add" mode, the selected addition will be added to the location chosen as the target in the existing molecule per normal (either by drag and drop or click in the small screen version.) When the mode is "Replace" mode, the selected addition will replace the existing atom at the location chosen as the target. For replacements that are either not yet supported nor chemically possible, no substitution will occur in replace mode.

Heterocyclic molecules can be created either by adding a bicyclo[4.4.0]decane cycloalkane or a bicyclo[4.3.0]nonane from the skeletons panel or by typing the name of the heterocyclic molecule in the name field.

Future Considerations: More heteroatom molecules will be possible in future updates including Oxygen, Sulfur, Phsophorous, Silicon, and Boron heteroatoms. The controls for editing heteroatoms may be altered per any user feedback. Also, the interface may be modified to allow the user to "fuse" together skeleton components to create heterocyclic molecules. Finally, the nomenclature engine for bridged ring systems will be updated to allow proper pyrimidine derivative names and purine derivative names where appropriate as well as purine/pyrimidine substituent numbering systems.

Finally, another call to please leave some feedback! In the comments, through our user feedback page, or the survey on our site. All feedback and interaction is appreciated! Let us know if the site is helping you in your chemistry endeavors!

Tuesday, April 9, 2019

Mobile and Small-Screen Optimization (Round One), addition of support for custom skeleton chains/segments, support for Iodine and Fluorine reactions, and support for free-radical chlorination

Release 4.0.0

The main motivation of this update was to introduce full mobile and small screen device support for the interface. To achieve this goal, two paths were considered: 1) Creating an android and iOS platform app. 2) Modifying and optimizing the existing interface to work on small/smaller screen devices. Eventually, the goal will be to create full fledged iOS and android platform apps, however, for this update the decision was made to keep the existing interface and tune it to work with small screens. The main reasoning for this decision is simply that it was easier to specialize the existing interface for smaller screens than it was to create an entirely new interface. Fortunately, I was also able to take advantage of the Bootstrap framework to allow stylistic changes for different page size breaks with the existing interface.

The first task for small screen optimization was chosen to be optimizing the home, or "splash" page. This was considered a good starting point because it showcases most of the features of the interface and it also is by default the first page the user views. Most stylistic modifications for the smaller screen were straightforward: wrapping text and interactive portions of the page from the same row to the next line to accommodate a smaller screen width, modifying the molecule drawing code to work with a smaller space, and using a carousel tool to display each molecule in the "Explore" section individually rather than displaying all molecules simultaneously.

The next part of the small screen optimization process involved optimizing the Reactions, Pathways, and Contact pages. These stylistic modifications were also straightforward and mainly involved re-positioning the page elements to work on a smaller screen width.

The final, and certainly most time intensive task, was to optimize the molecule design tool. A few chief considerations were made when figuring the best course to take:

  1. A goal was to make the workspace panel containing the molecule being designed/edited as large as possible, thus taking most of or the entire width of the screen in space limited environments.
  2. Given the lower precision of touch events on a mobile device, and the general use of dragging on a mobile device to move the portion of the page currently viewed, the decision was made to move away from the drag and drop method of adding atoms/skeletal chains and towards a click to select a new atom/skeletal chain and click to add the selected item to the existing molecule. This new method is also more similar to other molecule design tools and likely is more familiar to users.
  3. Although dragging on a mobile device is generally associated with moving the portion of the page currently viewed by the user, I also wanted to maintain the ability to pan to different areas of a complex molecule as well as to zoom in and out of the molecule. 
Work began attempting to satisfy the first consideration: as large a workspace panel as possible. Clearly having it take the full width (or nearly the full width) of the screen while maintaining proper height to width ratio was a good place to start. It then became a question of where the panels for the skeletal attachments and individual attachments to be added would be placed. I first attempted to keep the skeletal attachments panel to the left of the workspace and allowed the user to collapse and expand the panel as desired. A similar approach was used to keep the individual atom attachments on the bottom. This ultimately proved unsuccessful as it required the skeletal panel to overlay the molecule workspace when expanded, which resulted in it being difficult to properly place a new skeletal attachment. For example, adding an ethyl side chain to a pentane. I learned from this failed approach that it would be easiest to keep the skeletal and individual atom attachment panels separate from the main workspace panel.

After deciding to keep the panels of skeletal and individual atom attachments separate from and NOT overlaying the main workspace panel, it came time to decide where else on the screen to place the attachment panels. It seemed obvious that the individual atom panel would be placed on the bottom, as it was in that location in the original larger screen interface. As for the skeletal attachment panel, I first placed it above the workspace, but ultimately decided against this approach for two reasons: 1) It made more sense to keep both attachment panels near each other 2) the top of the workspace panel was already associated with the function buttons of the molecule editor (cut, paste, new, etc). With these two reasons in mind, I placed both attachment panels below the workspace panel. 

Transforming from the drag and drop approach of designing a molecule to the click to select an element and click where the element would be placed approached was relatively straight forward. The code was actually designed to abstract and separate the layers of handling interface events and handling the changes to the molecule as much as possible, so I merely had to switch from wiring a drag and drop handler to indicate that an attachment should be added to the molecule to wiring two clicks to indicate the same process. Fortunately, this approach is also more similar to existing, commonly-used molecule design tools and can even be implemented as an alternative to the drag and drop system used for a large size screen.

Finally, work was done to satisfy the third consideration: allowing the user both to drag the molecule around in the workspace panel and still affording the user the expected functionality of dragging on a mobile page; that is to navigate to a different part of the same page. This was trickier and I'm actually experimenting with the solution for now. As such I more than welcome all feedback on this approach! The approach is as follows. When the user starts a drag on a mobile device with the user's finger over the molecule workspace, the DEFAULT mobile behavior will be executed: the page will move to the position that the user has dragged to. When the user FIRST clicks on the molecule workspace, THEN performs a drag, instead of the page moving the molecule will move around inside the workspace panel. The user may click repeatedly on the molecule workspace to toggle exactly what drag operation does. This seemed to be an appropriate compromise, but I recognize that the user will need to get used to this functionality. With that in mind, the default result of a drag will be to move the position of the page. 

Phew! Compared to the overhaul of the interface for mobile optimization, the rest of the updates were much more minor. They were as follows:
  1. The addition of support for custom skeleton chains/segments. The impetus for this addition was to allow the user to add common elements of a more complex molecule component, such as a benzene ring, without needing to recreate the element from scratch. The user can find the custom skeleton chains/segments by cycling through the arrows in the skeletal attachments panel.
  2. The addition of support for molecules and reactions involving the elements Iodine and Fluorine.
  3. Support for the free-radical Chlorination reaction.
Standards: Similar to modeling for other supported reactions, the Fluorination and Iodination reactions were modeled after this article:  https://en.wikipedia.org/wiki/Halogen_addition_reaction.  The free-radical Chlorination reaction was modeled after this article: https://en.wikipedia.org/wiki/Free-radical_halogenation. No other new standards were introduced with this update. 

Controls: The major change in the controls of the interface involves the molecule design process on a small screen. The user will no longer add elements to a molecule in the design process by dragging an element from either the skeletal or individual atoms panels and then dropping the element on the chosen target in the workspace panel, but rather click an element to select it then click again on the target to add it. To compensate for the lower precision of click events on a smaller screen, I introduced a tolerance variable to play with how easily a click was registered. The tolerance was set at 5 pixels for now. That is any click within 5 pixels (in both dimensions) of the portion of the workspace where the atom is rendered will register as a click on that atom. 

There is also now a tutorial icon button in the upper right of the control button section which will open up all available tutorials on the small screen. I actually recommend following the Intermediate Molecule Design Tutorial to become acquainted with the new process.

Finally, the user can add a custom skeletal segment to the molecule in EITHER small screen mode or normal mode by selecting the "Custom Skeletons" page of the skeletal attachments. 

Future Considerations: One major consideration is potentially allowing the user to CHOOSE between the drag and drop approach or the click to select and click to place approach on all size screens. This would be useful if the user prefers one approach more than the other. Also to be considered is adding a "zig zag" type tool for drawing alkane chains. Finally, to be considered is the best level for the tolerance of a click action. This might even be a feature that is adjustable by the user.

Like the small screen design? Hate it???? Is this making Chemistry any easier yet??? Feel free to respond!!

Friday, January 25, 2019

Introduction of Smart Search feature, fix to bug with the Cumene process reaction, minor stylistic and interface updates, fix for bug of adding a cycloalkane to a smaller linear chain in interface

Release 3.1.0

A pretty exciting update for the search engine with this release! This time, first I'll begin with the interface and search engine bug fixes:

  1. Upon using the Discover feature of the home page, the user will now be greeted with a more friendly message upon selecting a reaction that cannot be applied to the selected molecule.
  2. A minor stylistic fix on the Reactions section was made. For example, see the first reaction of the Calvin cycle, the arrow in the View Reaction panel now looks correct.
  3. Previously, in the interface, the user was unable to design a molecule with a cyclic alkane primary skeleton and a straight-chain alkane attachment (for example propylcyclohexane) by first dragging the straight-chain alkane into the workspace and then adding the cyclic alkane. The user was required to first add the cyclohexane and then add the propyl attachment. Now either order is possible.
  4. A bug was fixed in the search engine for the modeling of the Cumene process reaction. The hydroxylation of benzene now results in the proper search engine modeling of Phenol.

And the exciting part of the update: the introduction of a Smart Search feature for the search engine. This new feature will use a heuristic calculation to guide its pathway search from the origin molecule to the goal molecule. For this release, the heuristic used in the smart search is NOT admissible, that is it will potentially overestimate the cost for the synthesis pathway between any given intermediate molecule and the goal molecule. The synthesis pathway found will thus POSSIBLY be sub-optimal. This was an acceptable trade-off made for the first release/iteration of the smart search. Subsequent releases will use admissible heuristics only to guarantee optimality. I am personally more than happy to explain more of the nature of the heuristic calculation used if you private or direct message me.

A motivating example for introducing a heuristic was the search to find a synthesis pathway from benzene to 2-acetoxybenzoic acid (aspirin) using ONLY the Pathway Calculations search option. This search had actually been previously accomplished utilizing both the Pathway Calculations and the MolGen Reactions search options. However, removing the MolGen Reactions search option (which basically provided a very strong hint for the search engine to begin with converting benzene to phenol), would result in a search timed out. The goal was to find such a synthesis pathway without the strong hint.

As I introduced the smart search option using a heuristic, I actually discovered the aforementioned bug in the modeling of the Cumene process reaction. Figuring it was more essential/urgent to fix the modeling bug, I went ahead and did so before proceeding with implementing the smart search feature. Low and behold, fixing this bug actually resulted in a successful synthesis pathway search from Benzene to Aspirin using ONLY the Pathway Calculations search option! There was no longer a time out issue! Running the following search for a pathway from Benzene to Aspirin using ONLY the pathway calculations search (and NOT the smart search feature) will now successfully find the synthesis pathway.

As I had already begun working on implementing the smart search feature, I went ahead and finished that feature as well. I ran benchmark tests on my local development environment and did indeed find that the search performs faster with the Smart Search feature turned on. Tests can actually be performed on www.organicchemmaster.com as well as the user wishes, comparing the search WITHOUT the Smart Search feature to the search WITH the Smart Seach feature. The user SHOULD see a shorter search time for the latter, but I have less control in performing benchmark tests on the server that hosts www.organicchemmaster.com than I do in my local environment.

Standards: Per usual, IUPAC naming rules were followed. Specifically, in this case the fix to the interface allowing the user to attach a cyclic alkane to a straight-chain alkane results in the properly named molecule: the cyclic alkane being designated the parent skeleton chain and taking naming precedence. Of note, the heuristic used for the Smart Search feature is by design NOT an admissible heuristic for this release/iteration. It will thus NOT necessarily guarantee an optimal synthesis pathway. Turning the feature off will STILL result in an optimal and complete (if there is one) synthesis pathway.

Controls: The only significant update to the controls is adding the new Smart Search feature option. This option can be selected in the "Search" checkbox section under the options popup.

Future Considerations: Obviously, we will eventually want the Smart Search feature to use an admissible heuristic to guarantee optimality of the pathway search. There will be a lot of choices to be taken into consideration to improve the heuristic(s) used in terms of trading off heuristic function calculation time and search time. That said, I am looking forward to utilizing the more power this update provides to the search engine !

Quick interface update per user feedback

Release 4.2.1 A thanks to user J.G. who wrote: " 'Im a software engineer, not a chemist. Approaching this website with just memo...