Thursday, August 1, 2019

Addition of heteroatom molecule design and creation tutorial

Release 4.1.1

A VERY short update. A tutorial was added to allow the user to get familiar with creating molecules involving both nitrogen heteroatoms and heterocyclic rings. The tutorial can be found at: The molecule chosen for the tutorial was allopurinol, as it was a motivating example for the previous update.

Thursday, July 25, 2019

Introduction of support for esters, Nitrogen heteroatoms, and bicyclic molecules

Release 4.1.0

This update mainly brings support for the following, more complex, molecules in the interface and the search engine: esters, Nitrogen heteroatoms and bicyclic molecules. The impetus for the full support of esters was actually to finish the work that had begun in a previous update, that is support for alkoxy side chains of hydrocarbon rings and the molecule 2-acetoxybenzoic acid (aspirin). While adding support for heteroatoms, I actually found the pyrimidine derived DNA bases (cytosine and thymine) to be especially useful as test cases. As such, when I was looking for one more direction to expand interface support for this update it was a logical next step to also add support for the purine DNA bases (adenine and guanine) which in turn required support of bicyclic molecules. This logical flow was actually coupled with my particular interest in the molecules uric acid and allopurinol, as a number of my close friends are rapidly approaching the advanced age that requires treatment of gout symptoms.

Previously, when support was added for 2-acetoxybenzoic acid (aspirin), the alkoxy side chain was treated as an acetyl group bonded by ether linkage to benzoic acid as opposed to an ester linkage between benzoic acid and an ethyl group. As such, FULL support for esters was not necessary. This update proceeds to add that full support for esters. Once support for esters was in place, I was able to add support for LiAlH₄ reduction of esters.

The approach to adding support for heteroatoms began with deciding which heteroatom to add first. The two main contenders were Oxygen and Nitrogen, with Nitrogen being chosen somewhat arbitrarily because of my particular interest in supporting DNA bases, uric acid, and allopurinol. Support began by first modifying the interface drawing engine to allow drawing of molecules containing rings and chains of elements other than Carbon. Once this update was made, the nomenclature engine for both the interface and search engine was updated to support heteroatoms. Pyrimidine was chosen as a goal case for support, as were the DNA bases cytosine and thymine. Of note, the pyrimidine derivative nomenclature style was NOT used in this update but support for this style will be added for future updates.

Once support for Nitrogen heteroatoms was in place for both the interface and the nomenclature engines, it was time to update the interface to allow the user to create molecules with heteroatoms. Simplicity and intuitiveness were the top two priorities for this interface addition. Two basic approaches were considered for this functionality: 1) Allowing the user to click on a Carbon atom to select it to be replaced and then clicking on the atom with which to replace it. 2) Introducing a "replace" mode that the user could toggle with an "add" mode. With the replace mode the user could first select an atom in the additions panel and then click on the atom to replace in the molecule. Ultimately the second option was chosen for two reasons: 1) the user interaction flow of clicking the atom to add to the molecule first then clicking the location to add/replace the new atom was maintained and 2) clicking on an existing Carbon atom in the molecule already has the function of swapping any stereochemistry associated with that Carbon. Fortunately this process is flexible enough that changes can still be explored. One last note is the approach chosen requires less accuracy in selecting a location, which is important for small screen versions.

Following the addition of support for Nitrogen heteroatoms and pyrimidine bases, a next logical step was to add support for the purine bases which would in turn require support for bridged ring heterocyclic molecules. The first step to allow support for the heterocyclic molecules was to update the drawing engine. The molecule chosen to use as a first step to model heterocyclic molecules was bicyclo[4.4.0]decane. Support also required updating the nomenclature engine and modeling on both the interface and search engine. Of specific challenge was introducing support for pi bonds between two bridgeheads, e.g. bicyclo[4.4.0]-dec-1(6)-ene. For now, to create a molecule with a heterocyclic ring, the user has two options: 1) To select either bicyclo[4.4.0]decane or bicyclo[4.3.0]nonane (the backbone for purine bases) from the skeleton panel and add it to the workspace or 2) Enter the IUPAC name. Future support will likely include a process to create such a cyclo skeleton by "fusing" smaller skeleton components.

With support for both Nitrogen heteroatoms and bicyclic molecules, the interface and search engine now support all five DNA/RNA bases!

Standards: Standard IUPAC nomenclature rules were followed for the naming of esters, Nitrogen heteroatoms (using aza to indicate a Nitrogen atom substitution), and heterocyclic molecules (specifically the bracket enclosed style of the lengths of the bridges in descending order and delimited by periods).

Numbering for atoms that are part of a heterocyclic ring follows the following rule per Chapter 13 of Organic Nomenclature by James G. Traynham: "Numbering of a bicycloalkane to indicate location of substituents begins at one bridgehead, proceeds around the longest bridge to the other bridgehead, continues around the second longest bridge back to the number 1 position (original bridgehead), and is completed across the shortest bridge." Once this rule is followed, the standard IUPAC rule of minimizing the locants of the substituents is followed. When a pi bond exists between two bridgeheads, the notation of the smallest numbered bridgehead followed by the other bridgehead in parenthesis is used.

Controls: No control changes were introduced for ester support in the interface. The user can simply create an ester per standard interface controls.

The Add/Replace mode toggle was introduced in the Additions panel. The user can toggle the mode by clicking the "Add/Replace" text. When the mode is "Add" mode, the selected addition will be added to the location chosen as the target in the existing molecule per normal (either by drag and drop or click in the small screen version.) When the mode is "Replace" mode, the selected addition will replace the existing atom at the location chosen as the target. For replacements that are either not yet supported nor chemically possible, no substitution will occur in replace mode.

Heterocyclic molecules can be created either by adding a bicyclo[4.4.0]decane cycloalkane or a bicyclo[4.3.0]nonane from the skeletons panel or by typing the name of the heterocyclic molecule in the name field.

Future Considerations: More heteroatom molecules will be possible in future updates including Oxygen, Sulfur, Phsophorous, Silicon, and Boron heteroatoms. The controls for editing heteroatoms may be altered per any user feedback. Also, the interface may be modified to allow the user to "fuse" together skeleton components to create heterocyclic molecules. Finally, the nomenclature engine for bridged ring systems will be updated to allow proper pyrimidine derivative names and purine derivative names where appropriate as well as purine/pyrimidine substituent numbering systems.

Finally, another call to please leave some feedback! In the comments, through our user feedback page, or the survey on our site. All feedback and interaction is appreciated! Let us know if the site is helping you in your chemistry endeavors!

Tuesday, April 9, 2019

Mobile and Small-Screen Optimization (Round One), addition of support for custom skeleton chains/segments, support for Iodine and Fluorine reactions, and support for free-radical chlorination

Release 4.0.0

The main motivation of this update was to introduce full mobile and small screen device support for the interface. To achieve this goal, two paths were considered: 1) Creating an android and iOS platform app. 2) Modifying and optimizing the existing interface to work on small/smaller screen devices. Eventually, the goal will be to create full fledged iOS and android platform apps, however, for this update the decision was made to keep the existing interface and tune it to work with small screens. The main reasoning for this decision is simply that it was easier to specialize the existing interface for smaller screens than it was to create an entirely new interface. Fortunately, I was also able to take advantage of the Bootstrap framework to allow stylistic changes for different page size breaks with the existing interface.

The first task for small screen optimization was chosen to be optimizing the home, or "splash" page. This was considered a good starting point because it showcases most of the features of the interface and it also is by default the first page the user views. Most stylistic modifications for the smaller screen were straightforward: wrapping text and interactive portions of the page from the same row to the next line to accommodate a smaller screen width, modifying the molecule drawing code to work with a smaller space, and using a carousel tool to display each molecule in the "Explore" section individually rather than displaying all molecules simultaneously.

The next part of the small screen optimization process involved optimizing the Reactions, Pathways, and Contact pages. These stylistic modifications were also straightforward and mainly involved re-positioning the page elements to work on a smaller screen width.

The final, and certainly most time intensive task, was to optimize the molecule design tool. A few chief considerations were made when figuring the best course to take:

  1. A goal was to make the workspace panel containing the molecule being designed/edited as large as possible, thus taking most of or the entire width of the screen in space limited environments.
  2. Given the lower precision of touch events on a mobile device, and the general use of dragging on a mobile device to move the portion of the page currently viewed, the decision was made to move away from the drag and drop method of adding atoms/skeletal chains and towards a click to select a new atom/skeletal chain and click to add the selected item to the existing molecule. This new method is also more similar to other molecule design tools and likely is more familiar to users.
  3. Although dragging on a mobile device is generally associated with moving the portion of the page currently viewed by the user, I also wanted to maintain the ability to pan to different areas of a complex molecule as well as to zoom in and out of the molecule. 
Work began attempting to satisfy the first consideration: as large a workspace panel as possible. Clearly having it take the full width (or nearly the full width) of the screen while maintaining proper height to width ratio was a good place to start. It then became a question of where the panels for the skeletal attachments and individual attachments to be added would be placed. I first attempted to keep the skeletal attachments panel to the left of the workspace and allowed the user to collapse and expand the panel as desired. A similar approach was used to keep the individual atom attachments on the bottom. This ultimately proved unsuccessful as it required the skeletal panel to overlay the molecule workspace when expanded, which resulted in it being difficult to properly place a new skeletal attachment. For example, adding an ethyl side chain to a pentane. I learned from this failed approach that it would be easiest to keep the skeletal and individual atom attachment panels separate from the main workspace panel.

After deciding to keep the panels of skeletal and individual atom attachments separate from and NOT overlaying the main workspace panel, it came time to decide where else on the screen to place the attachment panels. It seemed obvious that the individual atom panel would be placed on the bottom, as it was in that location in the original larger screen interface. As for the skeletal attachment panel, I first placed it above the workspace, but ultimately decided against this approach for two reasons: 1) It made more sense to keep both attachment panels near each other 2) the top of the workspace panel was already associated with the function buttons of the molecule editor (cut, paste, new, etc). With these two reasons in mind, I placed both attachment panels below the workspace panel. 

Transforming from the drag and drop approach of designing a molecule to the click to select an element and click where the element would be placed approached was relatively straight forward. The code was actually designed to abstract and separate the layers of handling interface events and handling the changes to the molecule as much as possible, so I merely had to switch from wiring a drag and drop handler to indicate that an attachment should be added to the molecule to wiring two clicks to indicate the same process. Fortunately, this approach is also more similar to existing, commonly-used molecule design tools and can even be implemented as an alternative to the drag and drop system used for a large size screen.

Finally, work was done to satisfy the third consideration: allowing the user both to drag the molecule around in the workspace panel and still affording the user the expected functionality of dragging on a mobile page; that is to navigate to a different part of the same page. This was trickier and I'm actually experimenting with the solution for now. As such I more than welcome all feedback on this approach! The approach is as follows. When the user starts a drag on a mobile device with the user's finger over the molecule workspace, the DEFAULT mobile behavior will be executed: the page will move to the position that the user has dragged to. When the user FIRST clicks on the molecule workspace, THEN performs a drag, instead of the page moving the molecule will move around inside the workspace panel. The user may click repeatedly on the molecule workspace to toggle exactly what drag operation does. This seemed to be an appropriate compromise, but I recognize that the user will need to get used to this functionality. With that in mind, the default result of a drag will be to move the position of the page. 

Phew! Compared to the overhaul of the interface for mobile optimization, the rest of the updates were much more minor. They were as follows:
  1. The addition of support for custom skeleton chains/segments. The impetus for this addition was to allow the user to add common elements of a more complex molecule component, such as a benzene ring, without needing to recreate the element from scratch. The user can find the custom skeleton chains/segments by cycling through the arrows in the skeletal attachments panel.
  2. The addition of support for molecules and reactions involving the elements Iodine and Fluorine.
  3. Support for the free-radical Chlorination reaction.
Standards: Similar to modeling for other supported reactions, the Fluorination and Iodination reactions were modeled after this article:  The free-radical Chlorination reaction was modeled after this article: No other new standards were introduced with this update. 

Controls: The major change in the controls of the interface involves the molecule design process on a small screen. The user will no longer add elements to a molecule in the design process by dragging an element from either the skeletal or individual atoms panels and then dropping the element on the chosen target in the workspace panel, but rather click an element to select it then click again on the target to add it. To compensate for the lower precision of click events on a smaller screen, I introduced a tolerance variable to play with how easily a click was registered. The tolerance was set at 5 pixels for now. That is any click within 5 pixels (in both dimensions) of the portion of the workspace where the atom is rendered will register as a click on that atom. 

There is also now a tutorial icon button in the upper right of the control button section which will open up all available tutorials on the small screen. I actually recommend following the Intermediate Molecule Design Tutorial to become acquainted with the new process.

Finally, the user can add a custom skeletal segment to the molecule in EITHER small screen mode or normal mode by selecting the "Custom Skeletons" page of the skeletal attachments. 

Future Considerations: One major consideration is potentially allowing the user to CHOOSE between the drag and drop approach or the click to select and click to place approach on all size screens. This would be useful if the user prefers one approach more than the other. Also to be considered is adding a "zig zag" type tool for drawing alkane chains. Finally, to be considered is the best level for the tolerance of a click action. This might even be a feature that is adjustable by the user.

Like the small screen design? Hate it???? Is this making Chemistry any easier yet??? Feel free to respond!!

Friday, January 25, 2019

Introduction of Smart Search feature, fix to bug with the Cumene process reaction, minor stylistic and interface updates, fix for bug of adding a cycloalkane to a smaller linear chain in interface

Release 3.1.0

A pretty exciting update for the search engine with this release! This time, first I'll begin with the interface and search engine bug fixes:

  1. Upon using the Discover feature of the home page, the user will now be greeted with a more friendly message upon selecting a reaction that cannot be applied to the selected molecule.
  2. A minor stylistic fix on the Reactions section was made. For example, see the first reaction of the Calvin cycle, the arrow in the View Reaction panel now looks correct.
  3. Previously, in the interface, the user was unable to design a molecule with a cyclic alkane primary skeleton and a straight-chain alkane attachment (for example propylcyclohexane) by first dragging the straight-chain alkane into the workspace and then adding the cyclic alkane. The user was required to first add the cyclohexane and then add the propyl attachment. Now either order is possible.
  4. A bug was fixed in the search engine for the modeling of the Cumene process reaction. The hydroxylation of benzene now results in the proper search engine modeling of Phenol.

And the exciting part of the update: the introduction of a Smart Search feature for the search engine. This new feature will use a heuristic calculation to guide its pathway search from the origin molecule to the goal molecule. For this release, the heuristic used in the smart search is NOT admissible, that is it will potentially overestimate the cost for the synthesis pathway between any given intermediate molecule and the goal molecule. The synthesis pathway found will thus POSSIBLY be sub-optimal. This was an acceptable trade-off made for the first release/iteration of the smart search. Subsequent releases will use admissible heuristics only to guarantee optimality. I am personally more than happy to explain more of the nature of the heuristic calculation used if you private or direct message me.

A motivating example for introducing a heuristic was the search to find a synthesis pathway from benzene to 2-acetoxybenzoic acid (aspirin) using ONLY the Pathway Calculations search option. This search had actually been previously accomplished utilizing both the Pathway Calculations and the MolGen Reactions search options. However, removing the MolGen Reactions search option (which basically provided a very strong hint for the search engine to begin with converting benzene to phenol), would result in a search timed out. The goal was to find such a synthesis pathway without the strong hint.

As I introduced the smart search option using a heuristic, I actually discovered the aforementioned bug in the modeling of the Cumene process reaction. Figuring it was more essential/urgent to fix the modeling bug, I went ahead and did so before proceeding with implementing the smart search feature. Low and behold, fixing this bug actually resulted in a successful synthesis pathway search from Benzene to Aspirin using ONLY the Pathway Calculations search option! There was no longer a time out issue! Running the following search for a pathway from Benzene to Aspirin using ONLY the pathway calculations search (and NOT the smart search feature) will now successfully find the synthesis pathway.

As I had already begun working on implementing the smart search feature, I went ahead and finished that feature as well. I ran benchmark tests on my local development environment and did indeed find that the search performs faster with the Smart Search feature turned on. Tests can actually be performed on as well as the user wishes, comparing the search WITHOUT the Smart Search feature to the search WITH the Smart Seach feature. The user SHOULD see a shorter search time for the latter, but I have less control in performing benchmark tests on the server that hosts than I do in my local environment.

Standards: Per usual, IUPAC naming rules were followed. Specifically, in this case the fix to the interface allowing the user to attach a cyclic alkane to a straight-chain alkane results in the properly named molecule: the cyclic alkane being designated the parent skeleton chain and taking naming precedence. Of note, the heuristic used for the Smart Search feature is by design NOT an admissible heuristic for this release/iteration. It will thus NOT necessarily guarantee an optimal synthesis pathway. Turning the feature off will STILL result in an optimal and complete (if there is one) synthesis pathway.

Controls: The only significant update to the controls is adding the new Smart Search feature option. This option can be selected in the "Search" checkbox section under the options popup.

Future Considerations: Obviously, we will eventually want the Smart Search feature to use an admissible heuristic to guarantee optimality of the pathway search. There will be a lot of choices to be taken into consideration to improve the heuristic(s) used in terms of trading off heuristic function calculation time and search time. That said, I am looking forward to utilizing the more power this update provides to the search engine !

Tuesday, December 11, 2018

Upgrade to SSL site and addition of user accounts

Release 3.0.0

Although this update does not bring any interesting interface or search engine changes, it nevertheless is a pretty major milestone for the site as a whole as it involves a transition to the secure https protocol as well as the introduction of user logins.

In order to use the most secure login tools, it was necessary to migrate the site to use the ASP.NET Core framework. Once this migration was finished, two login processes were created: 1) The process to allow the user to log in with an account specifically for the site. 2) The process to allow the user to use open authentication to log in with a Google account. The hope is that this flexibility will allow the user to log in in his or her preferred manner.

I am excited about the doors that having user logins will open such as personalized stored pathways and the ability to interact with other site users! The login page can be reached at: and the register for account page can be reached at: .

Sunday, November 4, 2018

Introduction of Intermediate Molecule Design and Intermediate Pathway Search Tutorials, fix of nomenclature bug

Release 2.11.1

A shorter update than last time, two more new tutorials were added: 1) An intermediate molecule design tutorial and 2) An intermediate pathway search tutorial.

The goals for the new intermediate level tutorials were:

  1. Get the user comfortable with designing a molecule with a more complex base skeleton.
  2. Familiarize the user with the two methods of changing types of bonds between two atoms in a molecule: clicking on the existing bond, and selecting a new bond type from the drop down list in the inspector.
  3. Familiarize the user with the New (+), Undo, Redo, Copy, and Paste control buttons.
  4. Introduce the process of adding more complex side chains to the molecule, such as acetyl groups, to the user.
  5. Familiarize the user with the process of using the zoom in, zoom out, and drag controls to edit/create more complex molecules.
  6. Familiarize the user with a more complicated pathway search. This includes using the MolGen Reaction database as a source for the pathway search and a longer time limit, of one minute, to perform the search. 
  7. Introduce the user to a pathway search that has real world applications as part of the tutorial. 
The chosen molecules for the tutorials were benzene and aspirin (2-acetoxybenzoic acid). Fortunately, the framework was already in place for creating the two new tutorials, so not much of the interface needed to be modified at all. The two new tutorials are up as the Intermediate Molecule Design tutorial and the Intermediate Pathway Search tutorial.

Additionally, one nomenclature bug was fixed. Previously, a molecule that contained a side chain of methyl that in turn contained two or more of the same functional group would NOT properly display the multiple group prefix (di, tri) in the radical name. One example of this (dihydroxymethyl)cyclohexane. This was previously, incorrectly named as (hydroxymethyl)cyclohexane, even with two hydroxyl groups attached to the methyl radical.This bug was actually discovered while creating the molecule 2-acetoxy-1-(dihydroxymethyl)benzene, a molecule the user will create on the way to creating aspirin.

Thanks! Again, keep the comments and emails coming!

Friday, October 19, 2018

Introduction of Basical Molecule Design and Basic Pathway Search Tutorials, Undo/Redo feature, and create molecule by IUPAC name feature

Release 2.11.0

This update was directly inspired by a comment on the previous post:

Hi, is there any intro-level text or blog post you recommend before using your tool?

Great question! I think it was indeed high time some sort of introductory text or tutorial was, for lack of better word, introduced. I played with a few options: creating a blog post detailing how to get started, creating some sort of slide based demonstration, and finally implementing an interactive tutorial. I still might add the first two options, but I was most excited about the interactive tutorial; not only would it provide the most hands on way to introduce the tool to a new user, it would also serve as a great way to clean up/test the interface while designing the tutorial. Which in fact it did quite a bit.

The three goals for the first round of tutorials were: 
  1. Get the user able to create/design a simple molecule via the interface by physically adding and attaching the atoms of the molecule. In this case, ethanol (ethan-1-ol) was chosen as a good example molecule because it is simple in structure, a very well known molecule, and very readily reactive. 
  2. Get the user able to use the create molecule by IUPAC name feature. This feature involves the user typing the IUPAC (or common) molecule name in the appropriate field, then viewing the resulting molecule in the interface. Essentially, this is the opposite of the first goal: going from the IUPAC name to the molecule structure rather than vice versa. Ethanal (commonly acetaldehyde) was chosen as it also has a simple structure and can be formed via an oxidation of ethanol.
  3. Get the user able to perform a simple pathway search between two molecules. In this case, ethanol and ethanal, logically chosen because they were created in the first two steps.
The second goal actually required adding the feature of creating a molecule by IUPAC name to the main molecule workspace page. This feature was already on the home/splash page in the first section, so it just needed to be added to the molecule workspace page as well. The user can now click on the molecule name, enter a new molecule name, press enter, and see the new molecule created in the workspace.

After the create molecule by IUPAC name feature was finished, it became clear during the tutorial implementation process that having undo functionality would be VERY helpful. Namely to provide step by step instructions on successfully completing each tutorial, it would be necessary to undo a user's step if it was incorrect. That is, for the first goal of creating ethanol, if the user added a Chlorine atom to ethane, for example, we would want to undo the addition of Chlorine and instead instruct the user to add an Oxygen/Hydroxyl group. Undo/redo functionality had been on the plate for a while now, so it seemed like a perfect time to implement it. For now, the maximum amount of undos/redos allowed was set to five, though this may be changed in the future.

With both the undo/redo feature and the create molecule by IUPAC name feature implemented, it was possible to finish the tutorials in mind to accomplish the three goals listed above. The first two goals were accomplished with the Basic Molecule Design and Creation tutorial. The third goal was accomplished with the Basic Pathway Search tutorial.

Standards: No significant new standards were introduced. Per normal, all existing IUPAC naming rules were followed.

Controls: The controls for the undo/redo functionality should be straightforward. The icons at the top control bar of the workspace are now enabled to allow the user to click undo and redo when desired. In the case of an undo or a redo not being possible/allowed, that particular icon will be disabled.

The user can utilize the create molecule by IUPAC name feature as follows: 1) Click on the molecule name or the text "Click to enter molecule name" if one the workspace is currently blank. 2) Enter the IUPAC or common name of the desired molecule. (Note: this feature does not yet support EVERY possible molecule, but that indeed is the goal eventually.) 3) Click "enter".

Beginning the two tutorials that were added should be straightforward. The user can either click the "Open Tutorials" button in the Instructions panel, or click on one of the tutorial links on the Help page.

Future Considerations: The two added tutorials seem to be enough to get the user started. However, of course more tutorials for advanced molecule creation/design and pathway searches need to be added. Also, we MAY increase the possible number of undo steps. Finally, the create molecule by IUPAC name feature will eventually be improved to add a certain level of tolerance. That is, we would expect the user entering "1, 3-dichlorobutane" to result in the same molecule as "1,3-dichlorobutane". That extra space should be considered a tolerable discrepancy.

I greatly appreciate the user's comment and absolutely welcome more! Please go ahead and comment!

Addition of heteroatom molecule design and creation tutorial

Release 4.1.1 A VERY short update. A tutorial was added to allow the user to get familiar with creating molecules involving both nitrogen...