Thursday, March 12, 2015

Lesson 4 - SMILES (Introduction to SMILES, and how to make SMILES notaiions using ChemSketch)

Bismillahirrahmanirrahim and good morning.

     Alhamdulillah, praise to Allah, we still have a chance to live in His blessing. Well, today we has learned new thing which is SMILES.At first, we had thought that Smiles is this:


However, this SMILES that we learned is different, it is about SMILES notations.  What is SMILES? SMILES is an acronym for Simplified molecular-input line-entry system. It is a specification in form of a line notation for describing the structure of chemical species using short ASCII strings. SMILES strings can be imported by most molecule editors for conversion back into two-dimensional drawings or three-dimensional models of the molecules (source : HERE). 

     This SMILES notation is widely used and also very efficient computationally.It uses atomic symbol and a set of intuitive symbols. Furthermore, it uses hydrogen-suppressed molecular graphs (HSMG), which basically men=ans that for example, in writing molecular formula (for example, propane), instead of writing the chemical structure as follows :
We just write it as CCC. This is what we called HSMG in SMILES notations, as we neglect the hydrogen atoms.

     In terms of a graph-based computational procedure, SMILES is a string obtained by printing the symbol nodes encountered in a depth-first tree traversal of a chemical graph. The chemical graph is first trimmed to remove hydrogen atoms and cycles are broken to turn it into a spanning tree. Where cycles have been broken, numeric suffix labels are included to indicate the connected nodes. Parentheses "the brackets ()" are used to indicate points of branching on the molecular structure. For example, 2-methylpentane an be written as CC(C)CCC. For molecular bondings, we use symbols as follows :
  • for single bond we use - symbol (can be omitted)
  • for double bond we use = symbol
  • for triple bond we use # symbol
  • For arimatic we use : symbol (also can be omitted)
     
     For branches, as stated before, we can use parenthesis to indicate them. The parenthesis symbol can be either nested or stacked. Example of branched molecules :
  • CC(O)CC is 2-Butanol
  • OCC(C)C is iso-Butanol
  • OC(C)(C)C is tert-Butanol
     Then, for aliphatic carbon, we use big capital C and for aromatic one, we use small capital c. For atomic charges, we can specify the atoms and the charges in a square brackets. For examples : 
  • [H+] proton
  • [OH-] hydroxyl anion
  • [OH3+] hydronium cation
  • [Fe2+] iron(II) cation
  • [NH4+] ammonium cation


     For SMILES Cyclic Structures, we can degign notaions by breaking one single or one aromatic bond in each ring and number it in any order
–Designate ring-breaking atoms by the same digit following the atomic symbol

Numbers indicate start and stop of ring :
  • Same number indicates start and end of the ring, entered immediately following the start/end atoms
  • Only numbers 1 –9 are used
  • A number should appear only twice
  • Atom can be associated w. 2 consecutive numbers, e.g., Napthalene: c12ccccc1cccc2


There are few restrictions in doing SMILES notations :

     Avoid two consecutive left parentheses if possible
     Strive for the fewest number of possible branches
     Tautomeric bonds are not designated; enter the appropriate form
     A branch cannot begin a SMILES notation
     A branch cannot immediately follow a double-or triple-bond symbol
     Example: C=(CC)C is invalid, but
     C(=CC)C or C(CC)=C are valid SMILES

For isomeric and chiral SMILES :

     Isomeric configuration indicated by forward and backward slashes: / \
     Examples:
          –trans-1,2-dibromoethene: Br/C=C/Br
          –cis-1,2-dibromoethene: Br/C=C\Br
     Chirality indicated by the “@” symbol

     Then, we will demonstrate how to use SMILES notations by using ChemSketch..

  • GENERATING SMILES NOTATION FOR BENZYLAMIDE

By using ChemSketch, firstly, we click structure and we choose benzene diagram. We continued by clicking the carbon symbol and linked the bond of carbon to the benzene ring. Lastly, we click the nitrogen symbol and connected to the structure. To notate the structure, we click draw then go to the tools and choose generate and then click SMILES notation. At the end, the notation of the structure given as follows :

  • GENERATING SMILES NOTATION FOR FLUORENE 

Generate Fluorene compound using the template provided in the Chemsketch




Using Draw mode, highlight the compound.



 Then, select ‘Tools’ in the Menu Bar.


Choose, ‘Generate’ in the ‘Tool’.



  Lastly, select SMILES notation.



 Finally, the SMILES notation is generated.


In addition, repeat (2) until (4) and select ‘Name for Structure’ to generate the structure’s name.

  • GENERATING SMILES NOTATION FOR PARACETAMOL

Draw the N-(4-hydroxyphenyl)acetamide or paracetamol using a chemsketch. Open structure mode and click C for carbon, O for OH and N for NH which are situated on the left side. Click the benzene ring structure on the right side to create a benzene ring.Open tools on chemsketch and select generate then select name for structure to insert the name. Select generate again and this time select SMILES notation to put the notation.




Thursday, March 5, 2015

Lesson 3 : PDB - Protein Data Bank

Bismillahirrahmanirrahim...

     Alhamdulillah, for our today's KOS1110 class, we had learned about Protein Data Bank, aka PDB. What is PDB? PDB basically means is a repository for the three-dimensional structural data of large biological molecules, such as proteins and nucleic acids. These proteins and nucleic acids are the molecules of life that are found in all organisms including bacteria, yeast, plants, flies, other animals, and humans. Understanding the shape of a molecule deduce a structure's role in human health and disease, and in drug development. The structures in the archive range from tiny proteins and bits of DNA to complex molecular machines like the ribosome. All of these proteins and nucleic acids structures are commonly obtained by X ray Crystallography, NMP spectroscopy, and so on.

     The PDB archive is available at no cost to users. The PDB archive is updated each week at the target time of Wednesday 00:00 UTC (Coordinated Universal Time).

     The PDB was established in 1971 at Brookhaven National Laboratory under the leadership of Walter Hamilton and originally contained 7 structures. After Hamilton's untimely death, Tom Koetzle began to lead the PDB in 1973, and then Joel Sussman in 1994.  In 1998, the Research Collaboratory for Structural Bioinformatics (RCSB) became responsible for the management of the PDB. In 2003, the wwPDB was formed to maintain a single PDB archive of macromolecular structural data that is freely and publicly available to the global community. It consists of organizations that act as deposition, data processing and distribution centers for PDB data.

     In addition, the RCSB PDB supports a website where visitors can perform simple and complex queries on the data, analyze, and visualize the results. For more details about PDB, all of us can find it at this website.

     The PDB data format is made to contain coordinate. The main data format of PDB is mmCIF. mmCIF is a dictionary containing 2500 definitions. Furthermore, it also contain syntax such as "_chem_comp_link.details". And finally PDB which confirmed by syntax can be loaded by the system.

     The examples of software for PDB is pdb_extract, ADIT, and Validation Suite. The details of the softwares are as follows :
pdf_extract Extract info from output of crystallographic, and merge it into mmCIF
ADIT It has two types, which is web-based and standalone tools. Its function is to assemble, edit, validate, and deposition structural data
Validation Suite Create reports

     For more information about PDB you can go to the website http://www.rcsb.org/. All the PDB datas are stored in this website.

     RasMol is a program for molecular graphics visualisation. It is a sophisticated, yet user-friendly, molecular graphics program for viewing molecules. The program is provided as freeware,available for internet downloading, by Roger Sayle in 1992 at the BioMolecular Structures Group of Glaxo Research & Development (UK). It is used by hundreds of thousands of users world-wide to view macromolecules and to prepare publication-quality images. Rasmol can be obtained by downloading from the internet as it is programmed as an open source program. 

     Difference between Rasmol and RASWIN. When the RasMol is compiled for use by Microsoft Windows users, it is called RASWIN.

     Then, we will show 5 types of proteins which we had got them from the RCSB PDB website  The 5 types of proteins are :
  • Amylase
  • Trypsin
  • Pepsin
  • Protease
  • Lipase
 The details of the 5 proteins are as follows :
Classification: Alpha Amylase 
Structure Weight: 48016.77 
Molecule: ALPHA-1,4-GLUCAN-4-GLUCANOHYDROLASE 
Polymer: 1 
Type: protein 
Length: 425 
Chains: A EC#: 3.2.1.1 
Mutation: E208Q 
Details: COMPLEXED WITH MALTOPENTAOSE 
Organism: Bacillus subtilis 
Gene Names: amyE amyA BSU03040 

  • Pepsin (3PSG)
 
Classification: Hydrolase(acid Proteinase Zymogen) 
Structure Weight: 39551.90 
Molecule: PEPSINOGEN 
Polymer: 1 
Type: protein 
Length: 370 
Chains: A EC#: 3.4.23.1 
Organism: Sus scrofa 
Gene Name: PGA 

  • Trypsin (7PTI)


Classification: Proteinase Inhibitor (trypsin) 
Structure Weight: 6558.45 
Molecule: BOVINE PANCREATIC TRYPSIN INHIBITOR 
Polymer: 1 
Type: protein 
Length: 58 
Chains: A 
Organism: Bos taurus 

  • Protease (3B4R)
Classification: Hydrolase 
Structure Weight: 50019.76 
Molecule: Putative zinc metalloprotease MJ0392 
Polymer: 1 
Type: protein 
 Length: 224 
Chains: A, B EC#: 3.4.24 
Fragment: Site-2 Protease residues 1-224 
Organism: Methanocaldococcus jannaschii 
Gene Name: MJ0392 

  • Lipase (1GT6)
Classification: Lipase 
Structure Weight: 59218.33 
Molecule: LIPASE 
Polymer: 1 
Type: protein 
Length: 269 
Chains: A, B EC#: 3.1.1.3 
Mutation: YES 
Details: OLEIC ACID 
Organism: Thermomyces lanuginosus 
Gene Name: LIP

Wednesday, March 4, 2015

Chemsketch: A powerful tool for scientists especially chemists.

Bismillahirrahmanirrahim..............

     Well hello there world! Hopefully we are all fine with the blessing from Allah SWT. Amiin.

     In our second class of KOS1110 we had learned something new. Before tht, do you remember what we have learned in our first classwhich is bout HTML, right? Yes, insyaAllah. and for this week we had learned something which is totally different from HTML which is to use the software namely "ChemSketch".

Figure 1 : ChemSketch - the basic.

     As a scientist, we must get used to this application as it serves a useful service for us to generate chemical compound either organic or inorganic or polymer compounds and also other uses such as making atomic orbital, DNA, lipids, energy diagram and much more. For example, ChemSketch. ChemSketch has a lot of functions which enables us to make all uses mentioned earlier, and also can help scientists (and also science students) to do what is commonly cannot be done using normal software such as drawing chemical structure, drawing set up of apparatus and so on. The basic background of the ChemSketch is as shown in Figure 1.

     We had learned 6 uses of ChemSketch in the our second KOS1110 class, which were :
  • Drawing energy diagram
  • Making atomic orbitals
  • Drawing apparatus (Vacuum distillator)
  • Drawing DNA - two-chain DNA strand
  • Drawing Lipids and Micells
  • Making tables (which shows the compositions and densities of some chemical compounds)
     Unlike the first HTML assignment which we had done it individually, this ChemSketch assignment we had done it in group. The uses of ChemSketch that we had learned and done in the assignment are as shown in Table 1 below :

Uses Images Descriptions
Drawing energy diagram
Energy diagram normally used to sketch the flow of energy along a series of chemical reaction, also can determine the activation energy and also either the reaction is endothermic or exothermic, and many more
Making atomic orbitals
Atomic orbitals has their own respective shapes, which are determined by l number of uantum number. For example, p orbital is the orbital with l=1, and d orbital ith l=2. Meanwhile, pi-bonding orbital is formed when two electrons from two atoms overlap side-by-side to form covalent bonding (pi bonding only occurs in two atoms which has more than one bonding, the other type is sigma-bonding)
Drawing apparatus (Vacuum distillator)
A vacuum distillation is used when the boiling point of the compound (or the solvent) is too high (Tb>150 oC) in order to distill the compound (or the solvent off) without significant decomposition. For more information, you can click HERE
Drawing DNA - two-chain DNA strand
Double-stranded DNA is simply two chains of single- stranded DNA, positioned so their "bases" can interact with each other. At left is a cartoon depiction of double-stranded DNA. The sugar-and-phosphate 'backbone' is depicted in red, and the bases are depicted in blue. Importantly, the two strands travel in opposite directions; hence the structure is said to be "anti-parallel". The bases in the middle "pair up" with bases on the opposite strand, so that a type 'A' nucleotide is always opposite a type 'T', and 'G' is opposite 'C'. The attraction between the paired nucleotides is fairly weak, but when there is a whole string of them, it adds up to enough strength to hold the strands together.
Drawing Lipids and Micells
Micelles are lipid molecules that arrange themselves in a spherical form in aqueous solutions. The formation of a micelle is a response to the amphipathic nature of fatty acids, meaning that they contain both hydrophilic regions (polar head groups) as well as hydrophobic regions (the long hydrophobic chain). Micelles contain polar head groups that usually form the outside as the surface of micelles, and the hydrophobic tails are inside and away from the water since they are nonpolar.
Making tables

The table shows us the compositions of atoms of three different elements, which are benzene, naphthalein, and also qounoline

     Those shown in Table 1 are some descriptive explanations in what we had done with ChemSketch in the previous class. We hope that all of us can get benefits from the information given above.

     To conclude, after we had learned ChemSketch, we felt that this sofware can help us very much in order for us as Applied Chemistry students especially for drawing chemical structure, drawing mechanisms and many more. This is because there are a lot of functions that ChemSketch can do to help us, the uses shown in Table 1 only shows some of it. There are a lot of other functions of this software which we are yet to discover, (InsyaAllah we'll try to discover it). This malicious varieties functions of ChemSketch software make it suitable as this post's title which is "Chemsketch: A powerful tool for scientists especially chemists".

Wallahu a'lam

Lesson 2 : CHEM Sketch

Bismillahirrahmanirrahim..............

     Well hello there world! I ask how are you? ("aku tanya apa khabar" direct manglish trenslation...haha). Hopefully we are all fine with the blessing from Allah SWT. Amiin.

     Well, well well well.... In our second class of KOS1110 we had learned something new. Before tht, do you remember what we have learned in our first class? About HTML, right? Yes exactly, pandai anak mama. Haha (bila nak masuk tajuk daa....). And...this week we had learned something which is totally different (really?) from HTML which is to use the software ChemSketch What is ChemSketch?

(for full update please click HERE)