Visualize connected voids in the unit cell

In the previous post, I have shown how to find the void in VMD. But if you want to visualize the connected voids, it will become a little complicated. At the beginning, I tried to use watershed algorithm to do that, during which I had an idea to do that in a much simpler way. I will introduce below.

  1. Use the following Tcl script to find all voids and put a Si atom in each void. And the end of the script, a XYZ file is generated to use Si to represent the voids in the unit cell.
  2. Use VMD to open the XYZ file.
  3. In the representation setting, you can select “fragment 0” to “fragment N” to see the connected voids. VMD can figure out the connected voids and treat them as different fragments. With Selection tool, you can even move the connected voids to anywhere you want for better representation.
  4. You can also use color method “fragment” to use different colors to highlight different fragments, but the default color scheme is not good.


#Change background color to white
color Display Background white

#set sphere radius
set probe 0.8

#set the box side-length, this is a cube
set length 39.91

#the side-length is divided into N grid, 50 is enough to visualize
set step [expr { $length/1.4 }]

#delete all graphics
draw delete all

#draw box, wrap the system
pbc box
pbc wrap
#draw color green
#recalculate the bond formation
mol bondsrecalc 0

set outfile [open "" w+]

set i 0
#find void
for {set x 0.5} {$x <= $length} { set x [expr {$x + $length/$step}]} {
        for {set y 0.5} {$y <= $length} { set y [expr {$y + $length/$step}]} {
                for {set z 0.5} {$z <= $length} { set z [expr {$z + $length/$step}]} {
                        set a [[atomselect top "sqrt( sqr(x-$x) +sqr(y-$y) + sqr(z-$z) )  < 2"] num]
                        if { $a == 0 } {
#                               draw sphere "$x $y $z" radius $probe
                                incr i
                                set X($i) $x; set Y($i) $y; set Z($i) $z


puts $outfile $i
puts $outfile "Connected Pores"
for {set m 1} {$m <= $i } {incr m} {
        puts $outfile "Si $X($m)  $Y($m)  $Z($m)"
close $outfile
mol new



HEATMAP for intramolecular hydrogen bond analysis of a 10mer copolymer.

I am working on some simulations of certain copolymer. This copolymer could form intensive intramolecular hydrogen bonds. To observe how each residue interacts with other residues to form intramolecular hydrogen bonds, I choose HEATMAP to present the results.


So the above figure can show the average hydrogen bonds formed among different residue pairs on one copolymer during a period.


I did this in three step.

  1. Calculate intramolecular hydrogen bonds occupancy with VMD.
  2. Find pairs from the output files.
  3. Plot with GNUPLOT (version 5.0 patchlevel 1).


So easy. Mama will never worry about my study anymore ~~~~~~


The VMD script is below. My system only contains water and polymer, so I use “not water” to simplify my selection.

package require hbonds
mol new copolymer.prmtop type parm7 first 0 last -1 step 1 filebonds 1 autobonds 1 waitfor all
mol addfile trajectory.dcd type dcd first 1 last -1 step 1 filebonds 1 autobonds 1 waitfor all
set sel1 [atomselect top "not water"]
set sel2 [atomselect top "not water"]
hbonds -polar yes -sel1 $sel1 -sel2 $sel2 -dist 3.5 -ang 50 -plot no -writefile yes -type pair -outfile vmd_hbond_chain.dat -detailout vmd_hbond_chain_detail.dat



The BASH to find pairs is below. This is 10mer copolymer. The names of residue in AMBER are MMH1 MMM2 MMM3 … MMM9 MME10. I chose perl to speed up the floating computing. It might save several minutes.

for i in {1..10};do echo \,${i};done | tr '\n' ' ' > hbond_intramolecular_occupancy.dat
echo >> hbond_intramolecular_occupancy.dat
for i in {1..10};do
 if (( $i == 1)); then j=MMH${i};fi
 if (( $i == 10)); then j=MME${i};fi
 for m in {1..10};do
  if (( $m == 1)); then n=MMH${m};fi
  if (( $m == 10)); then n=MME${m};fi
  a+=(`grep -E "${j}-.*${n}-" vmd_hbond_chain_detail.dat | awk '{print $3}' | cut -f1 -d"%" | perl -lne '$x += $_; END { print $x/100; }'`)
 echo ${i} ${a[*]} | tr ' ' ',' >> hbond_intramolecular_occupancy.dat
 unset a
unset a



The GNUPLOT script is below. I got help from And I did some slight modifications to fit my system.

set grid front
set xrange [*:]
set terminal pngcairo
set encoding iso_8859_1
set xl "Acceptor Residue"
set yl "Donor Residue"
# Color runs from white to green
set palette rgbformula -7,2,-7
set cbrange [0:2]
set cblabel "Number of Hydrogen Bonds per Frame"
set cbtics

set output "hbond_intramolecular_occupancy.png"
set title "Intramolecular Hydrogen Bonds"
set datafile separator ","
plot 'hbond_intramolecular_occupancy.dat' matrix rowheaders columnheaders using 1:2:3 with image noti
set datafile separator
set output


I am using CHARMM force field these days. So the corresponding files for MD simulation are PSF and PDB (PRMTOP and INPCRD in AMBER). The problem is that AMBERTOOLS has power tools of CPPTRAJ and parallel version CPPTRAJ.OMP. I used them a lot for commands like hbond, lie, radial or watershell, autoimage, etc. But CPPTRAJ can not deal with PSF very well.


The solution is converting PSF to PRMTOP, then use CPPTRAJ for the calculation.

Here, we need in AMBERTOOLS.

The input script is below. Run < script in the terminal.

chamber -top top_all36_prot.rtf -param par_all36m_prot.prm -str toppar_water_ions.str -psf your.psf -crd your.crd
outparm your.prmtop


So you may use the output PRMTOP in CPPTRAJ now. You need to check the PRMTOP to find proper residue names of the script for CPPTRAJ. WAT will be changed to TIP3 (in TIP3P water box).


Sometimes, I have to calculate self-diffusion coefficient, so I never wrap box during the simulation. I always use AUTOIMAGE in CPPTRAJ to get the box-shape trajectory. There is one more problem in this case. VMD can not read the PRMTOP got from PSF. We can use PSF instead of PRMTOP, then load DCD file for visualization.

vmd your.psf box-shape.dcd


Hydrodynamics radius

This is measured by dynamic light scattering (DLS). DLS output the diffusion coefficient of some particles in the solvent, which might be proteins or polymers.

Then use Stokes-Einstein equation, you can calculate the Radius. That Radius is hydrodynamics radius.

Although the measurement of DLS assumes the particle is hard sphere, it is not in reality. The shape of the proteins or polymers and solvent around them will affect the diffusion coefficient. So the value got from DLS should be hydrodynamics radius.

Hydrodynamics radius considers the geometric effect and the water dynamics effect, so the value is larger than that of radius of gyration. The ratio between RG and RH should be around 0.7-0.8. The calculation of RH in MD simulation can be found at the following paper.

Lee, H., Venable, R. M., MacKerell, A. D. & Pastor, R. W. Molecular Dynamics Studies of Polyethylene Oxide and Polyethylene Glycol: Hydrodynamic Radius and Shape Anisotropy. Biophys J 95, 1590–1599 (2008).

Dynamic Light Scattering

I was asked to do simulation to calculate the spring constant of a polymer. Steered MD simulation is not acceptable by my advisor. Synthia in Materials Studio can only give a prediction, it is not accurate. I am looking for other methods to do this now.

A literature talking about test the spring constant of polystyrene mentioned dynamic light scattering is used for the measurement (Measuring the Spring Constant of a Single Polymer Chain, H. Jensenius and G. Zocchi). I am trying to extract useful information and do some similar simulation. I have no clue what DLS is. The following is a good explanation. Cited from

Dynamic Light Scattering (DLS) is also known as Photon Correlation Spectroscopy (PCS). DLS is used to size particles form below 5ns to several microns. This technique operates on the principle that particles move randomly in gas or liquid. ie. undergo Brownian motion. The movement (diffusion) of these particles is described by the Stokes-Einstein equation (Eq.1).

DLSFig1 150x141 Dynamic Light Scattering 1c1726234f7132fc11fad12a71b4cf54 l3 Dynamic Light Scattering

(Eq. 1)

Where the diffusion (D) is equal to the product of Boltzman’s constant ( e4feb8c2565b3905e2bbdba78607b925 l3 Dynamic Light Scattering) divided by the hydrodynamic radius of the particle (R) of the particle and the shear viscosity of the solvent ( 744efd79f33f65e5887066be3e0cc271 l3 Dynamic Light Scattering). Larger particles have a slower velocity and will have smaller coefficients of diffusion than larger particles.

DLSFig2 150x150 Dynamic Light ScatteringIn most DLS systems a laser (i.e. HeNe) of known wavelength passes through a dilute sample in solution and the intensity of scattered light is collected by a detector and deconvoluted by algorithms to determine the particle size distribution of the sample. Figure B shows a schematic of a DLS instrument.The amount of scattered light collected is dependent on the molecular weight, size, and shape of a particle as well as the refractive indices of the particle and solvent. Before reaching the detector, the scattered light from individual particles experiences interference from those scattered by other particles all of which are moving randomly due to Brownian motion. This results in random fluctuations in time.

DLSFig3 150x150 Dynamic Light ScatteringFigure C shows a typical intensity vs time plot for three differently sized particles diffusing in solution. The time scale of the fluctuations shown in the figure is dependent on the particle diffusivity and size of the particles. Smaller particles will jitter about more quickly than larger particles. The figure shows represenative time vs intensity plots for “small” (3a), “medium” (3b) and “large” (3c) particles.

To determine the numerical size of the particles it is necessary to correlate intensity to the diffusion coefficient of the particles. This is done using an autocorrelation function or ACF (eq. 2). This function examines the changes in scattered intensity over periods of time for a volume of particles. In the case of a simple monodisperse particle size distribution (PSD) the ACF is a single decaying exponential function (eq. 3). After a series of calculations a decay constant (tt) is found that is inversely proportional to the diffusivity of a particle as shown in equations 4(a-b) where K is a constant called the “scattering wave vector”. This constant relates the time scale of the diffusion process to the distance scale set by the laser wavelength. K is shown in equation 5 and is dependent on the wavelength of the laser (λl), q is the angle of detection and the index of refraction of the solvent (n). Once the coefficient of diffusion is known the hydrodynamic radius can be determined using the Stokes-Einstein equation (eq. 6). 8d6df69cb9c021d5d6a51d8a5ac73453 l3 Dynamic Light Scattering 62779230f35b8beb8d4452ef129edaa7 l3 Dynamic Light Scattering bf0664ae8897b1b186b87da444926112 l3 Dynamic Light Scattering a4d3557d736f1573434a05ca4ea2e8b7 l3 Dynamic Light Scattering ea156754ef65921610b44b64406eaa55 l3 Dynamic Light Scattering cdc0d532993f3426269f95a66ce3ac31 l3 Dynamic Light Scattering

DLSFig4 150x150 Dynamic Light ScatteringThe simplest DLS result is of a monodisperse or uniform sample with a narrow size distribution. A monodisperse sample will have a bell shaped curve of narrow width (Figure D). Such a sample is quite rare. Most samples will have either a broad Gaussian single peak PSD indicating that the sample is not uniform and has perhaps a skewed unimodal or multimodal PSD. A simple Gaussian calculation will give highly inaccurate results.

DLSFig5 150x150 Dynamic Light ScatteringIn order to analyze samples that are not perfectly monodisperse, PSS employs a proprietary inverse Laplace transform called the NICOMP distribution analysis that does not assume a particular shape for the PSD and analyzes up to four independent parameters to determine PSDs.Figure E shows a PSD for a bimodal sample containing 220nm and 340nm polystyrene latex spheres. A conventional Gaussian analysis would not be able to discern the two mean diameters and would present a result with a mean diameter somewhere in between to the two.

Calculating heat of vaporization of solvent with MD simulation.

I was asked to calculate the solvation free energy of some solutes in Dimethylacetamide (DMA) with MD simulation. I disliked to develop own force field for DMAC. It’s mainly because I didn’t think my method to develop the force field was accurate. Since I used AMBER all the time, I tried to use GAFF for DMA at the beginning to testify whether it’s good for DMAC. I planed to use density and heat of vaporization to testify it.

Dr. Justin A. Lemkul at University of Maryland helped me a lot for this (

So I did the following steps to prepare the PRMTOP and INPCRD files for the simulation.

1. Build a DMAC structure with Molefacture in VMD and optimize the structure with SQM.

2. Use Gaussian to optimize the geometry by

MP2/6-31G* opt.

3. Use Gaussian to calculate ESP by

HF/6-31G* SCF=Tight Geom=AllCheck Guess=Read Pop=MK IOp(6/33=2,6/41=10,6/42=17)

4. Use Antechamber to get PREP and FRCMOD file

antechamber -i esp.log -fi gout -o dma.prep -fo prepi -at gaff

antechamber -i dma.prep -f prepi -o dma.frcmod -p $AMBERHOME/dat/leap/parm/gaff.dat

5. Edit dma.prep, change MOL to DMA. Edit NEWPDB.PDB, change MOL to DMA.

6. Use TLEAP to build a liquid and vacuum system.

#the script to build a liquid and vacuum system

source leaprc.gaff

loadamberprep dma.prep

loadamberparams dma.frcmod

a = loadpdb NEWPDB.PDB

check a

charge a

saveamberparm a dma_vac.prmtop dma_vac.inpcrd  #files for vacuum system

solvatebox a DMA 15.0

check a

charge a

saveamberparm a dma_liquid.prmtop dma_liquid.inpcrd    #files for liquid system


The following was to do short MD simulation with NPT. I did 10ns simulation for liquid system and 100ns for vacuum system at 300K. And then restart a new system for both 1ns. And we would use the 1ns simulation results for heat of vaporization calculation.

The end of AMBER output were below.

liquid averaged result
117 molecule

NSTEP = 1000000 TIME(PS) = 11020.000 TEMP(K) = 300.22 PRESS = 19.0
Etot = 3017.4941 EKtot = 1570.5383 EPtot = 1446.9559
BOND = 579.1624 ANGLE = 766.5262 DIHED = 397.1170
1-4 NB = 298.1792 1-4 EEL = 7051.5797 VDWAALS = -942.9803
EELEC = -6702.6283 EHBOND = 0.0000 RESTRAINT = 0.0000
EKCMT = 104.7960 VIRIAL = 97.3657 VOLUME = 18100.2857
Density = 0.9352

vacuum averaged result
1 molecular

NSTEP = 1000000 TIME(PS) = 101000.000 TEMP(K) = 298.44 PRESS = 0.0
Etot = 37.5408 EKtot = 13.3439 EPtot = 24.1970
BOND = 5.0303 ANGLE = 6.4772 DIHED = 3.3979
1-4 NB = 2.5018 1-4 EEL = 60.2780 VDWAALS = 0.7379
EELEC = -54.2261 EHBOND = 0.0000 RESTRAINT = 0.0000

The following is the equation for the calculation.


Assuming Vgas >> Vliq

dHvap=Ugas – Uliq + pVgas = Ugas – Uliq + RT

U is the sum of kinetic and potential energies. So the heat of vaporization of DMA in this case is

Ugas = (13.3439 + 24.1970) / 1 = 37.5409 kcal/mol

Uliq = ( 1570.5383 + 1446.9559 ) / 117 = 25.7905 kcal/mol

RT = 8.314 * 300 / 1000 / 4.184 = 0.5961 kcal/mol

dHvap= 37.5409 – 25.7905 + 0.5961 = 12.3465 kcal/mol

The experimental data is 12.00 kcal/mol at 298K (James S. Chickos, “Enthalpies of Vaporization of Organic and Organometallic Compounds, 1880–2002,” Journal of Physical and Chemical Reference Data 32, no. 2 (2003): 519, doi:10.1063/1.1529214.)

The density is 0.9352g/ml and the experimental data is 0.937g/ml.

So the density and heat of vaporization both fit well with experimental data.

GAFF is good for DMA!!

Continue the tricky calculation of solvation free energies with TI : (

Surface Plasmon Resonance

I saw this method today in one paper. It’s used to detect the binding between a small ligand and Fc chains of human antibody. The principle is simple. But technology is cool! The following like is from YouTube, the video illustrated the pricinple.

The combination between ligands and antibody could affect the electron on the the surface, and then affect the reflect lights.

Just like many other methods, SPR magnifies the microscopic change of system situation to macroscopic phenomenon.