Extract neat trajectory file from irregular AMBER trajectory files.

I usually chose 1fs as time step and output the trajectory file each picosecond. If the job can finish as it’s planned. I could get 5ns or 10ns trajetory files. But sometimes, due to the breakdown of HPCC or some problems of the system. I can not get a neat trajectory file. They might be 3.454ns or 8.332ns, etc. You may use CPPTRAJ to extract snapshots by simply using “start..end..interval” syntax. But you will find the output number is not what your simulation have run.

What I want to do in this blog is to extract trajectory files from a set of irregular trajectory files to 100ps/frame, which is much easier for analysis.

The BASH script is below. I have two solutions. One is using “start..end..interval” syntax. The other is using “onlyframes” when output. I prefer the first one. In the extracted trajectory file, the first frame is 100ps, the second is 200ps, and so on. I make use of the time information in RST file. OUT file is also fine, but sometimes it’s not accurate. When RST is generated, MDCRD file is also updated. So the time in RST file is better. You may use CPPTRAJ in AMBERTOOLS to run the output script.

list=`ls -lrtx1 md*.rst | head -n -1` #ignore the running job
d=(`ls -lrtx1 *.mdcrd* | head -n -1`)  #ignore the running job
a=(`for i in $list; do sed -n 2p ${i} | awk '{print $2}';done`)
for ((i=0;i<${#a[*]};i++));do
        ((b=`echo ${a[$i]} | grep E | wc -l`))
        if [ $b == 1 ];then
        m=`echo ${a[$i]} | cut -f1 -d"E"`
        ((n=`echo ${a[$i]} | cut -f2 -d"E"`))
        c+=(`echo "$m*(10^$n)"|bc | cut -f1 -d"."`)
        else
        c+=(`echo ${a[$i]} |cut -f1 -d"."`)
        fi
done
unset a

#echo ${c[*]} | tr ' ' '\n'
#echo ${d[*]}
((end=${c[${#c[*]}-1]}))
#echo end = $end
for ((i=100;i<=$end;i+=100));do
        ((sum=0))
        for ((j=0;j<${#c[*]};j++));do
                if (( ${c[$j]} >= $i )); then
                ((m= $i - $sum))
                        if (( $m >= 0 ));then
                        e+=(`echo ${d[$j]}_${m}`)
                        else break
                        fi
                fi
                ((sum=${c[$j]}))
        done
done


f=(`echo ${e[*]} | tr ' ' '\n' | cut -f1 -d"_" | uniq`)
echo parm \*prmtop
for ((i=0;i<${#f[*]};i++));do
       m=`echo ${e[*]} | tr ' ' '\n' | grep ${f[$i]} | cut -f2 -d"_" | head -1`
       n=`echo ${e[*]} | tr ' ' '\n' | grep ${f[$i]} | cut -f2 -d"_" | tail -1`
       echo trajin ${f[$i]} $m $n 100
done
md=(`echo ${f[$i-1]} | cut -f1 -d"."`)
echo autoimage
echo trajout till_${md}.dcd dcd


#solution to use "onlyframes"
#for ((i=0;i<${#f[*]};i++));do
#       m=`echo ${e[*]} | tr ' ' '\n' | grep ${f[$i]} | cut -f2 -d"_" | xargs echo | tr ' ' ','`
#       echo trajin ${f[$i]}
#       echo trajout test.dcd dcd onlyframes $m
#done

unset a b c d e f

Leave a comment