Basic trajectory post-processing w mdtraj

The first step in post-processing MD data is - usually! - concatenting multiple files and centering on a useful molecule, like the protein. This is easy enough to do with mdtraj but the code isn't that memorable and it isn't actually put together in the mdtraj examples.

The following loads a series of trajectory files (.dcd here, with .psf as the topology, but swap in other formats as required), then centers all frames on the protein. Following this are some optional steps like removing waters (saves space) and aligning on the protein by minimizing the RMSD.

Usage is python join.py 20 for 20 numbered dcd files.

      
import mdtraj as md
import sys

n = int(sys.argv[1])

trajs = list()

print('Loading trajectory files')
for i in range(1,n+1):
    try:
        print(f'loading {i}')
        traj = md.load_dcd(f'3PTB_{i}_traj.dcd', top='../system_setup/output.psf')
        traj.image_molecules(inplace=True)
        trajs.append(traj)
    except KeyboardInterrupt:
        raise
    except Exception as e:
        raise

print('Loaded, concatenating')
mtraj = md.join(trajs)

##Optional: remove waters, ions, and lipid.
mtraj = mtraj.atom_slice(traj.top.select('not resname HOH POPC CL NA'))
mtraj[0].save('protein.pdb')

print('Concatenated. Now aligning on protein')
prot = mtraj.top.select('protein')
mtraj.superpose(mtraj[0], atom_indices=prot)
print('Done, saving')


mtraj.save('joined.dcd')