3. Quickstart: using the hop package — hop.interactive

A typical session starts with a trajectory (which should have been RMS-fitted to a reference structure). Any topology and trajectory file suitable for MDAnalysis can be used such as PSF+DCD, PDB+XTC or a single PDB. In the following Charmm/NAMD psf and dcd files are used as examples.

We will use the high-level wrapper functions in hop.interactive:

>>> import hop
>>> from hop.interactive import (make_density, analyze_density,
...                              make_hoppingtraj, build_hoppinggraph)

3.1. Hydration sites

Hydration sites are sites of water density higher than the bulk density but one special site is the bulk. The hydration sites and the bulk site are computed in two separate steps.

3.1.1. High density sites

First build the density of the water oxygens.

>>> density = make_density(psf,dcd,filename,delta=1.0)

The density is also saved as a pickled python object so that one can easily reload it. The density is also exported as a dx file for visualization (e.g. use hop.interactive.visualize_density(), which calls VMD).

From the density one creates the site map for a given threshold (by default this is a multiple of the water bulk density):

>>> density.map_sites(threshold=2.72)

Experiment with the threshold; hop.analysis.DensityAnalysis can help to systematically explore the parameter space, and it is also helpful to load the density into a visualization software such as VMD and interactively explore contour levels. Values between 1.65 and 3 have given decent results in the past but this is system-dependent.)

3.1.2. Bulk site

For a full analysis of hopping events one also needs to define a bulk site. This is currently accomplished by calculating a second bulk density (all water not within 3.5 Å of the protein) and manually inserting the bulk site into the site map for the first density.

>>> density_bulk = make_density(psf,dcd,'bulk',delta=1.0,
...         atomselection='name OH2',
...         soluteselection='protein and not name H*',
...         cutoff=3.5
...         )

Using VMD’s VolMap can be potentially be faster — try it if the default seems too slow to you:

>>> density_bulk = make_density(psf,dcd,'bulk',delta=1.0,
...         atomselection='name OH2 and not within 3.5 of (protein and name not hydrogen)',
...         backend='VMD',load_new=False)

The bulk density should be a big, well defined volume so we choose a fairly low threshold:

>>> density_bulk.map_sites(0.6)

Add the biggest bulk site:

>>> density.site_insert_bulk(density_bulk)
>>> density.save()
>>> del density_bulk

Note

Behind the scenes, the bulk is simply prepended to the list of all sites (hop.sitemap.Density.sites), found so far. By convention the site at position 1 in the list of all sites is treated specially in many parts of hop (it has the so-called sitelabel “1”, which is simply the position in the list of sites) and hence you might encounter unexpected behaviour later if you do not insert a bulk site.

Statistics about the sites can be produced with

>>> analyze_density(density,figname)

The results figures will be named <figname>.pdf.

3.1.3. Remapping for comparing site maps

This section is only relevant if you plan on comparing site maps. Then you must compare the density to your reference density now before proceeding. You will

  1. remap this density to be defined on the same grid as the reference density (for this to work, this density must have been generated from a trajectory that has been RMS-fitted to the same reference structure as; see hop.trajectory.rms_fit_trj() and hop.trajectory.fasta2select())

    >>> ref_density = hop.sitemap.Density(filename='my_reference_density')
    >>> remapped_density = hop.sitemap.remap_density(density,ref_density)
    
  2. find the equivalence sites in the two densities and add those sites to both densities:

    >>> remapped_density.find_equivalence_sites_with(ref_density,verbosity=3)
    >>> remapped_density.save(<filename>)
    >>> ref_density.save()
    

    (You must also recalculate the reference densities hopping trajectory (see below) because some sites may have been merged into ‘equivalence sites’. See docs for hop.sitemap.find_equivalence_sites_with() and hop.graph.CombinedGraph()).

    From now on, work with the remapped density: >>> density = remapped_density

3.2. Hopping trajectory

Next we translate the dcd into a ‘hopping trajectory’ (saved in dcd format) in which coordinates for a given water oxygen are replaced by the site it visits at each time step.

>>> hops = make_hoppingtraj(density,'hop_water+bulk')

All further analysis should use this hopping trajectory (from disk) as it is computationally much cheaper to read the trajectory than to re-translate the coordinate trajectory (which is done behind the scences if the hopping trajectory is not available).

3.3. Hopping graph

The final step is to map out the graph of transitions between sites (using the hopping trajectory):

>>> tn = build_hoppinggraph(hops,density)

tn.hopgraph holds this graph (tn.graph just contains all jumps including the interstitial and off-sites). The edges of hopgraph are the rate constants k_ji (in 1/ps) for hops i –> j. They are computed from an exponential fit to the site survival function S_ji(t) for particles waiting to hop from i to j.

The density is provided to attach data to the nodes of the hopgraph. It is required for visualization and analysis (although not strictly necessary for the hopgraph itself).

Further analysis uses tn.hopgraph:

>>> h = tn.hopgraph           # main result is the 'hopgraph'
>>> h.save('hopgraph')        # save the hopping graph (necessary for cg part)
>>> h.filter(exclude={'outliers':True, 'Nmin':2, 'unconnected':True})
>>> h.show_rates()            # show all calculated rate constants (filtered graph)
>>> h.plot_fits(xrange(301))  # plot rate constant fits for t=0ps to 300ps
>>> h.plot_fits()
>>> h.export('water')         # write dot file to visualize (filtered) graph

To compare the water network based on density with another hop graph (based on ref_density), construct the CombinedGraph:

>>> h_ref = hop.graph.HoppingGraph(filename=<filename>) --- basically repeat steps from
###                                                     --- ref_density only with different labels
>>> cg = hop.graph.CombinedGraph(g0=h,g1=h_ref)
>>> cg.plot(0,'cg_h',linewidths=(0.01,))
>>> cg.plot(1,'cg_h_ref',linewidths=(0.01,))

3.4. Other topics

The following topics are not fully documented but the individual functions and classes contain some hints on how to make them work for the purposes outlined below:

3.5. Functions

hop.interactive.analyze_density(density, figure='sitestats')

Site statistics based on the density alone.

Plots site volumes, average densities and occupancy, and writes it to the pdf file <figure>.pdf

hop.interactive.build_hoppinggraph(hoppingtrajectory, density)

Compute the graph of all site hops and calculate the rate constants.

tgraph = build_hoppinggraph(hops,density)

Arguments:
hops
hop.trajectory.HoppingTrajectory object
density
hop.sitemap.Density object
Returns:tgraph, a hop.graph.TransportNetwork object
hop.interactive.build_hoppinggraph_fromfiles(hoppingtrajectory_filename, density_filename)

Compute the TransportNetwork including HoppingGraph from files.

tn = build_hoppinggraph_fromfiles(‘hoptraj’,’water_density’)

Input: hoppingtrajectory_filename filename for HoppingTrajectory psf and dcd density_filename filename for pickled Density

Output: tn hop.graph.TransportNetwork object (qv)

hop.interactive.generate_densities(*args, **kwargs)

Analyze the trajectory and generate solvent and bulk density.

generate_densities(topol, traj, atomselection=’name OW’) –> densities

This function can take a long time because it has to read the whole trajectory. Progress is printed to the screen. It saves results to pickle files. These files are hop.sitemap.Density objects and can be used to instantiate such a density object.

Arguments:
filename

name of the solvent density with bulk site

bulkname

bulk density

density_unit

unit of measurement for densities and thresholds (Molar, nm^{-3}, Angstrom^{-3}, water, SPC, TIP3P, TIP4P)

solvent_threshold : exp(1) = 2.7182818284590451

hydration sites when density > this threshold

bulk_threshold : exp(-0.5) = 0.60653065971263342

bulk site are regions with density > this threshold (and water farther away from the protein heavy atoms than cutoff)

delta : 1.0

cubic grid size in Angstrom

cutoff

bulk-water is assumed to start at this distance from the soluteselection

soluteselection : “protein and not name H*”

how to select the solute (for bulk density)

Returns:

a dict containing hop.sitemap.Density instances for the the “solvent” and the “bulk” density; the “solvent” has the bulk site (largest site in “bulk”) inserted as site 1.

Note

The “solvent” density is going to be used throughout the rest of the protocol. Should you ever remap the sites (i.e. run map_sites() with a different threshold) then you must insert the bulk site again (because the bulk site is removed for technical reasons whenever the sites change); use the saved bulk site and the hop.sitemap.Density.site_insert_bulk() method.

See also

Keyword arguments are passed on to hop.density.DensityCreator where all possible keywords are documented; the site mapping is done with hop.sitemap.Density.map_sites().

hop.interactive.hopgraph_basic_analysis(h, density, filename)

Do some simple analysis tasks on the hopgraph.

hopgraph_basic_analysis(h, density, filename)

Arguments:
h

hopgraph, a hop.graph.HoppingGraph

density

density, a hop.sitemap.Density

filename

default filename for generated files; all files and new directories are written in the directory pointed to by the path component

hop.interactive.make_density(psf, dcd, filename, delta=1.0, atomselection='name OH2', **kwargs)

Build the density by histogramming all the water oxygens in a dcd.

density = make_density(psf,dcd,filename,delta=1.0)

The function builds the density object, writes it to disk, and also exports it as a dx file for visualization (use vizualize_density(density)).

Arguments:
*psf

topology

dcd

trajectory (should be RMS fitted to a reference frame)

filename

default filename for the density

delta

grid spacing Angstrom

kwargs:
padding

increase box dimensions for 3D histogramming by padding

soluteselection cutoff

for bulk density: setting both soluteselection=’protein and not name H*’ and cutoff=3.5 A selects ‘<atomsel> NOT WITHIN <cutoff> OF <solutesel>’

Returns:

density, hop.sitemap.Density object; the density is converted to a fraction of the density of bulk TIP3P water

hop.interactive.make_hoppingtraj(density, filename, **hopargs)

Create the hopping trajectory from a density with a site map.

hops = make_hoptraj(density,filename)
Arguments:
density

density object with a site map

filename

prefix for the hop trajectory files (psf and dcd)

hopargs

keyword args to add to HoppingTrajectory such as fixtrajectory = {‘delta’:10.22741474887299}

This function relies on the density’s metadata. In particular it uses density.metadata[‘psf’] and metadata[‘dcd’] to find its input data and metadata[‘atomselection’] to define the atoms to track.

hop.interactive.make_xstal_density(pdb, filename, **kwargs)

Generate a density from the crystalwaters in a PDB.

For arguments see make_density().

(These water are typically named HOH.)

See also

Water molecules are counted as point-like particles. One can also use hop.sitemap.BfactorDensityCreator to broaden water molecules according to their B-factor.

hop.interactive.visualize_density(density)

Visualize the trajectory with the density in VMD.

visualize_density(density)
Arguments:
density

hop.sitemap.Density object