3. Quickstart: using the hop package — hop.interactive
¶
A typical session starts with a trajectory (which should have been RMS-fitted to a reference structure). Any topology and trajectory file suitable for MDAnalysis can be used such as PSF+DCD, PDB+XTC or a single PDB. In the following Charmm/NAMD psf and dcd files are used as examples.
We will use the high-level wrapper functions in hop.interactive
:
>>> import hop
>>> from hop.interactive import (make_density, analyze_density,
... make_hoppingtraj, build_hoppinggraph)
3.1. Hydration sites¶
Hydration sites are sites of water density higher than the bulk density but one special site is the bulk. The hydration sites and the bulk site are computed in two separate steps.
3.1.1. High density sites¶
First build the density of the water oxygens.
>>> density = make_density(psf,dcd,filename,delta=1.0)
The density is also saved as a pickled python object so that one can
easily reload it. The density is also exported as a dx file for
visualization (e.g. use hop.interactive.visualize_density()
,
which calls VMD).
From the density one creates the site map for a given threshold (by default this is a multiple of the water bulk density):
>>> density.map_sites(threshold=2.72)
Experiment with the threshold; hop.analysis.DensityAnalysis
can help
to systematically explore the parameter space, and it is also helpful
to load the density into a visualization software such as VMD and
interactively explore contour levels. Values between 1.65 and 3 have
given decent results in the past but this is system-dependent.)
3.1.2. Bulk site¶
For a full analysis of hopping events one also needs to define a bulk site. This is currently accomplished by calculating a second bulk density (all water not within 3.5 Å of the protein) and manually inserting the bulk site into the site map for the first density.
>>> density_bulk = make_density(psf,dcd,'bulk',delta=1.0,
... atomselection='name OH2',
... soluteselection='protein and not name H*',
... cutoff=3.5
... )
Using VMD’s VolMap can be potentially be faster — try it if the default seems too slow to you:
>>> density_bulk = make_density(psf,dcd,'bulk',delta=1.0,
... atomselection='name OH2 and not within 3.5 of (protein and name not hydrogen)',
... backend='VMD',load_new=False)
The bulk density should be a big, well defined volume so we choose a fairly low threshold:
>>> density_bulk.map_sites(0.6)
Add the biggest bulk site:
>>> density.site_insert_bulk(density_bulk)
>>> density.save()
>>> del density_bulk
Note
Behind the scenes, the bulk is simply prepended to the list
of all sites (hop.sitemap.Density.sites
), found so far. By
convention the site at position 1 in the list of all sites is
treated specially in many parts of hop (it has the so-called
sitelabel “1”, which is simply the position in the list of sites)
and hence you might encounter unexpected behaviour later if you do
not insert a bulk site.
Statistics about the sites can be produced with
>>> analyze_density(density,figname)
The results figures will be named <figname>.pdf.
3.1.3. Remapping for comparing site maps¶
This section is only relevant if you plan on comparing site maps. Then you must compare the density to your reference density now before proceeding. You will
remap this density to be defined on the same grid as the reference density (for this to work, this density must have been generated from a trajectory that has been RMS-fitted to the same reference structure as; see hop.trajectory.rms_fit_trj() and hop.trajectory.fasta2select())
>>> ref_density = hop.sitemap.Density(filename='my_reference_density') >>> remapped_density = hop.sitemap.remap_density(density,ref_density)find the equivalence sites in the two densities and add those sites to both densities:
>>> remapped_density.find_equivalence_sites_with(ref_density,verbosity=3) >>> remapped_density.save(<filename>) >>> ref_density.save()(You must also recalculate the reference densities hopping trajectory (see below) because some sites may have been merged into ‘equivalence sites’. See docs for hop.sitemap.find_equivalence_sites_with() and hop.graph.CombinedGraph()).
From now on, work with the remapped density: >>> density = remapped_density
3.2. Hopping trajectory¶
Next we translate the dcd into a ‘hopping trajectory’ (saved in dcd format) in which coordinates for a given water oxygen are replaced by the site it visits at each time step.
>>> hops = make_hoppingtraj(density,'hop_water+bulk')
All further analysis should use this hopping trajectory (from disk) as it is computationally much cheaper to read the trajectory than to re-translate the coordinate trajectory (which is done behind the scences if the hopping trajectory is not available).
3.3. Hopping graph¶
The final step is to map out the graph of transitions between sites (using the hopping trajectory):
>>> tn = build_hoppinggraph(hops,density)
tn.hopgraph holds this graph (tn.graph just contains all jumps including the interstitial and off-sites). The edges of hopgraph are the rate constants k_ji (in 1/ps) for hops i –> j. They are computed from an exponential fit to the site survival function S_ji(t) for particles waiting to hop from i to j.
The density is provided to attach data to the nodes of the hopgraph. It is required for visualization and analysis (although not strictly necessary for the hopgraph itself).
Further analysis uses tn.hopgraph:
>>> h = tn.hopgraph # main result is the 'hopgraph'
>>> h.save('hopgraph') # save the hopping graph (necessary for cg part)
>>> h.filter(exclude={'outliers':True, 'Nmin':2, 'unconnected':True})
>>> h.show_rates() # show all calculated rate constants (filtered graph)
>>> h.plot_fits(xrange(301)) # plot rate constant fits for t=0ps to 300ps
>>> h.plot_fits()
>>> h.export('water') # write dot file to visualize (filtered) graph
To compare the water network based on density with another hop graph (based on ref_density), construct the CombinedGraph:
>>> h_ref = hop.graph.HoppingGraph(filename=<filename>) --- basically repeat steps from
### --- ref_density only with different labels
>>> cg = hop.graph.CombinedGraph(g0=h,g1=h_ref)
>>> cg.plot(0,'cg_h',linewidths=(0.01,))
>>> cg.plot(1,'cg_h_ref',linewidths=(0.01,))
3.4. Other topics¶
The following topics are not fully documented but the individual functions and classes contain some hints on how to make them work for the purposes outlined below:
- Remapping densities to a reference density (see
hop.sitemap.remap_density()
). - Comparing densities and finding equivalence sites (see
hop.sitemap.find_common_sites()
andhop.sitemap.Density.find_equivalence_sites_with()
). - Comparing hopgraphs across different simulations: requires equivalence sites in
both densities; then build the
hop.graph.CombinedGraph
.
3.5. Functions¶
-
hop.interactive.
analyze_density
(density, figure='sitestats')¶ Site statistics based on the density alone.
Plots site volumes, average densities and occupancy, and writes it to the pdf file <figure>.pdf
-
hop.interactive.
build_hoppinggraph
(hoppingtrajectory, density)¶ Compute the graph of all site hops and calculate the rate constants.
tgraph = build_hoppinggraph(hops,density)
Arguments: - hops
- hop.trajectory.HoppingTrajectory object
- density
- hop.sitemap.Density object
Returns: tgraph, a hop.graph.TransportNetwork
object
-
hop.interactive.
build_hoppinggraph_fromfiles
(hoppingtrajectory_filename, density_filename)¶ Compute the TransportNetwork including HoppingGraph from files.
tn = build_hoppinggraph_fromfiles(‘hoptraj’,’water_density’)
Input: hoppingtrajectory_filename filename for HoppingTrajectory psf and dcd density_filename filename for pickled Density
Output: tn hop.graph.TransportNetwork object (qv)
-
hop.interactive.
generate_densities
(*args, **kwargs)¶ Analyze the trajectory and generate solvent and bulk density.
generate_densities(topol, traj, atomselection=’name OW’) –> densities
This function can take a long time because it has to read the whole trajectory. Progress is printed to the screen. It saves results to pickle files. These files are
hop.sitemap.Density
objects and can be used to instantiate such a density object.Arguments: - filename
name of the solvent density with bulk site
- bulkname
bulk density
- density_unit
unit of measurement for densities and thresholds (Molar, nm^{-3}, Angstrom^{-3}, water, SPC, TIP3P, TIP4P)
- solvent_threshold : exp(1) = 2.7182818284590451
hydration sites when density > this threshold
- bulk_threshold : exp(-0.5) = 0.60653065971263342
bulk site are regions with density > this threshold (and water farther away from the protein heavy atoms than cutoff)
- delta : 1.0
cubic grid size in Angstrom
- cutoff
bulk-water is assumed to start at this distance from the soluteselection
- soluteselection : “protein and not name H*”
how to select the solute (for bulk density)
Returns: a dict containing
hop.sitemap.Density
instances for the the “solvent” and the “bulk” density; the “solvent” has the bulk site (largest site in “bulk”) inserted as site 1.Note
The “solvent” density is going to be used throughout the rest of the protocol. Should you ever remap the sites (i.e. run
map_sites()
with a different threshold) then you must insert the bulk site again (because the bulk site is removed for technical reasons whenever the sites change); use the saved bulk site and thehop.sitemap.Density.site_insert_bulk()
method.See also
Keyword arguments are passed on to
hop.density.DensityCreator
where all possible keywords are documented; the site mapping is done withhop.sitemap.Density.map_sites()
.
-
hop.interactive.
hopgraph_basic_analysis
(h, density, filename)¶ Do some simple analysis tasks on the hopgraph.
hopgraph_basic_analysis(h, density, filename)
Arguments: - h
hopgraph, a
hop.graph.HoppingGraph
- density
density, a
hop.sitemap.Density
- filename
default filename for generated files; all files and new directories are written in the directory pointed to by the path component
-
hop.interactive.
make_density
(psf, dcd, filename, delta=1.0, atomselection='name OH2', **kwargs)¶ Build the density by histogramming all the water oxygens in a dcd.
density = make_density(psf,dcd,filename,delta=1.0)
The function builds the density object, writes it to disk, and also exports it as a dx file for visualization (use vizualize_density(density)).
Arguments: - *psf
topology
- dcd
trajectory (should be RMS fitted to a reference frame)
- filename
default filename for the density
- delta
grid spacing Angstrom
- kwargs:
- padding
increase box dimensions for 3D histogramming by padding
soluteselection cutoff
for bulk density: setting both soluteselection=’protein and not name H*’ and cutoff=3.5 A selects ‘<atomsel> NOT WITHIN <cutoff> OF <solutesel>’
Returns: density,
hop.sitemap.Density
object; the density is converted to a fraction of the density of bulk TIP3P water
-
hop.interactive.
make_hoppingtraj
(density, filename, **hopargs)¶ Create the hopping trajectory from a density with a site map.
hops = make_hoptraj(density,filename)Arguments: - density
density object with a site map
- filename
prefix for the hop trajectory files (psf and dcd)
- hopargs
keyword args to add to
HoppingTrajectory
such as fixtrajectory = {‘delta’:10.22741474887299}
This function relies on the density’s metadata. In particular it uses density.metadata[‘psf’] and metadata[‘dcd’] to find its input data and metadata[‘atomselection’] to define the atoms to track.
-
hop.interactive.
make_xstal_density
(pdb, filename, **kwargs)¶ Generate a density from the crystalwaters in a PDB.
For arguments see
make_density()
.(These water are typically named HOH.)
See also
Water molecules are counted as point-like particles. One can also use
hop.sitemap.BfactorDensityCreator
to broaden water molecules according to their B-factor.
-
hop.interactive.
visualize_density
(density)¶ Visualize the trajectory with the density in VMD.
visualize_density(density)Arguments: - density
hop.sitemap.Density object