Calculations in the - University of Pittsburgh

Visualization and Data Management
(Calculations in the Real World)
Prof. Geoff Hutchison
Department of Chemistry
University of Pittsburgh
[email protected]
!
May 8, 2014
http://hutchison.chem.pitt.edu/
http://avogadro.sf.net/
Common Tasks
Molecule Building & Editing
Running Calculations
Visualization
Results / Analysis
Archiving
Collaboration
There’s an app for that!
(Or.. there should be..)
Several Packages
Avogadro
http://avogadro.cc
Editing, visualization
cclib
http://cclib.sf.net
Data extraction
Open Babel
http://openbabel.org
Format interconversion,
batch processing
Example Workflow
A
download
B
editor
database
C
D
calculation
visualization
Desired Workflow with Avogadro
Avogadro
download
& share
A
interconversion
B
interconversion
visualization
& editor
database
http://avogadro.cc/
C
D
calculation
methods
interconversion
(by Open Babel)
What is Avogadro?
Avogadro Project
Free, open source molecular editor
Cross-platform: Win, Mac, Linux...
Fast, intuitive, flexible
Extensible: plugins & scripting
Over 480,000 downloads (112k in 2013)
Dozens of contributors 20+ foreign languages
http://avogadro.cc/
Problem Domains
Small molecules
Organic vs. Inorganic
Biomolecules
Peptides, DNA, RNA, Sugars...
Polymers
Crystals
Slabs, Slices, etc.
Nanotubes, Nanodots, Nanorods, NanoXYZ
Avogadro aims to be “the best builder” YOUR RESEARCH HERE
for chemistry simulation and visualization
Avogadro Architecture
OpenGL
POV-Ray
Painters
Extensions
Manipulate
Extensions
Tool
Tools
Scripting
Extensions
Balls &
Extensions
Sticks
Extensions
Python
Extensions
Tools
Engines
Extensions
Colors
Extensions
Extensions
Elements
Open Babel
Qt
Eigen
Extensions
Force
Extensions
Fields
Avogadro is an Extensible Library
~10,000
lines in GUI app
~30,000
lines in library
~70,000 lines of plugins
Custom extensions for your needs
XtalOpt: GA for crystal cell generation
Packmol: Prepare for MD runs
Custom coloring / rendering
Embed libavogadro in your code / Python
Quick Run-Down on Features: Rendering
Wireframe
Spheres
Balls & Sticks
Ribbon
Sticks
H-Bond
Quick Run-Down on Features: Tools
Draw
Navigate
Select
Auto-Rotate
Bond Manipulate
Auto-Opt
Manipulate
Measure
Align
Open Babel
Open Babel (Started 2001)
Free, open source chemical toolbox
Cross-platform: Win, Mac, Linux...
Both user-tools & C++ library
Interfaces in Python, Perl, Ruby, Java,
C#, PHP, R, etc.
Supports chemistry, bioinformatics,
solid-state...
http://openbabel.org/
Challenges:
A Plethora of File Formats
Currently supported input types
alc -- Alchemy file
prep -- Amber PREP file
bs -- Ball & Stick file
caccrt -- Cacao Cartesian file
ccc -- CCC file
c3d1 -- Chem3D Cartesian 1 file
c3d2 -- Chem3D Cartesian 2 file
cml -- Chemical Markup Language file
crk2d -- CRK2D: Chemical Resource Kit 2D file
crk3d -- CRK3D: Chemical Resource Kit 3D file
box -- Dock 3.5 Box file
dmol -- DMol3 Coordinates file
feat -- Feature file
gam, gamout -- GAMESS Output file
gpr -- Ghemical Project file
mm1gp -- Ghemical MM file
qm1gp -- Ghemical QM file
hin -- HyperChem HIN file
jout -- Jaguar Output file
bin -- OpenEye Binary file
mmd,mmod -- MacroModel file
out,dat -- MacroModel file
car -- MSI Biosym/Insight II CAR file
sd,sdf -- MDL Isis SDF file
mdl -- MDL Molfile file
mol -- MDL Molfile
mopcrt -- MOPAC Cartesian file
mopout -- MOPAC Output file
mmads -- MMADS file
mpqc -- MPQC file
bgf -- MSI BGF file
nwo -- NWChem Output file
ent,pdb -- PDB file
pqs -- PQS file
qcout -- Q-Chem Output file
ins,res -- ShelX file
smi -- SMILES file
mol2 -- Sybyl Mol2 file
unixyz -- UniChem XYZ file
vmol -- ViewMol file
xyz -- XYZ file
Currently supported output types
alc -- Alchemy file
bs -- Ball & Stick file
caccrt -- Cacao Cartesian file
cacint -- Cacao Internal file
cache -- CAChe MolStruct file
c3d1 -- Chem3D Cartesian 1 file
c3d2 -- Chem3D Cartesian 2 file
ct -- ChemDraw Connection Table file
cht -- Chemtool file
cml -- Chemical Markup Language file
crk2d -- CRK2D: Chemical Resource Kit 2D file
crk3d -- CRK3D: Chemical Resource Kit 3D file
cssr -- CSD CSSR file
box -- Dock 3.5 Box file
dmol -- DMol3 Coordinates file
feat -- Feature file
fh -- Fenske-Hall Z-Matrix file
gamin,inp -- GAMESS Input file
gcart -- Gaussian Cartesian file
gau -- Gaussian Input file
gpr -- Ghemical Project file
gr96a -- GROMOS96 (A) file
gr96n -- GROMOS96 (nm) file
hin -- HyperChem HIN file
jin -- Jaguar Input file
bin -- OpenEye Binary file
mmod,dat,mmd -- MacroModel file
sd,sdf -- MDL Isis SDF file
mdl,mol -- MDL Molfile
mopcrt -- MOPAC Cartesian file
mmads -- MMADS file
bgf -- MSI BGF file
csr -- MSI Quanta CSR file
nw -- NWChem Input file
ent,pdb -- PDB file
pov -- POV-Ray Output file
pqs -- PQS file
report -- Report file
qcin -- Q-Chem Input file
fix,smi -- SMILES file
mol2 -- Sybyl Mol2 file
At last count... Open Babel supports 100+
formats with more requested by users
PLUS:
Multiple software versions
Non-standard implementations!
Challenges:
Many Representations of Chemical Data
Molecular Mechanics:
Atom & bond types,
No orbitals
Quantum Mechanics:
PLUS:
Explicit
or implicit hydrogens?
Atoms
(no typing),
No
“bonds”
Different atom typing rules!
Crystallography:
2D vs. 3D
Fractional coordinates
c1cccc1
Proteins and biomolecules
Solid State Codes:
cells
/ translation
…
Unit
Daylight SMILES
Connectivity only
No coordinates!
Lazy Perception in Action...
Gaussian 98/03 Output
Sybyl Mol2
OBMol
Connectivity assignment
Bond perception needed:
double bonds, functional groups, aromaticity
Atom typing & partial charges assigned
Solving the
Chemical Representation “Problem”
XYZ
MDL Molfile
Sybyl Mol2
CML
PDB
Open Babel
(etc.)
SMARTS
Atom Typing
Bond Typing
(etc.)
Whole is greater than the sum of all parts:
No one person handles all file formats
Key goal reflected in “lazy evaluation”
Leave no data behind, but “perceive” as little as
possible — conversion should not create data!
Solving the
Chemical Representation “Problem”
Editor / Viewer
Open Babel
Analysis
Database
Open Babel
Open Babel
OpenGL
Graphics
Code reuse through open source code:
Focus on problems beyond the basics
New science, not new software development
Rapid development
Reduce non-standard file formats & bugs
Code-Reuse Example: obgrep
Match Molecular Patterns
Database
Open Babel
Total 216 lines of C++ code:
Includes blank lines & comments!
Contributed code, not originally
part of Open Babel library
Matches SMARTS molecular
patterns in database file(s)
Import/Export handled by
Open Babel
“Database” can be any file format,
any computer, any drive
Example Workflows
Custom Monte Carlo program (385 lines)
Read Gaussian output, calculations, write XYZ
Batch Conversion
MM optimization DFT INDO excitations
Crystal Structure
Fractional coordinates, convert & add hydrogens
Conversion of unit cell parameters to vectors
Editing & Visualization
Editor DFT View orbitals, vibrations, etc.
Examples
Aligning molecules or fragments
# align molecules in dataset.xxx to pattern.www based on the pattern SMARTS !
obabel pattern.www dataset.xxx -­‐O outset.yyy -­‐s SMARTS -­‐-­‐align -­‐-­‐append rmsd
# align the two conformers closest to the first conformer in dataset.xxx obabel dataset.xxx -­‐O outset.yyy -­‐-­‐align -­‐-­‐smallest 2 rmsd
!
Generate 3D coordinates and perform
conformer searching
obabel ligand.babel.smi -­‐O ligand.babel.sdf -­‐-­‐gen3d -­‐-­‐conformer -­‐-­‐nconf 20 -­‐-­‐weighted
!
Get a force field energy or minimize
obabel infile.xxx -­‐otxt -­‐-­‐energy -­‐-­‐ff MMFF94 -­‐-­‐append "Energy"
obabel infile.xxx -­‐O outfile.yyy -­‐-­‐minimize -­‐-­‐ff UFF -­‐-­‐steps 1500 -­‐-­‐sd
What is cclib?
cclib Project
Free, open source package (Python)
Parses & interprets comp. chem. results
Standardized interface for data
Algorithms beyond particular packages
ADF, Firefly, GAMESS, Gaussian, Jaguar,
NWChem, Molpro, ORCA, Psi
Charges, energies, basis sets, etc.
http://cclib.github.io/
Why cclib?
Rapidly grab data from output files
(e.g., across an entire directory at once!)
# print final SCF energy for a file
file = ccopen(filename) molecule = file.parse() print filename, molecule.scfenergies[-­‐1]
if (len(molecule.homos) == 1): # closed shell (only alpha spins) homo = molecule.homos[0] moenergies = molecule.moenergies[0] print filename, moenergies[homo], " | ", moenergies[homo+1]
cclib: Current Package Support
Gaussian
GAMESS-US, GAMESS-UK, Firefly
Jaguar
NWChem
Molpro
ORCA
Psi4
cclib: Parsed Data
http://cclib.github.io/data.html
Atoms: Coordinates, Partial Charges, Spin Densities
Energies (SCF, MP, CC corrections)
Electronic Excitations (energies, osc. strengths, …)
MO orbitals (eigenvalues, coefficients, etc.)
Gradients, Hessians
Vibrations (energies, osc. strengths, etc.)
Potential Energy Scan (energies, coordinates)
Fragments (fragment MOs)
cclib: Algorithms
http://cclib.github.io/methods.html
Population Analysis (C squared, Mulliken, Lowdin, Overlap)
Density Matrix calculation
Electron Density
Charge Decomposition Analysis
Mayer Bond Orders
Fragment Analysis (with custom fragments)
Avogadro Demos
Avogadro Demo: Small Molecule Building
Draw Tool: Click & Drag
Click on bond: Single, Double, Triple
Change Element, Click on atom: Alchemy
Auto-Adjust Hydrogens
Manually Add H after drawing
Keyboard Shortcuts:
Type element symbol to change
Type 1, 2, 3, 4 to change bond order
Control + 1, 2, 3 to change tools
Avogadro Demo: Larger Molecules
Copy/paste & Undo
Insert Fragments
Select h-atom to “grow” fragment (v 1.1)
Insert SMILES
Useful for polymers
“Sculpting” using Auto-Optimization
Adjusting angles, bond lengths, dihedrals
Bond-Centric Manipulate Tool
View Properties ...
Avogadro Demo: Peptides
Custom Sequences
Helix, Sheets, Custom Conformations
Multiple Chains
Visualization methods
Avogadro Demo: Building Crystals
New Interface in v1.1
Still in development: feedback welcome
Build slabs and nanoparticles (soon)
Edit cell parameters
Build supercells (simple slabs)
Edit Fractional Coordinates (v1.1)
Detect / Set Space Group (v1.1)
Avogadro Demo:Visualization Modes
Standard Representations
Balls & Sticks, Labels, etc.
Forces
Hydrogen-Bonding
Color Modes
Avogadro Demo: Force Fields / Conformers
Auto-Optimize Tool
Extension / Menu Item
Setting Constraints
Fixed / Frozen Atoms
Ignored Atoms
General Constraints (bonds, angles, …)
Finding & Generating Conformers
Avogadro Demo: More Extensions
Spectra Viewer (IR, UV/Vis, NMR)
Import experimental data
Vibrations & Orbitals
Molecular Surfaces
External Packages
Gaussian, GAMESS, Q-Chem…
Abinit
Packmol, XtalOpt (3rd party)