Fall 2004 Team Meeting Mike Langston`s Research - UTK-EECS

PROGRESS REVIEW
Mike Langston’s Research Team
Department of Computer Science
University of Tennessee
with collaborative efforts at
Oak Ridge National Laboratory
30 November 2004
Team Members in Attendance
Nicole Baldwin, John Eblen, Mike Langston,
Jon Scharff, Josh Steadmon, Henry Suters,
Chris Symons, Yun Zhang
Team Members Absent
Daniel Lucio, Ian Watkins, Xinxia Peng
Mike Langston’s Progress Report
Fall, 2004
• Team Changes
– Graduating: Nicole Baldwin, Henry Suters
• Professional Leave Update
– ORNL, Collaborations, Team Opportunities
• Recent Talks
– IWPEC (Bergen, Norway), TMGC (Fall Creek Falls)
• Recent Visits
– LBNL (San Francisco), UCSD (San Diego), SC2004 (Pittsburgh)
• Upcoming Conference Trips
– AICCSA (Cairo), RTST (Lebanon)
• Recent Panel Duties
– NSF
• Recent Program Committee Service
– AICCSA, PDCS, IWPEC, HiCOMB
Nicole Baldwin
General Conclusions
• Bron and Kerbosch (modified version)
– “Best worst-case” algorithm
– Best experimental times (unless very sparse)
• Kose et al.
– Too much core memory, or
– Too many I/O operations
– Useful on Altix??
• Preprocessing useful for mid-range density
• Paracliques appear to correspond well with
QTL data
Work at ORNL
• Infrastructure proposal
– Learning experience
– Help extract what participants REALLY
need
• Sandia-ORNL GTL proposal
– Postulate queries for integrated database
– Research regulatory/metabolic pathway
reconstruction.
– Goal: build an in-house computational
biology center
John Eblen
ORNL Research
• Installed direct maximum clique codes
• Installing maximal clique codes
• Installing parallel versions of above on Altix
supercomputer
• Currently processing Dr. Gerling’s mouse
genome data set
• General goal, though, is a pipeline with tools,
procedures, etc. for processing any large data
set
Other Research
• Clique common neighbor algorithms
– Current algorithm suffers from data explosion
– Many different possible approaches
• Random graph walks
– Idea: collect statistics to help guess clique
locations
– Current algorithm seems to give little or no gain
over simply using vertex degrees
• Chordal graphs
– Clique solvable in polynomial time if graph is
chordal
– Correlation graphs should be very close to chordal
Updates from Xinxia
• Sept.: MS in CS defense
• Oct.: poster presentation on 7th Annual
Conference on Computational Genomics
(http://www.tigr.org/conf/cg/)
• Nov.: oral presentation on CAMDA 2004
(http://www.camda.duke.edu/camda04)
Jon Scharff
Possible Future Directions
• Look into possibility of adding threads to
vc branching (with Chris?)
• Look at other possible maximal clique
algorithms (see Faisal)
• Look into parallelizing maximal clique
codes (with Yun?)
Group Webpage
We have a design, now we need content…
Henry Suters
•Defended thesis: Crown Reductions and
Decompositions: Theoretical Results and Practical
Methods
•Beginning part time position at ORNL
•Beginning a project on using the structure MCS
graphs to aid in kernelization
•Triplets!
Chris Symons
Cluster Editing
• FPT
• Solvable in O(1.92k +
n 3)
• Implemented O(3k +
n3) version
• Easily parallelizable
• Produces a disjoint
union of cliques
k=3: 2 insertions; 1 deletion
Cluster Editing
• The following 2 rules
produce a k2 kernel
• Rule 1:
– If 2 vertices have more than
k common neighbors, they
must have an edge between
them
– If 2 vertices have more than
k non-common neighbors,
they must not have an edge
between them
– If they have both k common
and non-common
neighbors, then this is a “no”
instance
• Rule 2: Delete connected
components that are
cliques.
Rule 1: If k=1, edge (u,v) must be inserted.
v
u
Rule 2: component 2 can be removed.
component 2
component 1
Yun Zhang
Work at ORNL
• Maximum Common Subgraph problem (with
Chris)
– Transform to Maximum Clique (MC) problem by
constructing an association graph
• MC in association graph  MCS of two graphs
• Solve MC problem using VC codes
– Can handle directed, labeled graphs
– Can’t handle very large graphs
• Due to the huge size of association graph (|V1|×|V2|)
• Preprocess association graph (with Henry)
– Applications: biology, chemistry, pattern recognition
Other Works
• Cleanup and update VC codes (with Faisal)
– Modulization all codes
• Make VC codes as libraries
– Provide a suite of applications: VC, MC, MCS, SAT,
…
– Three kernelization methods: LP, Network Flow,
Crown Reduction
• Look into parallelizing maximal clique codes