Project : Research and development: – Topic: pick one from lecture

Project : Research and development:
–
Topic: pick one from lecture notes (ln3 – ln11)
Example:
A MapReduce algorithm for graph simulation
Development
–
Pick a research paper from the reading list of ln3 – ln11 Implement its main
algorithms
–
Conduct its experimental study
Distribution:
–
Algorithms: technical depth, performance guarantees
–
Prove the correctness, complexity analysis and performance guarantees of
your algorithms
15%
–
Justification (experimental evaluation)
20%
10%
Report: in the form of technical report/research paper
–
Introduction: problem statement, motivation
–
Related work: survey
–
Techniques; algorithms, illustration
–
Correctness/complexity/property/proofs
–
Experimental evaluation
–
Possible extensions
via intuitive examples
Project: Survey
Topic: pick one topic from a lecture note (ln3 – ln11)
Example: distributed graph query engines; distributed algorithms for querying graphs
Distribution:
–
Select 5-6 representative papers, independently
–
Develop a set of criteria: the most important issues in that line of research,
based on your own understanding; justify your criteria
10%
–
Evaluate each of the papers based on your criteria
10%
15%
–
A table to summarize the assessment, based on your criteria, draw and
justify your conclusion and recommendation for various application
10%
Project report and presentation – 15%
•
A clear problem statement
•
Motivation and challenges
•
Key ideas, techniques/approaches
•
Key results – what you have got, intuitive examples
•
Findings/recommendations for different applications
•
Demonstration: a must if you do a development project
•
Presentation: question handling (show that you have developed a good
understanding of the line of work)
Project list:
Project 1: Recall regular path queries:
Input: A node-labeled directed graph G, a pair of nodes s and t in G, and a regular
expression R
•
Question: Does there exist a path p from s to t that satisfies R? Develop two
algorithms for evaluating regular path queries:
•
a sequential algorithm by using 2-hop covers; and
•
an algorithm in MapReduce
•
Prove the correctness of your algorithms and give complexity analysis
•
Experimentally evaluate your algorithms, especially their scalability
Project 2: GPath. Extend XPath to query directed, node-labeled graphs.
Design GPath, a query language for graphs. A GPath query Q starts from a context node
v in a graph G, traverses G and returns all the nodes that are reachable from v by
following Q. GPath should support the child axis, wildcard *, self-or-descendants (//),
and filters (aka qualifiers, such as [p = c]). Justify your design.
Develop an algorithm that, given a GPath query Q, a graph G, and a context node v in G,
computes Q(G), the set of nodes reachable from v in G by following Q.
Give a complexity analysis of your algorithm and show its correctness.
Experimentally evaluate your algorithm
Project 3: Study keyword search in graphs.
Pick a semantics for keyword search and an algorithm for implementing keyword search
based on the semantics
Justify your choice: semantics and algorithm
Implement the algorithm in whatever language you like
Experimentally evaluate your implementation
Demonstrate your keyword search support
Project 4: Recall bounded graph simulation
 Implement an algorithm that, given a pattern Q and a graph G, computes the
maximum match of Q in G via bounded simulation
 Develop optimization strategies
 Experimentally evaluate your algorithm, especially its scalability with the size of G
 Write a survey on revisions of conventional graph simulation, as related work
Project 5: Recall graph simulation
 Develop a MapReduce algorithm that, given a pattern Q and a graph G, computes
the maximum match of Q in G via graph simulation
 Develop optimization strategies
 Experimentally evaluate your algorithm, especially its scalability with the size of G
 Write a survey on revisions of conventional graph simulation as part of the related
work
Project 6: Recall subgraph isomorphism
 Develop two algorithms that, given a pattern Q and a graph G, computes the
maximum match of Q in G via subgraph isomorphism, in MapReduce BSP
 Develop optimization strategies to reduce parallel computational cost and data
shipment cost
 Experimentally evaluate your algorithms especially their scalability with the size of G
 Write a survey on parallel algorithms for subgraph isomorphism
Project 7: Recall strongly connected components (Lecture 2)
 Implement a MapReduce algorithm that, given a graph G, computes all (maximum)
strongly connected components of G
 Develop optimization strategies
 Experimentally evaluate your algorithm, especially its scalability with the size of G
 Write a survey on parallel algorithms for computing strongly connected components,
as part of the related work.
Project 8: Recall strongly kNN joins (Lecture 2)
 Implement a MapReduce algorithm for evaluating kNN join queries
 Develop optimization strategies
 Experimentally evaluate your algorithm
 Write a survey on parallel algorithms for kNN queries and kNN join queries, as part
of the related work.
Project 9: Recall keyword search with Steiner-tree semantics (Lecture 2)
 Implement a MapReduce algorithm for keyword search with Steiner-tree semantics
 Develop optimization strategies
 Experimentally evaluate your algorithm
 Write a survey on parallel algorithms for keyword search as part of the related work.
Project 10: Recall PageRank (Lecture 2)
 Implement two algorithms for PageRank, in

BSP

GRAPE
 Develop optimization strategies
 Experimentally evaluate your algorithms, especially its scalability with the size of G
 Write a survey on parallel algorithms for PageRank, as part of the related work.
Proj 11: Recall strongly connected components (Lecture 2)
 Implement two algorithms for strongly computing connected components, in

BSP

GRAPE
 Develop optimization strategies
 Experimentally evaluate your algorithms, especially its scalability with the size of G
 Write a survey on parallel algorithms for computing strongly connected components,
as part of the related work.
Proj 12: Recall bounded simulation (Lecture 3)
 Implement a parallel algorithm for graph pattern matching via bounded simulation,
in GRAPE
 Develop optimization strategies
 Experimentally evaluate your algorithm, especially its scalability with the size of G
 Write a survey on parallel algorithms for graph pattern matching, as part of the
related work.
Proj 13: Recall graph partitioning: given a directed graph G and a natural number n,
we want to partition G into n fragments of roughly even size such that the total
number of border nodes in Vf is minimized
 Read existing work on graph partitioning
 Develop an approximation algorithm for graph partitioning
 Implement your algorithm in any parallel programming model of your choice
 Develop optimization strategies
 Experimentally evaluate your algorithm, especially its scalability with the size of G
and the size of |Vf|
 Write a survey on graph partitioning algorithms, as part of the related work.