Document

CSE5544 Final Project
Interactive Visualization Tool(s) for IEEE Vis
Publication Exploration and Analysis
Team Name: Publication Miner
Team Members: Xiaonan Ji & Tong Zhao
May 1st 2017
Motivation
Dataset: Visualization Publication Data Collection - 2,752 IEEE Visualization (IEEE VIS)
publications from 1990-2015.
Problem & Significance
• Potential audiences: academics and students working in the Visualization domain.
• Provide a rapid overview of the IEEE Vis research community and facilitate the
cognition of its development over the past 25 years.
• Facilitate exploration, analysis, and knowledge discovery regarding:
• Specific research topics and publications
• Relationships among multiple research topics and publications
• Leverage impactful publications, and advise the audiences in developing research
scopes to address existing gaps or challenges.
What did we do
Two visualizations addressing the VisPubData from different perspectives.
Map projection of publication content similarity
Interactive exploration & analysis
Citation/Reference network
Interactive exploration & analysis
Visualization 1
Map projection of publication content similarity
Interactive exploration & analysis
Objective & Prototype
Innovation
Gap?
Unexplored
Area
New area
Year 1
Year 2
Mature
area
Filled
Gap
Important Publication &
its highlighted
citations/references
Potential
future work
Clustering ->Topic?
Year 3
Generate an interactive map projections (2D scatter plot) by placing 2,752
publications based on their text features derived from title, abstract, and
keywords.
Questions to answer
• How different research topics are developed over the 25 years?
• How a research topic is explored/fulfilled by relevant publications?
• What are the most impactful milestone publications? What are the citations and references?
• What patterns can be identified? What kinds of new potentials can be identified?
Workflow of Implementation
Dataset
(.csv)
Title
Abstract
(Effective) Text Feature Development with NLP
Tokenization, POS Tagging & Chunking
Extraction of Noun Phrases (NPs)
Stemming (Porter Stemmer)
1-gram
Count
1 & 2-gram
Tf-idf
•
•
•
•
User
Interaction •
•
Python
Truncated SVD
Keywords
•
Dimensionality Reduction
t-SNE
Examine publication affinity and
clustering with their positions in
the 2D map.
Adjust resolution with zoom-in
and zoom-out.
Identify important publications
with their sizes.
Search publications of interests via
DOI or text matching.
Click and select a publication for
its references/citations.
Mouseover a publication to see
instant information i.e. title and
keywords.
Examine the evolution with year
selection/filtering.
Year
2D Map Projection
D3
Sizing
#Citations
Coloring
Type
Highlights
To facilitate the cognition, exploration, and discovery:
• NLP for effective text feature development and dimensionality reduction
• Map projection accommodating 2,752 publication dots. Publication distribution and clustering
(research topic) in the 2D space are based on their similarity/relationship in the text feature
space. Three modes can be selected: title, abstract, and keywords.
• Adjustable resolution with zoom-in and zoom-out features to examine clusters and publications
on a customizable level.
• Highlight of important publications (milestones) that have many citations.
• Instant information display of a mouseover-ed publication.
• Locate publications of interests by searching with a DOI or user keyword(s).
• Suggest related publications (references and citations) of a user selected publication.
• Selection/filter of publication years to examine the evolution through the past 25 years.
Analysis & Findings
• We were able to identify many dominant research topics, including volume rendering,
flow visualization, vector/tensor visualization, multi-dimensional visualization, graph
visualization, user interaction and interface, visual data mining, etc.
• While InfoVis, SciVis, and VAST have different focuses, there exist overlaps among them,
for instance, user interaction and interface, multi-dimensional visualization…
• Some research topics have met different patterns of evolution through the past years:
• “Volume rendering“ - stable growth during the past years.
• "Visualization of text document" - "slack" development: initial attention in 1995, but many following studies
were not published until 2000s or after 2010.
• “Graph visualization" - began to draw larger attentions around 2000, the issue of "clutter graph" in 2004, and
many related solutions like "parallel coordinate“ and "focus and context” were applied to this scope after then.
• Many milestone publications that were published in early years (1990s) continue to
impact many recent studies, as reflected by the continuous increment of citations.
• Some research trends and potential opportunities can be identified, as an example, the
cognitive exploration of an dataset is drawing increasing attends in the recent years.
Visualization 2
Citation/Reference network
Interactive exploration & analysis
Thank you! 