CS 363 Comparative Programming Languages

Computer Systems Lab
TJHSST
Current Projects
In-House, pt 6
Current Projects, pt. 6
In-House
• Algorithms for Computational Comparative
Historical Linguistics
• Optimizing Genetic Algorithms for Cypher
Decoding
• Decision Trees for Career Guidance
• Archival of Articles via RSS and Datamining
Performed on Stored Articles
• Sabermetric Statistics in Baseball
2
Algorithms for
Computational
Comparative
Historical Linguistics
Over time, languages change by regular,
systematic processes. It is possible, by
looking at the state of a language now and
in the past, to deduce the exact changes that
occurred, and the order in which they
occurred. These changes also split
languages, therefore it is also possible to, by
using modern languages as input, induce the
probable structure of their parent language.
My goal is to develop algorithms by which
computers may efficiently analyze the
historical structure of languages and
language families.
3
Optimizing Genetic
Algorithms for
Cypher Decoding
Over the past several years, genetic
algorithms have come into wide use
because of their ability to find good
solutions to computing problems very
quickly. They imitate nature by crossing
over strings of information represented
as chromosomes, with preference given
to the more fit solutions produced. They
hold great promise in the field of
cryptology, where they may be used to
quickly find good partial solutions, thus
eliminating much of the intense manual
labor that goes into identifying initial
coding schemes.
4
Decision Trees for
Career Guidance
This research project will be an investigation
into the design and implementation of various
decision trees for career guidance. A decision
tree takes into account some sort of situation
outlined by a group of parameters and outputs
a Boolean decision to the situation. This
project will take into account many aspects
associated with decision trees including
database building, searching and sorting, and
algorithms for accessing data.
My project utilizes numerous decision trees in
an effort to serve as a tool for career guidance
for young adults. A user will fill out a form of
specified fields that will then be analyzed by
the group of decision trees until a field of
study/occupation is given to the user as the
outcome. This group of decision trees will be
built through database building techniques.
5
Archival of Articles
via RSS and
Datamining Performed
on Stored Articles
RSS (Really Simple Syndication,
encompassing Rich Site Summary and RDF
Site Summary) is a web syndication protocol
used by many blogs and news websites to
distribute information it saves people having
to visit several sites repeatedly to check for
new content. At this point in time there are
many RSS newsfeed aggregators available to
the public, but none of them perform any
sort of archival of information beyond the
RSS metadata. The purpose of this project is
to create an RSS aggregator that will archive
the text of the actual articles linked to
in the RSS feeds in some kind of linkable,
searchable database, and, if all goes well,
implement some sort of datamining capability
as well.
6
An Analysis of
Sabermetric Statistics
in Baseball
For years, baseball theorists have pondered the
most basic question of baseball statistics:
which statistic most accurately predicts which
team will win a baseball game. With this
information, baseball teams can rely on
technological, statistical-based scouting
organizations. The book, Moneyball addresses
the advent of sabermetric statistics in the
1980s and 1990s and shows how radical
baseball thinkers instituted a new era of
baseball scouting and player analyzation. This
project analyzes which baseball statistic is
the single most important. It has been found
that new formulas, such as OBP, OPS, and
Runs Created correlate better with the number
of runs a team scores than traditional statistics
such as batting average.
7