jspMCyT: Jornada de Seguimiento de Proyectos en Tecnologías del

Jornadas de Seguimiento de Proyectos, 2007
Programa Nacional de Tecnologías Informáticas
Parallel Architectures and Applications
(Cooperative Project)
TIN2004-07440-C02 (01 and 02)
Ramón Beivide*
Universidad de Cantabria
José Miguel-Alonso**
Universidad del País Vasco
Abstract
This research project deals with interconnection technologies for parallel and distributed
computing systems. Nowadays, interconnection networks are ubiquitous in the computing
field. At the chip level, networks are present in clustered microprocessors, on-chip
multiprocessors and systems on chip. Some workstations and servers begin to use switched
networks to interconnect their different modules. Clusters, cc-NUMAs, MPPs and SANs rely
on their underlying interconnection subsystem for obtaining high performance and robustness.
Some designs for IP switches begin to consider internal structures based on interconnection
networks. Finally, Grid Computing is hot topic pursuing efficient designs for wide area
networks. This project pursues new research contributions in the fields mentioned above
including architectural and technological proposals and software development.
Keywords: Interconnection networks, performance evaluation, parallel and distributed
computing.
1. Project Goals
The main outcomes of a cooperative research project are summarized in this report. Two
coordinated research teams belonging to the University of Cantabria (subproject 1) and to the
University of the Basque Country (subproject 2) support this project. The group in Cantabria is
officially composed of five regular professors, one system manager and three PhD students. The
group in the Basque Country is composed of three professors (and two PhD students that did not
form part of the grant-requesting team).
The research project begun in September 2004 and ends in September 2007. The subproject 1
achieved a direct funding of 157800 € (40800 € in manpower) plus a grant for one PhD student.
The subproject 2 achieved a direct funding of 34400€.
The goals of this research project were organized around five main research lines:
*
Email: [email protected]
Email: [email protected]
**
TIN2004-07440-C02
1. Bibliographic study and state-of-the-art in parallel computing. As the research project deals
with parallel architectures and applications, a permanent goal is to acquire in some cases, or to
maintain in others, enough know-how in the research field as to propose new ideas and solutions.
The emphasis of the project when it was proposed was directed towards interconnection
subsystems. Then, a continuous bibliographic analysis had to be done in the broad spectrum of
networks present today in different platforms: multi-core on-chip architectures, multiprocessors,
clusters, supercomputers and grids.
2. Simulation Tools. The design, development and maintenance of accurate simulation tools
constitute key aspects for a research group in interconnection networks. The objectives of this
project included the development of simulation tools designed by our research groups and the
adaptation of external ones to our simulation platforms. In the same way, the design and analysis
of adequate benchmarks and load models were also targets to be considered.
3. Topological Issues. Many of the design problems of interconnection networks have a
topological nature. Some of the targets programmed in this project pivoted around topological
issues including the proposal of new network geometries and lay-outs. Routing, fault-tolerance,
partitioning and support for collective communication were also targets of our research project.
Special emphasis was declared on very large networks, as those used by the “top ten”
supercomputers. In addition, contributions to the design of algebraic error-correcting codes were
envisaged due to their intrinsic topological nature.
4. Architectural proposals. The proposals of complete network and router architectures
constitute another goal of this project. First, the topic of fault-tolerance was considered as a critical
one for forthcoming network technologies. In addition, routing and deadlock avoidance were
included as research topics. The design of scalable high-performance routers was considered as
another important issue. Special emphasis was also declared on small size networks for current
multi-core on-chip architectures and big networks for high-end supercomputers. Finally, the
architecture of network interfaces was also considered as another research goal of the project.
5. Parallel Applications. The targets around this research line were related to the parallelization of
scientific codes which requires high computational power. Two application suites were considered.
The first one is based on a set of bio-inspired algorithms denoted as Estimation of Distribution
Algorithms. The second one is a physics application devoted to the search of the Higgs boson based
on artificial neural networks. Clusters and Grid computing were declared as target platforms to run
the resulting parallel codes. In addition, load monitoring and load modeling were announced as
other research topics of the project.
2. Project Achievements
2.1. Simulation Tools
Supporting the advances in the other project goals, a complete evaluation framework has been
developed and maintained. Currently, we are working in two main simulation platforms, SICOSYS
(Simulator of Communicating Systems) and INSEE (Interconnection Network Simulation and
TIN2004-07440-C02
Evaluation Environment), and around these tools we have a complex infrastructure that allows us
to evaluate different interconnection network usage scenarios.
SICOSYS was initially planned, almost 10 years ago, trying to achieve the precision of hardware
simulators with much lower computational effort. In order to be able to analyze current proposals,
the simulator is under continuous development. The tool has more than 80.000 of C++ code. Its
flexible software design has made possible to use a coherent tool for very different scenarios by
different researchers without problems.
The main new features added to the tool support the analysis of fault-tolerant techniques in the
interconnection network; in addition, the simulator is able to report power consumption measures.
Besides, different architectural router proposals have been added to the simulator.
SICOSYS has been optimized to expand its scalability to current computing infrastructures.
Nowadays, commercial multi-core processors have made possible to acquire affordable small scale
SMP systems. Our computing infrastructure follows this trend. In order to evaluate different
architectural proposals employing HTC (High Throughput Computing), large amounts of DRAM
are required which can increase the infrastructure cost dramatically. To mitigate this drawback, we
have parallelized SICOSYS. Now, it is possible to run efficiently one simulation with 2 to 4 threads
per computing node. We have improved the answer time for multi-billion cycle simulations being
able to simulate very large networks without reduce the original tool accuracy [1].
SICOSYS has been designed with a functional interface to external simulators. In the past, that
interface was used with execution-driven simulators like Rsim and SimOS. Now, we have added
GEMS (General Execution-driven Multiprocessor Simulator). GEMS is able to run unmodified
Solaris®8/9/10 software on top of simulated state-of-the art superscalar processor with a complex
memory hierarchy and using SICOSYS as the interconnection network simulator. The tool relies in
SIMICS® to simulate the full system activity, including I/O. Consequently, now we have a
complete framework to perform full-system simulation with precise processor/memory/network
details. Using the complete tool, we are able to evaluate OS dependent workloads such as
transactional or commercial applications whit a high level of confidence. Moreover, the
development of new benchmarks has been greatly simplified. GEMS is an actively supported tool
by the group of Prof. Mark D.Hill at the University of Wisconsin-Madison, with more than one
hundred of registered users. Many multiprocessor researches use this tool in their recent papers.
The knowledge and later integration of GEMS was possible through a direct collaboration with the
original developers. One of the members of the U. Cantabria (Valentin Puente) spent one
sabbatical year at UW-Madison collaborating with the group of Prof. Mark D.Hill.
Although not as expensive, in terms of memory and CPU utilization, as a hardware simulator,
SICOSYS is still a heavyweight tool. It is not possible, using current hardware, to run this tool to
simulate a network with thousands of routers. For this reason, we are working on a
complementary, lightweight simulation environment called INSEE. It is less precise, in terms of
timing, than SICOSYS, but allows us to experiment with very large networks.
The main two modules of INSEE are a network simulator (FSIN) and a traffic generation module
(TrGen). Regarding improvements to FSIN, the most relevant one has been the extension of
capabilities to deal with many different topologies, including k-ary n-cubes and trees. Important re-
TIN2004-07440-C02
design has allowed us to describe all its behavior in a single configuration file, so a single binary can
simulate many different architectural proposals. Output capabilities have also been enhanced,
generating files that can be imported directly by data analysis tools.
The traffic generation capabilities of TrGen have also been greatly improved. Originally, it was only
able to generate traffic using a small selection of synthetic traffic patterns. In the current version,
the selection is wider. Also, we have included the capability of injecting traffic to the FSIN
simulator using traces obtained from actual applications. This has allowed us to perform more
realistic evaluations of performance. Finally, TrGen allows FSIN to interact with a full-system
simulator, Simics®, allowing simulations with workloads generated by actual, running applications.
The details of these tools are described in [2, 3, 4, 5]. The availability of these tools has allowed us
to perform experiments in the fields of topologies for interconnection networks [6] and congestion
control [7, 8, 9]. Also, the experimentation with different ways of injecting traffic in the network
have allowed us to detect some pitfalls in some “classic” approaches; that is the reason we propose
new ways of generating traffic in [10, 11].
Note that the development of the INSEE tools have been made at the UPV/EHU, but inputs for
their design, and utilization, has been done in cooperation with the Group at the U. of Cantabria,
and with Dr. Cruz Izu of the University of Adelaide.
Recently we have started a line of collaboration with the BSC/CNS and IBM Research
Laboratories–Zurich. One of our immediate objectives is to share our knowledge on simulation
tools and application characterization using traces. Later, we will work on topologies and
scheduling strategies for large-scale high-performance supercomputers. We have held several
meetings hosted by the BSC/CNS, and one of the members of the UPV/EHU team (Javier
Navaridas, a PhD student under the supervision of Dr. Miguel-Alonso) has been visiting the
Zurich laboratories from November 2006 to February 2007.
Concerning parallel workloads, we are actively using NAS Parallel Benchmarks (OpenMP or MPI
versions), older benchmarks such us SPLASH 1&2 and some transactional benchmarks such as
static web applications (SURGE) or Online Transaction Processing (SPECjbb)
Regarding traffic modeling, we have been working in different ways of providing realistic traffic to
our network simulators. This has been done using three different approaches: i) Full-system
simulation: A collection of machines are simulated using Simics®, and run unmodified MPI
applications. The network simulator performs the interchange of messages; ii) Trace-based simulation:
Instrumented versions of MPI applications are run in actual machines (our own clusters, or
machines such as BSC/CNS Mare Nostrum). Detailed traces of the interchange messages are
obtained. These traces are used to feed simulators. The challenge here is to respect causal
relationships; iii) Synthetic traffic: The usual way of working is considering a collection of
independent sources of traffic that use some probability distribution functions to compute message
sizes, destinations, and inter-generation times. We have introduced a "burst" mode in which traffic
sources are not totally independent, which reflect the coupled nature of most parallel applications.
A recent paper submitted to EuroPar 2007 [12] compares the last two options. Paper [2] evaluates
the first one.
TIN2004-07440-C02
2.2. Topological Issues
The objectives achieved within this workpackage can be grouped around two topics: distancereduced practicable topologies and algebraic codes.
In respect to topologies, dozens of parallel computers of different size have been designed around
Torus interconnection networks. Typically, a 2D Torus arranges its N nodes in a square Mesh with
wraparound links. Above few thousands nodes, it has been shown that parallel computers should
use 3D topologies, being a cubic 3D Torus the most desirable solution. Parallel machines, such as
the HP GS1280 based on Alpha 21364 and the Cray X1E vector computer, have used twodimensional Tori. Others, such as the Cray T3D and T3E have used three dimensional Tori by
piling up several 2D Tori planes. The IBM BlueGene is a notable example of a massively parallel
computer that joins 26×25×25 nodes in a mixed-radix 3D Torus. In addition, each dimension of a
Torus network may have a different number of nodes, leading to rectangular and prismatic
topologies for two and three dimensions respectively. These topologies are denoted as mixed-radix
networks. Mixed-radix Tori are often built for practical reasons of packaging and modularity. The
HP GS1280 employs a 2D rectangular network and the IBM Blue-Gene a 3D prismatic one.
However, mixed-radix Tori have two important drawbacks: first, they are no longer edgesymmetric and second, the distance-related network parameters (diameter and average distance) are
quite far from the optimum values of square and cubic topologies. The edge asymmetry introduces
load imbalance in these networks, and for many traffic patterns the load on the longer dimensions
is larger than the load on the shorter ones. In addition, maximum and average delays are too long
as they depend on the poor values of diameter and average distance exhibited by these networks.
During this project we have analyzed and introduced a family of 2D toroidal networks which
includes standard and twisted Tori configured as square or rectangular networks. We have
proposed the use of the Gaussian integers as the appropriated tool for defining, analyzing and
exploiting this set of networks. By using this algebraic tool we have solved some interesting
applications over this type of networks such as unicast and broadcast optimal routing and the
perfect placement of resources on them. Some of the results of this research line have been
published in several papers and others have been submitted for publication. Most of the material
generated on this topic has been collected in [38]. In [13], we have considered practicable lay-outs
for this kind of networks. Also, in [14] we have explored a particular network case as a suitable
topology for on-chip networks. In [47, 48] the routing problem over circulant graphs has been
revisited.
Moreover, we have proposed and analyze alternative mixed-radix 2D and 3D Torus topologies that
avoid the two above mentioned problems by adequately twisting the wraparound links of one or
two network dimensions. The performance exhibited by these twisted Tori is notably higher than
the one obtained when using their standard mixed-radix counterparts. Some of the results of this
research line have been published in a recently accepted paper [6].
In respect to the field of Algebraic Codes, the design of error-correcting codes for two-dimensional
signal spaces has been recently considered in the technical literature. Hamming and Lee distances
have been proved to be inappropriate metrics to deal with QAM signal sets and other related
constellations. Up to our knowledge, the first author who modeled certain QAM constellations
TIN2004-07440-C02
with quotient rings of Gaussian integers was Klaus Huber. In his papers, Huber introduced a new
distance for using in QAM-like constellations denoted as Mannheim metric. The rational behind
this metric is to consider the Manhattan (or taxicab) metric modulo a two-dimensional grid.
Based on this seminal work, we have proposed perfect codes for different multidimensional signal
spaces. To solve these problems, we have introduced an original relationship among the fields of
Graph Theory, Number Theory and Coding Theory. One of our main findings is the proposal of a
suitable metric over quadratic, hexagonal and four-dimensional constellations of signal points. This
metric is the distance among vertices of a new class of Cayley graphs defined over integer rings
which include Gaussian, Eisenstein-Jacobi and Lipschitz Graphs. Hence, such graphs represent
mathematical models of the multidimensional constellations under study. A problem in Graph
Theory known as the perfect dominating set calculation has been solved over these graphs. A
sufficient condition for obtaining such a set has been given for each case. Obtaining these sets of
domination directly yields to the construction of perfect codes for the alphabets under
consideration.
Papers concerning codes over Gaussian integers have been published in notable Information
Theory conferences’ proceedings or are in reviewing process in journals. Specifically, in [39] and
[40] perfect codes over quotient rings of Gaussian integers are considered. The metric applied to
these codes is the distance induced by a circulant Gaussian graph. Also, the weight distribution of
these circulants has been presented in [41]. In addition, perfect codes over any quotient ring of
Gaussian integers are presented in [42]. In this paper we have shown, as well, that Lee perfect
codes are a particular case of our Gaussian perfect codes. In [39], 1-perfect error correcting codes
over hexagonal constellations modeled by quotient rings of Eisenstein-Jacobi integers have been
introduced. Recently, in [43], a method for finding certain perfect t-dominating sets over degree six
circulant graphs has been considered. A preliminary approach to degree eight graphs and codes
over them is being considered. In this case, the graphs will be built over certain subsets of the
integer quaternions. Perfect 1-dominating sets are obtained and perfect codes over these sets are
compared, in some cases, with perfect Lee codes, which we have considered in [44]. Most of this
work corresponds to the PhD dissertation of C. Martínez [50].
2.3. Architectural Proposals
Among the different tasks that were raised in this objective, most effort has been directed towards
the improvement of packet routers, and to the increase of the fault-tolerant capabilities of the
network. Thus, very significant results were obtained in [15] (not completely explored yet) giving
rise to a new strategy to implement adaptive routing in irregular networks. This method relies on
the use of a deadlock avoidance mechanism developed by our groups in previous projects.
Other aspect in which our work has given results of impact has been the development of a new
fault-tolerant mechanism for medium and large size k-ary n-cube networks [16, 17]. This
mechanism makes possible to tolerate, at network level, every time-space combination of failures
provided that the network topology remains connected. The application of this new mechanism to
other network structures can give new interesting results. In [45], we present an exhaustive
evaluation of the proposal under different realistic scenarios.
TIN2004-07440-C02
Many other issues, corresponding to the hardware support for improving the communication
operations inside the router have been carried out. Thus, new solutions for solving several
congestion problems, included in [18, 19, 20], have been proposed, with special emphasis on the
aspect discovered (at least published for the first time) for our groups, about the instability of
channel occupation in networks whose diameter surpasses the twenty of nodes [21]. Additionally
and with respect to the network interfaces, a deep analysis of the main characteristics of the
network interfaces present at the moment in the market (and with options of future) has been
made in collaboration with the BSC/CNS.
It is worth to remark that an important part of the challenges of the interconnection networks has
been transferred inside the chip. The capacity of integrating several processors inside the same chip
has converted the interconnection network in a key element for the performance of new multi-core
architectures. For this reason, our groups are making an effort in the acquisition of the necessary
expertise and tools as to make contributions in interconnection mechanisms suitable for this new
environment. As a consequence of this effort, a line of collaboration with the BSC/CNS has
produced significant results [22, 23] about kilo-instruction processors as a way for removing
bottlenecks in the access to shared resources. Specifically, a new router structure for this type of
systems has been developed whose performance surpasses the one of the architectures used until
the moment. These results have been submitted for publication [46].
Finally, we want to mention that, in cooperation with Dr. Izu of the U. of Adelaide, we have
studied some congestion-control mechanism that can be incorporated in large-scale
interconnection networks. In particular, we have studied the mechanism included in IBM’s
BlueGene/L (that gives priority to in-transit traffic, even at the cost of delaying new injections) and
proposed and studied LBR (Local Buffer Restriction), an extension of the Bubble routing
mechanism to adaptive virtual channels. Results of these studies are summarized in [7, 8, 9].
2.4. Parallel Applications
One of the objectives of this project was to perform, in cooperation with the Intelligent Systems
Group of the UPV/EHU, the parallelization of a particular class of bio-inspired algorithm called
EDAs (Estimation of Distribution Algorithms). These algorithms have proven very successful
when solving complex optimization problems, but are very CPU-consuming. If we were able to
accelerate them, via parallelization, we could reach better solutions in shorter times, and/or explore
more solutions. This is precisely what we have done. The details of the way EDAs work, and how
parallelization has been done, are in [24]. This paper also includes a preliminary performance
evaluation of the parallel solutions, extended in [25].
We have collaborated with the group of Prof. Ubide at the Faculty of Chemistry of the
UPV/EHU, in order to apply our fast, parallel version of EDAs to a complex problem in
quantitative chemistry: the creation of multivariate calibration models for chemical applications.
The problem consists of obtaining data (light spectra) of the concentration of species taking part in
controlled chemical reactions, selecting the most relevant data, and creating a prediction model.
This model can be feed with measured data from new reactions with unknown concentrations of
the same species; the model should predict those concentrations. The application of EDAs has
TIN2004-07440-C02
proven to be a very successful approach, because of the quality of the obtained models. In [26, 27,
28] we discuss the approaches to the problem used traditionally by chemists, and compare those
with new approaches based on the “intelligent” search in the space of solutions performed by an
EDA. In [29] we extend this work evaluating different mechanisms of input-data reduction, and
different EDAs.
As a result of this work, Alexander Mendiburu presented his PhD dissertation in January 2006 [30]
under the supervision of Dr. Jose Miguel-Alonso and Dr. Jose Antonio Lozano.
In the field of grid computing, we have been working in economic models to schedule applications
in a grid environment. The results of this work are in [31]. What we have done is an extension to
the grid architecture, introducing a collection of modules that, when launching an application,
select resources taking into account not only availability and adequacy, but also a cost model.
Implementation has been done using a Globus 4 environment [www.globus.org/toolkit], and the
GridWay [www.gridway.org] meta-scheduler.
Another objective of this project was to perform, in cooperation with the IFCA (Instituto de Física
de Cantabria - Institute of Physics of Cantabria), the parallelization of an Artificial Neural Nets
algorithm. Currently, to search for the Higgs boson at CERN (European Nuclear Research Centre)
a multi-layer perceptron (MLP) is used. MLPfit is a sequential application for the design,
implementation and use of multilayer perceptrons. It has been developed by DAPNIA which
depends on the French Commission of Atomic Energy and adopted by CERN for the
implementation of artificial neural nets for High-Energy Physics. The details of how parallelization
has been carried out are in [32]. This work also has produced several Master theses [33, 34, 35, 36].
One of these students has joined our group as a PhD student. In addition, a formal collaboration
with IFCA has been established under a new contract [37] for participating in the Crossgrid
initiative. Crossgrid is a European project, whose objective is the creation, management and
exploitation of a Europe-wide Grid computation environment, permitting the interactive use of
applications that are extremely intensive in terms of calculation and data. The applications making
use of the Crossgrid project’s infrastructure are related with the fields of biomedicine, meteorology
or High-Energy Physics. Also, in [49] we have presented an evaluation of OpenMoxis
environments.
3. Indicators of results
Collaboration with research groups
- Dr. Cruz Izu – Dep. of Computer Science, U. of Adelaide, Australia.
- Prof. Ernst Gabidulin. Moscow Institute of Physics and Technology. Russia.
- Prof. Mark D. Hill, UW-Madison. USA.
- Prof. Mateo Valero and Jesús Labarta. BSC/UPC.
- Dr. Carlos Ubide – Faculty of Chemistry, UPV/EHU.
- Intelligent Systems Group – Computer Science Sc., UPV/EHU (www.sc.ehu.es/isg)
- BSC / CNS (www.bsc.es)
- IBM Research Laboratories–Zurich (www.zurich.ibm.com)
- I2Bask – Research Network of the Basque Country (www.i2bask.net)
- IFCA (CSIC/U. Cantabria, (www.ifca.unican.es)
TIN2004-07440-C02
Patents: “Mecanismo de encaminamiento tolerante a fallos altamente escalable”. Inventores:
Valentín Puente Varona, José Ángel Gregorio Manasterio, Fernando Vallejo Alonso, Ramón
Beivide Palacio. U. Cantabria. Request #: P200500530. Request date: 01-03-05.
PhD dissertations:
- Dr. Alex Mendiburu (supervised by Dr. Miguel-Alonso) UPV/EHU
- Dr. Carmen Martínez (supervised by Dr. Ramón Beivide) U. Cantabria
Current PhD students: 2 at the UPV/EHU, 3 at U. of Cantabria, 1 at U. of Burgos
Master’s thesis: 1 at the UPV/EHU, 4 at U. of Cantabria
Participation in other projects: HiPEAC Network of Excellence (European Project IST-004408)
Articles in international journals: 7 accepted plus 2 submitted
Papers in international conferences: 21 accepted and 3 submitted
Papers in national conferences: 7 accepted
Other publications (technical reports): 3
4. References
1.
2.
3.
4.
5.
6.
7.
P.Prieto, V.Puente, P.Abad y J.Á. Gregorio, “Paralelización de un Entorno de Evaluación
de Redes de Interconexión”, XVII Jornadas de Paralelismo, 2006.
Javier Navaridas, Fco. Javier Ridruejo, J. Miguel-Alonso. "Evaluation of Interconnection
Networks Using Full-System Simulators: Lessons Learned". Proc. 40th Annual Simulation
Symposium, Norfolk, VA, March 26-28, 2007.
F.J. Ridruejo, J. Miguel. "Simulación de redes de interconexión utilizando tráfico real".
Actas de las XVI Jornadas de Paralelismo. Thomson, 2005 (ISBN 84-9732-430-7). Pages:
109 - 116.
F.J. Ridruejo, J. Miguel-Alonso. "INSEE: an Interconnection Network Simulation and
Evaluation Environment". Lecture Notes in Computer Science, Volume 3648 / 2005
(Proc. Euro-Par 2005), Pages 1014 - 1023.
F.J. Ridruejo, A. Gonzalez, J. Miguel-Alonso. "TrGen: a Traffic Generation System for
Interconnection Network Simulators". International Conference on Parallel Processing,
2005. 1st. Int. Workshop on Performance Evaluation of Networks for Parallel, Cluster
and Grid Computing Systems (PEN-PCGCS'05). ICPP 2005 Workshops. 14-17 June
2005 Page(s): 547 – 553
Jose M. Camara, Miquel Moreto, Enrique Vallejo, Ramon Beivide, Jose Miguel-Alonso,
Carmen Martinez, Javier Navaridas. "Mixed-radix Twisted Torus Interconnection
Networks". Proc. 21st IEEE International Parallel & Distributed Processing Symposium IPDPS '07, Long Beach, CA, March 26-30, 2007.
F.J. Ridruejo, J. Navaridas, J. Miguel-Alonso, C. Izu. "Evaluation of Congestion Control
Mechanisms using Realistic Loads". Actas de las XVII Jornadas de Paralelismo. Albacete,
Septiembre 2006.
TIN2004-07440-C02
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
C. Izu, J. Miguel-Alonso, J.A. Gregorio. "Effects of injection pressure on network
throughput". Proc. PDP'06 - 14th Euromicro Conference on Parallel, Distributed and
Network-based Processing. Montbéliard-Sochaux, 15-17 Feb. 2006, France. Pp. 91-98
J. Miguel-Alonso, C. Izu, J.A. Gregorio. "Improving the Performance of Large
Interconnection Networks using Congestion-Control Mechanisms". Technical report
EHU-KAT-IK-06-05. Department of Computer Architecture and Technology, The
University of the Basque Country.
C. Izu, J. Miguel-Alonso, J.A. Gregorio. "Evaluation of Interconnection Network
Performance under Heavy Non-uniform Loads". Lecture Notes in Computer Science,
Volume 3719 / 2005 (Proc. ICA3PP 2005), Pages 396 - 405.
J. Miguel-Alonso, F.J. Ridruejo, C. Izu, J.A. Gregorio and V. Puente. "Are independent
traffic sources suitable for the evaluation of interconnection networks?". Technical report
EHU-KAT-IK-07-05. Department of Computer Architecture and Technology, The
University of the Basque Country.
F.J. Ridruejo, J. Navaridas, J. Miguel-Alonso and Cruz Izu. “Realistic Evaluation of
Interconnection Network Performance at High Loads". Submitted to EuroPar 2007.
E. Vallejo, R. Beivide and C. Martinez, «Practicable Lay-outs for Optimal Circulant
Graphs» The 13th Euromicro Conference on Parallel, Distributed and Network-Based
Processing (PDP'05)
C. Martínez, E. Vallejo, R. Beivide, C. Izu and M. Moretó, “Dense Gaussian Networks:
Suitable Topologies for On-Chip Multiprocessors”, International Journal of Parallel
Programming, Vol: 34, pp. 193-21, Junio 2006.
V. Puente, J.A. Gregorio, F. Vallejo, R. Beivide and C. Izu, “High-performance adaptive
routing for networks with arbitrary topology”, Journal on System Architecture, Vol 52,
Issue 6, June 2006, pp.345-358
V.Puente, J.A.Gregorio, F.Vallejo y R.Beivide, “Immunet: A Cheap and Robust FaultTolerant Packet Routing Mechanism”, International Symposium on Computer
Architecture (ISCA-2004).
V. Puente, J.A. Gregorio, "Immucube: Scalable Fault Tolerant Routing for k-ary n-cube
Networks", Aceptado para publicación en IEEE Transactions on Parallel and Distributed
Systems.
C.Izu, R.Beivide y J.A. Gregorio, “The Case of Chaotic Routing Revisited”, WDDD
Workshop. In conjunction with the International Symposium on Computer Architecture
(ISCA-2004)
C.Izu y R.Beivide, “Understanding Buffer Management for Cut-through 1D Rings”,
EURO-PAR 2004.
Miguel-Alonso, J.A. Gregorio, V. Puente, F. Vallejo, R. Beivide, P. Abad, « Desequilibrio
de Carga en Redes k-ary n-cube », XV Jornadas de Paralelismo (2004)
J.Miguel-Alonso, J.A.Gregorio, V.Puente, F.Vallejo y Ramón Beivide, “Load unbalance in
k-ary n-cube Networks”, EURO-PAR 2004.
E.Vallejo, M. Galluzzi, A. Cristal, F. Vallejo, R. Beivide, P. Stenström, J. E. Smith and M.
Valero, “Implementing Kilo-Instruction Multiprocessors”,
IEEE International
Conference on Pervasive Services 2005 (ICPS'05); Santorini, Greece.
E.Vallejo, F.Vallejo, R.Beivide, M.Galluzzi, A.Cristal, P.Stenström, J.E. Smith, M.Valero,
“KIMP: Multicheckpointing Multiprocessors”, XVII Jornadas de Paralelismo, 2005.
TIN2004-07440-C02
24. A. Mendiburu, J.A. Lozano, J. Miguel-Alonso. "Parallel implementation of EDAs based
on probabilistic graphical models". IEEE Transactions on Evolutionary Computation,
Aug. 2005 Vol. 9, N. 4, Pages 406 - 423.
25. A. Mendiburu, J. Miguel-Alonso and J.A. Lozano. "Implementation and performance
evaluation of a parallelization of Estimation of Bayesian Network Algorithms". Parallel
Processing Letters, Vol. 16, No. 1 (2006) 133-148.
26. A. Mendiburu, J. Miguel-Alonso, J.A. Lozano, "Evaluation of Parallel EDAs to Create
Chemical Calibration Model", e-science, p. 118, Second IEEE International Conference
on e-Science and Grid Computing (e-Science'06), 2006.
27. A. Mendiburu, J. Miguel-Alonso, J.A. Lozano, M. Ostra and C. Ubide, "Parallel EDAs to
create multivariate calibration models for quantitative chemical applications", Journal of
Parallel and Distributed Computing, Volume 66, Issue 8, Parallel Bioinspired Algorithms,
August 2006, Pages 1002-1013.
28. A. Mendiburu, J. Miguel-Alonso, J. A. Lozano, M. Ostra and C. Ubide. "Calibración
multivariante de reacciones químicas utilizando EDAs paralelos". Actas del IV Congreso
Español sobre Metaheurísticas, Algoritmos Evolutivos y Bioinspirados (MAEB'2005) II.
M.G. Arenas et al. (editors). Thomson, 2005 (ISBN 84-9732-467-6). Pages: 881 - 888.
29. A. Mendiburu, J. Miguel-Alonso, J. A. Lozano, M. Ostra and C. Ubide. "Parallel and
Multi-objective EDAs to create multivariate calibration models for quantitative chemical
applications". International Conference on Parallel Processing, 2005. 1st. Int. Workshop
on Parallel Bioinspired Algorithms. ICPP 2005 Workshops. 14-17 June 2005 Page(s): 596603
30. Alexander Mendiburu Alberro. “Parallel implementation of Estimation of Distribution
Algorithms based on probabilistic graphical models. Application to chemical calibration
models”. PhD thesis, Department of Computer Architecture and Technology,
UPV/EHU, January 2006.
31. Jose Antonio Pascual Saiz. Sistema de compraventa de recursos para Grid Computing.
Proyecto Fin de Carrera dirigido por J. Miguel-Alonso. Facultad de Informática
UPV/EHU, 2006.
32. Gutierrez, Alejandro, Cavero, Francisco, Menéndez de Llano, Rafael. Parallelization of a
Neural Net Training Program in a Grid Environment. Proceedings of PDP 2004 by
IEEE Computer Society. 2004.
33. Pablo Abad. Análisis Y Evaluación De Sistemas De Colas En Entornos De Computación
De Alto Rendimiento. Proyecto Fin de Carrera dirigido por Rafael Menéndez de Llano.
Universidad de Cantabria. 2004.
34. Alejandro Gutierrez. Implementación Y Evaluación De La Paralelización De Una
Aplicación De Física De Altas Energías Basada En Redes Neuronales Artificiales Para La
Detección Del Bosón De Higgs. Proyecto Fin de Carrera dirigido por Rafael Menéndez
de Llano. Universidad de Cantabria. 2004.
35. David Hernández Sanz. Instalación, Análisis Y Evaluación De Un Cluster Openmosix
Para Entornos De Alta Productividad Y De Alto Rendimiento. Proyecto Fin de Carrera
dirigido por Rafael Menéndez de Llano. Universidad de Cantabria. 2005
36. Víctor Sancibrián Grijuela. Evaluación De Distintos Bancos De Pruebas Para Diferentes
Arquitecturas De Supercomputadores. Proyecto Fin de Carrera dirigido por Rafael
Menéndez de Llano. Universidad de Cantabria. 2005.
TIN2004-07440-C02
37. “Development of Grid Environment for Interactive Applications (CROSSGRID)”
Subcontrato de IST-2001-32243. CSIC / Universidad de Cantabria. Investigador
principal: Rafael Menéndez de Llano Rozas. 2004.
38. C. Martínez, R. Beivide, E. Stafford, M. Moretó and E. Gabidulin. ``Modeling Toroidal
Networks with the Gaussian Integers". Submitted to IEEE Transactions on Computers.
2007.
39. C. Martínez, R. Beivide, and E. Gabidulin. “Perfect Codes from Circulant Graphs”.
Accepted for publication under changes. IEEE Transactions on Information Theory.
2007.
40. C.Martínez, R.Beivide, J. Gutierrez and E.Gabidulin, “On the Perfect t-Dominating Set
Problem in Circulant Graphs and Codes over Gaussian Integers”, 2005 IEEE
International Symposium on Information Theory (ISIT 2005); Adelaide, Australia.
41. E.Gabidulin, C.Martínez, R.Beivide and J.Gutierrez, “On the Weight Distribution of
Gaussian Graphs with an Application to Coding Theory”, Eighth International
Symposium on Communication Theory and Applications (ISCTA ’05); Ambleside, Lake
District, UK.
42. C. Martínez, M. Moretó, R. Beivide and E. Gabidulin, “A Generalization of Perfect Lee
Codes over Gaussian Integers”,. 2006 IEEE International Symposium on Information
Theory (ISIT'06).
43. C. Martínez, E. Stafford, R. Beivide and E. Gabidulin, “Perfect Codes over EisensteinJacobi Graphs”, Lugar de publicación: Proc. of Tenth Int. Workshop on Algebraic and
Combinatorial Coding Theory (ACCT-10), 2006.
44. C. Martínez, E. Stafford, R. Beivide and E. Gabidulin. “Perfect Codes over Lipschitz
Integers”. Submitted to 2007 IEEE International Symposium on Information Theory.
45. V.Puente, J.A.Gregorio, F.Vallejo y R.Beivide, “Dependable Routing in Interconnection
Networks of Flexible Topology”, Submitted to IEEE Transactions on Computers. 2007.
46. P. Abad, V. Puente, P. Prieto and J.A. Gregorio, “Rotary Router: An Efficient
Architecture for CMP Interconnection Networks”, Submitted to ISCA 2007, San Diego,
California, USA, 2007.
47. D. Gomez-Perez, J. Gutierrez, A. Ibeas, C. Martinez, « On routing on circulant graphs of
degree four », International Symposium on Symbolic and Algebraic Computation 2005.
48. D.Gómez, J.Gutierrez, A.Ibeas, C.Martínez and R.Beivide, « On Finding a Shortest Path
in Circulant Graphs with Two Jumps”, The Eleventh International Computing and
Combinatorics Conference (COCOON 2005); Kunming, Yunnan, P.R.China.
49. D.Hernández, R.Menéndez de Llano, P.Abad, C.Izu. ”Evaluación de un cluster
OpenMosix para entornos de alta productividad y de alto rendimiento”, XVI Jornadas de
Paralelismo, 2005.
50. C. Martínez. “Codes and Graphs over Complex Integer Rings”. PhD Dissertation. To be
defended the 12th of March, 2007.