P-SAM: A Post-Simulation Analysis Module for Agent-Based Models Title S. M. Niaz Arifin1, Ryan C. Kennedy1, Kelly E. Lane2, Agustín Fuentes3, Hope Hollocher2, Gregory R. Madey1 Researcher Presenter Name 1 Department/ of Computer Science and Engineering, 2Department of Biological Sciences, 3Department of Anthropology Institution / Organization / Company Name University of Notre Dame Agent-based models (ABMs) can produce large volumes of textual output, which often contain inherent logical structures that can be naturally expressed in terms of abstract mathematical notions such as graphs, relations etc. It is crucial to effectively analyze this voluminous textual output, and to produce the desired visualization. Analysis and visualization also play important roles in verification and validation (V&V) of ABMs. P-SAM (PostSimulation Analysis Module) is designed to analyze and visualize the post-simulation output for ABMs, with special emphasis on biological simulation models. As a case study, P-SAM is applied to a biological simulation model named LiNK1. LiNK analyzes the spread of pathogens among macaque monkeys in the Indonesian island of Bali. Results indicate the importance of using PSAM to perform V&V of LiNK by allowing internal validity checking and tracing the model entities. Analysis and Visualization Infection Statistics • Allows interactive probing: user can interact • An infection event occurs when a macaque transmits the pathogen to another macaque inside a temple, allowing self-transmission. • Shows all initially infected macaques, macaques that did not infect any other, all macaques that took part in infection events with details (timestep, infected macaque and the temple) of each infection event Birth and Death Statistics Summary Statistics • In a birth event, a mother gives birth to an infant • A death event records the time, place and cause of deaths (aging, dispersal and pathogen) • Lists all birth events with the corresponding timesteps, and all death events with the corresponding timesteps, causes of deaths and the location temple • Input/Output Statistics: original parameters from the LiNK file (acquired immunity, virulence, infectivity etc.) • Line count and Event Statistics: counts for initially infected macaques, infections, roaming infections, the total number of infected macaques etc. • Temple Statistics: number of unique infections occurring at each temple • Runtime and Memory Statistics: memory and runtime consumed by P-SAM Design LiNK LiNK Output File Analysis & Serialization Pathogen Transmission Graphs Writer Infection Data (DOT File) Roaming Infection Data (DOT File) Summary Data (Text File) Reader Visualization Graphical User Interface • The core P-SAM architecture consists of two Perl programs: the writer and the reader • The writer analyzes and serializes the simulation output; it takes the LiNK file as its input, serializes it into three files: Infection Data, Roaming Infection Data, and Summary Data; it uses the DOT graph-description language for hierarchical drawings of directed graphs • The reader builds the interactive visualization structures; it reads in the serialized DOT files, and projects the information into the GUI We utilize the following Perl modules2: Roaming Infection Statistics • Allows interactive probing • A roaming infection event occurs when a macaque transmits pathogen to another macaque outside a temple • Each event is accompanied by location information: the latitude, longitude, and the landscape in which the infection took place, such as City, Forest, Rice Field, River, Road, and Coast • Visually tracks the pathogen transmission record • Particularly helpful in V&V • Nodes represent macaques, edges represent events • Node 27.2969.0: a female macaque with temple 27 as its natal temple and 2969 as its id • Events are listed with the timestep and location where the infection occurred • Macaque 27.2969.0 infected macaque 27.2775.0 at timestep 1, in temple 27, and so on • Autoinfection: macaques 27.2863.0 and 27.2805.1 More information, including some results about LiNK & PSAM can be found in [3]. Future Work • Performance: We are working in collaboration with CRC4 to improve P-SAM runtime. Short term plans include preprocessing of input, code profiling, and code optimization • Generalization: We envision P-SAM to be useful to other types of biological ABMs that produce large volumes of textual output, which may include different types of agents (e.g. humans, mosquitoes, monkeys etc.) References 1. Kennedy, R.C., et al., “A GIS Aware Agent-Based Model of Pathogen Transmission,” International Journal of Intelligent Control and Systems, 14(1): 51-61, March 2009. 2. CPAN: http://www.cpan.org 3. LiNK: http://www.nd.edu/~macaque/ Acknowledgements 4. Paul Brenner, CRC !"#$%&'()$"% !)#*'+"%',%-+"% !"#!$%& !'&($$)**&+,)&!"&%-./(/01&2&.3-%4&+,)&567& 5/(8,& !'&$/)(+)&9/(8,&4(+(&*+/3$+3/)*1&(:4&3*)&+,)& $'//)*8':4-:9&9/(8,&(%9'/-+,;*& 5/(8,<-=& !'&>-*3(%-=)&+,)&9/(8,*& P-SAM: A Post-Simulation Analysis Module for Agent-Based Models S. M. Niaz Arifin∗, Ryan C. Kennedy†, and Gregory R. Madey‡ Department of Computer Science and Engineering, University of Notre Dame Abstract Agent-based models (ABMs) can produce large volumes of textual output, potentially in the range of hundreds of gigabytes. In most cases, these output contain inherent logical structures that can be naturally expressed in terms of abstract mathematical notions such as graphs, relations etc. It is crucial to be able to effectively analyze this voluminous textual output, and to produce the desired visualization with ease. Appropriate analysis and visualization also play important roles in verification & validation (V&V) of ABMs. We have developed a software module, called P-SAM (Post-Simulation Analysis Module), to analyze and visualize the post-simulation output for ABMs, with special emphasis on biological simulation models. P-SAM differs from conventional statistical software tools by emphasizing the visualization part that arises from the interaction between abstract entities present in the textual output, with the goal to automate post-simulation analysis tasks for ABMs. P-SAM, though still in its current embryonic form, has been designed with the goal to handle large data files distributed over a high-performance network. To achieve this, it must be suited to take full advantages of the network’s computing system, data storage system, data repositories, and visualization environments. As a case study, this poster describes the application of P-SAM to a biological simulation model named ‘LiNK’ that analyzes the spread of pathogens amongst long-tailed macaque monkeys in the Indonesian island of Bali. Reported results indicate the importance of using P-SAM to perform V&V of the LiNK model.1 The core P-SAM architecture consists of two programs (written in Perl) called the writer and the reader. The writer takes the LiNK file as its input and serializes it (after analysis) into separate files, some of which are written in the DOT format, which allows hierarchical drawings of directed graphs. Once analysis and serialization are complete, the reader allows visualization by building the Graphical User Interface (GUI). It ‘reads’ in the serialized DOT files, and projects the information into the GUI. P-SAM works on relations involving relevant entities (e.g. agents) defined by the user. The visualization process enables the user to visually analyze Infection Statistics, Roaming Infection Statistics, Birth and Death Statistics, Pathogen Transmission Graphs, and Summary Statistics. P-SAM still faces the challenges of efficiently processing gigabytes of data and minimizing the response time for large data sets run over the campus network. In collaboration with the Center for Research Computing (CRC) at the University of Notre Dame, we plan the following future improvements: Preprocessing Decomposing the large LiNK file into smaller files according to similar events, then to sort and analyze Profiling Using profilers (Devel::Profile and Devel::NYTProf) to measure the frequency and duration of function calls and to collect performance data Code Optimization Applying certain optimization techniques after the hotspots have been identified by profiling Once the above are implemented within the cyberinfrastructure setting offered by CRC, we expect P-SAM to be able to perform more complex, multi-run analysis involving large data sets with improved runtime, and to efficiently handle the storage, management, integration, and visualization of data produced by the LiNK model. All of these, linked by the CRC network, would create opportunities of scholarly innovation and discoveries by the LiNK users. This, in turn, would allow the biologists to further validate and refine their model. We also envision P-SAM to be useful to other types of ABMs. ∗ [email protected] † [email protected] ‡ [email protected] 1 See http://www.nd.edu/~macaque/ for more information 1
© Copyright 2025 Paperzz