AAAI-08 Tutorial on Computational Workflows for Large-Scale Artificial Intelligence Research Part VII: Future Challenges in Computational Workflows and Opportunities for AI Research USC Information Sciences Institute Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008 1 Scientific Collaborations: Publications [from Science, April 2005] USC Information Sciences Institute Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008 2 Sharing Data Collection: LIGO (ligo.caltech.edu) USC Information Sciences Institute Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008 3 Sharing Computing Resources USC Information Sciences Institute Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008 4 Ongoing Research USC Information Sciences Institute Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008 5 Workflow Lifecycle [Deelman and Gil 06] Workflow and Component Libraries Data Products Adapt, Modify Workflow Template Workflow Instance Data, Metadata, Provenance Information Execute Populate with data Executable Workflow Map to available resources Compute, Storage and Network Resources USC Information Sciences Institute Data, Metadata Catalogs Yolanda Gil ([email protected]) Resource, Application Component Descriptions AAAI-08 Tutorial July 13, 2008 6 Workflow Creation Workflow completion • Workflows as components of other workflows Automatic workflow assembly from libraries of components • [McDermott 02] [McIlraith & Son 03] [Blythe et al 04] … Interleaving workflow composition and execution • Automatically add data conversion and formatting components [Gil et al 07] “Science of design” for computational workflows as software artifacts • [Deelman & Gil 07] [Gil et al 07][Gil 08] USC Information Sciences Institute Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008 7 Workflow Catalogs Workflow description and formal representation • Workflow discovery • [Goderis et al 06] [Goderis et al 07] Query-based workflow matching • [Goble et al 06] Workflow reuse and repurposing • W3C semantic workflow language activity [Horrocks and Li 02] [Baader 01] Workflow sharing • [DeRoure & Goble 07] USC Information Sciences Institute Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008 8 Workflow Learning 1) From a user’s demonstration of service invocations [Burstein et al 08] [Kim & Gil 08] 2) From tutorial instruction [Groth & Gil 08] 3) Generalizing from examples (from [Burstein et al 08]) USC Information Sciences Institute Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008 9 Five Opportunities for Future Research USC Information Sciences Institute Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008 10 1) Reduce Setup Cost -> Workflow as First Class Citizen in Scientific Research Today: Workflow design and implementation is costly • Developed through collaboration – Application scientists in several areas, software engineers, distributed systems experts, etc. • Developed over many months – Must adapt existing code, must create “glue” code • Validated and refined over time Goal: Must be done by scientists themselves at minimal cost: • • • • • To create them To understand them To learn to use them for research To adapt them for another purpose or analysis variant To refine/update them over time USC Information Sciences Institute Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008 11 2) Workflow Centered User Interaction Workflow template as selected method User visibility into the data analysis process User steering during execution based on results Interleaving generation and execution (data-driven adaptation) Recording provenance Automation of non-experiment critical, routine tasks USC Information Sciences Institute Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008 12 3) Workflows for Cross-Disciplinary Analyses -> Enable Integrative Science Today: Workflow systems can generate detailed provenance and metadata for new data products • • Describe individual datasets so they can be used by others Reuse of new data products by other systems is currently rare – Reuse is common within systems/communities Goal: Workflows generating data that is used across disciplines • • Meaningful reuse of data products (results) by other workflows True test of the utility of provenance and metadata information USC Information Sciences Institute Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008 13 4) Using Workflows for Educating New (and Old!) Scientists Today: Scientific analyses are less and less accessible to newcomers • Steep learning curve that includes a variety of areas of expertise – Application science(s), modeling, software engineering, distributed computing, etc. Goal: Workflow systems could be configured to enable learning of additional capabilities on-demand • • Could isolate less proficient users from advanced capabilities while enabling them to learn and practice what they learn Everyone should be able to contribute as they learn USC Information Sciences Institute Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008 14 5) Workflows as Efficient Instruments of Systematic Exploration and Discovery Today: Workflows manually selected by user • • • • User decides what data/analysis to conduct Not a systematic exploration of space Visualization is only one way to understand results Human is bottleneck, current practice will not scale Goal: Workflows conduct automated heuristic discovery and pattern detection • • • • Automate systematic exploration of all possible workflows Formulate heuristics for scientific discovery: recurring domainindependent data analysis patterns [Simon 82] Search for patterns (or pattern types) Workflows could include pattern detection and discovery components USC Information Sciences Institute Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008 15 Cyberinfrastructure: Not Just Big Iron “The Federal government must rebalance R&D investments to: • Create a new generation of well-engineered, scalable, easy-to-use software suitable for computational science that can reduce the complexity and time to solution for today’s challenging scientific applications and can create accurate models and simulations that answer new questions • Design, prototype, and evaluate new hardware architectures that can deliver larger fractions of peak hardware performance on key applications • Focus on sensor- and data-intensive computational science applications in light of the explosive growth of data” President’s Information Technology Advisory Committee (PITAC) report on “Computational Science: Ensuring America’s Competitiveness”, May 2005 USC Information Sciences Institute Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008 16 Tomorrow’s Cyberinfrastructure Layers Enabled by Knowledge-Rich Workflow Systems [Gil 08] Portals Portals Interfaces Workflow-Centered Portals Heuristic Discovery Data Services Workflow Sharing Application Tools Workflow Systems Resource Sharing Resource Access USC Information Sciences Institute Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008 17 “As We May Think” “Wholly new forms of encyclopedias will appear, ready made with a mesh of associative trails running through them […]. The lawyer has at his touch the associated opinions and decisions of his whole experience, and of the experience of friends and authorities. The patent attorney has on call the millions of issued patents, with familiar trails to every point of his client's interest. […] The chemist, struggling with the synthesis of an organic compound, has all the chemical literature before him in his laboratory, with trails following the analogies of compounds, and side trails to their physical and chemical behavior. […] There is a new profession of trail blazers, those who find delight in the task of establishing useful trails through the enormous mass of the common record. The inheritance from the master becomes, not only his additions to the world's record, but for his disciples the entire scaffolding by which [their additions] were erected.” --- Vannevar Bush, 1945 http://www.theatlantic.com/unbound/flashbks/computer/bushf.htm USC Information Sciences Institute Yolanda Gil ([email protected]) AAAI-08 Tutorial July 13, 2008 18
© Copyright 2026 Paperzz