Cloudbus Workflow Engine
Contents
- Introduction
- Workflow Engine Documentation
- Software Download and License
- Compatibility
- Bug Reports and Feedback
- Credits
- Links
Introduction
Cloudbus Workflow Engine (WFE) facilitates users to link standalone applications and execute their workflow applications on distributed network environments, such as Cluster, Grid, and Clouds. The WFE provides an XML-based workflow language for the users to define tasks and dependencies. It supports several scheduling algorithms as plug-in components, thus allowing the resource allocation decision to be made using several techniques, both in online and offline mode. We also use tuple spaces approach to enable an event-driven scheduling architecture for simplifying workflow execution. The latest version of the WFE has been integrated into the Cloudbus Broker and supports multiple computing platforms.
The Cloudbus Workflow Engine is a significant extension to the Gridbus Workflow Engine (GWFE), which was initially developed under the Gridbus Project, Grid Computing and Distributed Systems (GRIDS) Lab., Dept. of Computer Science and Software Engineering, the University of Melbourne, Australia. The project is partially supported by Australian Research Council Discovery Project grant, Storage Technology and the University of Melbourne.
Components of the Workflow Engine
Users interact with the Workflow Management System (WMS) through a web-portal. The following key components facilitae users to execute and manage scientific applications.
Grid Portal: The primary user interface for any application is a Web Portal that encompasses the following functionalities:
1. A workflow editor, which enables users to compose new workflows and modify existing ones.
2. A submission page, through which users can upload to the system, all necessary input files to run a workflow including the workflow description file, credentials, and services files.
3. A monitoring and output visualization page, which allows users to monitor multiple workflow executions in progress. The most common monitoring activity consists of keeping track the status of each task through the workflow monitor, which provides a real-time updated graphical representation of workflow tasks. The application’s output is presented in the form of images where applicable.
4. A resource information page, which shows the characteristics of all available computing resources.
5. An application specific page, which in the current implementation provides generation of application workflow description files by taking application specific parameters as input.
Workflow Editor: The workflow editor provides a Graphical User Interface (GUI) and allows users to create new and modify existing workflows, based on an XML-based workflow language (xWFL) utilizing the drag and drop facilities. Using the editor, expert users can design and create the workflows for complex scientific procedures following the workflow language and the nature of application, whereas primitive users can reuse these workflows with some minor changes. In the editor, workflows are created graphically as a Directed acyclic Graph (DAG) with some nodes and links. The node represents the computational activities of a particular task in the workflow and a link is used to specify the data flow between two tasks.
Workflow Monitor: The Workflow Monitor provides a GUI for viewing the status of each task in the workflow. Users can easily view the ready, executing, stage-in, and completed tasks. Task status is represented using di!erent colors. Users can also view the site of execution of each task, the number of tasks being executed (in case of a parameter sweep type of application) and the failure history of each task. The workflow structure is editable such that users can drag tasks and group or separate tasks of interest when there are numerous tasks in the workflow. The monitor interacts with the workflow engine (WFE) using an event mechanism by using the tuple space model. In the backend, a database server stores the states of each task for each application workflow. Whenever any task changes state, the monitoring interface is notified and the status values are stored. This enables multiple users to access the monitoring interface from di!erent locations. The monitoring interface does not have support for deletion and insertion of individual tasks at run-time. However, users can add and delete tasks at the time of construction using the workflow editor.
Workflow Engine: Scientific application portals submit task definitions along with their dependencies in the form of the workflow language to WFE. Then the WFE schedules the tasks in the workflow application through the middleware services and manages the execution of tasks on distributed resources. The key components of the WFE are: workflow submission, workflow language parser, resource discovery, dispatcher, data movement and workflow scheduler.
The WFE is designed to support an XML-based WorkFlow Language (xWFL). This facilitates user level planning at the submission time. The workflow language parser converts workflow description from XML format to Tasks, Parameters, Data Constraint (workflow dependency), Conditions, etc., that can be accessed by workflow scheduler. The resource discovery component of the WFE can query Information Services such as Globus MDS, directory service, and replica catalogues, to locate suitable resources for execution of the tasks in the workflow by coordinating with middleware technologies such as the Cloudbus Broker. The WFE uses the Cloudbus Broker for deploying and managing task execution on various middlewares as a dispatcher component. The Cloudbus Broker as a middleware mediates access to distributed resources by (a) discovering resources, (b) deploying and monitoring task execution on selected resources, (c) accessing data from local or remote data source during task execution, and (d) collating and presenting results.
The WFE is designed to be loosely-coupled and flexible using a tuple spaces model, eventdriven mechanism, and subscription/notification approach in the workflow scheduler, which is managed by the workflow coordinator component. The data movement component of the WFE enables data transfers between distributed resources by using SFTP and GridFTP protocols. The workflow executor is the central component in WFE. With the help from dispatcher component it interacts with the resource discovery component to find suitable compute resources at run time (depends on the scheduling algorithm used), submits a task to resources, and controls input data transfer between task execution nodes.
Gridbus Workflow Engine Documentation
For architecture, APIs and installation documentations, refer to the following source:
- Jia Yu and Rajkumar Buyya, A Novel Architecture for Realizing Grid Workflow using Tuple Spaces,Proceedings of the 5th IEEE/ACM International Workshop on Grid Computing (GRID 2004, Nov. 8, 2004, Pittsburgh, USA), IEEE Computer Society Press, Los Alamitos, CA, USA.
- Jia Yu, Rajkumar Buyya and Chen-Khong Tham, Cost-based Scheduling of Workflow Applications on Utility Grids, Proceedings of the 1st IEEE International Conference on e-Science and Grid Computing (e-Science 2005, IEEE CS Press, Los Alamitos, CA, USA), Dec. 5-8, 2005, Melbourne, Australia.
- Tianchi Ma and Rajkumar Buyya,Critical-Path and Priority based Algorithms for Scheduling Workflows with Parameter Sweep Tasks on Global Grids, Proceeding of the 17th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD 2005), IEEE CS Press, Los Alamitos, CA, USA, Oct. 24-27, 2005, Rio de Janeiro, Brazil.
- Jia Yu and Rajkumar Buyya, A Taxonomy of Scitific Workflow Systems for Grid Computing, Special Issue on Scientific Workflows, SIGMOD Record, ACM press, Volume 34, Number 3, Sept. 2005.
- Jia Yu and Rajkumar Buyya, Taxonomy of Workflow Management Systems for Grid Computing, Journal of Grid Computing, Volume 3, Numbers 3-4, Pages: 171-200, Springer Science+Business Media B.V., New York, USA, Sept. 2005.
- Jia Yu and Rajkumar Buyya, A Budget Constrained Scheduling of Workflow Applications on Utility Grids using Genetic Algorithms, Workshop on Workflows in Support of Large-Scale Science, Proceedings of the 15th IEEE International Symposium on High Performance Distributed Computing (HPDC 2006), June 19-23, 2006, Paris, France.
- Jia Yu and Rajkumar Buyya, Scheduling Scientific Workflow Applications with Deadline and Budget Constraints using Genetic Algorithms, Scientific Programming Journal, Volume 14, Issue 3-4, Pages: 217 - 230, ISSN: 1058-9244, IOS Press, Amsterdam, The Netherlands, Nov 2006.
- Jia Yu, Michael Kirley, and Rajkumar Buyya, Multi-objective Planning for Workflow Execution on Grids, Proceedings of the 8th IEEE/ACM International Conference on Grid Computing (Grid 2007, IEEE CS Press, Los Alamitos, CA, USA), Sept. 19-21, 2007, Austin, Texas, USA.
- Mustafizur Rahman, Srikumar Venugopal, and Rajkumar Buyya, A Dynamic Critical Path Algorithm for Scheduling Scientific Workflow Applications on Global Grids, Proceedings of the 3rd IEEE International Conference on e-Science and Grid Computing (e-Science 2007, IEEE CS Press, Los Alamitos, CA, USA), Dec. 10-13, 2007, Bangalore, India.
- Rajiv Ranjan, Mustafizur Rahman, and Rajkumar Buyya, A Decentralized and Cooperative Workflow Scheduling Algorithm, Proceedings of the 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2008, IEEE CS Press, Los Alamitos, CA, USA), May 19-22, 2008, Lyon, France.
- Suraj Pandey and Rajkumar Buyya, Scheduling of ScientificWorkflows on Data Grids, TCSC Doctoral Symposium, Proceedings of the 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2008, IEEE CS Press, Los Alamitos, CA, USA), May 19-22, 2008, Lyon, France.
- Mustafizur Rahman and Rajkumar Buyya, An Autonomic Workflow Management System for Global Grids, TCSC Doctoral Symposium, Proceedings of the 8th IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2008, IEEE CS Press, Los Alamitos, CA, USA), May 19-22, 2008, Lyon, France.
Cloudbus Workflow Engine related Documentation
- Suraj Pandey, Letizia Sammut, Rodrigo N. Calheiros, Andrew Melatos, and Rajkumar Buyya, Scalable Deployment of a LIGO Physics Application on Public Clouds: Workflow Engine and Resource Provisioning Techniques, Cloud Computing for Data-Intensive Applications, 3-25pp, Li, Xiaolin, Qiu, Judy (Eds.), ISBN: 978-1-4939-1904-8, Springer, Berlin, Germany, 2014.
- Suraj Pandey, William Voorsluys, Mustafizur Rahman, Rajkumar Buyya, James Dobson, Kenneth Chiu, A Grid Workflow Environment for Brain Imaging Analysis on Distributed Systems. Concurrency and Computation: Practice and Experience, Volume 21, Number 16, Pages: 2118-2139, ISSN: 1532-0626, Wiley Press, New York, USA, November 2009.
- Suraj Pandey, Linlin Wu, Siddeswara Guru, and Rajkumar Buyya, A Particle Swarm Optimization (PSO)-based Heuristic for Scheduling Workflow Applications in Cloud Computing Environments, Proceedings of the 24th IEEE International Conference on Advanced Information Networking and Applications (AINA 2010), Perth, Australia, April 20-23, 2010. - Best Paper Award.
- Suraj Pandey, Kapil Kumar Gupta, Adam Barker and Rajkumar Buyya, Minimizing Execution Cost when using Globally Distributed Cloud Services, Proceedings of the 24th IEEE International Conference on Advanced Information Networking and Applications (AINA 2010), Perth, Australia, April 20-23, 2010.
- Suraj Pandey, William Voorsluys, Mustafizur Rahman, Rajkumar Buyya, James Dobson, Kenneth Chiu, Brain Image Registration Analysis Workflow for fMRI Studies on Global Grids. In Proceedings of the 23rd IEEE International Conference on Advanced Information Networking and Applications (AINA-09), Bradford, UK, May 2009.
- Suraj Pandey, Dileban Karunamoorthy and Rajkumar Buyya, Workflow Engine for Clouds, Cloud Computing: Principles and Paradigms, R. Buyya, J. Broberg, A.Goscinski (eds), ISBN-13: 978-0470887998, Wiley Press, New York, USA, 2010. (in press, accepted on Dec. 10, 2009).
- Suraj Pandey, Rajkumar Buyya. Scheduling and Management Techniques for Data-Intensive Application Workflows, Data Intensive Distributed Computing: Challenges and Solutions for Large-scale Information Management, T. Kosar (ed), IGI Global, USA, 2009. (in press, accepted on July 24, 2009).
Software and License
The WFE software and documents are released as "open source" under the GPL license. Copyright The Cloudbus Project, CLOUDS Lab, The University of Melbourne, 2011.
LATEST CODE: WorkFlow Engine code (released in 2016)
Compatibility
For compability, install the following software:
- Java 2 SDK version 1.4
- MySQL version 3.23.49
- IBM TSpaces version 3
- CoG 1.1
- Globus2.4
- Gridbus Broker 2.4.3
Bug Reports and Feedback
For WFE bugs and feedback, please email to raj@csse.unimelb.edu.au,
Credits
This release is developed by:
- Suraj Pandey, PhD, CLOUDS Lab @
The University of Melbourne.
- William Voorsluys, PhD student, CLOUDS Lab @ The University of Melbourne.
- Mustafizur Rahman, PhD, CLOUDS Lab @ The University of Melbourne.
- Dileban Karunamoorthy, Sheng Niu, Dong Leng (MEDC Students)
The Gridbus Workflow Engine (previous releases) was developed by:
- Jia Yu, PhD, CLOUDS Lab @ The University of Melbourne.
- Tianchi Ma, Research Fellow, CLOUDS Lab @ The University of Melbourne
- Rajkumar Buyya, Project Leader, CLOUDS Lab @ The University of Melbourne.
Links
For additional information, please refer to the following websites: