RUNNING A WORKFLOW WITHOUT WORKFLOWS: A BASIC ALGORITHM FOR DYNAMICALLY CONSTRUCTING AND TRAVERSING AN IMPLIED DIRECTED ACYCLIC GRAPH IN A NON-DETERMINISTIC ENVIRONMENT
Fedir Smilianets
fedor.smile@gmail.comNational Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute" (Ukraine)
https://orcid.org/0000-0002-0061-7479
Oleksii Finogenov
National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute" (Ukraine)
https://orcid.org/0000-0002-1708-5632
Abstract
This paper introduces a novel algorithm for dynamically constructing and traversing Directed Acyclic Graphs (DAGs) in workflow systems, particularly targeting distributed computation and data processing domains. Traditional workflow management systems rely on explicitly defined, rigid DAGs, which can be cumbersome to maintain, especially in response to frequent changes or updates in the system. Our proposed algorithm circumvents the need for explicit DAG construction, instead opting for a dynamic approach that iteratively builds and executes the workflow based on available data and operations, through a combination of entities like Data Kinds, Operators, and Data Units, the algorithm implicitly forms a DAG, thereby simplifying the process of workflow management. We demonstrate the algorithm’s functionality and assess its performance through a series of tests in a simulated environment. The paper discusses the implications of this approach, especially focusing on cycle avoidance and computational complexity, and suggests future enhancements and potential applications.
Keywords:
distributed computing, directed acyclic graph, pipeline processingReferences
Brewer L. E. et al.: Causal inference in cumulative risk assessment: The roles of directed acyclic graphs. Environment International 102, 2017, 30–41 [https://doi.org/https://doi.org/10.1016/j.envint.2016.12.005].
Google Scholar
Colonnelli I. et al.: Bringing AI pipelines onto cloud-HPC: setting a baseline for accuracy of COVID-19 diagnosis. ENEA CRESCO in the Fight Against COVID-19, 2021, 66–73 [https://doi.org/10.5281/ZENODO.5151511].
Google Scholar
Eugster P. Th. et al.: The many faces of publish/subscribe. ACM Comput. Surv. 35(2), 2003, 114–131.
Google Scholar
Ferguson K. D. et al.: Evidence synthesis for constructing directed acyclic graphs (ESC-DAGs): a novel and systematic method for building directed acyclic graphs. International Journal of Epidemiology 49(1), 2019, 322–329 [https://doi.org/10.1093/ije/dyz150].
Google Scholar
Georgeson P. et al.: Bionitio: demonstrating and facilitating best practices for bioinformatics command-line software. GigaScience 8(9), 2019, giz109 [https://doi.org/10.1093/gigascience/giz109].
Google Scholar
Jackson M. et al.: Using prototyping to choose a bioinformatics workflow management system. PLOS Computational Biology 17(2), 2021.
Google Scholar
Authors
Fedir Smilianetsfedor.smile@gmail.com
National Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute" Ukraine
https://orcid.org/0000-0002-0061-7479
Authors
Oleksii FinogenovNational Technical University of Ukraine "Igor Sikorsky Kyiv Polytechnic Institute" Ukraine
https://orcid.org/0000-0002-1708-5632
Statistics
Abstract views: 123PDF downloads: 84