Deadline Danger Detection Solutions for an Industrial Batch System using Tivoli Workload Scheduler for z/OS
Authors
Ernstsen, Emil ; Greve, Peter Borch
Term
4. term
Education
Publication year
2021
Abstract
Financial institutions run large batch systems where many jobs must be scheduled and monitored to meet strict deadlines. In collaboration with BEC, we investigate how to detect deadline risks in BEC's Tivoli Workload Scheduler for z/OS (TWS) without replacing the scheduler. We design and implement the Dangerous Deadline Detection Program (D3P), a tool that interfaces with TWS to retrospectively analyze completed batch days and per-job history, and that provides a proof of concept for live monitoring. Methodologically, we build on the concepts of latest start, critical network, and critical path, and compare two approaches: an IBM-inspired partial re-implementation and a modified D3P method. The work covers design considerations, iterative feedback with BEC, and experiments on multiple closely related solutions. Findings indicate that estimating individual job execution times is difficult due to widely varying contexts, and that the two IBM-inspired latest-start methods produce similar results; the D3P method may offer advantages where there are more deadline jobs and longer critical paths. We also note limitations, including the inability to test IBM's built-in solution in BEC's environment.
Finanssektoren driver store batchsystemer, hvor mange jobs skal planlægges og overvåges for at overholde stramme deadlines. I samarbejde med BEC undersøger vi, hvordan deadline-risici kan opdages i BEC's Tivoli Workload Scheduler for z/OS (TWS) uden at udskifte selve planlægningsplatformen. Vi designer og implementerer Dangerous Deadline Detection Program (D3P), et værktøj der integrerer med TWS for retrospektiv analyse af gennemførte batchdage og jobhistorik og som demonstrerer proof-of-concept for live-overvågning. Metodemæssigt bygger vi på begreberne seneste start, kritisk netværk og kritisk sti og sammenligner to tilgange: en IBM-inspireret delvis re-implementering og en modificeret D3P-metode. Arbejdet omfatter designovervejelser, løbende afklaringer med BEC og eksperimenter med flere nært beslægtede løsninger. Resultaterne viser, at estimering af køretider for individuelle jobs er vanskelig på grund af meget varierende kontekster, og at de to IBM-inspirerede metoder til beregning af seneste start giver ens resultater; D3P-metoden kan dog have fordele ved flere deadline-jobs og længere kritiske stier. Vi beskriver også begrænsninger, herunder at IBMs indbyggede løsning ikke kunne afprøves i BEC's miljø.
[This apstract has been generated with the help of AI directly from the project full text]
