High-level design

This chapter aims to provide high-level overview on how ppacer works.

Top level

Starting from the top, we can say that ppacer is a set of Go packages that can be used to define and compile a program(s) for scheduling and executing directed acyclic graphs of tasks. These programs are compiled into cross-platform and statically-linked binaries with no dependencies, like most Go programs. Tasks are defined via regular Go code, meaning the processes defined within ppacer are embedded into the program and are not parsed dynamically at runtime.

Ppacer scheduler uses a database to persist information on processes, their schedules, tasks, runs and other metadata. By default SQLite is used, but ppacer can support any other database that provides a driver for standard database/sql package. For details please check Databases.

Scheduler - the Heart of ppacer

The main objective of using ppacer is to setup and run a scheduler. To have a running ppacer Scheduler we need to initialize it and start it. Scheduler on the startup does the following steps:

Synchronize dag.Registry, DAG runs and task queues with the database.
Setup and start DagRunWatcher in a separate goroutine, to detect new DAG runs.
Setup and start TaskScheduler, in a separate goroutine, to coordinate tasks scheduling for active DAG runs.
Register HTTP endpoints and finally return *http.ServerMux.

In other words we can say that ppacer Scheduler performs some housekeeping on the startup and eventually exposes its API in form of HTTP server. In particular tasks scheduled to be executed are stored in Scheduler internal queue which can be accessed through one of the mentioned endpoints.

Executor

Executor executes tasks. It has information about the same dag.Registry as Scheduler. When Executor is initialized and running, it starts a never-ending loop, to ask ppacer Scheduler, via HTTP, about new tasks to be executed. Tasks are executed in separate goroutines. Executor also let Scheduler know about executed task status - it doesn’t have access to the main ppacer database.

Scheduler and Executor can be used in the same program - in this case Executor just needs to be started in a separate Goroutine, as in ppacer hello world example. Because of facts that Executor communicates with Scheduler via HTTP and has information about dag.Registry of processes we can have defined Executor in a separate binary or even in multiple binaries placed on different computers or k8s nodes connected in a network.