Pants Daemon
The Pants Daemon (pantsd) is a system introduced to enable Pants to keep information about the build warm in memory between runs. It consists of a process running in the background (currently, one for each buildroot), which listens to filesystem events and keeps a build graph warm. It then passes that graph to subsequent runs that request it.
This document outlines all the moving pieces of pantsd, and explains how it works through an end-to-end run.
ProcessManager
ProcessManager is a class designed to keep track of processes. Besides changing their state (paused, terminating...), it allows a process to fork in different ways. Classes that extend ProcessManager can (and often do) inject code to be run before and after forking by overriding the functions pre_fork, post_fork_child, post_fork_parent. An example of one of these functions is in NailgunExecutor, where we start a Nailgun server that we can connect to, after forking the main Pants process.
The PantsDaemon Class
The PantsDaemon class is responsible for managing the lifetime of the pantsd process and the services associated with it. A PantsDaemon is created with the inner class Factory, and it has two modes of initialization:
- Stub initialization, which parses options, launches watchman, and not much else. This mode is used to determine whether it needs to fully initialize the daemon or not.
- Full initialization, which spawns the daemon process, initializes all the services, the legacy engine and the native code.
Initialization is encapsulated in the PantsDaemon.Factory.create() method.
PantsDaemon is a ProcessManager, which means one can know if it's alive, or if it needs to restart.
PantsDaemons can be launch()ed, which will terminate the process it was running, and call daemon_spawn() to fork a new process. This new process will run the code in PantsDaemon.post_fork_child(), which in short means it will run os.spawnve to execute the pants_daemon.py:launch() function, which will call PantsDaemon.run_sync(). run_sync() does a lot of things, but the vital things are calling _setup_services() to spin up services, and _run_services() to start an infinite loop polling them. More on services later.
Pailgun
The Nailgun Protocol is a protocol designed to allow clients to make command-line requests to a Nailgun Server. It supports an interface similar to Process, except for it having no concern about making operations hermetic, and the Nailgun Protocol supports streaming access to stdin/stdout. Pailgun is an extension of the Nailgun Protocol, which is clients use to ask the Pants Daemon to spawn pants invocations.
The protocol is subject to change slightly in #6579.
In Pantsd, PailgunServer is the class responsible for reading Pailgun requests and handling them, by spawning PailgunHandlers in new threads.
Services
A Pants Daemon process has several services that it polls in order. Every service runs in a separate thread, and can be paused, resumed or terminated. Services can communicate with each other.
Examples of services are SchedulerService, which takes care of listening responding to those events to keep a warm Graph, and PailgunService, which listens to SchedulerService and manages the lifetime of a PailgunServer responsible for spawning pants runs when requested by clients, it takes a DaemonPantsRunner as one of its arguments, to use as a template to spawn pants runs when requested by the clients.
PailgunService
A PailgunService is a PantsService which spins up and polls a PailgunServer.
A PailgunServer is a TCPServer with ThreadingMixIn, which listens to Pailgun requests in a socket and spins up instances of PailgunHandlers to handle them. It overrides ThreadingMixIn.process_request_thread() to spin up one thread and one handler per request. A PailgunServer holds a reference to the class DaemonPantsRunner, which can be used to run pants from the server.
A PailgunHandler is a class that parses the requests sent to the server, and uses DaemonPantsRunner to invoke pants with the environment and arguments specified by the request.
A DaemonPantsRunner implements a run() method that creates an instance of LocalPantsRunner, which will be used to run the requested pants command.
An end-to-end run with Pantsd
To understand what the process is for spinning up and closing down pantsd, here is what happens when we run pantsd for the first time:
If we run the command ./pants --enable-pantsd list src/scala::, the following happens:
PantsRunner::run()is called, which will prompt parsing of bootstrap options.--enable-pantsdis a bootstrap option.-
In that function, we determine whether we need to run in pantsd mode or local mode. If we choose local mode, an instance of
LocalPantsRunneris created and the run will continue as if pantsd didn't exist. -
Since
--enable-pantsdwas toggled on, we will create an instance ofRemotePantsRunner, and run it.RemotePantsRunnerwill: - Maybe launch pantsd, by calling
PantsDaemon.Factory.maybe_launch(). This method will:- Create a stub instance of
PantsDaemon. - Check if any of the fingerprinted options have changed by calling
ProcessManager.needs_restart(). If they have (or there is no pantsd running at the moment), it will fully initialize an instance ofPantsDaemonby callingPantsDaemon.Factory.create(), and launch it withPantsDaemon.launch(). In either case, it will return aPantsDaemon.Handleto a process runningPantsDaemon.run_sync(), which will poll all the services.
- Create a stub instance of
- With a handle to the pantsd process,
RemotePantsRunnerwill now call_run_pants_with_retry(), which will try to_connect_and_execute()to the port supplied by the handle, probably more than once. To do that, it will create aNailgunClientinstance with will use the pailgun protocol described above to tell the pantsd process to invoke pants, with a call along the lines of:result = client.execute('./pants', *self._args, **modified_env) -
After that request is finished, it will record the result and the local process will exit. But before that happens, this is what happens from the pantsd side:
-
The
PailgunServicewill receive that request, and it will handle it as follows: PailgunServiceis endlessly polling for requests via callingPailgunServer.handle_request().- When that method receives a request, it will call
SocketServer._handle_request_noblock(), which will callThreadingMixIn.process_request()(it would usually callSocketServer.process_request, butPailgunServeralso extendsThreadingMixIn, which intentionally overrides this function). ThreadingMixIn.process_request()will spin up a newThreadand callPailgunServer.process_request_thread()(it would usually callThreadingMixIn.process_request_thread(), but we override it).- (note: we are now in a separate thread)
PailgunServer.process_request_thread()will create an instance ofPailgunHandler, and callPailgunHandler.handle_request(). PailgunHandler.handle_request()will create an instance ofDaemonPantsRunner, and it willrun()it. Creating an instance ofDaemonPantsRunnerwithDaemonPantsRunner.create()means that it will call theSchedulerServiceto get a warm graph.-
DaemonPantsRunnerwill create an instance ofLocalPantsRunnerwith the graph (and options, and such) it got from theSchedulerService, and run as if the daemon didn't exist. -
The
PailgunServerwill wait untilLocalPantsRunneris finished inhandle_request.