From 0c4ea3e7bb37b6ff14a2973deceda79b9f255cf5 Mon Sep 17 00:00:00 2001 From: bnewbold Date: Mon, 27 Jun 2022 12:11:34 -0700 Subject: adding old notes files --- doc/plan.txt | 113 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ doc/spec.txt | 7 ++++ 2 files changed, 120 insertions(+) create mode 100644 doc/plan.txt create mode 100644 doc/spec.txt diff --git a/doc/plan.txt b/doc/plan.txt new file mode 100644 index 0000000..bea2b14 --- /dev/null +++ b/doc/plan.txt @@ -0,0 +1,113 @@ + +rustup run nightly cargo install clippy +rustup run nightly cargo clippy + +startup: +x bind sockets +- [optional] bind rpc server+client +x register signal handlers +x populate config +- [optional] spawn rpc thread +x spawn initial set of children +x enter main event loop + select! + +rpc mechanism: + use JSON-RPC: https://github.com/ethcore/jsonrpc-core + worker side receives client request, creates a new reply channel, sends + request+channel to event loop, blocks on reply channel. this all + happens in per-client thread? + event loop selects() on rpc requests. when one is received, processes, + sends reply down channel, closes channel + +## Process Lifetime + +process table is pid_t -> offspring +entries are only removed by SIGCHILD handler, which calls pidwait and thus has reap'd +entries hold a timer guard, so after they are destroyed the timer shouldn't fire + +spawn: + init + checkin childhood (seconds) later + +shutdown: + notified, send USR2 + checkin shutdown later: if not dead term it, either way reap + +term: + notified, send TERM + checking shutdown later; if not dead kill it + +kill: + dead, send KILL + +upgrade: + spawn new generation; keep 'replaces' linkage + when each new generation is health, notify the 'replace-ee' to shut down + +timer: check_alive: + healthy; or kill and re-spawn (based on ack mode) + +signal: child died: + find which child it was (by pid) + if was infancy or health and attempts ok, try to respawn, possibly with backoff + otherwise, just reap from brood + +## Most Basic + +x bind to any supplied socket(s) +x set up environment variables +x register signal handlers +x fork() for each child, with some delay in between +x in each forked copy, execve() to the supplied program +x in the parent, wait on children, waiting for failures. restart on failure +- reset signal mask in child processes + +then, probably want to shift to an event-driven (single threaded?) setup +(including signal handling) +(event-driven seems like it might not work well, so threads+channels instead) + +- timers +- signals: chan_signal +- child termination +- RPC commands + +threads: +- shepard: holds sub-process state machines; chan_select!{} on other threads +- chan_signal (internal to library) +? timers ? +- rpc: workers spawned on socket connects + -> line-based? JSON-RPC? gRPC? + +sub-process states (TODO: look at daemontools): +- healthy +- dead +- starting + +## Socket Passing + +fcntl F_GETFD: to get fd flags +FD_CLOEXEC: flag of whether to keep open ("close on exec", true means it isn't passed) + +EINHORN_FD_COUNT +EINHORN_FD_0 + +-4 and -6 flags (forces IPv4 or IPv6, respectively) + +## Command Socket + +EINHORN_SOCK_PATH + + +## Child API + +listen for USR2 (graceful shutdown) +parse environment variables + +## Rust child impl + +https://doc.rust-lang.org/std/os/unix/io/trait.FromRawFd.html + +---- + +Using chan instead of std::sync::mspc because Select/select! is unstable. diff --git a/doc/spec.txt b/doc/spec.txt new file mode 100644 index 0000000..e8662fd --- /dev/null +++ b/doc/spec.txt @@ -0,0 +1,7 @@ + +Signals to daemon itself: + USR2, INT -> graceful shutdown all offspring and exit + TERM, QUIT -> terminate all offspring and exit (this is a bit faster) + HUP -> upgrade all offspring + +NB: QUIT and ALRM are different from einhorn -- cgit v1.2.3