On 2003-04-15 at 15:45 -0400, Jorge Arellano Cid wrote:
When a server that listens on a UDS has started and its socket's filename is removed from the filesystem. Is it notified somehow, maybe with a SIGPIPE signal or something? or not?
Not to the best of my knowledge. And I wrote a short C program to test it, and it's not on FreeBSD. Code available on request (I might well have messed up -- I caught myself writing Perl far too often). I've tested straight accept() and non-blocking accept socket with poll(); no poll POLL(HUP|ERR|NVAL) either. :^(
For the 0.7.1.2 bm_srv12, I'm considering `fuser -k bm_srv12` and a hackish:
fuser is platform-specific. You'd need to code logic for multiple commands. lsof(8) is fairly portable, but not as widespread as it should be. Certainly various *BSD systems have it as an optional package, and we make sure it's on our Solaris systems too. You can't invoke lsof(8) on a filename and get output if the filename represents a unix-domain socket. :^( Either "lsof -U" for a list of all unix-domain sockets, or "lsof -c /commandnameregexp/" which would give: COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME [...] unixsocke 367 me 3u unix 0xcf251c00 0t0 /home/me/SocketTest "lsof -U" gives output in identical format, except that every row will have TYPE == "unix". sockstat(8) on FreeBSD (but not OpenBSD), but again you'd need to grep the filename. "sockstat -u" gives: USER COMMAND PID FD PROTO ADDRESS [...] me unixsock 367 3 stream /home/me/SocketTest [...]
<dpi cmd='bye'
(without the closing ">" an exit(1) is forced!)
For the new version, why not just have a proper "bye" command? Any distributed system needs a way to shut down cleanly. I've worked cleaning up the mess after an application system which didn't have one and the original programmer had tried to use signals, but had ended up using SIGKILL. This thing took several minutes to start up each time, losing money when it happened. Diagnosed, added shutdown stuff to the all the IDL definitions for the various components, and put together a quick tool to remove all the old state files from the working directory which had made the tool so slow. Hey presto, decent coordinated shutdown which actually worked, reliably, and startup time of a few seconds maximum (despite being written for DCE -- don't ask). I've seen a few other situations, none as severe, but my perhaps limited experience is that provided your control protocol can be trusted (nothing untrusted can send messages, eg HTTP) then every independently running part of a system needs a start-up, a reinitialise, and a shutdown command. Preferably also "dump state" and "dump statistics" commands but that's because I'm nosy (okay, I like debugging). A _really_ sick way would be to create a token, fork a child to connect to your own unix socket and write a ping command with the token as data to its connection. If you get that coming back in, it's still your socket. If you don't get it but something else replies to the child then you've lost the socket and should exit. :^) -Phil