Re: [Dillo-dev] Unix domain sockets

April 16, 2003

      On 2003-04-15 at 15:45 -0400, Jorge Arellano Cid wrote:
...
When  a  server  that  listens  on  a  UDS  has started and its
socket's  filename is removed from the filesystem. Is it notified
somehow, maybe with a SIGPIPE signal or something? or not?
Not to the best of my knowledge.  And I wrote a short C program to test
it, and it's not on FreeBSD.  Code available on request (I might well
have messed up -- I caught myself writing Perl far too often).

I've tested straight accept() and non-blocking accept socket with
poll(); no poll POLL(HUP|ERR|NVAL) either.  :^(
...
For  the  0.7.1.2 bm_srv12, I'm considering `fuser -k bm_srv12`
and a hackish:
fuser is platform-specific.  You'd need to code logic for multiple
commands.

lsof(8) is fairly portable, but not as widespread as it should be.
Certainly various *BSD systems have it as an optional package, and we
make sure it's on our Solaris systems too.  You can't invoke lsof(8) on
a filename and get output if the filename represents a unix-domain
socket.  :^(  Either "lsof -U" for a list of all unix-domain sockets, or
"lsof -c /commandnameregexp/" which would give:

COMMAND   PID USER   FD   TYPE     DEVICE SIZE/OFF   NODE NAME
[...]
unixsocke 367   me    3u  unix 0xcf251c00      0t0   /home/me/SocketTest

"lsof -U" gives output in identical format, except that every row will
have TYPE == "unix".

sockstat(8) on FreeBSD (but not OpenBSD), but again you'd need to grep
the filename.  "sockstat -u" gives:

USER     COMMAND    PID   FD PROTO  ADDRESS                                    
[...]
me       unixsock   367    3 stream /home/me/SocketTest
[...]
...
<dpi cmd='bye'
(without the closing ">" an exit(1) is forced!)
For the new version, why not just have a proper "bye" command?

Any distributed system needs a way to shut down cleanly.  I've worked
cleaning up the mess after an application system which didn't have one
and the original programmer had tried to use signals, but had ended up
using SIGKILL.  This thing took several minutes to start up each time,
losing money when it happened.  Diagnosed, added shutdown stuff to the
all the IDL definitions for the various components, and put together a
quick tool to remove all the old state files from the working directory
which had made the tool so slow.

Hey presto, decent coordinated shutdown which actually worked, reliably,
and startup time of a few seconds maximum (despite being written for DCE
-- don't ask).

I've seen a few other situations, none as severe, but my perhaps limited
experience is that provided your control protocol can be trusted
(nothing untrusted can send messages, eg HTTP) then every independently
running part of a system needs a start-up, a reinitialise, and a
shutdown command.   Preferably also "dump state" and "dump statistics"
commands but that's because I'm nosy (okay, I like debugging).

A _really_ sick way would be to create a token, fork a child to connect
to your own unix socket and write a ping command with the token as data
to its connection.  If you get that coming back in, it's still your
socket.  If you don't get it but something else replies to the child
then you've lost the socket and should exit.  :^)

-Phil

Re: [Dillo-dev] Unix domain sockets

Phil Pennock