Existing
We have debugging infrastructure. For example:
GNU Hurd debugging, including rpctrace, and more.
To Do
glibc's sotruss
Checkpoint/restart allows the state of a set of processes to be saved to persistent storage, then restarted at some future time -- quoting from Jonathan Corbet's 2010 Linux Kernel Summit report.
This is surely a very useful facility to have for reproducing failures, for example. But on the other hand it's questionable how it can help with debugging failures in GNU Hurd servers' interactions, as their state is typically spread between several processes.
Continues: http://lwn.net/Articles/414264/, which introduces http://dmtcp.sourceforge.net/.
?crash server}}, [[GDB gcore, http://code.google.com/p/google-coredumper/
http://lwn.net/Articles/415728/, or http://lwn.net/Articles/415471/ -- just two examples; there's a lot of such stuff for Linux.