Original Report

In the GCC testsuite, at this point:

Running /home/tschwinge/tmp/gcc/hurd/gcc/testsuite/gcc.c-torture/unsorted/unsorted.exp ...

... expect had gone bonkers:

$ ps --all --format=hurd-long -w
  PID  UID  PPID  PGrp  Sess TH  Vmem   RSS %CPU     User   System Args
[...]
 3567 1000 10295  3567  3567  2  137M  856K 98.2     5hrs 28 hours expect -- /usr/share/dejagnu/runtest.exp --tool gcc
[...]

Last lines of gcc/testsuite/gcc/gcc.sum:

PASS: gcc.c-torture/unsorted/q.c,  -O2 -flto -flto-partition=none  
PASS: gcc.c-torture/unsorted/q.c,  -O2 -flto  
PASS: gcc.c-torture/unsorted/r.c,  -O0  
PASS: gcc.c-torture/unsorted/r.c,  -O1  
PASS: gcc.c-torture/unsorted/r.c,  -O2  
PASS: gcc.c-torture/unsorted/r.c,  -O3 -fomit-frame-pointer  
PASS: gcc.c-torture/unsorted/r.c,  -O3 -g  
PASS: gcc.c-torture/unsorted/r.c,  -Os  
PASS: gcc.c-torture/unsorted/r.c,  -O2 -flto -flto-partition=none  

Last lines of gcc/testsuite/gcc/gcc.log:

Executing on host: /media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/xgcc -B/media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/  -w  -O2 -flto -flto-partition=none   -c  -o /home/tschwinge/tmp/gcc/hurd.build/gcc/testsuite/gcc/r.o /home/tschwinge/tmp/gcc/hurd/gcc/testsuite/gcc.c-torture/unsorted/r.c    (timeout = 300)
spawn /media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/xgcc -B/media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/ -w -O2 -flto -flto-partition=none -c -o /home/tschwinge/tmp/gcc/hurd.build/gcc/testsuite/gcc/r.o /home/tschwinge/tmp/gcc/hurd/gcc/testsuite/gcc.c-torture/unsorted/r.c
PASS: gcc.c-torture/unsorted/r.c,  -O2 -flto -flto-partition=none  
Executing on host: /media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/xgcc -B/media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/  -w  -O2 -flto   -c  -o /home/tschwinge/tmp/gcc/hurd.build/gcc/testsuite/gcc/r.o /home/tschwinge/tmp/gcc/hurd/gcc/testsuite/gcc.c-torture/unsorted/r.c    (timeout = 300)
spawn /media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/xgcc -B/media/data/home/tschwinge/tmp/gcc/hurd.build/gcc/ -w -O2 -flto -c -o /home/tschwinge/tmp/gcc/hurd.build/gcc/testsuite/gcc/r.o /home/tschwinge/tmp/gcc/hurd/gcc/testsuite/gcc.c-torture/unsorted/r.c

The root filesystem is sort-of deadlocked: syncfs -c / doesn't finish -- even without -s. But it is fine to spawn new processes, execute new commands, etc.

GDB on 3567:

(gdb) info threads
  2 Thread 3567.2  0x011aaf4c in mach_msg_trap () at /build/buildd-eglibc_2.11.2-7-hurd-i386-6JVoJz/eglibc-2.11.2/build-tree/hurd-i386-libc/mach/mach_msg_trap.S:2
* 1 Thread 3567.1  0x011aaf9c in swtch_pri () at /build/buildd-eglibc_2.11.2-7-hurd-i386-6JVoJz/eglibc-2.11.2/build-tree/hurd-i386-libc/mach/swtch_pri.S:2
(gdb) bt
#0  0x011aaf9c in swtch_pri () at /build/buildd-eglibc_2.11.2-7-hurd-i386-6JVoJz/eglibc-2.11.2/build-tree/hurd-i386-libc/mach/swtch_pri.S:2
#1  0x011ac824 in __spin_lock_solid (lock=0x131e8e0) at spin-solid.c:27
#2  0x011aca1d in __mutex_lock_solid (lock=0x131e8e0) at mutex-solid.c:31
#3  0x0122dd0b in __mutex_lock (oldmem=0x8076458, bytes=94) at ../mach/lock-intern.h:89
#4  __libc_realloc (oldmem=0x8076458, bytes=94) at malloc.c:3814
#5  0x0121de62 in _IO_vasprintf (result_ptr=0x15f40c0, format=0x13098a8 "%s%s%s:%u: %s%sUnexpected error: %s.\n", args=0x15f3c9c "") at vasprintf.c:86
#6  0x01206d1b in ___asprintf (string_ptr=0x15f40c0, format=0x13098a8 "%s%s%s:%u: %s%sUnexpected error: %s.\n") at asprintf.c:37
#7  0x011e2fc8 in __assert_perror_fail (errnum=19, file=0x1305a98 "../sysdeps/mach/hurd/fork.c", line=466, function=0x1305acf "__fork") at assert-perr.c:62
#8  0x012586c8 in __fork () at ../sysdeps/mach/hurd/fork.c:466
#9  0x011f92e9 in do_system (line=0x15f42dc "/bin/stty sane > /dev/ttypa") at ../sysdeps/posix/system.c:119
#10 0x0105bea6 in ?? () from /usr/lib/libexpect.so.5.44.1.15
#11 0x0105bf6d in ?? () from /usr/lib/libexpect.so.5.44.1.15
#12 0x0105c229 in exp_getptyslave () from /usr/lib/libexpect.so.5.44.1.15
#13 0x0103e4b2 in ?? () from /usr/lib/libexpect.so.5.44.1.15
#14 0x01087d79 in ?? () from /usr/lib/libtcl8.5.so.0
#15 0x01088beb in ?? () from /usr/lib/libtcl8.5.so.0
#16 0x0108826a in Tcl_EvalEx () from /usr/lib/libtcl8.5.so.0
#17 0x0108985f in TclEvalObjEx () from /usr/lib/libtcl8.5.so.0
[...]
(gdb) bt full
#0  0x011aaf9c in swtch_pri () at /build/buildd-eglibc_2.11.2-7-hurd-i386-6JVoJz/eglibc-2.11.2/build-tree/hurd-i386-libc/mach/swtch_pri.S:2
No locals.
#1  0x011ac824 in __spin_lock_solid (lock=0x131e8e0) at spin-solid.c:27
No locals.
#2  0x011aca1d in __mutex_lock_solid (lock=0x131e8e0) at mutex-solid.c:31
No locals.
#3  0x0122dd0b in __mutex_lock (oldmem=0x8076458, bytes=94) at ../mach/lock-intern.h:89
No locals.
#4  __libc_realloc (oldmem=0x8076458, bytes=94) at malloc.c:3814
        ar_ptr = <value optimized out>
        nb = 104
        newp = 0x68
        oldp = 0x8076450
        oldsize = 104
        __func__ = "__libc_realloc"
#5  0x0121de62 in _IO_vasprintf (result_ptr=0x15f40c0, format=0x13098a8 "%s%s%s:%u: %s%sUnexpected error: %s.\n", args=0x15f3c9c "") at vasprintf.c:86
        sf = {_sbf = {_f = {_flags = -72515584, 
              _IO_read_ptr = 0x8076458 "expect: ../sysdeps/mach/hurd/fork.c:466: __fork: Unexpected error: (os/kern) urefs overflow.\n", 
              _IO_read_end = 0x8076458 "expect: ../sysdeps/mach/hurd/fork.c:466: __fork: Unexpected error: (os/kern) urefs overflow.\n", 
              _IO_read_base = 0x8076458 "expect: ../sysdeps/mach/hurd/fork.c:466: __fork: Unexpected error: (os/kern) urefs overflow.\n", 
              _IO_write_base = 0x8076458 "expect: ../sysdeps/mach/hurd/fork.c:466: __fork: Unexpected error: (os/kern) urefs overflow.\n", 
              _IO_write_ptr = 0x80764b5 "", _IO_write_end = 0x80764bc "\201\004", 
              _IO_buf_base = 0x8076458 "expect: ../sysdeps/mach/hurd/fork.c:466: __fork: Unexpected error: (os/kern) urefs overflow.\n", 
              _IO_buf_end = 0x80764bc "\201\004", _IO_save_base = 0x0, _IO_backup_base = 0x0, _IO_save_end = 0x0, _markers = 0x0, _chain = 0x0, _fileno = 20046008, 
              _flags2 = 0, _old_offset = 23018464, _cur_column = 0, _vtable_offset = 49 '1', _shortbuf = "\001", _lock = 0x0, _offset = 85643859813466072, 
              _codecvt = 0x1304583, _wide_data = 0x15f3bc0, _freeres_list = 0x0, _freeres_buf = 0x15f3c58, _freeres_size = 0, _mode = -1, 
              _unused2 = "\240;_\001\236D0\001\005\000\000\000\340;_\001(K3\001\000\000\000\000\005\000\000\000\000\000\000\000\250\230\060\001\260\303\031\001"}, 
            vtable = 0x131b9c0}, _s = {_allocate_buffer = 0x122cdf0 <__libc_malloc>, _free_buffer = 0x122cd20 <__libc_free>}}
        ret = <value optimized out>
        needed = 94
#6  0x01206d1b in ___asprintf (string_ptr=0x15f40c0, format=0x13098a8 "%s%s%s:%u: %s%sUnexpected error: %s.\n") at asprintf.c:37
        done = 1
#7  0x011e2fc8 in __assert_perror_fail (errnum=19, file=0x1305a98 "../sysdeps/mach/hurd/fork.c", line=466, function=0x1305acf "__fork") at assert-perr.c:62
        errbuf = "\334\r", '\000' <repeats 14 times>, "\f\265\032\001\000\000\000\000x\262\004\b\000\000\000\000\000\000\000\000\377\377\377\377 \262\004\b\250\065\063\001\070A_\001k\000\000\000\000\000\000\000ı2\001\002", '\000' <repeats 11 times>"\366, \377\377\377\270\235\004\bk\000\000\000X=_\001\037\343\037\001\377\377\377\377\000\000\000\000s=_\001\361\t\006\001\070A_\001\362\t\006\001\350\t\006\001\000\000\000\000\304B_\001\350\t\006\001\330\377_\001\033\000\000\000)\036\024\001\364\267\025\001x=_\001Bq\022\001\000\000\000\000\350:\b\b\230=_\001\364\267\025\001X\313\031\b S\005\b\230=_\001\004\334\f\001X\313\031\b\000\022\030\b\300L\005\b\364\267\025\001\370\021\030\b S\005\b\bB_\001\a\365\f\001 S\005\b\320\021\030\b\001\000\000\000\274A_\001,\316\024\001\001\000\000\000\001\000\000\000\344"...
        buf = <value optimized out>
#8  0x012586c8 in __fork () at ../sysdeps/mach/hurd/fork.c:466
        newproc = 122
        sigthread_refs = 4
        portnames = 0x63000
        porttypes = 0x64000
        sigthread = 130
        state = {gs = 1, fs = 18236712, es = 20390776, ds = 17004532, edi = 1, esi = 143348, ebp = 23020160, esp = 0, ebx = 23020088, edx = 23020016, 
          ecx = 23020028, eax = 3966371413, eip = 23020088, cs = 18236712, efl = 0, uesp = 20138312, ss = 18488015}
        newtask = 121
        thread = 139
        thread_refs = 65534
        statecount = <value optimized out>
        nportnames = 142
        nporttypes = 142
        env = {{__jmpbuf = {20037620, 23068628, 23020252, 23020128, 23019736, 19231503}, __mask_was_saved = 0, __saved_mask = 4222451713}}
        pid = <value optimized out>
        err = EKERN_INVALID_ADDRESS
        __PRETTY_FUNCTION__ = "__fork"
        ss = 0x1376808
        threads = 0x65000
        nthreads = 2
        stopped = 0
        i = 2
#9  0x011f92e9 in do_system (line=0x15f42dc "/bin/stty sane > /dev/ttypa") at ../sysdeps/posix/system.c:119
        status = <value optimized out>
        save = <value optimized out>
        pid = <value optimized out>
        sa = {__sigaction_handler = {sa_handler = 0x1, sa_sigaction = 0x1}, sa_mask = 524288, sa_flags = 0}
        omask = 0
[...]

fork failed here:

   458        /* We have one extra user reference created at the beginning of this
   459           function, accounted for by mach_port_names (and which will thus be
   460           accounted for in the child below).  This extra right gets consumed
   461           in the child by the store into _hurd_sigthread in the child fork.  */
   462        if (thread_refs > 1 &&
   463            (err = __mach_port_mod_refs (newtask, ss->thread,
   464                                         MACH_PORT_RIGHT_SEND,
   465                                         thread_refs)))
   466          LOSE;

This is in the parent, before signal thread setup, registering with the proc server, and starting the new process.

The error is 19, EKERN_UREFS_OVERFLOW.

(This is likely also the reason why the error path did not execute successfully.)

On 2010-11-30 and 2010-12-04, when I had again started the GCC testsuite, it failed again, but at another position (understandably), but with the same symptoms as shown below. In particular, the thread_refs values were the same ones.

Discussion

This is likely simply a programming error in glibc's fork implementation.

Bounty

There is a FOSS Factory bounty (p273) on this task.