TODO


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197

* Design
** physmem should probably support marking a container as being the
   default paging container for a certain task (providing a specific
   virtual address for this container).
   Then the last-resort pager in a task can run on memory in this
   container, and physmem can freely unmap that memory temporarily
   (for reorganization).  The last-resort pager would use physmem as
   its pager.  This relieves the need to wire down the memory the
   pager is running on.
   This can also be used for startup code, which is currently mapped
   directly from starting task to started task.  This is not
   acceptable under the paranoid constrain that all mappings must be
   installed by physmem, to avoid DoS attacks on the page tables.

   Deep integration in libhurd-cap-server to handle page fault
   messages transparently as RPC messages could be provided by hooks.
   Alternatively, special server thread could be designated to handle
   the page faults.

   This does not eliminate all need to wire down memory.  Buffers for
   receiving string items must not page fault, and although physmem
   could be trusted to handle such a page fault, the server has no way
   to enforce the use of a trusted pager for such memory on the client
   side.  So, either some form of wiring must still be supported, or
   containers or other trusted buffer objects must be used instead of
   string items.


* libl4
** The main TODO list for libl4 is in the file libl4/TODO.
** We need cancellable forms of ipc() and lipc()!


* configure.ac
** Allow user to specify location of libc.a.


* laden
** Implement the Generic Booting Protocol (Appendix J
   l4-x2-20040823.pdf and newer).
** Overlaps between modules and destination regions should be resolved
   intelligently.
** Support for sigma1 needs to be added when sigma1 exists.
** Shutdown should sleep a couple of seconds before reboot.  How can
   this be done without any operating system (maybe use the BIOS?).
** When L4 supports it, the UTCB area of the rootserver should be set
   in the KIP.
** Memory descriptors need to be constructed and handled carefully,
   verify that everything is all-right.  In particular: conventional
   memory overriding non-conventional memory in the descriptor list is
   not supported, but should be.
** Add loaded modules as bootloader specific types to memory
   descriptor list (for sigma0 and wortel).  But check with the
   Generic Booting Protocol specification first!
** Fix the memory descriptors: Consistently set the high value right.
   Mark all bootloader stuff as bootloader specific, to prevent that
   L4 scribbles over it accidently.  This includes the GRUB info as
   well as all modules beyond the rootserver module.


* root server libraries
** More code should be explicitely shared by the root servers.
** Use ptmalloc, not malloc+USE_MALLOC_LOCK.


* wortel
** Use the Generic Booting Protocol (Appendix J l4-x2-20040823.pdf and
   newer).  Needs corresponding support in laden.
** Conventional memory overriding non-conventional memory in the
   descriptor list is not supported, but should be.


* libhurd-slab
** Ideally this would be a feature in glibc.
** Support having the pager reap stuff (needs a wrapper around reap()
   that does locking).

* libhurd-ihash
** Can be merged back into the Hurd if the callers are changed.


* libhurd-cap-server
** Implement propagation support, so that worker threads like for
   select or notifications can propagate rpcs to another thread.  This
   must update the pending_rpc table (the worker thread can then
   return with ENOREPLY) for cancellation support.  Of course, the new
   receiver thread must be able to deal with cancellation.

   One problem is that the new processing thread can't know which rpc
   is cancelled.  Yuck!

   So, maybe, to cancel, the manager could just propagates the
   cancellation request.  For this to work, we need to be able to
   differentiate between normal pending workers and such sub-managers.
** Implement cap transfer.
** Implement reference management and a no-sender callback when the
   last reference by a client is dropped.
** Use of <atomic.h>, which is not a public header file!
** It should be allowed to call hurd_cap_obj_rele() with only one
   reference.
** Neal points out that the placement of the cap-class argument in
   hurd_cap_class_init and hurd_cap_class_create is very much
   divergent.


* L4 (for lack of a better place)
** Check that L4 does not schedule the client when the server makes a
   non-blocking reply.
** Check that L4 does schedule the server when the client makes a
   blocking call.
** What happens with map and grant items if IPC is aborted due to
   xfer timeout?
   Answer: Current implementation: They are processed up to the string
   item in which the page fault occured.
** Wishlist for ABI changes:
*** [ia32] Use %fs or %gs:4 for the TCB pointer instead %gs:0, to free
    that one for the ia32 TLS ABI.
    Answer: Current patch uses %gs:4 for UTCB and %gs:0 for TLS.
    Problem: As the gs segment is not 4GB in size (to allow small
    space protection), %gs:OFFSET access to TLS is not allowed.
*** Use Xfer timeout of the other side for pagefault timeouts, instead
    of the minimum (so pageouts on your side don't abort IPC operations
    if you need to restrict the xfer timeout to zero).  Alternatively:
    Have another set of xfer timeouts for that use.  Solution: Patches
    for both have been developed.  Problem of the first approach is its
    limited generality (but it should be ok for us).  Problem of the
    alternative approach is that if multiple page faults occur on both
    sides, the semantics are unclear and sometimes undesirable.
*** Deferred cancellation: If you want to safely cancel an IPC
    operation in another thread, you need heavy high-level support
    (sigstate) to avoid race conditions.  It would be very useful to
    have support for this at a rather low level of L4-only (without
    massive libc support), for example in hardware drivers (IRQ
    handler).
    Proposal: Extend ExchangeRegisters to allow atomically aborting
    pending IPC operations or set a deferred cancellation flag.  On
    the next IPC, the destination thread will then abort the IPC
    operation before even starting it if the cancel flag is set
    (and clear it).
    Response: The L4 people suggest to use preemption delay, but this
    has trust issues, is not well defined (cpu time vs instruction
    count) and only works if the interacting threads run on the same
    CPU.  Could be done for IRQ handlers, for example, but not for user
    threads.
*** IPC to stopped threads should succeed: If a thread is stopped
    while in an IPC receive operation, any attempt to send a message to
    it (with a short timeout) will fail because the dest thread is not
    ready.  But this means that you can not safely stop threads if the
    IPC operating can not be retried (which is usually the case if the
    timeout is short, for example in the case of server reply messages
    in an RPC context).  Stopping threads is necessary for debugging,
    cancellation etc.
    Same problem occured when implementing debugging support in Sawmill
    (paper found on the net).
    Again, this could be worked around with heavy wrappers around IPC
    operations, and according information in sigstate etc.  But this is
    undesirable.
    Suggestion: Allow an IPC operation to a stopped thread to succeed.
    This is possible because only MRs and string items need to be
    transferred, and usually no cooperation by the destination thread
    is necessary.
    Problem: There is one potentially troublesome boundary condition:
    If string items are transferred, and a page fault occurs in the
    receivers address space, what should happen?  Should the receiver's
    thread state be modified to fake a page fault message to its pager?
    IMO, this would be OK.  In this case, the pager could immediately
    process the page fault (in case it is running), and send its reply
    - which would be received even if the thread is still stopped, and
    transfering string items could be resumed.
    Alternatively, the actual page fault message could be delayed until
    the thread is resumed, in which case the likelyhood is increased
    that a xfer timeout will occur.
    Our own needs are more modest: As we will always send reply
    messages with timeout 0, all page faults in the receiver will abort
    the IPC anyway.

** Bugs:
*** See patches in README


* Servers
** The task server can hang if it needs to create a thread and is out
   of memory, and physmem wants to create a worker thread.  Because
   then task will contact physmem to allocate more memory, and physmem
   contacts task to create a new worker thread, and the system will
   dead-lock.  This needs some hackery to break out of it.

Copyright 2003, 2004 Free Software Foundation, Inc.
Written by Marcus Brinkmann <marcus@gnu.org>

This file is free software; as a special exception the author gives
unlimited permission to copy and/or distribute it, with or without
modifications, as long as this notice is preserved.
 
This file is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY, to the extent permitted by law; without even the
implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.