Skip to content
bgoglin edited this page Mar 2, 2012 · 89 revisions

Welcome to the CCI Status page we will track progress weekly here.

UDP LAN

Working capabilities: RMA (both read and write), RNR on the recever side, short messages, connection management.

Working on:

  • RNR: Implementation finished, just need to double check with Scott that everything is fine.

  • mss < SOCK_MIN_MSS When the MSS specified during the connection handshake is inferior to the minimum MSS associated to the socket, reject the socket.

Code checking:

  • clang: last ran on 02/01/2012 (UDP branch)

UDP WAN

GNI

Supports cci_connect/cci_accept and associated events. Supports cci_send and associated events. cci_send performance leaves much to be desired. Not sure where problem is... This week will center on fixing the(se) performance issue(s).

OFA Verbs

Updated 2012/01/14

  • Working:

  • Connect, accept, reject

  • Connect full MSG payload

  • MSGs: RO (RU will be ordered since IB is ordered)

  • RMA: Write and Read with/without completion MSG

  • Run through valgrind - all verbs leaks fixed (still 6KB in the ltdl code)

  • Tests: pingpong and stream

  • Reviewed with clang static analyzer

  • ToDo:

  • MSGs: UU provide immediate completion, handle cleanup after transmission

  • MSGS: UU further investigate using UD rather than RC QP

  • Missing: set_opt, get_opt

  • Replace rdma_getaddrinfo to avoid recent lib dependency (maybe)

  • Test fence

  • Many-to-one tests/overrun issues?

  • Handle spurious disconnects, etc.

Ethernet

Should have MSGs by end of Jan. RMA in Feb.

See the HowTo if you want to use it.

Update 2012/03/05

  • RO connection (receive side only, send completions are still RU)

  • Updated to latest connect/accept events, handle connect nacks.

  • Need to tune reliability. Resend is very slow, should make it faster without killing the bandwidth.

  • MSG reliability seems to work. Stream is fine.

  • Still about +300ns for RU vs UU.

  • Driver works with kernels >= 2.6.32

  • Reliable connect/accept/reject (acks for all these) with timeout, and nacks if closed in the meantime

  • Basic event-delivery through syscalls (list of events in the kernel)

  • UU MSG

  • Interfaces can be removed/modified by the kernel without breaking everything if endpoints use them

TODO

  • TODO RO send completion events
  • TODO SILENT flag completion ordering
  • TODO Optimize reliability
  • TODO Force a delayed ack as soon as we have a misorder
  • TODO Immediate ack if still misordered after many other receives?
  • TODO Don't changed timer if already scheduled
  • TODO If not piggyacking the bitmap, add a force_ack_with_bitmap flag
  • TODO Throttle the sender to avoid multiple resends because the receiver can only ack 32 in advance ?
  • TODO MSG send flags
  • TODO Basic disconnect that just closes everything, discuss the semantics about pending MSGs later
  • TODO More checks in connect messages (and MSG): at least req_seqnum
  • TODO RMA
  • TODO Proper event-delivery through mmapped shared buffer to allow syscalls
  • TODO Improve ack status and ccieth_pkt_ack_status_to_errno()

Open questions

  • Should piggyack in MSG contain the acked bitmap as well?
  • Would simplify management because explicit acks and piggyacks would be the same
  • Should we immediately resend some packets when a selective ack arrives? Only if not resent recently?
  • Resend is currently very slow (0.5s), we don't want to resend to often, it would waste throughput
  • Can we generate events from user-space without breaking future shared-memory mapped ring?
  • At least send and accept events are good candidates
  • If we want to nack MSG when target endpoint or connection doesn't exist, we need the source ep/conn id in all MSG headers
Clone this wiki locally