1RMA: Re-envisioning Remote Memory Access forMulti-tenant Datacenters

1 Background & Issues

1RMA重新設計劃分網卡與軟體間的責任，把原本全部都由NIC完成的工作，拆一些出來給軟體做
- 硬體專注RMA read/write, encrypt
- 軟體負責CC, op pacing, timeout policy
設計目標：
1. No Connections
  - 不用cache connection state
  - 每個op都視為獨立 => per-op retry/fail-recovery
2. Small-sized ops, solicitation based
  - Hardware solicitation window to prevent TCP incast
3. Software Congestion Control
  - dynamic
4. Software-defined resource allocation
  - 不像傳統RDMA為了滿足lossless的網路環境，而必須要在網卡上cache一堆state
  - 透過priority決定要分配給request多少資源
5. First-Class Security
  - 讓app有權限做key rotation
  - 每個memory region用不同key保護

Step1~2 Get $K_d$, $RegionID$
Step3 Solicitation window有空位的時候，才能issue request到NIC
- SW: chunk, SW->NIC的速率、Congestion Control (Slow)
- HW: 用Soli. window做Admission Control (Fast)
Step4 $K_d$ sign request
Step6 $K_d$ encrypt response

RRT: static table, 存RegionID, Kr對應到的memory range
CST: single in-flight operation
Solicitaion Window: Admission Control, 限制FIFO中多少packet能進入網路
Number of memory regions for RMA based on tasks, not task-pairs
- manageable in finite resources
Timeout: 等太久都沒進入window就直接timeout
- 避免head-of-line blocking, 提供congestion signal

要求Remote對local做RMA read
Con: 多花一個RTT
Pro:
- 機制可以沿用RMA read的，不用重新設計
- client會比remote晚timeout
  - 可避免斷線時，client不知道write remote memory到底有沒有成功

用RMA write做key rotation
成本低：Install a new region key 𝐾𝑟 in 1 RRT
傳統方法問題
- High transient connection usage: 要先建用新key的連線
- Bursts of connection failure: 換key的時候會瘋狂auth fail