We found a match
Your institution may have access to this item. Find your institution then sign in to continue.
- Title
Mitigation of NUMA and synchronization effects in high-speed network storage over raw Ethernet.
- Authors
González-Férez, Pilar; Bilas, Angelos
- Abstract
Current storage trends dictate placing fast storage devices in all servers and using them as a single distributed storage system. In this converged model where storage and compute resources co-exist in the same server, the role of the network is becoming more important: network overhead is becoming a main limitation to improving storage performance. At the same time, server consolidation dictates building servers that employ non-uniform memory architectures (NUMA) to scale memory performance and bundling multiple network links to increase network throughput. In this work, we use Tyche, an in-house protocol for network storage based on raw Ethernet, to examine and address (a) performance implications of NUMA servers on end-to-end path and (b) synchronization issues with multiple network interfaces (NICs) and multicore servers. We evaluate NUMA and synchronization issues on a real setup with multicore servers and six 10 GBits/s NICs on each server and we find that: (a) NUMA effects have significant negative impact and can reduce throughput by almost 2 $$\times $$ on our servers with as few as eight cores (16 hyper-threads). We design protocol extensions that almost entirely eliminate NUMA effects by encapsulating all protocol structures to a 'channel' concept and then carefully mapping channels and their resources to NICs and NUMA nodes. (b) The traditional inline approach where each thread accesses the NIC to post-storage requests is preferable to using a queuing approach that trades locks for context switches, especially when the protocol is NUMA-aware. Overall, our results show that dealing with NUMA affinity and synchronization issues in network storage protocols allows network throughput between the target and initiator to scale by a factor of 2 $$\times $$ and beyond 60 GBits/s.
- Subjects
NON-uniform memory access; COMPUTER programming ability testing; COMPUTER performance; COMPUTER storage devices; ETHERNET
- Publication
Journal of Supercomputing, 2016, Vol 72, Issue 11, p4129
- ISSN
0920-8542
- Publication type
Article
- DOI
10.1007/s11227-016-1726-7