Please wait a minute...
Tsinghua Science and Technology  2021, Vol. 26 Issue (4): 452-463    doi: 10.26599/TST.2020.9010018
    
Trident: Efficient and Practical Software Network Monitoring
Xiaohe Hu(),Yang Xiang(),Yifan Li(),Buyi Qiu(),Kai Wang(),Jun Li*()
Department of Automation, Tsinghua University, Beijing 100084, China.
Yunshan Networks, Beijing 100084, China.
Research Institute of Information Technology, Tsinghua University, Beijing 100084, China.
Download: PDF (8318 KB)      HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

Network monitoring is receiving more attention than ever with the need for a self-driving network to tackle increasingly severe network management challenges. Advanced management applications rely on traffic data analyses, which require network monitoring to flexibly provide comprehensive traffic characteristics. Moreover, in virtualized environments, software network monitoring is constrained by available resources and requirements of cloud operators. In this paper, Trident, a policy-based network monitoring system at the host, is proposed. Trident is a novel monitoring approach, off-path configurable streaming, which offers remote analyzers a fine-grained holistic view of the network traffic. A novel fast path packet classification algorithm and a corresponding cached flow form are also proposed to improve monitoring efficiency. Evaluated in a practical deployment, Trident demonstrates negligible interference with forwarding and requires no additional software dependencies. Trident has been deployed in production networks of several Tier-IV datacenters.



Key wordscloud networking      software network monitoring      network programmability      network management     
Received: 17 January 2020      Published: 12 January 2021
Fund:  National Natural Science Foundation of China(61872212)
Corresponding Authors: Jun Li     E-mail: hu-xh14@mails.tsinghua.edu.cn;xiangyang@ yunshan.net.cn;liyifan18@mails.tsinghua.edu.cn;buyi@yunshan.net;wangkai@yunshan.net;junl@tsinghua.edu.cn
About author: Xiaohe Hu received the BEng degree from Tsinghua University, China in 2014. He is now a PhD candidate at Department of Automation, Tsinghua University, China. His research interests include software-defined networking, cloud datacenter networks, and network monitoring and management.|Yang Xiang received the BS degree from Jilin University, China in 2008, and the PhD degree from Tsinghua University, China in 2013. He is currently a software engineer at Yunshan Networks. His research interests include software-defined networking, network architecture, and intrusion detection.|Yifan Li received the BEng degree from Tsinghua University, China in 2018. He is now a PhD candidate at the Department of Automation, Tsinghua University, China. His research interests include network verification, cloud datacenter networks, and network monitoring and management.|Buyi Qiu received the BEng degree from Northeastern University, China in 2012. He is now a software engineer at Yunshan Networks. His research interests include cloud datacenter networks, network monitoring, and network troubleshooting.|Kai Wang received the BS degree from Nanjing University, China in 2009 and the PhD degree from Tsinghua University, China in 2015. He is currently a software engineer at YunShan Networks. His research interests include network security and software-defined networking.|Jun Li received the BEng and MEng degrees from Tsinghua University, China in 1985 and 1988, respectively, and the PhD degree from New Jersey Institute of Technology, USA in 1997. Currently, he is a professor at Research Institute of Information Technology, Tsinghua University, China. His research interests include network security, pattern recognition, and image processing.
Cite this article:

Xiaohe Hu,Yang Xiang,Yifan Li,Buyi Qiu,Kai Wang,Jun Li. Trident: Efficient and Practical Software Network Monitoring. Tsinghua Science and Technology, 2021, 26(4): 452-463.

URL:

http://tst.tsinghuajournals.com/10.26599/TST.2020.9010018     OR     http://tst.tsinghuajournals.com/Y2021/V26/I4/452

Fig. 1 Basic framework of a self-driving network.
CategoryLocal countingDirect streaming
On-pathHash-based: UMON[13], On-path-FCAP[29]Port mirroring: OpenStack Tap-as-a-Service[10]
Sketch-based: On-path-SMON[29]Configured forwarding rules: Open vSwitch[11], VFP[12]
Off-pathHash-based: Trumpet[14], Off-path-FCAP[29]Configured monitoring policies: Trident
Sketch-based: SketchVisor[15], Off-path-SMON[29]
Table 1 Summary of the software network monitoring work in the data plane.
Fig. 2 TSS algorithm example with a two-field rule set.
Fig. 3 Fast- and slow-path framework.
Fig. 4 Architecture of Trident. Trident performs off-path traffic monitoring within the host hypervisor and interacts with the remote controller and analyzers.
Fig. 5 Example of the USS construction process.
Rule setACL1_100ACL1_103ACL1_104FW1_100FW1_103FW_104
TSS with megaflow111322
USS with uniflow111111
Table 2 Fast-path hash table lookup times compassion on TSS with megaflow and USS with uniflow. x in rule set ACL1_x and FW1_x represents rule size in the set.
Fig. 6 Average cached flow size of the megaflow and uniflow, representing the fast path caching hit rate. Y-axis is shown in the log scale. On each flow entry, the space size is calculated by multiplying each field size.
Traffic patternHeader statisticsPacket mirroring
MonCopySumMonCopySum
Random-6412.2311.9124.1458.329.3767.69
Random-51210.438.7219.1559.189.8369.01
Random-102411.3814.1525.5362.6913.2875.97
Random-145812.1516.2128.3610017.85117.85
CAIDA14.683.6418.3255.973.5859.55
Table 3 Trident CPU usage at 2 ×𝟏𝟎𝟓 pps. Random-x traffic pattern represents the packet source IP andrandomly varying ports. The packet length is set to x. Mon represents the CPU usage of the Trident process. Copy represents the CPU usage of the copy overhead introduced to the forwarding path. Sum represents the sum of Mon and Copy. (%)
Fig. 7 Traffic rate example that Trident monitors and compresses the header statistics.
Fig. 8 Demo that Trident can dynamically vary the sampling ratio to keep the CPU usage at 10%. The sampling ratio here means dividing the number of monitored packets by the number of forwarded packets.
[1]   Juniper, Expel complexity with a self-driving network, , 2020.
[2]   Feamster N. and Rexford J., Why (and how) networks should run themselves, in Proc. Applied Networking Research Workshop, Montreal, Canada: ACM, 2018, p. 20.
[3]   Jiang J. C., Sekar V., Stoica I., and Zhang H., Unleashing the potential of data-driven networking, in Proc. 9th Int. Conf. on Communication Systems and Networks, Bengaluru, India, 2017, pp. 110-126.
[4]   Yuan Y. F., Lin D., Mishra A., Marwaha S., Alur R., and Loo B. T., Quantitative network monitoring with NetQRE, in Proc. Conf. of the ACM Special Interest Group on Data Communication, Los Angeles, CA, USA, 2017, pp. 99-112.
[5]   Gupta A., Harrison R., Canini M., Feamster N., Rexford J., and Willinger W., Sonata: Query-driven streaming network telemetry, in Proc. 2018 Conf. of the ACM Special Interest Group on Data Communication, Budapest, Hungary, 2018, pp. 357-371.
[6]   Al-Fares M., Radhakrishnan S., Raghavan B., Huang N., and Vahdat A., Hedera: Dynamic flow scheduling for data center networks, in Proc. 7th USENIX Conf. on Networked Systems Design and Implementation, San Jose, CA, USA, 2010, p. 19.
[7]   Curtis A. R., Mogul J. C., Tourrilhes J., Yalagandula P., Sharma P., and Banerjee S., DevoFlow: Scaling flow management for high-performance networks, in Proc. ACM SIGCOMM 2011 Conf., Toronto, Canada, pp. 254-265, 2011.
[8]   Roesch M., Snort-lightweight intrusion detection for networks, in Proc. 13th USENIX Conf. on System Administration, Seattle, WA, USA, 1999, pp. 229-238.
[9]   Yuan Z. L., Xue Y. B., and van der Schaar M., BitMiner: Bits mining in internet traffic classification, in Proc. 2015 ACM Conf. on Special Interest Group on Data Communication, London, UK, 2015, pp. 93-94.
[10]   OpenStack, Tap as a Service (TAPaaS), , 2020.
[11]   Pfaff B., Pettit J., Koponen T., Jackson E. J., Zhou A., Rajahalme J., Gross J., Wang A., Stringer J., Shelar P., et al., The design and implementation of Open vSwitch, in Proc. 12th USENIX Conf. on Networked Systems Design and Implementation, Oakland, CA, USA, 2015, pp. 117-130.
[12]   Firestone D., VFP: A virtual switch platform for host SDN in the public cloud, in Proc. 14th USENIX Conf. on Networked Systems Design and Implementation, Boston, MA, USA, 2017, pp. 315-328.
[13]   Wang A., Guo Y., Hao F., Lakshman T. V., and Chen S. Q., UMON: Flexible and fine grained traffic monitoring in open vSwitch, in Proc. 11th ACM Conf. on Emerging Networking Experiments and Technologies, Heidelberg, Germany, 2015, p. 15.
[14]   Moshref M., Yu M. L., Govindan R., and Vahdat A., Trumpet: Timely and precise triggers in data centers, in Proc. 2016 ACM SIGCOMM Conf., Florianopolis, Brazil, 2016, pp. 129-143.
[15]   Huang Q., Jin X., Lee P. P. C., Li R. H., Tang L., Chen Y. C., and Zhang G., SketchVisor: Robust network measurement for software packet processing, in Proc. Conf. of the ACM Special Interest Group on Data Communication, Los Angeles, CA, USA, 2017, pp. 113-126.
[16]   sFlow, , 2020.
[17]   NetFlow, , 2020.
[18]   Tcpdump, , 2020.
[19]   Yu M. L., Jose L., and Miao R., Software defined traffic measurement with OpenSketch, in Proc. 10th USENIX Conf. on Networked Systems Design and Implementation, Boston, MA, USA, 2013, pp. 29-42.
[20]   Liu Z. X., Manousis A., Vorsanger G., Sekar V., and Braverman V., One sketch to rule them all: Rethinking network flow monitoring with UnivMon, in Proc. 2016 ACM SIGCOMM Conf., Florianopolis, Brazil, 2016, pp. 101-114.
[21]   Li Y. L., Miao R., Kim C., and Yu M. L., FlowRadar: A better NetFlow for data centers, in Proc. 13th Usenix Conf. on Networked Systems Design and Implementation, Santa Clara, CA, USA, 2016, pp. 311-324.
[22]   Moshref M., Yu M. L., Govindan R., and Vahdat A., SCREAM: Sketch resource allocation for Software-defined measurement, in Proc. 11th ACM Conf. on Emerging Networking Experiments and Technologies, Heidelberg, Germany, 2015, p. 14.
[23]   Handigol N., Heller B., Jeyakumar V., Mazières D., and McKeown N., I know what your packet did last hop: Using packet histories to troubleshoot networks, in Proc. 11th USENIX Conf. on Networked Systems Design and Implementation, Seattle, WA, USA, 2014, pp. 71-85.
[24]   Zhu Y. B., Kang N. X., Cao J. X., Greenberg A., Lu G. H., Mahajan R., Maltz D., Yuan L. H., Zhang M., Zhao B. Y., et al., Packet-level telemetry in large datacenter networks, in Proc. 2015 ACM Conf. on Special Interest Group on Data Communication, London, UK, 2015, pp. 479-491.
[25]   Benson T., Anand A., Akella A., and Zhang M., MicroTE: Fine grained traffic engineering for data centers, in Proc. Seventh Conf. on Emerging Networking Experiments and Technologies, Tokyo, Japan, 2011, p. 8.
[26]   Rasley J., Stephens B., Dixon C., Rozner E., Felter W., Agarwal K., Carter J., and Fonseca R., Planck: Millisecond-scale monitoring and control for commodity networks, in Proc. 2014 ACM Conf. on SIGCOMM, Chicago, IL, USA, 2014, pp. 407-418.
[27]   Wundsam A., Levin D., Seetharaman S., and Feldmann A., OFRewind: Enabling record and replay troubleshooting for networks, in Proc. 2011 USENIX Conf. on USENIX Annu. Technical Conference, Portland, OR, USA, 2011, p. 29.
[28]   Suh J., Kwon T. T., Dixon C., Felter W., and Carter J., OpenSample: A low-latency, sampling-based measurement platform for commodity SDN, in Proc. 2014 IEEE 34th Int. Conf. on Distributed Computing Systems, Madrid, Spain, 2014, pp. 228-237.
[29]   Zha Z. L., Wang A., Guo Y., Montgomery D., and Chen S. Q., Instrumenting Open vSwitch with monitoring capabilities: Designs and challenges, in Proc. Symp. on SDN Research, Los Angeles, CA, USA, 2018, p. 16.
[30]   Gupta P. and McKeown N., Packet classification using hierarchical intelligent cuttings, in Proc. Hot Interconnects, Stanford, CA, USA, 1999.
[31]   Singh S., Baboescu F., Varghese G., and Wang J., Packet classification using multidimensional cutting, in Proc. 2003 Conf. on Applications, Technologies, Architectures, and Protocols for Computer Communications, Karlsruhe, Germany, 2003, pp. 213-224.
[32]   Qi Y., Xu L., Yang B., Xue Y., and Li J., Packet classification algorithms: From theory to practice, in Proc. IEEE INFOCOM 2009, Rio de Janeiro, Brazil, 2009, pp. 648-656.
[33]   Srinivasan V., Suri S., and Varghese G., Packet classification using tuple space search, ACM SIGCOMM Comput. Commun. Rev., vol. 29, no. 4, pp. 135-146.
[34]   McCanne S. and Jacobson V., The BSD packet filter: A new architecture for user-level packet capture, in Proc. USENIX Winter 1993 Conf. Proc. on USENIX Winter 1993 Conf. Proc., San Diego, CA, USA, 1993, p. 2.
[35]   Begel A., McCanne S., and Graham S. L., BPF+: Exploiting global data-flow optimization in a generalized packet filter architecture, in Proc. Conf. on Applications, Technologies, Architectures, and Protocols for Computer Communication, Cambridge, MA, USA, 1999, pp. 123-134.
[36]   Yu M. L., Rexford J., Freedman M. J., and Wang J., Scalable flow-based networking with DIFANE, in Proc. ACM SIGCOMM 2010 Conf., New Delhi, India, 2010, pp. 351-362.
[37]   Liu Z., Sun S. J., Zhu H., Gao J. Q., and Li J., BitCuts: A fast packet classification algorithm using bit-level cutting, Comput. Commun., 2017, vol. 109, pp. 38-52.
[38]   Jacobson V., Compressing TCP/IP headers for low-speed serial links, , 1990.
[39]   Degermark M., Nordgren B., and Pink S., IP header compression, , 1999.
[40]   Jonsson L. E., Pelletier G., and Sandlund K., The Robust Header Compression (ROHC) framework, , 2007.
[41]   Taylor D. E. and Turner J. S., ClassBench: A packet classification benchmark, IEEE/ACM Trans. Netw., 2007, vol. 14, no. 3, pp. 499-511.
[42]   CAIDA, The CAIDA anonymized Internet traces 2016 Dataset, , 2020.
[43]   LWN, Introducing AF_PACKET V4 support, , 2020.
[44]   Data Plane Development Kit (DPDK), , 2020.
No related articles found!