Cloud flow: Network flow analysis and application traffic monitoring

How can you determine where and when your data is flowing to the cloud? In this tip, learn about network flow analysis and application traffic monitoring for cloud computing migrations.

Here’s the situation: You recently started working for a company where a cloud computing migration is under way....

The company has budgeted for IPS/IDS appliances in the cloud, but hasn’t yet purchased them, let alone deployed them. So, between the company’s data centers and those in the cloud provider’s network, the only devices you currently manage are routers and switches.

How can you tell what traffic is headed to and from the cloud?

One answer is network flow analysis (NFA), which leverages the existing flow-reporting tools in routers and some switches to provide much more complete application traffic monitoring.

With NFA for cloud flow, it’s possible to determine who’s connecting to which servers, which applications use the most bandwidth, the average time users connect with a given service, and many other flow-based metrics. All you need to collect, analyze and report on this data is your existing network devices and some open source tools. This example in particular will involve a router and open source NFA tools.

First, some basic terminology: A flow describes a collection of packets with common source and destination IP addresses, port numbers and IP-protocol numbers. Cisco Systems Inc. developed a system of flow tracking called NetFlow in the mid-1990s. A rival specification called sFlow appeared soon after. While NetFlow today is supported by a number of vendors, the Internet Engineering Task Force now defines open flow-format standards in its IP Flow Information Export (ipfix) working group.

Most NFA systems have three components: A sensor is an agent on the network device (usually a router or switch, though this could be a firewall or standalone probe) that gathers data about the traffic it sees. A collector is a server that receives and stores sensor records. Finally, there’s a reporting system that analyzes and produces reports on flow data. The collector and reporting system can reside on the same system, but this is not mandatory.

Enabling NetFlow on Cisco and Cisco-like routers is simple: Turn it on for a given interface, and then tell the router what version of NetFlow to use and where to send the flow records:

router#conf t
router(config)# int s0/3
router(config-if)# ip route-cache flow
router(config-if)# exit
router(config)# ip flow-export version 9
router(config)# ip flow-export destination 5678

This example uses NetFlow version 9, the most recent variant that covers both IPv4 and IPv6. For sites using IPv4 only, NetFlow v5 is sufficient, though keep in mind that the IETF’s ipfix work is based on NetFlow v9. The command syntax is similar on modular Cisco switches, such as those in the Catalyst 4xxx, 65xx and 76xx product lines. NetFlow isn’t supported on its smaller, fixed-configuration switches. Other vendors’ syntax may differ, but the basic concepts are similar for enabling flow collection and naming a collector. For example, products from Arista Networks Inc., Dell Inc., Ericsson Inc., and even open source projects like Quagga use command-line interfaces (CLIs) that are similar to Cisco’s IOS, but to be clear, your mileage may vary. Also, some vendors support sFlow instead of NetFlow, but the collector described here can handle both report formats.

For flow collection and reporting, we’ll use the nfdump and nfsen tools, available with many Linux and BSD distributions. For this demonstration, we’ll install nfsen on a FreeBSD 8.2 server using the nfsen port:

collector# cd /usr/ports/net-mgmt/nfsen
collector# make install clean

This will take care of installing all dependencies, including nfdump.

Once the tools are installed, it's necessary to configure nfdump and nfsen. To set up nfdump, create a directory to store flow data (in this example: /home/flows/router1) and then, to verify the system is collecting flows from the router, kick off nfcapd (part of the nfdump suite) in foreground mode:

collector# nfcapd –b –p 5678 –l /home/flows/router1

In a new terminal window, run a quick check of nfdump, a command-line tool, to show that the collector is indeed receiving flow data:

[[email protected] ~]# nfdump -R /home/flows/router1/nfcapd.201102161705

Date flow start                  Duration Proto       Src IP Addr:Port           Dst IP Addr:Port   

Packets      Bytes   Flows
2011-02-16 17:10:07.330     0.000 UDP ->           

1                  91         1
2011-02-16 17:10:07.927     0.032 TCP    ->        

2                  255       1
2011-02-16 17:10:07.943     0.000 UDP ->           

1                   91        1
2011-02-16 17:11:06.215     0.000 UDP    ->        

1                  380       1
2011-02-16 17:11:06.215     0.000 UDP    ->        

1                  380       1
Summary: total flows: 221, total bytes: 635830, total packets: 841, avg bps: 61639, avg pps: 10, avg bpp: 756
Time window: 2011-02-16 17:09:59 - 2011-02-16 17:11:21
Total flows processed: 221, Blocks skipped: 0, Bytes read: 11520
Sys: 0.009s flows/second: 23265.6    Wall: 0.003s flows/second: 69650.2  

We now have a working NFA system, but we’re not done yet.

While nfdump provides a text-based command-line tool to analyze flow data, we’re more interested in nfsen, the Web-based front end of nfdump. To use nfsen, edit the /usr/local/etc/nfsen.conf file to name our router as a sensor and then run nfsen reconfig to update the collector.

Then, after killing the nfcapd process used earlier, kick off nfsen by running:  /usr/local/etc/rc.d/nfsen start (or onestart if nfsen is not yet added to /etc/rc.conf). Also, start the collector’s Web server if you haven’t done so already.

Now, browse to http://<collector name or IP>/nfsen,  and traffic graphs should be filling in with flow data (see Figure 1). By default, nfsen uses 24-hour graphs, though this and many other attributes can be customized. Because the display is browser based, however, it’s accessible to anyone with access to the Web server running on the collector. It may make sense to restrict access using passwords or other access controls (such as firewall rules or access control lists on routers), and to prevent eavesdropping by displaying traffic only using SSL.

Click to enlarge

More to the point, it's also possible to use the Web interface to drill down on flow data, for example, finding the top-10 flow types sorted by IP address, port number or other criteria (see Figure 2). The nfsen toolkit is endlessly customizable: One could, for example, use the analyzer to send email or a text message to an admin whenever a given user tries to access a given service at the cloud provider. The data available from NFA is useful, both for troubleshooting and security monitoring.

Click to enlarge

Other NFA resources
The nfdump/nfsen tools are only one of three major NFA toolkits. Others include the flow-tools and flowd projects. Of these, flow-tools is much more widely used and has many more command-line utilities, but lacks the ability to track NetFlow v9 flows (and thus isn’t suitable for IPv6 monitoring). The flowd project is newer and supports IPv6, but doesn’t have a Web frontend like nfsen.

Finally, for those interested in a more in-depth look, I recommend Michael W. Lucas’ book Network Flow Analysis. It offers an excellent overview of the topic; even though its examples use flow-tools, Lucas clearly explains concepts and analysis techniques that apply to any NFA system.

During a cloud computing deployment, especially in the early stages, security professionals may not have all the security monitoring tools in place that they’d like. But every deployment uses routers, and that’s where NFA can help. By leveraging the existing flow-analysis features available on most routers, NFA offers a wealth of information, not only on how much traffic moves to and from the cloud, but also on what’s happening with cloud-based applications.

About the author:
David Newman has been breaking computer networks for more than 20 years. His company, Network Test, is an independent third-party test lab and engineering services consultancy focused on network device performance characterization. Newman is author or coauthor of IETF RFCs 4814, 3511, and 2647.

Dig Deeper on Cloud Network Security Trends and Tactics