On this page:


Your Rating:
16 rates

In some cases, packet loss can occur on a server or intermediate device. The problem might be hardware-related – perhaps the device simply cannot handle peak traffic loads, or there might be faulty cabling on one or more network interfaces. The problem might be software-related – a bug or misconfiguration that results in intermittent packet loss at seemingly random intervals. When packet loss affects application performance, you can use Network Visibility to pinpoint the nodes, devices, and TCP connections where the packets are getting dropped.  

Application Symptoms

A DevOps engineer is responsible for monitoring the performance of a mission-critical app. One day, she opens the Application Dashboard notices that the Order-Tier, and the application flows to and from this tier, have suddenly turned red. She decides to investigate.

Network Diagnosis

  1. She switches over to the Network Dashboard and sees that the Order-Tier and Payment-Tier, and the network link between them, show a spike in PIE and network errors. 
  2. She decides to analyze an individual transaction. She goes to the Transaction Snapshots page and double-clicks on a Very Slow transaction. The Transaction Overview shows that 97% of the delay is on the link between Order-Tier and Payment-Tier. She drills down into the Order-Tier and goes to the Network tab.
  3.  Scanning the dashboard for this transaction, she sees immediately that
    1. The Network Impact on Transactions graph (top left) shows a spike in PIE and Very Slow Calls around the snapshot time range.
    2. She sees the correlation between Errors and Very Slow Calls, so she looks at the Network Errors - Contributors chart. She sees immediately that all these errors are Syn Blackholes. The Order-Tier node is trying to establish connections to the Payment-tier node, but something in the middle is silently dropping the connection. When this happens, it's usually a firewall or other intermediate device that is dropping packets. The requesting tier tries and retries to re-establish the connection, which introduces significant delays.
    3. The Retransmissions Per Min chart confirms that the retransmissions during the time window of interest are all Data Retransmits, which means that the drops are occurring on an intermediate device rather than at an application nodes. 

  4. Given this information, she now wants to pinpoint the specific TCP connections where the packet drops are occurring. She enables Advanced Diagnostic monitoring on the relevant network agents to collect TCP diagnostic metrics. She then goes to the Network Dashboard and clicks on the links between Order-Tier and Payment-Tier. This provides her with all of the connections with elevated PIE and errors. 

Given this information, she contacts the network-management team in her organization and says: "We're seeing intermittent SYN blackholes and retransmissions on these connections, can you please investigate and fix." 

  • No labels