Tux

...making Linux just a little more fun!

Is "PING" the right tool to measure packet losses in WAN interface ?

Ramanathan Muthaiah [rus.cahimb at gmail.com]
Fri, 22 Dec 2006 06:09:23 +0530

Gang,

Recently, there were some discussions at my workplace regd packet losses in WAN interface. And then, one folk in IT dept came up with output of "ping" command to highlight that there are no packet losses.

Am sure, this is not the correct way to measure packet losses.

I feel, they should be monitored over a period of time at the gateway router and the traffic in this router should be analysed for dropped packets / timeouts.

Is this true ?

NOTE: Am not working in the IT dept but one of the affected parties.

/Ram


Top     Back


Kapil Hari Paranjape [kapil at imsc.res.in]
Fri, 22 Dec 2006 09:29:33 +0530

Hello,

On Fri, 22 Dec 2006, Ramanathan Muthaiah wrote:

> Recently, there were some discussions at my workplace regd packet
> losses in WAN interface. And then, one folk in IT dept came up with
> output of "ping" command to highlight that there are no packet losses.
> 
> Am sure, this is not the correct way to measure packet losses.

True enough. "ping" is/was a useful way to check that the link is up and get a rough idea of round-trip times. That's it.

Kapil. --


Top Back


Benjamin A. Okopnik [ben at linuxgazette.net]
Thu, 21 Dec 2006 22:36:38 -0600

On Fri, Dec 22, 2006 at 06:09:23AM +0530, Ramanathan Muthaiah wrote:

> Gang,
> 
> Recently, there were some discussions at my workplace regd packet
> losses in WAN interface. And then, one folk in IT dept came up with
> output of "ping" command to highlight that there are no packet losses.
> 
> Am sure, this is not the correct way to measure packet losses.
> 
> I feel, they should be monitored over a period of time at the gateway
> router and the traffic in this router should be analysed for dropped
> packets / timeouts.
> 
> Is this true ?

In theory, the router itself should have on-board tools to measure this - I can't think of anything else that would do the job. The problem is that traffic on the WAN side is meaningless for those purposes - you don't know what part of it is supposed to be routed into the LAN - and the traffic on the LAN side doesn't tell you anything, since you don't know how many packets were supposed to get through. I've heard of traffic analyzers (hardware) that are supposed to let you troubleshoot that kind of problems, but they're supposed to be extremely expensive.

The best bet is to find out if your router has a telnet interface (many of them do), log in, and snoop around. If it's got an 'ifconfig' command, that might go a long way toward answering your question.

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *

Top Back


Benjamin A. Okopnik [ben at linuxgazette.net]
Thu, 21 Dec 2006 22:54:53 -0600

On Fri, Dec 22, 2006 at 09:29:33AM +0530, Kapil Hari Paranjape wrote:

> Hello,
> 
> On Fri, 22 Dec 2006, Ramanathan Muthaiah wrote:
> > Recently, there were some discussions at my workplace regd packet
> > losses in WAN interface. And then, one folk in IT dept came up with
> > output of "ping" command to highlight that there are no packet losses.
> > 
> > Am sure, this is not the correct way to measure packet losses.
> 
> True enough. "ping" is/was a useful way to check that the link is up
> and get a rough idea of round-trip times. That's it.

I've also found that it can be used, with the '-f' option, to stress-test a suspect 10Mb/S NIC (I've never tried it on a faster network; _caveat geek._) "su -c 'ping -f hostname'" is a perfectly lovely way to entertain yourself on a network full of crap hardware, and a good way to produce reports that show why you need new cards.

Ram, if you can convince your IT folks to OK it, you might want to try this - do note that it'll probably DoS your router while you run it. It's not a complete diagnostic - if it doesn't fail does not mean that it's good - but if you get serious packet loss from doing it, then there's most likely a problem. 30 seconds of ping-flooding ought to be more than enough.

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *

Top Back


Predrag Ivanovic [predivan at ptt.yu]
Fri, 22 Dec 2006 15:18:07 +0100

On Thu, 21 Dec 2006 22:54:53 -0600 Benjamin A. Okopnik wrote:

<ping>

> 
> I've also found that it can be used, with the '-f' option, to
> stress-test a suspect 10Mb/S NIC (I've never tried it on a faster
> network; _caveat geek._) "su -c 'ping -f hostname'" is a perfectly
> lovely way to entertain yourself on a network full of crap hardware, and
> a good way to produce reports that show why you need new cards.

Ben, would using 'ping -f' on localhost produce any meaningful results? I would like to stress-test NIC on my machine, I'd like to see if it is responsible for (sometimes very high) packet loss. Look at this 2 consecutive ping runs:

---
[1]
ping www.yahoo.com
PING www.yahoo-ht2.akadns.net (209.73.186.238): 56 octets data
64 octets from 209.73.186.238: icmp_seq=0 ttl=51 time=179.6 ms
64 octets from 209.73.186.238: icmp_seq=2 ttl=51 time=155.8 ms
64 octets from 209.73.186.238: icmp_seq=3 ttl=51 time=135.9 ms
64 octets from 209.73.186.238: icmp_seq=4 ttl=51 time=138.1 ms
64 octets from 209.73.186.238: icmp_seq=7 ttl=51 time=144.8 ms
64 octets from 209.73.186.238: icmp_seq=9 ttl=51 time=152.9 ms
64 octets from 209.73.186.238: icmp_seq=10 ttl=51 time=192.8 ms
64 octets from 209.73.186.238: icmp_seq=11 ttl=51 time0.9 ms
64 octets from 209.73.186.238: icmp_seq=13 ttl=51 time=174.9 ms
64 octets from 209.73.186.238: icmp_seq=15 ttl=51 time=128.4 ms
64 octets from 209.73.186.238: icmp_seq=16 ttl=51 time=142.1 ms
64 octets from 209.73.186.238: icmp_seq=18 ttl=51 time=144.8 ms
64 octets from 209.73.186.238: icmp_seq=19 ttl=51 time=164.3 ms
 
--- www.yahoo-ht2.akadns.net ping statistics ---
21 packets transmitted, 13 packets received, 38% packet loss
round-trip min/avg/max = 128.4/158.1/200.9 ms
[2]
 ping www.yahoo.com
PING www.yahoo-ht2.akadns.net (209.73.186.238): 56 octets data
64 octets from 209.73.186.238: icmp_seq=0 ttl=51 time=234.2 ms
64 octets from 209.73.186.238: icmp_seq=1 ttl=51 time=165.9 ms
64 octets from 209.73.186.238: icmp_seq=2 ttl=51 time=149.1 ms
64 octets from 209.73.186.238: icmp_seq=3 ttl=51 time=168.4 ms
64 octets from 209.73.186.238: icmp_seq=4 ttl=51 time=162.9 ms
64 octets from 209.73.186.238: icmp_seq=5 ttl=51 time=184.9 ms
64 octets from 209.73.186.238: icmp_seq=6 ttl=51 time=179.3 ms
64 octets from 209.73.186.238: icmp_seq=8 ttl=51 time=192.4 ms
64 octets from 209.73.186.238: icmp_seq=12 ttl=51 time=177.4 ms
64 octets from 209.73.186.238: icmp_seq=13 ttl=51 time=191.4 ms
64 octets from 209.73.186.238: icmp_seq=14 ttl=51 time=185.6 ms
64 octets from 209.73.186.238: icmp_seq=15 ttl=51 time=122.1 ms
64 octets from 209.73.186.238: icmp_seq=16 ttl=51 time=158.8 ms
  
--- www.yahoo-ht2.akadns.net ping statistics ---
17 packets transmitted, 13 packets received, 23% packet loss
round-trip min/avg/max = 122.1/174.8/234.2 ms
---
If NIC is OK, next to check would be cable modem i.e coaxial cable that goes from modem to NIC. Since I crimped that cable, and it is with twist-on connectors, *it is* possible that it causes the trouble. (sometimes, resetting the modem/reattaching the coax helps). NIC uses via-rhine kernel driver(VIA Rhine PCI Fast Ethernet driver). I didn't mean to hijack the thread, but this somehow seems related ;)

Pedja

-- 
 Complicated == Learning Experience
                      -- Joe Bowman

Top Back


Benjamin A. Okopnik [ben at linuxgazette.net]
Fri, 22 Dec 2006 09:11:08 -0600

On Fri, Dec 22, 2006 at 03:18:07PM +0100, Predrag Ivanovic wrote:

> On Thu, 21 Dec 2006 22:54:53 -0600
> Benjamin A. Okopnik wrote:
> 
> <ping>
> > 
> > I've also found that it can be used, with the '-f' option, to
> > stress-test a suspect 10Mb/S NIC (I've never tried it on a faster
> > network; _caveat geek._) "su -c 'ping -f hostname'" is a perfectly
> > lovely way to entertain yourself on a network full of crap hardware, and
> > a good way to produce reports that show why you need new cards.
> 
> Ben, would using 'ping -f' on localhost produce any meaningful results?

I'm afraid not; if you consider it in terms of the OSI 4-layer model, the ICMP packets would be routed to 'lo' at layer #2, and would never get as far as the hardware interface. You need an actuall remote host to "talk" to so your packets can go out through the NIC, onto the wire, and into the other NIC (and a short way into the other host's stack, where it'll get ACKed.)

> I would like to stress-test NIC on my machine, I'd like to see if it is 
> responsible for (sometimes very high) packet loss.
> Look at this 2 consecutive ping runs:
> ---

[ snippage ]

> 21 packets transmitted, 13 packets received, 38% packet loss

[ snipperoonie ]

> 17 packets transmitted, 13 packets received, 23% packet loss

Your NIC is certainly a part of that chain, but I don't know that it would be the first thing I'd suspect - there's probably a fair amount of hardware between you and Yahoo. Have you tried running 'traceroute' on those addresses? That can be nicely instructive, in a bit of visual detail.

> If NIC is OK, next to check would be cable modem i.e coaxial cable that goes from modem to NIC.

If you're going to go one step at a time, I'd first look at the RJ-45 cable connecting the NIC to the modem (assuming that you've already run the host-to-host test that we were just talking about.) Having another system available would also allow you to definitely determine whether everything up to the modem is OK or not: if another host, with its own RJ-45 patch cable, is still flaky, then it's somewhere between the modem and the end you're pinging.

The other advantage of 'traceroute' is you're going to see all the bars and houses of ill repute that your packets are going to visit before they get where they're going. You could ping each of those IPs in turn (going from closest to most remote), and watch the loss ratio. In most cases by far, though, the problem is local - and it's usually either the patch cable from the modem to the router (or host), or from the cable drop to the modem.

> Since I crimped that cable, and it is with twist-on connectors, *it is* possible that it causes the trouble.
> (sometimes, resetting the modem/reattaching the coax helps).

Oh yeah. If you made it yourself, and you're not fairly experienced with RG-8 (or RG-6 which I prefer), it may well be the source of the problem. As a former boss of mine, a lab manager at Hughes Aircraft, had scrawled on the whiteboard for a rather clueless MMW engineer (who had tried running an IF signal via a piece of wire, and was wondering why it wasn't working), "400MHz is NOT DC!"

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *

Top Back


Predrag Ivanovic [predivan at ptt.yu]
Sat, 23 Dec 2006 18:22:57 +0100

On Fri, 22 Dec 2006 09:11:08 -0600 Benjamin A. Okopnik wrote:

<snip>

> Your NIC is certainly a part of that chain, but I don't know that it
> would be the first thing I'd suspect - there's probably a fair amount of
> hardware between you and Yahoo. Have you tried running 'traceroute' on
> those addresses? That can be nicely instructive, in a bit of visual
> detail.

I ran a traceroute -v -n to ISP.

---
root@deus:/usr/ports/pedja/mathomatic#traceroute -v -n ptt.yu
traceroute to ptt.yu (212.62.32.65), 30 hops max, 40 byte packets
 1  172.17.248.1 36 bytes to 172.19.224.147  10.593 ms  7.322 ms  158.193 ms
 2  213.137.109.113 36 bytes to 172.19.224.147  12.411 ms  497.248 ms  124.917 ms
 3  213.137.99.193 36 bytes to 172.19.224.147  12.509 ms  12.468 ms  10.190 ms
 4  213.137.97.173 36 bytes to 172.19.224.147  43.376 ms  71.164 ms  46.475 ms
 5  213.137.107.97 36 bytes to 172.19.224.147  22.555 ms  30.928 ms  21.622 ms
--- 

and to yahoo.com(just the relevant part below).

---
root@deus:/usr/ports/pedja/mathomatic#traceroute -v -n yahoo.com
traceroute: Warning: yahoo.com has multiple addresses; using 66.94.234.13
traceroute to yahoo.com (66.94.234.13), 30 hops max, 40 byte packets
 1  172.17.248.1 36 bytes to 172.19.224.147  20.192 ms  42.007 ms  5.684 ms
 2  213.137.109.113 36 bytes to 172.19.224.147  10.098 ms  29.912 ms  56.426 ms
 3  213.137.99.193 36 bytes to 172.19.224.147  11.681 ms  25.007 ms  12.274 ms
 4  213.137.97.173 36 bytes to 172.19.224.147  36.362 ms  25.909 ms  33.405 ms
 5  213.137.107.125 36 bytes to 172.19.224.147  15.544 ms  26.807 ms  13.219 ms
---

There is a difference in hop 5, as you can see, but I don't think its relevant in any way. I may have mentioned that few months back cable users got moved behind the proxy, no more public ip addresses( Windows machines+net 24/7=disaster).So, I ran traceroute on proxy(ip address that ip-lookup.net claims I use):

---
root@deus:/usr/ports/pedja/mathomatic#traceroute -n -v 213.137.109.129
traceroute to 213.137.109.129 (213.137.109.129), 30 hops max, 40 byte packets
 1  172.17.248.1 36 bytes to 172.19.224.147  18.481 ms  5.958 ms  7.685 ms
 2  213.137.109.113 36 bytes to 172.19.224.147  13.821 ms  21.541 ms  20.453 ms
 3  213.137.109.114 36 bytes to 172.19.224.147  36.690 ms  16.368 ms  7.293 ms
 4  213.137.109.113 36 bytes to 172.19.224.147  16.869 ms  14.729 ms  14.616 ms
<snip> 
28  213.137.109.113 36 bytes to 172.19.224.147  40.092 ms  42.882 ms  83.277 ms
29  213.137.109.114 36 bytes to 172.19.224.147  87.868 ms  30.703 ms  44.451 ms
30  213.137.109.113 36 bytes to 172.19.224.147  68.395 ms  61.603 ms  65.174 ms
---

and ping:

---
root@deus:/usr/ports/pedja/mathomatic#ping 213.137.109.129
PING 213.137.109.129 (213.137.109.129): 56 octets data
64 octets from 213.137.109.129: icmp_seq=0 ttl=61 time=18.5 ms
64 octets from 213.137.109.129: icmp_seq=1 ttl=61 time=34.3 ms
<snip>
64 octets from 213.137.109.129: icmp_seq=11 ttl=61 time=19.6 ms
64 octets from 213.137.109.129: icmp_seq=12 ttl=61 time=111.8 ms
 
--- 213.137.109.129 ping statistics ---
13 packets transmitted, 13 packets received, 0% packet loss
round-trip min/avg/max = 14.3/38.7/111.8 ms
---

I'm not sure how to interpret traceroute output, though.

> > If NIC is OK, next to check would be cable modem i.e coaxial cable that
> > goes from modem to NIC.
> 
> If you're going to go one step at a time, I'd first look at the RJ-45
> cable connecting the NIC to the modem (assuming that you've already run
> the host-to-host test that we were just talking about.) Having another
> system available would also allow you to definitely determine whether
> everything up to the modem is OK or not: if another host, with its own
> RJ-45 patch cable, is still flaky, then it's somewhere between the modem
> and the end you're pinging.

RJ-45 cable came included with the modem(Motorola SB 5101E, btw), and I tried another cable/modem combination, same thing happens, so it's rather safe to conclude that problem is something else, I think.

> The other advantage of 'traceroute' is you're going to see all the bars
> and houses of ill repute that your packets are going to visit before
> they get where they're going. You could ping each of those IPs in turn
> (going from closest to most remote), and watch the loss ratio. In most
> cases by far, though, the problem is local - and it's usually either the
> patch cable from the modem to the router (or host), or from the cable
> drop to the modem.
> 
> > Since I crimped that cable, and it is with twist-on connectors, *it is*
> > possible that it causes the trouble. (sometimes, resetting the
> > modem/reattaching the coax helps).
> 
> Oh yeah. If you made it yourself, and you're not fairly experienced
> with RG-8 (or RG-6 which I prefer), it may well be the source of the
> problem.

I googled 'crimping for dummies', and made the cable several times, but without proper tools/quality cable/connectors, I really can't be sure, can I? btw, tools etc are on my shopping list, any recommendations(dealers, manufacturers...)?

> As a former boss of mine, a lab manager at Hughes Aircraft, had
> scrawled on the whiteboard for a rather clueless MMW engineer (who had
> tried running an IF signal via a piece of wire, and was wondering why it
> wasn't working), "400MHz is NOT DC!"

<laugh> Many engineers at my workplace are like that, they make a friend of mine (he has an engineering mindset, lots of clue but not a formal degree) absolutely mad, particularly 'I saw it work in a book, so it must be right' attitude. Ben, I'd like to thank you for your time and informations, it's been fun as always, and I learned something in the process :)

Pedja

-- 
 You can lead an idiot to knowledge but you cannot make him think.  You can,
 however, rectally insert the information, printed on stone tablets, using a
 sharpened poker.  -- Nicolai

Top Back


Benjamin A. Okopnik [ben at linuxgazette.net]
Sat, 23 Dec 2006 12:43:00 -0600

On Sat, Dec 23, 2006 at 06:22:57PM +0100, Predrag Ivanovic wrote:

> 
> I ran a traceroute -v -n to ISP.
> --
> root@deus:/usr/ports/pedja/mathomatic#traceroute -v -n ptt.yu
> traceroute to ptt.yu (212.62.32.65), 30 hops max, 40 byte packets

The right thing to do from there - assuming you suspect the problem is somewhere upstream - is to ping each of the hosts along the way and watch for losses. Again, though, chances are high that your problem is local.

> and ping:
> ---
> root@deus:/usr/ports/pedja/mathomatic#ping 213.137.109.129
> PING 213.137.109.129 (213.137.109.129): 56 octets data

[snip]

> --- 213.137.109.129 ping statistics ---
> 13 packets transmitted, 13 packets received, 0% packet loss
> round-trip min/avg/max = 14.3/38.7/111.8 ms

Note that the primary problem isn't there anymore - all your packets came back.

> I'm not sure how to interpret traceroute output, though.

What I normally do is watch the display - if it shows '*' for an intermediate host, either that host isn't answering or the packets are being lost. In any case, the most useful part of it is being able to see that chain of hosts: you can now ping them in turn and see where the loss starts.

> RJ-45 cable came included with the modem(Motorola SB 5101E, btw), 
> and I tried another cable/modem combination, same thing happens, so it's 
> rather safe to conclude that problem is something else, I think.  

Agreed. Although one time I got surprised by two bad cables in a row, and oh brother did that frustrate the hell out of me! (Also had a similar experience with two bad NICs.)

So, assuming that your patch cable is good, it's down to the modem, the coax, the jack, or the upstream cable/plant. I'd focus on the first two.

> I googled 'crimping for dummies', and made the cable several times, but without proper
> tools/quality cable/connectors, I really can't be sure, can I?
> btw, tools etc are on my shopping list, any recommendations(dealers, manufacturers...)? 

I got religion about crimping tools a long time ago, and Paladin and Klein share the godhead. There are a few lesser deities out there as well (I used a Ziotek, which has a mildly sucky stripper, for ~1000 RJ-45 connections, and it did a pretty good job), but those two always work - even with dicey connectors.

> >As a former boss of mine, a lab manager at Hughes Aircraft, had
> > scrawled on the whiteboard for a rather clueless MMW engineer (who had
> > tried running an IF signal via a piece of wire, and was wondering why it
> > wasn't working), "400MHz is NOT DC!"
> 
> <laugh>
> Many engineers at my workplace are like that, they make a friend of mine
> (he has an engineering mindset, lots of clue but not a formal degree)
> absolutely mad, particularly 'I saw it work in a book, so it must be right' attitude.

We had a fellow at Hughes - can't rememer his name now, but a very sweet and friendly Chinese MMW scientist - whom I caught one day while he was trying to jam an SMP connector into a E-band waveguide, grinding away at the job vigorously and looking puzzled at their failure to fit (E-band WG has a rectangular hole about 1/8"x1/16"; an SMP connector is ~5/8" diameter hex nut.) I took it away from him - he had managed to grind off the gold plating on both; the waveguide, at least, would have to be replated, and the solid coax would have to be remade - and explained that

1) Waveguides carry the RF.
2) Coax carries the IF/modulation (so far, so good - he was nodding
        along.)
3) To get the IF signal off the RF carrier, you needed a pickup - i.e.,
        a Gunn or an IMPATT diode mounted in a waveguide with a coax takeoff.
        He knew this intellectually, but had somehow failed to connect the
        books with physical reality.
I then took a pickup from a stack of them *on his desk*, 6" away from his elbow; got him another, non-destroyed piece of waveguide; bolted the whole thing up with the 10-24 machine screws and nuts - of which he had a bin, same as there was on every desk in that lab - and hooked it up to the power meter. He was effusively grateful, and promised to remember the process from then on.

Brilliant MMW designer, by the way, and excellent at estimating, "by feel", the size of a resistive waveguide insert for a given dB drop (which is Deep Black Magick, I assure you.) He got sniped by a headhunter from TRW later, and last I heard - mid-90s - was making over 200 big ones a year.

> Ben, I'd like to thank you for your time and informations, it's been fun as always, 
> and I learned something in the process :) 

[grin] Always happy to assist with the bits that I know; I'm getting the same thing in exchange, and figure that to be a more than fair deal.

-- 
* Ben Okopnik * Editor-in-Chief, Linux Gazette * http://LinuxGazette.NET *

Top Back