Some good/common questions with some relatively good answers :-)

--sdo

============================================================
Tue May 26, 1998

>> I installed tcptrace version 4.1.3 on sunos4.1.4 together with
>> tcpdump version 3.4a6 and libpcap version 0.4a6 and made a few
>> tests, however I am very puzzled by the lack of traffic in the
>> reverse direction!
>> 
>> Can you throw some light on what the problem might be?

Probably.  It appears that you're doing the "packet grabbing" on the
same machine that's hosting the connection.  That's a problem under
SunOS (and maybe other systems).  Their packet capturing pseudo-device
can't grab both ends of a connection whose home is the local machine.
I never can remember which end it omits, but given your output, it
appears that it can't see packets SENT by the local host.  The easiest
solution is to run tcpdump on a machine OTHER than the ones you're
monitoring.

============================================================
Fri Jul 24, 1998

>> I have been trying to find out when you place a diamond and when an
>> arrow and why.

The "diamond" means that the segment was sent with a PUSH (so most
segments will have these except for large data transfers).  This feature
is now documented in the on-line docs.


>> The second thing is that you should create a margin around the
>> plot.  Unfortunately, XPLOT will only zoom out as far as the
>> original plot. I often find myself trying to figure out what is
>> happening at the handshake and closing of a connection and I cannot
>> see clearly these locations because they run off the side of the
>> plot.

Yes, this is a problem.  The only way to do it is to add invisible
points outside the normal range, xplot won't let me control the area
otherwise.  In this particular case, however, there's an easy
solution: use the middle mouse button to move the graph so that the
lower corner all shows.  Using this, you can scroll anywhere on the
screen, even outside the boundaries of the points being plotted.


============================================================
Mon Jul 27, 1998

>> do you know if I'd be able to find the precompiled versions of
>> tcpdump or a similar package for NT from somewhere?

I'm sorry, I don't keep up on Microsoft software.  Lots of people have
asked me this question and I don't know the answer.  I'm pretty sure
that tcpdump does NOT run under NT.  The only packet-grabbing software 
that I'm aware of for Wintel machines is etherpeek (Windows/NT/Mac):

	http://www.aggroup.com/

It's nice software, but it's commercial.

============================================================
Tue Jul 28, 1998

>> I've just tried tcptrace. It seems to be really good. But, I've got a
>> problem: I can't trace any UDP packets. It's like we can only decode TCP
>> packets. Is it a tcptrace feature? 
>> 
>> Thanks in advance.

:-)

Yes, that's why it's called _TCP_trace!!!  Seriously, I've never had
any reason to look at UDP packets.  Tcptrace was designed to explore
TCP's protocol details and there's not much protocol detail to UDP to
look into.

(update, version 5 DOES support UDP a little, but you need to add the
"-u" option, as it ignores UDP by default)


============================================================
Wed Jul 29, 1998


>> I was trying to measure tcp throughput in an experimental testbed
>> here in xxxxx initially using netperf so as to have a roughly idea
>> how much it would be.
>> 
>> I tried afterwards to have the same results in the tcptrace. The
>> problem is that the use of tcpdump program seems to degrade the
>> performance up to 75%.
>> 
>> Parameters used in the tcpdump are: -i interface -w file
>> 
>> For example without tcpdump : TPUT: 45Mbps
>> 		with tcpdump: TPUT: 11Mbps
>> 
>> If the use of the tcpdump degradates performance then how else can
>> someone use your tcptrace program in Free BSD?

If you want really good data from tcpdump, you need to run it on a
machine on the same LAN as the tcp sender, but NOT on the same
machine.  You also need to make sure that the data that you're writing
is going to a local file system, rather than a remote file system,
which would cause extra network traffic.

From what you describe, I'm assuming that you're running tcpdump on
the SAME machine as one connection endpoint.  I've never seen a case
in which running tcpdump on a reasonably fast machine would slow it
down enough to cause that sort of degradation.  One other possibility
is that the machine is busy enough that tcpdump is not seeing all the
packets (which IS common, the kernel drops them if tcpdump gets
behind).  Tcptrace only counts the packets that it sees, so if it only
sees half of them, its throughput estimate will be off by half.

The other possibility that I alluded to is that you're writing the
packet data to a file system on a different machine.  That competing
network traffic might congest the network enough to skew the results.


>> And a last question: In case of TPUT which line should I consider?
>> I suspect it should be the blue one.

That's the running average.  Network throughput is difficult to
represent discretely.  Normally, you take bytes/second over a short
period of time.  The blue line is a running average of that figure
from the beginning of the connection.  If you want a numeric answer,
just use "-l" with tcptrace and look at the figure that it prints
out.

============================================================
Sat Sep 12, 1998

>> How do I print the plots from xplot?

This is from the xplot README:

>>>> Clicking the left button while SHIFT is pressed causes xplot to
>>>> drop a postscript file in the current directory.  The title is
>>>> used as the first part of the filename if there has been a title
>>>> plot command.  Otherwise, "xplot" is used.  The file ends in PS.#
>>>> where # is a serial number.  Xplot is careful not to write over a
>>>> previously dumped postscript file, and # is incremented until an
>>>> unused filename is found.
>>>> 
>>>> Clicking the middle button while SHIFT is pressed similarly
>>>> causes xplot to drop a postscript file, but this will be scaled
>>>> suitably to allow the figure to be included in a document.  You
>>>> might have to fiddle with the constants in emit_PS() and
>>>> recompile to get the figure sized the way you want it.
>>>> 
>>>> If you didn't like the size of the figure produced by
>>>> SHIFT-Middle, Clicking the right button while SHIFT is pressed
>>>> will produce a postscript plot just like the middle button, but
>>>> it will take less vertical space.  Again, you can fiddle with the
>>>> constants in emit_PS() and recompile if you don't like these
>>>> sizes.

Just a note about these files.  The magic first line that gets created on
them is nonstandard.  The first line of a postscript file is supposed to
look like:

#!PS (and possibly more stuff)

but xplot generates

#!POSTSCRIPT

That's not correct, but some older printers don't mind.  My experience
with newer printers is that they WON'T recognize it and will misbehave
in annoying ways.  If you change the first like to #!PS it'll work
fine.  If you have the xplot source, you might as well fix this and
recompile (that's what I always do).

============================================================
Sat Sep 12, 1998

>> I just installed tcptrace and xplot, but I'm having trouble 
>> figuring out how to load the neat plots into xplot?
>> Any chance of including a typical session in the docs?

There's not much to it.  Tcptrace creates plot files with the suffix
".xpl".  To see a single plot:

    xplot a2b_tsg.xpl

a VERY nice feature for lining up plots is the "-x" which locks all
graphs to the same X axis.  For example, you can:

    xplot -x a2b_tsg.xpl  b2a_tsg.xpl

and it'll show you BOTH plots.  When you zoom in on one, the other
zooms in too.  Very handy, particularly if you line them up across you 
screen like:

  +----------------------------------------+
  |                                        | screen
  | +-----------------------------------+  |
  | |                                   |  |
  | | xplot graph 1                     |  |
  | |                                   |  |
  | |                                   |  |
  | +-----------------------------------+  |
  | +-----------------------------------+  |
  | |                                   |  |
  | | xplot graph 2                     |  |
  | |                                   |  |
  | |                                   |  |
  | +-----------------------------------+  |
  |                                        |
  +----------------------------------------+

============================================================
Tue Sep 15, 1998

>> xplot doesn't work

if you type

   xplot [FILE].xpl

and see some sort of strange syntax error warnings, type

   xplot -v

   output:
	   xplot version 0.90

if you see something that doesn't look much like that, make sure that
you're running the correct "xplot" program.  There are probably a lot of
programs around with the same name.  You want the one from:

 	ftp://mercury.lcs.mit.edu/pub/shep/

============================================================
Tue Oct  6, 1998

>> Do programs such as snoop identify the application (e.g. telnet,
>> ftp) simply by knowledge of the standard port numbers, or is there
>> something else in the packet somewhere that identifies it?

It's just from the port numbers.  RFC 1700 

	http://www.cis.ohio-state.edu/htbin/rfc/rfc1700.html

gives the official purposes of all of the low-numbered ports.  Unix
machines usually have a subset of this document in /etc/services,
which is where snoop et. al. get their application information.  If
you have a lot of traffic on official ports that AREN'T in this file,
If you can add more entries and make the snoop (and tcptrace)
information easier to read.

============================================================
Wed Dec  2, 1998

>> [...referring to the TSG graphs...] interpretation of some of the
>> graphics, like the "3" and diamonds and arrows and such.

"3" is a triple duplicate ACK, the kind of thing that usually triggers 
fast retransmit in the sender.

The "diamonds" are probably what you're seeing at the tops of some of
the data segments.  That means the segment was sent with a PUSH.

>> From the code, there are a bunch of types of ACKS and things. A
>> bunch of different colors as well.

SACKs are purple, normal ACKs are green.  SYNs and FINs are orange
(unless they're retransmitted, then they're red)

If there's a blue diamond on an ACK, that means that it doesn't create 
a usable RTT sample (because something that preceeded that data was
retransmitted)

If there's a red diamond on an ACK, it ALSO means that there's no RTT
sample for it, but in this case it's because the data being ACKed was
retransmitted.
============================================================
Mon Dec 14, 1998S

>> Is there any way to always use the same set of flags??

You can store commonly-used tcptrace arguments in "~/.tcptracerc"
(comments start with '#') or in envariable "TCPTRACEOPTS".

============================================================
Wed Dec 16, 1998

>> it seems to be running VERY slowly, and using very little CPU time

It might be stuck trying to resolve IP addresses.  Try running it with
"-n" to NOT resolve.  I almost always use "-n" when looking at
non-local traces.  It sometimes takes FOREVER to resolve the names
otherwise.

A quick check of what it's doing is to use the "-t" option that gives
you visual feedback of its progress; it displays the number of packets
processed so far, percentage done (unless compressed), and the
"elapsed trace file time".  If it's not doing at least several hundred
packets per second, then it's probably stuck doing DNS lookups.
============================================================
Wed Dec 16, 1998

>> with "-t", why is the the percentage done more than 100%

Probably one of:

1) If it's a live file, ie one that tcpdump is currently writing, then the
   "percentage done" is based on the original size of the file.  As
   such, depending on how fast the file is growing compared to the
   horsepower of the processor, the figure might be way off.

2) If it's a compressed file, tcptrace doesn't know how long it'll be.  For
   gzipped files of headers only, it usually runs up to around 100%+275% (or
   almost 4 to one compression). 
============================================================
Thu Dec 17, 1998

>> What are "post-loss acks"?

Tcptrace tries to gather _ALL_ RTT samples, not just some of them as
several TCP implementations do.  I called the case causing trouble
"Post Loss":

  Post-loss: an ACK arrives for a segment that was only xmitted once.
   However, at least one of the segments that preceeded it was
   retransmitted, so this ACK was delayed until a CUMACK could be
   sent.  This is not a valid RTT sample therefore.

tcptrace counts them and (optionally) marks them on the TSG output,
but otherwise ignores them

============================================================
Thu Dec 17, 1998

>> What does this mean in the long output with RTT stats: 
>> 
>>           For the following 5 RTT statistics, only ACKs for
>>           multiply-transmitted segments (ambiguous ACKs) were
>>           considered.  Times are taken from the last instance
>>           of a segment.
>> 

This is some pretty old stuff that probably isn't useful anymore.
When I first started looking at this, I was studying some TCP's
without the Karn/Partridge stuff in them and this was a big deal.

The point was to gather some stats about RTTs that might be confused
by older TCPs, those that were taking samples from "ambiguous ACKs"
(those for segments that were retransmitted).  For those ACKs, I kept
track separately of the max,min,avg,stdev.  For the other RTT stats,
those ACKs are ignored, as expected.

I agree that the text is confusing, but I think that it's accurate
given that explanation.

============================================================
Tue Dec 22, 1998


>> Is it possible to use it for monitoring the whole traffic between
>> two hosts (from and to all ports) in one throughput graph?

The traffic module can do that.  First, you'd need to generate a dump
file that contained ONLY the traffic between those two hosts (how to
do that will depend on the kind of file).  Assuming it's a tcpdump
file, you could:

  tcpdump -r oldfile -w newfile host THEHOST1 and host THEHOST2

then just use something like

	tcptrace -xtraffic" -B" newfile

to get bytes/second between the hosts for all traffic.

============================================================
Tue Jan 12, 1999

>> I get the message indicating that there is the presence of hardware 
>> duplicates. What exactly does that mean? Two ethernet cards with 
>> the same MAC address? I don't think it could be a duplicate IP 
>> address.

That's a sanity check.  It means that tcptrace saw 2 packets with the
same TCP header and identical ID fields at the IP level.  Because the
IP headers (IPv4) are the same, it's unlikely that this is a
retransmission.  Most likely, it means that those packets are crossing
the local network twice, as in:

       Router                Sender                  Receiver
         |                     |                         |
       ==========================================================
pkt1:    ^---------------------<
pkt2:    >-----------------------------------------------^

once from the sender to a router (or hub), and then again from the router to
the receiver.  Tcptrace flags the situation because otherwise those packets would
be seen as retransmissions when they really aren't.  If you're seeing a lot of
these (well, probably any at all), then there's a bad setup on your network.
    
============================================================
Thu Mar  4, 1999

>> I just have a question regarding the congestion window plot. I am
>> wondering if this congestion is the cwnd on the sending side or it
>> is the MIN (cwnd, adv_win) where adv_win is the receiving side
>> advertised window size. Can you please explain how the congestion
>> window is calculated (I assumed it is calculated since the tcp
>> packet does not carry the congestion window information).

The title of that plot is misleading and is titled "outstanding data"
in the version that I'm working on now.  There's a high probability
that what it's plotting is similar to the sender's congestion window,
but in fact it is just a heuristic that plots the amount of non-acked
data on the network (the different between the highest byte sent and
the highest byte ACKed).
============================================================
Tue Mar  9, 1999

>> Can I use the program in "real-time" mode?

Sort of, but it depends on what you're trying to do.  For example:

	tcpdump -s4096 -w- | tcptrace -e stdin

will create data files for all connections as they're opened.  For
functions that give answers WHILE processing, this works fine.  For
functions that don't answer until they're done, this will require a
little more fiddling.  For example:

	tcpdump -c100 -s4096 -w- | tcptrace -e -l stdin

for grab the next 100 packets and then stop, while extracting the
contents and then doing a "long" output listing.
============================================================
Mon Apr 19, 1999

Somebody pointed out that there's now a windows port of tcpdump.  I
don't know anything about it except the URL:

http://netgroup-serv.polito.it/tools/analyzer/Install/windump/
============================================================
Tue Apr 27, 1999

>> What does the "truncated data" and "truncated packets" mean ??  I
>> ask this because I do not see any anomalies using our sniffer and
>> tcpdump, but this field has us confused.

"truncated" refers to the difference between the size of the packets
"on the wire" and the number of bytes saved in the dump file.  With
tcpdump and snoop, for example, you can set the "snap length", which
controls the maximum amount of data saved from each packet.  A
1500-byte packet grabbed with a 128-byte snap-length is "truncated".
Because some of the analysis from various modules and features
requires full packet data, tcptrace counts and prints the truncated
segments ("truncated packets") and the amount of missing data
("truncated data").
============================================================
Wed Apr 28, 1999

>> I was wondering if you could please explan what the following actually
>> mean in the detailed output:
>> 
>> data xmit time:        1.377 secs

That's the elapsed time of the connection from the first segment
containing data to the last segment containing data.  It discounts the
SYN and FIN handshaking.


>> idletime max:         126384.9 ms

That's the longest period during which no packets were sent (data or ACK)
from that side of the connection.
============================================================
Thu May 20, 1999

>> Is there a version of tcpdump for wintel machines?

I'm told that there's one here:

  http://netgroup-serv.polito.it/analyzer/install/windump/default.htm

but I've never used it.
============================================================
Tue Jun  1, 1999

>> 2. I get some 'Z' letters printed out by xplot on a sequence number
>> graph, and I couldn't find anything about this in the doc I found. Do
>> you know what they mean ?

Those are "zero windows".  That's when the receiver of the data
advertises a receive window of 0, meaning that it can't accept any
more data.  That normally means that the receiving application can't
keep up.  Normally, they're followed by a "gratuitious ACK" from the
receiver advertising additional buffer space, which will cause the
sender to send more data.  If the sender is impatient (which it's
allowed to be) or the gratuitious ACK is lost, the sender can send
additional data anyway (a zero window probe).
============================================================
Wed Oct 13, 1999

>> I actually wanted very selective outputs.  For example I just want
>> tcptrace to give me only "actual data pkts", "rexmt data pkts",
>> "data xmit time" and the "RTT avg".  I couldnt get it to work with
>> the filtering option. I'd appreciate any help on this.

There's no automatic way to do this.  A simple script should do the
trick.  You might start with something like:

tcptrace -r -l -n input/all.snoop.gz | egrep '(actual data pkts)|(rexmt data pkts)|(data xmit time)|(RTT avg)'
============================================================
Tue Nov  9, 1999

>> What's with all of the stupid quotes for module args

As the modules evolved, it became clear that it wasn't possible to
keep the names of the arguments that the module writers needed
separate from the arguments that the main program wanted.  Rather then 
make the argument names even MORE non-intuitive, I decided that all
module arguments must be in the same shell argument as the -x switch
that enables it.  That means that if you want to pass "-G -I" to the
traffic module, you need to say:

   tcptrace -xtraffic"-G -I"
      or
   tcptrace -xtraffic" -G -I"

note that the Unix shell will package this the same as:

   tcptrace "-xtraffic-G -I"

but this seems less clear to me somehow.  The module writer then has
to parse a long, ugly string, but there's a support routine to do it
for them (see the existing modules as examples).

============================================================
Tue Jan 18, 2000


>> Not being completely sure how to interpret out of order packets, I
>> would be grateful if you could confirm that such conditions occur
>> following packet losses

Out of order packets can occur in lots of cases.  Let's say that TCP
sends the following segments:

1 2 3 4 5 6 2'

meaning that segment 2 was retransmitted (as 2') for some reason.  If
tcptrace sees those packets near the sender, it will mark 2' as a
retransmission because it already saw the first instance of the
segment.

However, if you grab the packets "close" to the receiver, tcptrace
will likely see:

1 3 4 5 6 2'

because segment "2" didn't arrive.  In that case, tcptrace will mark
segment 2 as being "out of order" because it can't tell the
difference.
