- problem with arp who-has

PDA

View Full Version : problem with arp who-has


Douglas O'Neal
07-24-2004, 10:36 PM
I am running a linux cluster on an isolated network
and I am having a network issue that I cannot resolve.
The cluster nodes are dual-processor intel systems
running Redhat Enterprise WS 2.1, kernel version
2.4.9-e.12smp and are connected via SMC Tigerswitch
gigabit switches. When I run 'tcpdump broadcast'
from any node I see a series of packets reading

arp who-has biowolf001 (Broadcast) tell biowolf001

with every node in the cluster represented in the
tcpdump and the packets are repeated every half
second for every node. If I shut down a node, the
arps for that node continue so I believe that the
switches are at fault. I have tried configuring
individual ports to add static addressing for each
node with no effect. Can anybody either tell me
that this is not a problem or point out the solution?
Thanks.

Doug
--
Dr. Douglas O'Neal
Manager, Bioinformatics Center
Delaware Biotechnology Institute
(302) 831-3456

P Gentry
07-24-2004, 10:39 PM
Douglas O'Neal <oneal@dbi.udel.edu> wrote in message news:<br50dv$fv2$1@news.udel.edu>...
> I am running a linux cluster on an isolated network
> and I am having a network issue that I cannot resolve.
> The cluster nodes are dual-processor intel systems
> running Redhat Enterprise WS 2.1, kernel version
> 2.4.9-e.12smp and are connected via SMC Tigerswitch
> gigabit switches. When I run 'tcpdump broadcast'
> from any node I see a series of packets reading
>
> arp who-has biowolf001 (Broadcast) tell biowolf001
>
> with every node in the cluster represented in the
> tcpdump and the packets are repeated every half
> second for every node. If I shut down a node, the
> arps for that node continue so I believe that the
> switches are at fault. I have tried configuring
> individual ports to add static addressing for each
> node with no effect. Can anybody either tell me
> that this is not a problem or point out the solution?
> Thanks.
>
> Doug

Hope by now someone/somehow has resolved this issue for you. No one
has posted so I'll offer what I can, which ain't much.
Questions:
-- which switch model exactly (SMC8608SX 8 port)?
-- how many swithches?
-- how many nodes?
-- what kind of cluster (biowolf -> beowolf)?

It seems every node in your cluster is looking for the MAC of
biowolf001 as if it were a default gateway (who has Elvis?) and is
getting no response that sticks. The ARP's are coming too frequently.
Without the full dump line, however it's hard to tell if this is
what's going on (I'm used to Ethereal output). Is this (biowolf001) a
switch? Are these the only ARP's? Are the nodes' ARP tables loaded
from files (you did say this was an isolated cluster)? Are the
switches' ports statically config'ed (did you leave them that way)?
Any trunks? Is STP running around to be sure there are no redundant
paths creating storms (if everything is static and proper you
shouldn't need it, but run as a double check now to make sure)? More
than the one "default" VLAN? Have no idea if VLAN's are useful in your
case or not.

I'm not sure why you should be getting any ARP's to speak of, as
surely each node and switch can hold the tables in memory. Though I
don't know that you can eliminate the time outs, they default to 5
minutes in above model. I guess I'm asking why isn't everything
regarding ARP not static?

If you're running at 1000Mbs, I don't know that this ARP traffic is a
burden, but my gut tells me something is amiss. Sorry to be so little
help, but maybe something here will spark a brain cell for you. Have
a feeling it's something fairly obvious.

HTH a little,
prg
email above disabled