The standard network tools (e.g. ifconfig, route and netstat) aren't capable of setting up some of the features used in newer LVSs e.g. routing based on src_addr. For this we use iproute2, which allows routing based on almost any of the parameters of a packet (src, dest, proto, tos...). iproute2 is available at iproute2-current.tar.gz. iproute2 implements similar functionality to cisco's IOS.
For nodes attached to only one network (leaf nodes, i.e. there is only one possible route for packets), then ifconfig and route are just fine. If multiple routes exist then iproute2 is needed.
Presumably routing in Linux and the setup of LVS will move more toward using iproute2. The configure script will use the iproute2 package to do some configuration if you have it installed.
Instead of aliases (e.g. eth0:110) iproute2 uses labels. ip_tables is based on the same underlying code and also requires labels to recognise ip_aliases. If you want to see the network as ip_tables sees it, you need the iproute2 tools.
iproute2 is not compatible with ifconfig,route and netstat. The entries added by the iproute2 tools are not seen by ifconfig/route etc and the output of ifconfig/route etc will be incorrect. You can't tell from looking at the output of ifconfig/route whether iproute2 commands have been run - you just have to know. The iproute2 tools correctly interpret the results of ifconfig/route commands and will give the correct state of the network.
Unfortunately the user interface to iproute2 is not easy.
The documentation is not easy to read (although it was all Julian needed).
Ratz suggested "Policy Routing Using Linux" by Matthew G. Marsh, Pub Sams 2001, ISBN 0-672-32052-5, to get you started (it helped me). (Oct 2002) Ratz has just found that the book is also online
Padraig Brady padraig (at) antefactor (dot) com suggests Linux Advanced Routing and Traffic Control HOWTO.
See Guide to IP Layer Network Administration with Linux (http://linux-ip.net/) where Appendix C has information on using the iproute2 tools.
The output from the commands is difficult to parse (see the comments in the configure script for more details) - i.e. it's not machine readable. If the route is 0/0 then it is not listed in the output and the next output item shifts one field. This means that you have to know the route before you can parse the output. Ratz is developing a wrapper for iproute2 that will give machine readable output. (To have a command line utility which is not machine readable is intolerable.)
There are other problems
Joe, Dec 2003
The latest on Alexey's ftp site is 2.4.7 from Jan 2002. Is this really the latest?
Alejandro Mery amery (at) geeks (dot) cl 24 Dec 2003
2.4.7-now-ss010824 is the official lastest 'stable' but Bert Hubert (ahu) (Bert's website, http://ds9a.nl/) from lartc.org had an 'almost-branch' with some fixes and improvements with the date 2002-10-20. Bert's code is downloadable at http://ds9a.nl/cgi-bin/viewcvs.cgi/iproute2-ahu/iproute2-ahu.tar.gz?tarball=1&only_with_tag=HEAD" . Sadly both Bert's and Alexey's code are unmantained.
Example:
In a normally functioning LVS-DR, with routing setup by "route" the realservers will be sending packets with the following routing
- src_addr=VIP dest_addr=0/0. dest=0/0 - route via default gw
- src_addr=RIP dest_addr=RIP network. dest=RIP network - route to RIP network
In LVS-DR a packet leaving the realserver can exit via the default gw or the director. In the standard setup, packets with dst_addr=RIPnetwork are put onto the local network and all other packets are sent to the default gw.
If instead the routing is setup by "iproute2", packets with src_addr=VIP are sent to the default gw, while packets with src_addr=RIP are put onto the local network. The realservers will be sending packets with the following routing
- src_addr=VIP dest_addr=0/0. src=VIP - route via default gw
- src_addr=RIP dest_addr=RIP network. src=RIP - route to RIP network
The result for a normal working LVS, will be the same (i.e. the LVS will still work). However with the standard setup, packets with scr_addr=RIP cannot get to the outside world (the director does not have a default route to 0/0). If a process needs this (e.g. the operator needs to telnet out, or the realserver needs DNS), then those packets from the RIP can be NAT'ed out via the director (or you can setup the realservers as if they are part of a 3-Tier LVS LVS). For security, all packets from the VIP have to go out the default gw (including any to say the DIP, which will be dropped by rules on the default gw, to prevent spoofing).
- src_addr=VIP dest_addr=RIP network. src=VIP - route via default gw, will be dropped
- src_addr=RIP dest_addr=0/0. src=RIP - route to RIP network. If the director has the correct NAT rules, then these packets can pass to the outside world.
Lawrence Strydom laurie (at) midafrica (dot) com 26 May 2003
Is it possible to set up heartbeat between a Linux and a Windose box. The MS box will be the master node and the Linux box will provide redundancy.(dont ask! it is what the client wants)
Horms
It should be theoretically possible to run heartbeat on Windows. But to my knowledge no one has done this in the past. The heartbeat code is reasonably portable (between different Unix-like operating systems) but it is likely that you will need to do quite a lot of work to get it to compile and work correctly on Windows. I have no experince with using cygwin so I can't comment any further than that.
(with Julian)
(I needed this information to setup a one-net LVS-NAT LVS. However since it is about routing and not LVS specifically, maybe I should move it elsewhere.)
The routes added with route go into the kernel FIB (Forwarding information base) route table. The contents are displayed with route (or netstat -a).
Following an icmp redirect, the route updates go into the kernel's route cache (route -C).
You can flush the route cache with
echo 1 > /proc/sys/net/ipv4/route/flush or ip route flush cache |
Here's the route cache on the realserver before any packets are sent.
realserver:/etc/rc.d# route -C Kernel IP routing cache Source Destination Gateway Flags Metric Ref Use Iface realserver director director 0 1 0 eth0 director realserver realserver il 0 0 9 lo |
With icmp redirects enabled on the director, repeatedly running traceroute to the client shows the routes changing from 2 hops to 1 hop. This indicates that the realserver has received an icmp redirect packet telling it of a better route to the client.
realserver:/etc/rc.d# traceroute client traceroute to client (192.168.1.254), 30 hops max, 40 byte packets 1 director (192.168.1.9) 0.932 ms 0.562 ms 0.503 ms 2 client (192.168.1.254) 1.174 ms 0.597 ms 0.571 ms realserver:/etc/rc.d# traceroute client traceroute to client (192.168.1.254), 30 hops max, 40 byte packets 1 director (192.168.1.9) 0.72 ms 0.581 ms 0.532 ms 2 client (192.168.1.254) 0.845 ms 0.559 ms 0.5 ms realserver:/etc/rc.d# traceroute client traceroute to client (192.168.1.254), 30 hops max, 40 byte packets 1 client (192.168.1.254) 0.69 ms * 0.579 ms |
Although route shows no change in the FIB, the route cache has changed. (The new route of interest is bracketted by >< signs in the table below.)
realserver:/etc/rc.d# route -C Kernel IP routing cache Source Destination Gateway Flags Metric Ref Use Iface client realserver realserver l 0 0 8 lo realserver realserver realserver l 0 0 1038 lo realserver director director 0 1 138 eth0 >realserver client client 0 0 6 eth0< director realserver realserver l 0 0 9 lo director realserver realserver l 0 0 168 lo |
Packets to the client now go directly to the client instead of via the director (which you don't want).
It takes about 10mins for the client's route cache to expire (experimental result). The timeouts may be in /proc/sys/net/ipv4/route/gc_*, but their location and values are well encrypted in the sources :) (some more info from Alexey at LVS archives)
Here's the route cache after 10mins.
realserver:/etc/rc.d# route -C Kernel IP routing cache Source Destination Gateway Flags Metric Ref Use Iface realserver realserver realserver l 0 0 1049 lo realserver director director 0 1 139 eth0 director realserver realserver l 0 0 0 lo director realserver realserver l 0 0 236 lo |
There are no routes to the client anymore. Checking with traceroute, shows that 2 hops are initially required to get to the client (i.e. the routing cache has reverted to using the director as the route to the client). After 2 iterations, icmp redirects route the packets directly to the client again.
realserver:/etc/rc.d# traceroute client traceroute to client (192.168.1.254), 30 hops max, 40 byte packets 1 director (192.168.1.9) 0.908 ms 0.572 ms 0.537 ms 2 client (192.168.1.254) 1.179 ms 0.6 ms 0.577 ms realserver:/etc/rc.d# traceroute client traceroute to client (192.168.1.254), 30 hops max, 40 byte packets 1 director (192.168.1.9) 0.695 ms 0.552 ms 0.492 ms 2 client (192.168.1.254) 0.804 ms 0.55 ms 0.502 ms realserver:/etc/rc.d# traceroute client traceroute to client (192.168.1.254), 30 hops max, 40 byte packets 1 client (192.168.1.254) 0.686 ms 0.533 ms * |
If you now turn off icmp redirects on the director.
director:/etc/lvs# echo 0 > /proc/sys/net/ipv4/conf/all/send_redirects director:/etc/lvs# echo 0 > /proc/sys/net/ipv4/conf/default/send_redirects director:/etc/lvs# echo 0 > /proc/sys/net/ipv4/conf/eth0/send_redirects |
Checking routes on the realserver -
realserver:/etc/lvs# netstat -rn Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo 0.0.0.0 director 0.0.0.0 UG 0 0 0 eth0 |
nothing has changed here.
Flush the kernel routing table and show the kernel routing table -
realserver:/etc/lvs# ip route flush cache realserver:/etc/lvs# route -C Kernel IP routing cache Source Destination Gateway Flags Metric Ref Use Iface realserver director director 0 1 0 eth0 director realserver realserver l 0 0 1 lo |
There are now no routes to the client.
Now when you send packet to the client, the route stays via the director needing 2 hops to get to the client. There are no one hop packets to the client.
realserver:/etc/rc.d# traceroute client traceroute to client (192.168.1.254), 30 hops max, 40 byte packets 1 director (192.168.1.9) 0.951 ms 0.56 ms 0.491 ms 2 client (192.168.1.254) 0.76 ms 0.599 ms 0.574 ms realserver:/etc/rc.d# traceroute client traceroute to client (192.168.1.254), 30 hops max, 40 byte packets 1 director (192.168.1.9) 0.696 ms 0.562 ms 0.583 ms 2 client (192.168.1.254) 0.62 ms 0.603 ms 0.576 ms realserver:/etc/rc.d# traceroute client traceroute to client (192.168.1.254), 30 hops max, 40 byte packets 1 director (192.168.1.9) 0.692 ms * 0.599 ms 2 client (192.168.1.254) 0.667 ms 0.603 ms 0.579 ms realserver:/etc/rc.d# traceroute client traceroute to client (192.168.1.254), 30 hops max, 40 byte packets 1 director (192.168.1.9) 0.689 ms 0.558 ms 0.487 ms 2 client (192.168.1.254) 0.61 ms 0.63 ms 0.567 ms realserver:/etc/rc.d# traceroute client traceroute to client (192.168.1.254), 30 hops max, 40 byte packets 1 director (192.168.1.9) 0.705 ms 0.563 ms 0.526 ms 2 client (192.168.1.254) 0.611 ms 0.595 ms * realserver:/etc/rc.d# traceroute client traceroute to client (192.168.1.254), 30 hops max, 40 byte packets 1 director (192.168.1.9) 0.706 ms 0.558 ms 0.535 ms 2 client (192.168.1.254) 0.614 ms 0.593 ms 0.573 ms |
The kernel route cache
realserver:/etc/rc.d# route -C Kernel IP routing cache Source Destination Gateway Flags Metric Ref Use Iface client realserver realserver l 0 0 17 lo realserver realserver realserver l 0 0 2 lo realserver director director 0 1 0 eth0 >realserver client director 0 0 35 eth0< director realserver realserver l 0 0 16 lo director realserver realserver l 0 0 63 lo |
shows that the only route to the client (labelled with ><) is via the director.
For send_redirects, what's the difference between all, default and eth0?
Julian
see the LVS archives
When the kernel needs to check for a feature (e.g. send_redirects) it uses calls like:
if (IN_DEV_TX_REDIRECTS(in_dev)) ...These macros are defined in /usr/src/linux/include/linux/inetdevice.h
The macro returns a value using expression from all/<var> and <dev>/<var>. So, these macros check for example for: all/send_redirects || eth0/send_redirects or all/hidden && eth0/hidden.
when you create eth0 for first time using ifconfig eth0 ... up default/send_redirects is copied to eth0/send_redirects from the kernel, internally. i.e. default/ contains the initial values the device inherits when it is created. This is the safest way a device to appear with correct conf/<dev>/ values.
When we put a value in all/<var> you can assume that we set the <var>. When we put value in all/<var> you can assume that we set the <var> for all devices in this way:
all/<var> the macro returns: for && 0 0 for && 1 the value from <dev>/<var> for || 0 the value from <dev>/<var> for || 1 1This scheme allows the different devices to have different values for their vars. e.g. if we set 0 to all/send_redirects, the 3th line applies to the values, i.e. the result from the macro is the real value in <dev>/send_redirects. If we set 1 to all/send_redirects according to the 4th line, the macro always returns 1 regardless of the <dev>/send_redirects.
how to debug/understand TCP/IP packets?
Julian
The RFC documents http://www.ietf.cnri.reston.va.us/rfc.html are your friends. The numbers you need:
793 TRANSMISSION CONTROL PROTOCOL 1122 Requirements for Internet Hosts -- Communication Layers 1812 Requirements for IP Version 4 Routers 826 An Ethernet Address Resolution Protocolfor tcpdump, see man tcpdump.
for Microsoft NT _server_
Steve (dot) Gonczi (at) networkengines (dot) com
there is a uSoft supplied packet capture utility as well.
also -W. Richard Stevens: TCP-IP Illustrated, Vol 1, a good intro into packet layouts and protocol basics. (anything by Stevens is good - Joe).
Ivan Figueredo idf (at) weewannabe (dot) com
for windump - http://netgroup-serv.polito.it/windump/
Packets leaving a LVS-DR realserver can have src_addr=VIP or src_addr=RIP. If the default gw is different for each packet, it would be nice to have a command line testing tool like ping or traceroute to test the route. The normal tools will create packets with src_addr=RIP and you won't be able to test the packets with src_addr=VIP.
Roberto Nibali ratz (at) tac (dot) ch 22 May 2001
maybe hping can help you.
Joe
Ah, the file hping2.8 is the man page i.e. {hping2}.8 - I thought it was v2.8 of hping.
How about:
ip route get $IP? |
didn't know about "get". yes that works. It's like a -C with iptables. I'd still like to send a packet and see where it goes rather than getting an answer about where it is expected to go.
Julian
Not possible with src interface "lo" but possible with source address configured in "lo". Oh yes, "source interface" for some tools means "get one address from this iface and use it". In most of the cases these tools don't do the Right Thing.
from iproute2
$ ping -I src dst |
arping -I if -s src dst |
see Julian's notes and patches to handle the arp problem with iproute2 (this is somewhat developemental).
from Julian
This will look at the routing tables and tell you the route to xxx.xxx.xxx.xxx
ip route get xxx.xxx.xxx.xxx |
If you already have a route from A to B, and want to add another, you can't, you have to append the extra route.
dynnema dynnema (at) yahoo (dot) com Mar 22 2002
Lets say I got one RS and two NAT DIRs.
RS: RIP1: 192.168.1.2/24 dev eth0 RIP2 192.168.2.2/24 dev eth0:10 DIR1: VIP: x.x.x.69 eth0:110 DIP 192.168.1.1 DIR2: VIP: x.x.x.70 eth0:110 DIP 192.168.2.1 |
I add the first route
ip route add src 192.168.1.2 via 192.168.1.1 |
but then I can't add the second route:
ip route add src 192.168.2.2 via 192.168.2.1: "RTNETLINK answers: File exists" |
Careful reading of IProute mailing list was very useful. It should be
ip route append src 192.168.2.2 via 192.168.2.1 |
Joe: we used to have ip aliases with ifconfig. We still have ip aliases, but as of kernel 2.1.128, the semantics has changed. Be careful using the old style ip aliases (e.g. eth0:1, lo:127) with the newer tools (e.g. iproute2), which expect a different syntax.
Ratz 25 Nov 2003
ipchains doesn't recognize alias neither because since the 2.2.x kernel we moved to the iproute2 architecture.
Note | |
---|---|
Joe: in other parts of the HOWTO, I've incorrectly said the changeover started with the 2.4 kernels. Hopefully this error has been fixed. The change from 2.2 to 2.4 involved the different packet path through the kernel and the replacement of ipchains with netfilter (http://www.netfilter.org). Netfilter is most familiar through its user space tool iptables which defines rule set for packets. |
Packet filtering on aliases stopped working after the decay of ipfwadm in the old 2.0.x kernel days. Today you can still filter on so-called ip aliases but as the name implies you specify the IP ADDRESSS as a classifier and if you want to restrict it further, you add the underlying _physical_ interface definition to the classifying rule.
iproute2 is compatible with ifconfig/route/netstat but not vice versa. The two biggest issues people new to iproute2 have to struggle with are:
Ratz ratz (at) drugphish (dot) ch 07 Mar 2007
Before the arrival of the Linux kernel version 2.2 a network device named eth0:3 was actually a "real" (kernel-wise) network device by the name of eth0:3. You could filter on that device and you could route on that device (please send this packet out eth0:3).
After that the Linux network model changed and the so called logical/virtual devices were degraded to aliases. The nomenclature was never standardised, so in the 2.0.x kernels, a device eth0:3 was called an alias, but it was a real independant device. In later kernels, the name alias meant "another name for". In current kernels, an alias is actually a string related to an IP address, nothing more and nothing less. It has no semantic meaning whatsoever, besides being a backwards compatible string for the ifconfig tool.
The label is optional for secondary IP addresses. Secondary IPs configured with iproute2 without an explicit label do not show up in ifconfig.
Note | |
---|---|
If the first IP configured on an interface with ip addr add is 192.168.1.1/24, then any subsequent addresses in that network (192.168.1.2/24..192.168.1.254/24) will be secondary addresses and 192.168.1.1 will be the primary address. If the primary address is removed, then all secondary addresses will also be removed. If another address not in that network is added (e.g. 10.10.1.1/16 or 192.168.1.5/30) then it will be another primary address. |
Going the other direction, an alias configured with ifconfig always shows up in ip addr show.
ifconfig intf:label ip.ad.dr.ess netmask ne.tm.as.k broadcast br.oa.dc.ast up |
is essentially the same as
ip addr add ip.ad.dr.ess/cidr brd + dev intf label intf:label ip link set dev intf up |
Sentences like "Network aliases or IP aliases or device aliases don't work with netfilter anymore" are not correct, since it's rather the other way around, but generally not a correct sentence, since no packet filtering mechanism ever worked with pure strings :). If you want to filter on an alias, find out its corresponding IP address (using ip addr show) and filter based on the IP address and the physical underlying interface. So:
1.1.1.1 eth0 1.1.2.1 eth0:3 1.1.3.1 eth0:foobar |
If you want to filter based on eth0:3, you set up a filter as follows on eth0 and 1.1.2.1 (didn't check on the correct syntax):
iptables -t filter -A INPUT -j DROP -i eth0 -s 1.1.2.1 ... |
Note | |
---|---|
you can't filter on eth0:3 and iptables doesn't use labels either. So you can't use interfaces in iptables rules. |
A Linux host/router is behaving like modern managed (application) switch: There is no assignment anymore of IP addresses to network interface cards. An IP address is attached to the host and this confuses most people, especially when they have multiple NICs in their node and configure different IP addresses to each NIC, ping one IP address and get the reply from a seemingly different collision domain. This means, that for example even though one can be physically connected only to eth0 with IP address 192.168.1.1 and having eth1 with IP address 10.10.10.10 non-wired to a switch, he/she will be able to ping 10.10.10.10 through eth0. This is because IP addresses do not belong to network interface cards. Also one reason why you can filter per network interface card and also per IP address. In some cases, the machine will have a route to (say) 10.10.10.10, but you don't have a routing entry for /32 IP addresses and they'll still reply.
Note | |
---|---|
Joe: the Marsh book, p8, explains that from 2.2.x, when you configure an IP on a NIC, the route to that network is configured as well (i.e. you don't have to add a route to the network, as you did with the 2.0.x kernels). If you don't want the route, you have to configure a host (i.e. /32) address. |
There is no assignment anymore of IP addresses to network interface cards. An IP address is attached to the host and this confuses most people.
Joe
yes. So why are addresses configured with the name of the NIC (eth0, eth1)? Why not just tell the kernel which IPs it has and let it figure out what to do with all the NICs?
How?
An IP alias or logical/virtual device is simply a string, which you'll clearly see in the output of ip addr show:
root@laphish2:~# ip addr show 1: lo: <LOOPBACK,UP,10000> mtu 16436 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo 2: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:08:74:9d:e7:0a brd ff:ff:ff:ff:ff:ff 3: eth1: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:0b:db:22:82:53 brd ff:ff:ff:ff:ff:ff inet 192.168.1.32/24 brd 192.168.1.255 scope global eth1 4: vmnet8: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:50:56:c0:00:08 brd ff:ff:ff:ff:ff:ff inet 172.16.39.1/24 brd 172.16.39.255 scope global vmnet8 5: vmnet1: <BROADCAST,MULTICAST,UP,10000> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:50:56:c0:00:01 brd ff:ff:ff:ff:ff:ff inet 192.168.136.1/24 brd 192.168.136.255 scope global vmnet1 |
If I add a new IP address to the host, I can specify a physical interface to which it will add the routing entries and that "alias" string (label in iproute2 speak):
root@laphish2:~# ip addr add 7.7.7.7/32 brd + dev eth0 label "eth0joe_bloggs" root@laphish2:~# ip addr add 8.8.8.8/29 brd + dev eth0 label "eth0:ratzfatz" root@laphish2:~# ip addr show dev eth0 2: eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:08:74:9d:e7:0a brd ff:ff:ff:ff:ff:ff inet 7.7.7.7/32 scope global eth0joe_bloggs inet 8.8.8.8/29 brd 8.8.8.15 scope global eth0:ratzfatz root@laphish2:~# ip route show dev eth0 table main 8.8.8.8/29 proto kernel scope link src 8.8.8.8 |
Even the eth0 is a string with no special meaning:
root@laphish2:~# ip link set dev eth0 down root@laphish2:~# ip link set dev eth0 name kkk root@laphish2:~# ip addr show dev eth0 Device "eth0" does not exist. root@laphish2:~# ip addr show dev kkk 2: kkk: <BROADCAST,MULTICAST> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:08:74:9d:e7:0a brd ff:ff:ff:ff:ff:ff inet 7.7.7.7/32 scope global kkk inet 8.8.8.8/29 brd 8.8.8.15 scope global kkk:2 root@laphish2:~# ip link set dev kkk up root@laphish2:~# ip addr show dev kkk 2: kkk: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:08:74:9d:e7:0a brd ff:ff:ff:ff:ff:ff inet 7.7.7.7/32 scope global kkk inet 8.8.8.8/29 brd 8.8.8.15 scope global kkk:2 |
This should remedy the last concerns regarding the Linux networking on the link and addressing level. What is interesting though is that if you rename an existing device, the associated labels will get renames as well and enumerated, so ifconfig will find it.
And for some final fun:
root@laphish2:~# ip addr add 9.9.9.9/29 brd + dev kkk label "kkklllllvvvv" root@laphish2:~# ifconfig eth1 Link encap:Ethernet HWaddr 00:0B:DB:22:82:53 inet addr:192.168.1.32 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:71064 errors:0 dropped:0 overruns:0 frame:0 TX packets:59472 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:71731379 (68.4 MiB) TX bytes:7894335 (7.5 MiB) Interrupt:11 Base address:0x8800 kkk Link encap:Ethernet HWaddr 00:08:74:9D:E7:0A inet addr:7.7.7.7 Bcast:0.0.0.0 Mask:255.255.255.255 UP BROADCAST MULTICAST MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) Interrupt:11 Base address:0x6c00 kkk:2 Link encap:Ethernet HWaddr 00:08:74:9D:E7:0A inet addr:8.8.8.8 Bcast:8.8.8.15 Mask:255.255.255.248 UP BROADCAST MULTICAST MTU:1500 Metric:1 Interrupt:11 Base address:0x6c00 kkklllllvvvv: error fetching interface information: Device not found root@laphish2:~# ip addr show dev kkk 2: kkk: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000 link/ether 00:08:74:9d:e7:0a brd ff:ff:ff:ff:ff:ff inet 7.7.7.7/32 scope global kkk inet 8.8.8.8/29 brd 8.8.8.15 scope global kkk:2 inet 9.9.9.9/29 brd 9.9.9.15 scope global kkklllllvvvv |
Not only does ifconfig not show all the IP addresses configured to kkk, it also barfs about an unknown device, which actually is a label for an IP address. Thankfully we have iproute2, which displays the exact state of configuration.
One of the problems with the iproute2 utils is that the syntax is not machine readable (and difficult for humans too). Ratz has built some wrappers around these utils.
Ratz 25 Nov 2003
If you guys are interested I'll offer my first semi-official release of some of the replacement tools I've written for ifconfig/route. You can download them from Ratz's wrappers http://www.drugphish.ch/~ratz/iproute2/
It's still not really scriptable (I wrote it with really gross bash constructs and by using external tools ;). BUT, it solves some of architectural principles, such as separation of concern, correctness, flexibility, conceptional integrity, coupling and cohesion! You are given two tools to maintain almost everything network related. (I'm aware that iptables/netfilter and mii-tool, ethtool are also network related)
ifconfig gives you the (wrong) impression that eth0:0 is an interface, just as others in the output ifconfig -a. This is not true. The iproute2 tools correctly displays the relationship between aliases/labels and their corresponding physical interface.
Example:
laphish:~ # ifconfig -a | grep -A2 eth0 eth0 Link encap:Ethernet HWaddr 00:20:E0:68:71:3A inet addr:172.23.2.131 Bcast:172.23.255.255 Mask:255.255.0.0 inet6 addr: fe80::220:e0ff:fe68:713a/64 Scope:Link -- eth0:0 Link encap:Ethernet HWaddr 00:20:E0:68:71:3A inet addr:10.98.43.233 Bcast:10.98.43.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 -- eth0:foo Link encap:Ethernet HWaddr 00:20:E0:68:71:3A inet addr:10.23.7.233 Bcast:10.23.7.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 laphish:~ # |
Sure, one could argue that all HWaddr of those "interfaces" are the same and thus something with the interpretation of them being _real_ physical interfaces must be fishy. But it gives you the wrong idea of connection or entity relationship between link and ip layer.
Now let's compare the same output for iproute2:
laphish:~ # ip addr show dev eth0 2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 100 link/ether 00:20:e0:68:71:3a brd ff:ff:ff:ff:ff:ff inet 172.23.2.131/16 brd 172.23.255.255 scope global eth0 inet 10.23.7.233/24 brd 10.23.7.255 scope global eth0:foo inet 10.98.43.233/24 brd 10.98.43.255 scope global eth0:0 inet 10.239.10.1/24 brd 10.239.10.255 scope global eth0 inet6 fe80::220:e0ff:fe68:713a/64 scope link laphish:~ # |
As you can see we have a physical interface (link layer entity) called eth0 and associated with this interface we have 5 (not 4 like with ifconfig) IP addresses. And you can certainly well spot the labels which in ifconfig were displayed as independant interfaces at the end of each line starting with inet, right?
Plus there you certainly noted that in the second output we have one additional address which was not shown in the ifconfig output but is very well routable and _is_ a valid configuration. I simply didn't want to put an alias there.
Tools like ipchains and iptables and their underlying state machine are better off matching for ip addresses and the _one_ physical interface those are attached to then trying to fiddle around with a label that is optional and doesn't give you real valuable information. Additionally with iproute2 you have a better approach to conceptional integrity which is one of the key ingredients of architectures in that you say that even if I have multiple addresses for one interface I still send out the packet through the physical interface and not through a labeled, aliased or virtual interface.
ifconfig is an example of a "hiding complexity" tool. Hiding complexity is a concept the software industry has not yet adopted to the extent that we can trust it, and thus ifconfig is broken by design.
The reasons why people still use those deprecated tools are: