Doing this from source code is now described in the LVS-mini-HOWTO. Two methods of setup are described
Ultra Monkey is a packaged set of binaries for LVS, including Linux-HA for director failover and ldirectord for realserver failover. It's written by Horms, one of the LVS developers. Ultra Monkey was used on many of the server setups sold by VA Linux and presumably made lots of money for them. Ultra Monkey has been around since 2000 and is mature and stable. Questions about Ultra Monkey are answered on the LVS mailing list. Ultra Monkey is mentioned in many places in the LVS-HOWTO.
Ben Hollingsworth ben (dot) hollingsworth (at) bryanlgh (dot) org 29 Jun 2007
There's step by step instructions on How to install Ultra Monkey LVS in a 2-Node HA/LB Setup on CentOS/RHEL4 (http://www.jedi.com/obiwan/technology/ultramonkey-rhel4.html).
Dan Thagard daniel (at) gehringgroup (dot) com 3 Jul 2007
I recently setup LVS using the Ultramonkey RPMs. The following is a (based on my understanding) complete howto for setting up CentOS 5 with LVS: Generic CentOS 5 x64 Install on 2 PCs using Ultramonkey and Streamlined/HA topology with Apache The following assumptions were made:
Configure the network settings for each adapter.
Select the system packages
.Configure the system packages.
Set firewall to 'Disabled' and click 'Forward'.
Edit the '/etc/group' file
vi /etc/group |
Su to root
su - |
Install the dries yum repository by creating dries.repo in the /etc/yum.repo.d/ directory with the following contents
[/etc/yum.repo.d/dries.repo] [dries] name=Extra Fedora rpms dries - $releasever \ - $basearch baseurl=http://ftp.belnet.be/packages/dries.ulyssis.org/redhat/el5/en/x86_64/dries/RPMS |
Install the dries GPG key
rpm --import http://dries.ulyssis.org/rpm/RPM-GPG-KEY.dries.txt |
Update your local packages and install some additional ones
yum update -y && yum -y install lynx libawt xorg-x11-deprecated-libs nx freenx arptables_jf httpd-devel |
Correct release version
mv /etc/redhat-release /etc/redhat-release.orig && \ echo "Red Hat Enterprise Linux Server release 5 (Tikanga)" > /etc/redhat-release |
Install the arptables-noarp-addr and perl-Mail-POP3Client RPMs (change the cd path to wherever you downloaded Ultramonkey to)
cd /usr/local/src/Ultramonkey && rpm -Uvh arptables-noarp-addr-0.99.2-1.rh.el.um.1.noarch.rpm && \ rpm -Uvh perl-Mail-POP3Client-2.17-1.el5.centos.noarch.rpm |
Install Ultramonkey
yum install -y heartbeat* |
Download and edit the Ultramonkey config files that relate your desired topology from http://www.ultramonkey.org to the /etc/ha.d/ directory and edit them to meet your desired configuration. Examples as follows:
[/etc/ha.d/authkeys] auth 2 2 sha1 Ultramonkey! [/etc/ha.d/ha.cf] logfacility local0 mcast eth0 225.0.0.1 694 1 0 auto_failback off node ws01.testlab.local node ws02.testlab.local ping 10.0.0.1 respawn hacluster /usr/lib64/heartbeat/ipfail apiauth ipfail gid=haclient uid=hacluster [/etc/ha.d/haresources] ws01.testlab.local \ ldirectord::ldirectord.cf \ LVSSyncDaemonSwap::master \ IPaddr2::10.0.0.100/24/eth0/10.0.0.255 [/etc/ha.d./ldirector.cf] checktimeout=10 checkinterval=2 autoreload=yes logfile="/var/log/ldirectord.log" quiescent=no # Virtual Service for HTTP virtual=10.0.0.100:80 fallback=127.0.0.1:80 real=10.0.0.10:80 gate real=10.0.0.20:80 gate service=http request="alive.html" receive="I'm alive!" scheduler=wrr persistent=1800 protocol=tcp checktype=negotiate # Virtual Service for HTTPS virtual=10.0.0.100:443 fallback=127.0.0.1:443 real=10.0.0.10:443 gate real=10.0.0.20:443 gate service=https request="alive.html" receive="I'm alive!" scheduler=wrr persistent=1800 protocol=tcp checktype=negotiate |
Set the permission on authkeys
chmod 600 /etc/ha.d/authkeys |
Start the httpd server
httpd -k start |
Create alive.html in the /var/www/html folder with the following text (set this to whatever file you have set in the monitoring script)
I'm alive! |
Edit the /etc/hosts file to include the FQDN of all of the machines in your LVS (not strictly necessary, but it helps avoid problems)
# Do not remove the following line, or various programs # that require network functionality will fail. 127.0.0.1 localhost.localdomain localhost 10.0.0.10 ws01.testlab.local ws01 10.0.0.20 ws02.testlab.local ws02 ::1 localhost6.localdomain6 localhost6 |
Edit the /etc/sysconfig/network-scripts/ifcfg-lo file with your virtual IP
DEVICE=lo IPADDR=127.0.0.1 NETMASK=255.0.0.0 NETWORK=127.0.0.0 BROADCAST=127.255.255.255 ONBOOT=yes NAME=loopback DEVICE=lo:0 IPADDR=10.0.0.100 NETMASK=255.255.255.255 NETWORK=10.0.0.0 BROADCAST=10.0.0.255 ONBOOT=yes NAME=loopback |
Edit the /etc/sysconfig/network-scripts/ifcfg-eth0 file to match this (edit the IP address for each director/real server, change from eth0 to whatever active interface you are using):
[/etc/sysconfig/network-scripts/ifcfg-eth0 on ws01] \ DEVICE=eth0 ONBOOT=yes BOOTPROTO=static IPADDR=10.0.0.10 NETMASK=255.255.252.0 GATEWAY=10.0.0.1 [/etc/sysconfig/network-scripts/ifcfg-eth0 on ws02] \ DEVICE=eth0 ONBOOT=yes BOOTPROTO=static IPADDR=10.0.0.20 NETMASK=255.255.252.0 GATEWAY=10.0.0.1 |
Restart the network
service network restart |
Enable packet forwarding and arp ignore in the /etc/sysctl.conf file
net.ipv4.ip_forward = 1 net.ipv4.conf.eth0.arp_ignore = 1 net.ipv4.conf.eth0.arp_announce = 2 net.ipv4.conf.all.arp_ignore = 1 net.ipv4.conf.all.arp_announce = 2 |
Reparse the sysctl.conf file
/sbin/sysctl -p |
Make sure all services set to start at system boot.
chkconfig httpd on && chkconfig --level 2345 heartbeat on && chkconfig --del ldirectord |
Start the heartbeat service
/etc/init.d/ldirectord stop && /etc/init.d/heartbeat start |
Keepalived is written by Alexandre Cassen Alexandre (dot) Cassen (at) free (dot) fr, and is based on vrrpd for director failover. Health checking for realservers is included. It has a lengthy but logical conf file and sets up an LVS for you. Alexandre released code for this in late 2001. There is a keepalived mailing list and Alexandre also monitors the LVS mailing list (May 2004, most of the postings have moved to the keepalived mailing list). The LVS-HOWTO has some information about Keepalived.
Volker Jaenisch volker (dot) jaenisch (at) inqbus (dot) de 2007-07-04
http://sourceforge.net/projects/ipvsman/
ipvsman is a curses based GUI to the IPVS loadbalancer written in python. ipvsmand is a monitoring instance of ipvs to achive the desired state of the loadbalancing as ldirectord or keepalived do.
Clint Byrum cbyrum (at) spamaps (dot) org 27 Sep 2004
I'd like to setup a two node Heartbeat/LVS load balancer using Soekris Net4801 machines. These have a 266Mhz Geode CPU, 3 Ethernet, and 128MB of RAM. The OS (probably LEAF) would live on a CF disk. If these are overkill, I'd also consider a Net4501, which has a 133Mhz CPU, 64MB RAM, and 3 ethernet.
I'd need to balance about 300 HTTP requests per second, totaling about 150kB/sec, between two servers. I'm doing this now with the servers themselves (big dual P4 3.02 Ghz servers with lots and lots of RAM). This is proving problematic as failover and ARP hiding are just a major pain. I'd rather have a dedicated LVS setup.
1) anybody else doing this?
2) IIRC, using the DR method, CPU usage is not a real problem because reply traffic doesn't go through the LVS boxes, but there is some RAM overhead per connection. How much traffic do you guys think these should be able to handle?
Ratz 28 Sep 2004
The Net4801 machines are horribly slow but for your purpose enough. The limiting factor on those boxes are almost always the cache sizes. I've waded through too many processor sheets of those Geode derivates to give your specific details on your processor but I would be surprised if it had more than 16kb i/d-cache each.
16k unified cache. :-/
Make sure that your I/O rate is as low as possible or the first thing to blow is your CF disk. I've worked with hundreds of those little boxes in all shapes, sizes and configurations. The biggest common mode failures were CF disk due to temperature problems and I/O pressure (MTTF was 23 days); other problems only showed up in really bad NICs locking up half of the time.
I haven't ever had an actual CF card blow on me. LEAF is made to live on readonly media.. so its not like it will be written to a lot.
Sorry, blow is exaggerated, I mean they simply fail because they only have limited write capacity on the cells.
RO doesn't mean that there's no I/O going to your disk as you correctly noted. The problem is that if you plan on using them 24/7 I suggest you monitor your block I/O on your RO partitions using the values from /proc/partitions or the wonderful iostat tool. Then extrapolate about 4 hours worth of samples, check your CF vendor specification on how many writes it can endure and see how long you can expect the thing to run.
I have to add that thermal issues were adding to our high failure rates. We wanted to ship those little nifty boxes to every branch of a big customer to do a big VPN network. Unfortunately the customer is in the automobile industry and this means that those boxes were put in the stranges places imaginable in garages sometimes causing major heat congestion. Also as it is usual in this sector of industry people are used to reliable hardware and so they don't care if at the end of a working day they simply shut down the power of the whole garage. Needless to say that this adds up to the reduced lifetime of a CF.
I then did a reliability analysis using the MGL (multiple greek letter, derived from the beta-factor model) model to calculate the average risk in terms of failure*consequence and we had to refrain from using those little nifty things. The costs of repair (detection of failure -> replacement of product) at a customer would exceed the income our service provided through a mesh of those boxes.
If these are overkill, I'd also consider a Net4501, which has a 133Mhz CPU, 64MB RAM, and 3 ethernet.
I'd go with the former ones, just to be sure ;).
Forgive me for being frank, but it sounds like you wouldn't go with either of them.
I don't know your business case so it's very difficult to give you a definite answer. I only give you an (somewhat intimidating) experience report, someone might just as well give you a much better report.
I'd need to balance about 300 HTTP requests per second, totaling about 150kB/sec, between two servers.
So one can assume a typical request to your website is 512 Bytes, which is rather quite high. But not really an issue for LVS-DR.
I didn't clarify that. The 150kB/sec is outgoing. This isn't for all of the website, just the static images/html/css.
I'm doing this now with the servers themselves (big dual P4 3.02 Ghz servers with lots and lots of RAM). This is proving problematic as failover and ARP hiding are just a major pain. I'd rather have a dedicated LVS setup.
I'd have to agree to this.
1) anybody else doing this?
Maybe. Stupid questions: How often did you have to failover and how often did it work out of the box?
Maybe once every 2 or 3 months I'd need to do some maintenance and switch to the backup. Every time there was some problem with noarp not coming up or some weird routing issue with the IPs. Complexity bad. :)
So frankly speaking: your HA solution didn't work as expected ;).
2) IIRC, using the DR method, CPU usage is not a real problem because reply traffic doesn't go through the LVS boxes, but there is some RAM overhead per connection. How much traffic do you guys think these should be able to handle?
This is very difficult to say since these boxes impose limits also through their inefficiant PCI busses, their rather broken NICs and the dramatically reduced cache. Also it would be interesting to know if you're planning on using persistency on your setup.
Persistency is not a requirement. Note that most of the time a client opens a connection once, and keeps it up as long as they're browsing with keepalives.
Yes, provided most clients use HTTP/1.1. But since on an application level you don't need persistency.
But to give you a number to start with, I would say those boxes should be able (given your constraints) to sustain 5Mbit/s of traffic with about 2000pps (~350 Bytes/packet) and only consume 30 Mbyte of your precious RAM when running without persistency. This is if every packet of your 2000pps is a new client requesting a new connection to the LVS and will be inserted by the template at an average of 1 Minute.
As mentioned previously, you HW configuration is very hard to compare to actual benchmarks, thus take those numbers with a grain of salt, please.
Thats not encouraging. I need something fairly cheap.. otherwise I might as well go down the commercial load balancer route.
Well, I have given you number which are (at a second look) rather low estimates ;). Technically, your system should be able to deliver 25000pps (yes, 25k) at a 50Mbit/s rate. You would then, if every packet was a new client, consume about all the memory of your system :). So somewhere in between those two numbers I would place the performance of your machine.
Bubba Parker sysadmin (at) citynetwireless (dot) net 27 Sep 2004
In my tests, the Soekris net4501, 4511, and 4521 all were able to route almost 20Mbps at wire-speed. I would suspect the 4801 to be in excess of 50Mbps, but remember, your Soekris board has 3 nics, but what they don't tell you is that they all share the same interrupt, so performance degredation is exponential with many packets per second.
Ratz 28 Sep 2004
For all Geode based boards I've received more technical documentation than I was ever prepared to dive in. Most of the time you get a very accurate depiction of your hardware including south and north bridges and there you can see that the interrupt lines are hardwired and require a interrupt sharing.
However this is not a problem since there's not a lot of devices on the bus anyway that would occupy it and if you're really unhappy about the bus speed, use setpci to reduce latency for the NIC's IRQs.
Newer kernels have excellent handling for shared IRQs btw.
Did you measure exponential degradation? I know you get a pretty steep performance reduction once you push the pps too high but I newer saw exponential behaviour.
Peter Mueller 2004-09-27
What about not using these Soekris's and just using those two beefy servers? e.g., http://www.ultramonkey.org/2.0.1/topologies/ha-overview.html or http://www.ultramonkey.org/2.0.1/topologies/sl-ha-lb-overview.html
Clint Byrum 27 Sep 2004
Thats what I'm doing now. The setup works, but its complexity causes issues. Bringing up IPs over here, moving them from eth0 to lo over there, running noarpctl on that box. Its all very hard to keep track of. Its much simpler to just have two boxes running LVS, and not worry about whats on the servers.
Simple things are generally easier to fix if they break. It took me quite a while to find a simple typo in a script on my current setup, because it was very non-obvious at what layer things were failing.
Malcolm Turnbull Malcolm (dot) Turnbull (at) crocus (dot) co (dot) uk 03 Jun 2003, has released a Bootable ISO image of his Loadbalancer.org appliance software. The link was at http://www.loadbalancer.org/modules.php?name=Downloads&d_op=viewdownload&cid=2 but is now dead (Dec 2003). Checking the website (Apr 2004) I find that the code is available as a 30 day demo (http://www.loadbalancer.org/download.html, link dead Feb 2005).
Here's the original blurb from Malcolm
The basic idea is creating an easy to use layer 4 switch appliance to compete with Coyote Point Equalizer/ CISCO local director... All my source code is GPL, but the ISO distribution contains files that are non-GPL to protect the work and allow vendors to licence the software. The ISO requires a license before you can legally use it in production.
Burn it to CD and then use it to boot a spare server with pentium/celeron + ATAPI CD + 64MB RAM + 1 or 2 NICs+20GB HD
root password is : loadbalancer ip address is : 10.0.0.21/255.255.0.0 web based login is : loadbalancer web based password is : loadbalancerDefault setup is DR so just plug it straight into the same hub as your web servers and have a play.. Download the manuals for configuration info...