16 February 2012

Using TMG as a hardware load balancer for web based apps

There is always a debate when it comes to what constitutes a hardware firewall and what does not.  I have found a similar contradiction when it comes to load balancing.  Since you are actually after the service being provided and both use a combination of hardware and software it does not really matter.

The deployment referred to here is a TMG array dedicated to internal traffic to perform as a loadbalancer.  It is only connected to the Internal Network.

In most cases I have seen, load balancing is used mainly for the purposes of  fault tolerance.
There are of course a few ways to go about load balancing with this goal in mind.  I am going to rate them based on the following criteria

Ease of adding or removing hosts form the NLB array (Add/Remove)
As load increases or additional functionality is required the array memeber might increase or decrease.

Ability to stop traffic for a host for maintenance
This pertains to the ability to temporarily stop all traffic to an array member for a know amount of planned down time.

This process checks to see if the service you are load balancing is actually active on the array members, to ensure traffic is not routed to a unresponsive member.

Networking requirements
WNLB has implications at physical switch level since it involves manipulating the MAC addresses.  This in a static environment can be managed easily, but adding dynamic virtualization into the mix makes this a nightmare. 

Virtual Implications
Same as networking requirements but it also another layer of virtual switching that need to be catered for.

Fault tolerance
This is rated based on the actual effectiveness of the NLB as a means of improving fault tolerance

DNS Round Robin (RR)  
This relies on multiple host entries in DNS that point to different IP addresses. The theory being that as DNS requests are answered the load is sent to alternating servers.  There are a few drawback to this though.

Adding / Removing hosts
Since load balancing relies entirely on DNS lookup everything happens here.  Adding a host is as simple as adding an additional host entry for the additional IP.  Removing a hosts is the same in reverse.  The catch is that client DNS cache will persists to that IP until the cache has expired.

Ability to stop for maintenance
Since there is no way to instantly update client DNS, you would have to wait until all traffic dies down for a host before you could work on it.

There is no validation method

Network requirements
One really good thing is that there is no network requirements.  No switch configuration at all.

Virtual Machine Implications

Fault tolerance

Windows Network Load balancing (WNLB) 
The traditional Microsoft approach to load balancing is to use WNLB.  It is a feature that is installed  on the host array members.  The member then collectively provide a virtual IP (VIP) that clients can be  directed to.  WNLB can be deployed in Unicast or Multicast.  Unicast is simple to configure at switch level but results in switch flooding so one should isolate the NLB to it's own VLAN.  Multicast require satic CAM and ARP entries to be created at switch level pointing to the physical ports on the switch.

Adding / Removing hosts
The host needs to have the WNLB feature added and then the hosts need to join the WNLB cluster.  Physical Network configuration has to be updated to cater for the new host. You can have a maximum of 8 hosts in the WLNB array.

Ability to stop for maintenance
Using the WNLB console drain stop the node and then leave it as disconnected from the NLB

The NLB can only converge if everything is setup and configured correctly, if there is an issue with a particular host it will not join the NLB.  Validation is only at host level though and not at application level.

Network requirements
Because of the nature of how either Unicast or Multicast works, network configuration is important to be done correctly.  This really slows down the ability to dynamically add and remove hosts form the NLB array.

Virtual Machine Implications 
All of the network requirements are there and then some additional overriding settings that have to be configured on both VMware and HyperV.   One of the biggest problems are that when you start migrating VMs form host to host the network configuration would have to cater for this.  So it is only suitable for static VM implemenations.

Fault tolerance

TMG Farm Publishing 
As a reverse proxy it also works by providing a VIP for the applications you publish.  In an array configuration, TMG itself is load balanced natively using it own flavor of WNLB.  The advantage here is that generally TMG is deployed on physical hardware.  The resultant WNLB network requirements now has to be configured correctly only once despite load balancing numerous apps.  This is similar to deploying a hardware device such as an F5 BigIP where network configuration is equally important.  Have a

Adding / Removing hosts
TMG farm publishing requires a server farm network object to be created.  Additional servers and be added and removed easily.

Ability to stop for maintenance
By controlling the server farm properties you can drain or resume individual servers.

TMG checks the hosts in the farm for a valid response.  If the verification check fails for one server, no additional traffic will be routed there.  By performing a HTTP verification you can for instance prevent traffic from being routed if the server is up but the IIS site is down.

Network requirements
Other than TMG's own WNLB, there is no specific network requirements.

Virtual Machine Implications 
Since TMG just address the individual hosts there are no specific network requirements, so if VMs move between hosts or from switch to switch there is no impact.

Since the farm can dynamically grow or shrink TMG can also be configured with all the potential hosts in a farm. The failed validation for the offline host will prevent traffic from being routed there until they are online and the application is responsive.

Fault tolerance

Added features:
Since we are using TMG we also get the same benefits as publishing a site to the outside world.  This includes the ability to pre-authenticate requests, preform SSL offloading,  name and path restrictions, etc.
Since all traffic is logged you can also analyse it with products such as Webspy and Fastvue.

TMG can only be used as a load balancer for HTTP and HTTPS traffic
The effective WNLB throughput limit is about 500Mbps, higher is possible.

Using a TMG array you can have a very effective and cost efficient load balancer for web traffic.  If you already have the skill level to manage a TMG environment this is a good starting point. (Not really sure if the ISA or TMG teams ever intended for it to be used like this - but it works really well...)

If however you have multi Gbps performance requirements and need to provide more advanced load balacing on other protocols you will need to look at devices like the F5 BigIP.

No comments:

Post a Comment