Monday, March 16, 2009

Clustered Reverse Proxy with Fedora

I was given the enviable task of setting up a reverse proxy in Fedora.

A reverse proxy is a piece of software that is installed on a device that has network access to an external and internal network. The proxy acts as a bridge between the two networks. A normal proxy as installed in most company networks allows all users on the internal network to access the external network. A reverse proxy allows clients on the external network to access services hosted on the internal network. It can be installed on the front end of the network and will proxy specified traffic through to the internal network.

Here is a diagram of my lab network. In this example we can see:
  • Multiple clients are connecting to the proxy cluster. ( more on clustering in the next article. )
  • Only one node in the cluster has possession of the shared IP address.
  • The Apache web service on the proxies are configured to listen on the shared IP address.
  • The host web server behind the resource zone is serving traffic to the active node.
  • All servers shown in this lab were built from the Fedora 10 Installation DVD. Required dependancies were installed using YUM. It helps if the machines can be built while connected to a network with access to the internet.

Configuring the
Reverse Proxy Server.

Networking:
ETH0:
  • DHCP or Statically assigned.
  • Must be on same network or accessible by the clients.
  • In VMWARE use "Bridged" so you can access it from your host.
ETH1:
  • Statically assigned.
  • Cross Over cable ( in VMWARE use "Host Only" )

Software:
  • Install the basics and the apache webservice. In Fedora 10 mod_proxy is already included and ready to be enabled.
  • Install gcc ( yum install gcc ) Required to compile the mod_proxy_html module.
  • Follow these procedures in LISTING 1 to get "mod_proxy_html" installed. Mod_proxy_html is required for rewriting URL links in the web pages served from behind the resource firewall. All links will need to point to the shared ip or hostname.
  • Edit /etc/httpd/conf/httpd.conf. After all the LoadModule directives ensure that these two lines appear.

LISTING 1:
(/etc/httpd/conf/httpd.conf )
Proxy Configuration - httpd.conf: LoadFile /usr/lib/libxml2.so
LoadModule proxy_html_module modules/mod_proxy_html.so

LISTING 2:
wget "http://apache.webthing.com/mod_proxy_html/mod_proxy_html.c"
yum install httpd-devel libxml2 libxml2-devel
apxs -c -a -I /usr/include/libxml2 -i mod_proxy_html.c

The apxs command above will insert the "LoadModule" directive for prxoy_html_module. I needed to edit the path to read, "modules/mod_proxy_html.so"

Restart httpd on your proxy servers after installing mod_proxy_html with:
# service httpd restart
Create a reverse proxy config file in /etc/httpd/conf.d/reverse_proxy.conf.

Here is my example: ( read up on the reasoning here: http://www.apachetutor.org/admin/reverseproxies )
ProxyRequests off
ProxyPass /test/ http://192.168.0.200/
ProxyHTMLURLMap http://192.168.0.200 /test

ProxyPassReverse /
SetOutputFilter proxy-html
ProxyHTMLURLMap / /test/
ProxyHTMLURLMap /test /test
RequestHeader unset Accept-Encoding

RewriteEngine on
RewriteRule ^/test$ test/ [R]


The rewrite rule at the end will allow for any urls that miss the trailing slash on directories. It will add a trailing slash in automatically.

Restart httpd with:
# service httpd restart
Testing:
Create 2 html pages on the internal network. One should link to the other using the IP address of the internal httpd server so we can see how the HTML links are rewritten.

My next instalment will cover how to create a simple auto-failover cluster using heartbeat from the Linux HA Project.
Post a Comment