Skip to main content

How to log client IP from Apache behind Elastic Load Balancer

The Apache rainbow feather logo
This is my solution to make Apache HTTP Server log the actual remote client IP address from behind the Amazon Elastic Load Balancer (ELB), with a little help from SetEnvIf. [2 minutes]

The Problem

The problem is that the Elastic Load Balancer (ELB) sets REMOTE_ADDR to the load balancer IP address. However it also sets the X-Forwarded-For HTTP header to a comma-delimited string of ip-addresses like client, proxy1, proxy2.

Some suggest replacing %h (REMOTE_ADDR) in the NCSA common log format (%h %l %u %t \"%r\" %>s %O) with the X-Forwarded-For header:

LogFormat "\"%{X-Forwarded-For}i\" %l %u %t \"%r\" %>s %O xfwd_common

This approach has two problems:

1. Broken log formatting

Comma-separated IP addresses violate the NCSA common and combined log formats and generally breaks applications that attempt to extract the log fields.

Above I added quotes around X-Forwarded-For to make it easier to extract by regex.   Supporting this modified format in Splunk involves adapting the access-extractions transform to use [[qstring:clientip]]  (quoted string) instead of [[nspaces:clientip]] (no-spaces string).

2. Missing IP for unproxied requests

Direct or unproxied HTTP requests lack the X-Forwarded-For header, so the clientip is logged as "".    If all clients connect via the load balancer this won't happen, but in practice developers and monitoring agents may want to skip the load balancer.

The Solution

My solution for standard log formatting and logging of unproxied IPs uses SetEnvIf to log the remote client IP from REMOTE_ADDR initially, and overwrites it with the first component of the X-Forwarded-For header only if available, meaning the request is proxied:

SetEnvIf REMOTE_ADDR "(.+)" CLIENTIP=$1
SetEnvIf X-Forwarded-For "^([0-9.]+)" CLIENTIP=$1
LogFormat "%{CLIENTIP}e %D %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" trueip_combined

The third line defines the custom trueip_combined log format that uses CLIENTIP in place of %h and uses %D in the place of the never-used ident field (%l) to log request latency in microseconds.

P.S. if you're running Wordpress on the EC2 instance, check out my how-to for optimizing Wordpress on EC2 micro instances.

Comments

  1. Awesome article. Incorporated this into my web server AMI this evening.

    ReplyDelete
  2. Thankyou so much! :)

    ReplyDelete
  3. Thanks for the info. However, I added it to my /etc/apache2/sites-available for this domain as you suggested, and I'm still getting the ELB internal ip.

    I tested printing the $_SERVER["HTTP_X_FORWARDED_FOR"] value on a php page and i do get the forwarded ip. Do you have any suggestions? I'm really scratching my head on this one...

    ReplyDelete
  4. Terry,

    Try adding the values to httpd.conf.

    ReplyDelete

Post a Comment

Popular posts from this blog

How to grow a large crystal of copper (II) sulphate in 5 days

Presenting a faster way to grow large copper sulphate crystals! The pictured 4cm crystal took me 5 days by cooling, instead of the 3-6 weeks it would take by evaporation. [4 minute read]

The keys to doing long-form Narrative Improv

Here are some key ingredients for full-length improvised plays known as Narrative Improv. Providing tips on story structure, normalcy, the protagonist, consequences and clarity. [4 minutes]

How to write Bad Python

A controversial post about bad code that I observed on an unnamed Python project, in which I describe outdated idioms, Java-style code, and bad programming practices. [3 minutes]