Device Detection

by Bruce Szalwinski

Background

The good folks that power Apache Mobile Filter use version 2 of the Device Repository from 51Degrees and currently have no plans for updating their software to use version 3.  Since we currently use the AMF handler to do device detection and since 51Degrees has announced the end of life for version 2, this provides an opportunity for us to write our own handler.  We attempted to do this in version 2 days, but there was no Perl API offered from 51Degrees and the C code was pretty shaky.  With version 3, the 51 Degrees folks now offer a Perl API that wraps around a much more robust C API.  With that, the stage is set to tackle writing our own Apache Handler.   I’ll use the Apache::Test module to help drive the development.  This article from last decade was very helpful in learning how to use this powerful module.  Full source code is available at DeviceDetection.

Requirements

Analyze web traffic by a user specified set of device properties.

Implementation

An Apache Handler allows for the customization of the default behavior of the web server.  We will write a handler that reads the user agent from the request, detects the device associated with the user agent, creates environment variables for each requested device property and writes the values to a log file.  Let’s get started.

Write tests first

To test our handler, we’ll send requests to an apache server, passing in various user agent strings and validating that we receive known device id values.

use strict;
use warnings FATAL => 'all';

use Apache::TestTrace;
use Apache::Test qw(plan ok have_lwp);
use Apache::TestRequest qw(GET);
use Apache::TestUtil qw(t_cmp);
use Apache2::Const qw(HTTP_OK);

use JSON;

plan tests => 6, have_lwp;

detect_device('','15364-5690-17190-18092');
detect_device('unknown', '15364-5690-17190-18092');
detect_device(
 "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.114 Safari/537.36",
  "15364-18110-25377-18092");

sub detect_device {
  my ($user_agent, $device_id) = @_;

  Apache::TestRequest::user_agent(
    reset => 1,
    agent => $user_agent
  );

  my $response = GET '/cgi-bin/index.cgi';
  my $json = decode_json $response->content;

  debug "response", $response;

  ok defined($json->{_51D_ID}) eq 1;
  ok $json->{_51D_ID} eq $device_id;
}

Great, we have a unit test and it fails miserably because we don’t have a apache server.  We need an apache server that we can start, stop and configure for every test run.  Conveniently, the Apache::Test module provides a “whole, pristine and isolated” apache server at our disposal.  Cool, we have a server.  Next, the handler will push device properties into the environment via the subprocess environment table so we need a way to capture those values.  Line 28 shows how the unit tests make a call to a CGI script.  The CGI script will simply grab the variables pushed into the environment by the handler and return them to the test.


#!/usr/bin/perl

use CGI qw(:standard -no_xhtml -debug);
use JSON;

print header('application/json');

my %properties;

while ( my ($key, $value) = each(%ENV)) {
  if ( $key =~ /^_51D/) {
    $properties{$key} = $value;
  }
}

print encode_json \%properties;

Ok, so we have a failing test, a web server, and a way of communicating between the two.  We have a little bit of wiring to do to let the server know about our CGI script as well as our handler.  By convention, Apache::Test will look for a file called t/conf/extra.conf.in.  This file contains configuration directives that will be added to httpd.conf before starting the server.  We’ll take this opportunity to configure the execution of our index.cgi test harness, configure our log format and setup our handler.


PerlSwitches -w

ScriptAlias /cgi-bin @ServerRoot@/cgi-bin
<Location /cgi-bin>
  SetHandler cgi-script
  Options +ExecCGI +Includes
</Location>

LogFormat "%{_51D_ID}e|%{User-Agent}i" combined

PerlTransHandler +CDK::51DegreesFilter
PerlSetEnv DeviceRepository @ServerRoot@/data/51Degrees-Lite.dat
PerlSetEnv DevicePropertyList ScreenPixelsHeight,BatteryCapacity
PerlSetEnv DevicePrefix _51D

Man, when is this guy ever going to get around to writing some code?  Almost there. The Apache::TestRunPerl and Apache::TestMM modules combine together to provide all that is necessary to start, configure and stop Apache, as well as run all of the individual unit tests.  These get added into our Build.PL script.  The test action normally just executes tests.  We need to subclass this action so that we can start the server before the  tests executes and stop it when complete.   It would also be nice to produce Junit style output of the test results so that they can be published by the build server.


use Module::Build;
use ModPerl::MM ();
use Apache::TestMM qw(test clean);
use Apache::TestRunPerl ();
use IO::File;

my $class = Module::Build->subclass(
    class => 'CDK::Builder',
    code => q{
	sub ACTION_test {
	    my $self = shift;
	    $self->do_system('t/TEST -start-httpd');
	    $self->SUPER::ACTION_test();
	    $self->do_system('t/TEST -stop-httpd');
	}
    }
);

my $build = $class->new (
  module_name => 'CDK::51DegreesFilter',
  license => 'perl',
  test_file_exts => [qw(.t)],
  use_tap_harness => 1,
  tap_harness_args => {
    sources => {
      File => {
        extensions => ['.tap', '.txt'],
      },
    },
    formatter_class => 'TAP::Formatter::JUnit',
  },
  build_requires => {
      'Module::Build' => '0.30',
      'TAP::Harness'  => '3.18',
  },
  test_requires => {
      'Apache::Test' => 0,
  },
  requires => {
      'mod_perl2' => 0,
      'FiftyOneDegrees::PatternV3' => 0,
      'JSON' => 0,
      'Apache2::Filter' => 0,
      'Apache2::RequestRec' => 0,
      'Apache2::RequestUtil' => 0,
      'Apache2::Log' => 0,
      'Apache2::Const' => 0,
      'APR::Table' => 0
  }
);

Apache::TestMM::filter_args();
Apache::TestRunPerl->generate_script();

$build->create_build_script;

 

Handler

Finally.  At this point, writing the handler is pretty anti climatic.  It reads the user agent from the header and passes it to the getMatch method from 51Degrees.  A set of device properties are returned as a JSON object.  Each requested property, defined by DevicePropertyList, is added to the environment, via subprocess_env().  The AMF handler used a caching mechanism to avoid detection costs for previously seen user agents.  The 51D folks said the new version was faster, so I wouldn’t need it.  Performance testing will prove this out.


sub handler {
  my $f = shift;

  my $user_agent=$f->headers_in->{'User-Agent'} || '';
  my $json = FiftyOneDegrees::PatternV3::getMatch($dataset, $user_agent);
  my %properties = %{ decode_json($json) };

  while ( my ($key, $value) = each(%properties) ) {
    my $dkey = uc("${prefix}_${key}");
    $f->subprocess_env($dkey => $value);
  }

  return Apache2::Const::DECLINED;
}

 

Performance

To test performance, I setup Jmeter with 5 threads on a sandbox machine and looped over a set of 350K unique user agents.  The Jmeter instance made requests to apache running on a second sandbox machine with the new handler installed.  With 2,428,264 requests under its belt, the average response time is 10ms.  For v2, with caching, the average response time was 16ms.

About collectivegenius
Everyone has a voice and great ideas come from anyone. At Cobalt, we call it the collective genius. When technical depth and passion meets market opportunity, the collective genius is bringing it’s best to the table and our customers win.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: