Rate Limiting With Repose, the Restful Proxy Service Engine


General, OpenStack

Chad Lung is a software engineer on the Rackspace Cloud Integration team and is the maintainer of Atom Hopper. Be sure to check out his personal blog at http://www.giantflyingsaucer.com/blog/ and follow @chadlung on Twitter.

I recently wrote an article introducing Repose which is a sponsored open-source project that is built to scale for the cloud. Repose is used within Rackspace as a a key element of our internal OpenStack.

Repose has many features such as rate limiting, client authentication, translation, API validation, versioning, logging, with more on the way. Today I want to show you how you can use Repose for your own projects and in particular I’m going to focus on Rate Limiting. Since Repose doesn’t care what programming language my particular web service is written in, I’m going to write a very simple Node.js API server and then use Repose to enforce rate limiting on it. At the same time I will also be using Repose’s HTTP Logging and IP Identity filters.

Note: I will be assuming from here that you are working in a Linux or OS X environment.

Creating the Node.js API Server

Make sure you have Node.js installed and ready to go. On your computer create a new folder called APIDemo and add a new JavaScript file called app.js in it.

The contents of the app.js file are very simple: they return the current date to the user.

app.js:

var express = require('express');
var app = module.exports = express();

// Very minimal API demo that returns
// the current date info
app.get('/api/getdate', function(req, res){
 res.send(new Date());
});

app.listen(8080);
console.log('API demo server listening on port 8080');

To execute this code successfully, you must have Express installed. You can do that from the command line easily by going into the APIDemo folder and running:

$ npm install express

Start the API server by running:

$ node app.js

With a web browser, use the following URL to hit the API and have the date returned back to you: http://localhost:8080/api/getdate

Setting up and Configuring Repose

Repose requires that a few folders exist on your system. Create the following folders: * Configuration files are located in: /etc/repose/ * The EAR file drop location is located at: /usr/share/repose/filters/ * The standalone location for Repose is at: /usr/share/lib/repose/ You can also run Repose in a container such as Apache Tomcat * Log files are located at: /var/log/repose/ * The deployment location (where the EAR file is extracted) is at: /var/repose/

You will also need to ensure that the user account that is running Repose has the necessary access to read and execute on the appropriate folders.

With those folders created, we can move on to gathering the binary artifacts required to run Repose. Alternatively, you can grab the source code from the GitHub repository for Repose and compile the code yourself.

Here are the steps to get the configuration and binary files copied over:

  1. Copy all of the example Repose configuration files from this location into the /etc/repose/ folder.
  2. Copy the IP Identity configuration example file from this location into the /etc/repose/ folder
  3. Copy the Rate Limiting configuration example file from this location into the /etc/repose/ folder
  4. Copy the HTTP Logging configuration example file from this location into the /etc/repose/ folder
  5. Copy the latest EAR filter bundle file from this location and place filter-bundle-x.x.x.ear file into the /usr/share/repose/filters/ folder.
  6. Copy the latest valve-x.x.x.jar file from this location and place it into the /usr/share/lib/repose/ folder.

Note: I’m using Repose’s IP Identity filter since it’s very easy to use for demonstration purposes. In a real service, you might prefer to use the Repose Client Authentication filter, which supports the OpenStack Identity Service authentication scheme.

Next we will modify the example IP Identity and Rate Limiting configuration files to suit our needs. Let’s start with the IP Identity file. We’ll modify it to accept requests from localhost only. Of course, this is just for demonstration purposes and does not reflect a real world scenario.

ip-identity.cfg.xml

1
2
3
4
5
6
7
8
9
10
11
12
13
<?xml version="1.0" encoding="UTF-8"?>

<ip-identity  xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'
   xmlns='http://docs.api.rackspacecloud.com/repose/ip-identity/v1.0'
   xsi:schemaLocation='http://docs.api.rackspacecloud.com/repose/ip-identity/v1.0'>

   <quality>0.2</quality>

    <white-list quality="1.0">
        <ip-address>127.0.0.1</ip-address>
    </white-list>

</ip-identity>

Save the changes to the file and then edit the Rate Limiting config file next. What we will do here for the demonstration is lock down our API endpoint to only accept 1 HTTP GET request per minute for standard users. This value can be configured easily and any changes you make will get picked up by Repose and reloaded automatically.

rate-limiting.cfg.xml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
<?xml version="1.0" encoding="UTF-8"?>

<rate-limiting delegation="false" xmlns="http://docs.rackspacecloud.com/repose/rate-limiting/v1.0">
    <!--
        Defining a limit group.

        The following headers can be found in the class
        com.rackspace.cloud.powerapi.http.PowerApiHeader in the Power API
        Filterlet library, maven group id com.rackspace.cloud.powerapi, artifact
        id filterlet.

        Groups are matched on the HTTP header: X-PP-Groups
        User information is matched on the HTTP header: X-PP-User
    -->
    <limit-group id="standard-ip-limits" groups="IP_Standard">
        <limit uri="/*" uri-regex="/(.*)" http-methods="GET" unit="MINUTE" value="1" />
    </limit-group>

    <limit-group id="standard-ip-limits-superuser" groups="IP_Super">
        <limit uri="/*" uri-regex="/(.*)" http-methods="GET" unit="SECOND" value="5" />
    </limit-group>
</rate-limiting>

Save the changes to the file and then edit the following config file next.

http-logging.cfg.xml

1
2
3
4
5
6
7
8
9
10
11
12
13
<?xml version="1.0" encoding="UTF-8"?>

<http-logging xmlns="http://docs.rackspacecloud.com/repose/http-logging/v1.0">
    <!-- The id attribute is to help the user easily identify the log -->
    <!-- The format includes what will be logged.  The arguments with % are a subset of the apache mod_log_config
         found at http://httpd.apache.org/docs/2.2/mod/mod_log_config.html#formats -->
    <http-log id="my-special-log" format="Response Code Modifiers=%200,201U\tModifier Negation=%!401a\tRemote IP=%a\tLocal IP=%A\tResponse Size(bytes)=%b\tRemote Host=%h\tRequest Method=%m\tServer Port=%p\tQuery String=%q\tTime Request Received=%t\tStatus=%s\tRemote User=%u\tURL Path Requested=%U\n">
        <targets>
            <!-- The actual log file -->
            <file location="/var/log/repose/repose.log"/>
        </targets>
    </http-log>
</http-logging>

Save the changes to the file and then edit the following config file next. We will just set a few simple defaults.

container.cfg.xml

1
2
3
4
5
6
7
8
9
10
11
12
<?xml version="1.0" encoding="UTF-8"?>

<repose-container xmlns='http://docs.rackspacecloud.com/repose/container/v2.0'>
    <deployment-config http-port="8888" connection-timeout="30000" read-timeout="30000">
        <deployment-directory auto-clean="false">/var/repose</deployment-directory>

        <artifact-directory check-interval="60000">/usr/share/repose/filters</artifact-directory>

        <logging-configuration href="log4j.properties"/>

    </deployment-config>
</repose-container>

Save the changes to the file. The final file to modify is the following:

system-model.cfg.xml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
<?xml version="1.0" encoding="UTF-8"?>

<system-model xmlns="http://docs.rackspacecloud.com/repose/system-model/v2.0">
  <repose-cluster id="repose">
    <nodes>
      <node id="node1" hostname="localhost" http-port="8888"/>
    </nodes>
    <filters>
      <!--
      <filter name="header-id-mapping" />
      -->
      <filter name="ip-identity" />
      <filter name="rate-limiting" />
      <filter name="http-logging" />
      <filter name="default-router"/>
    </filters>
    <destinations>
      <endpoint id="openrepose" protocol="http" hostname="localhost" root-path="/" port="8080" default="true"/>
    </destinations>
  </repose-cluster>
</system-model>

This file is used to enable the filters we want to use and define the order in which they should be called. It also sets the endpoint which I pointed back to our Node.js API server running on port 8080. Repose will be running on port 8888 but in a real-world environment you would probably be using port 80 for Repose.

Note: Repose is requires Java so make sure you have it installed.

Issue this command to make Repose listen on port 8888 (which is configured to proxy to the Node.js API server on port 8080):

$ java -jar valve-2.3.5.jar start -p 8888 -s 8188 -c /etc/repose/

Port 8188 is the port Repose listens on for a shutdown command; in a production environment, you should make sure to disable access to port 8188 from outside networks. The shutdown command can be triggered with a simple HTTP GET to this address: http://localhost:8188/ The final argument tells Repose to look in the /etc/repose/ folder for the configuration files.

Go back to the Node.js app and start it if it’s not already running.

Using a web browser we will make a request to the API server to get the time. Try it out: http://localhost:8888/api/getdate

If you hit refresh a second time within a minute, you should get a message back that looks similar to this:

{
    "overLimit" : {
        "code" : 413,
        "message" : "OverLimit Retry...",
        "details" : "Error Details...",
        "retryAfter" : "2012-10-10T21:21:31Z"
    }
}

Now, go ahead and modify the rate limiting file to accept 10 requests per second.

rate-limiting.cfg.xml

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
<?xml version="1.0" encoding="UTF-8"?>

<rate-limiting delegation="false" xmlns="http://docs.rackspacecloud.com/repose/rate-limiting/v1.0">
    <!--
        Defining a limit group.

        The following headers can be found in the class
        com.rackspace.cloud.powerapi.http.PowerApiHeader in the Power API
        Filterlet library, maven group id com.rackspace.cloud.powerapi, artifact
        id filterlet.

        Groups are matched on the HTTP header: X-PP-Groups
        User information is matched on the HTTP header: X-PP-User
    -->
    <limit-group id="standard-ip-limits" groups="IP_Standard">
        <limit uri="/*" uri-regex="/(.*)" http-methods="GET" unit="SECOND" value="10" />
    </limit-group>

    <limit-group id="standard-ip-limits-superuser" groups="IP_Super">
        <limit uri="/*" uri-regex="/(.*)" http-methods="GET" unit="SECOND" value="5" />
    </limit-group>
</rate-limiting>

You should be able to hit the API 10 times per second now.

When you are finished with your experiment, shut Repose down by hitting this URL: http://localhost:8188/

Rate limiting is only one small piece of what Repose can do. To learn more about Repose, the Open Repose website is your starting point providing links to the source code in GitHub. This is also the right place to find our documentation, including an FAQ and wiki; the wiki has the most current information. If you have ideas about how Repose can grow to suit your needs, you are welcome to contribute back to this project.

Repose is available as open source under the Apache License version 2.0.

©2014 Rackspace, US Inc. About Rackspace | Fanatical Support® | Hosting Solutions | Investors | Careers | Privacy Statement | Website Terms | Trademarks | Sitemap