Recently I’ve been looking at setting up a POC environment for a solution involving streaming media. I’ve got some streaming media servers that delivers content over RTMP and some degree of infrastructure cleverness that claims to give improved performance. So how do I test that?

Well, I need the capability of submitting requests for content and evaluating the quality of service as I tweak the infrastructure. Features along these lines:

  • Simulating particular access patterns, for example a large number of users all requesting some popular content.
  • Defining extended test runs, some parts of the infrastructure take a while to “warm up”, and performance measures are best taken over extended periods of time.
  • Some way of determining KPIs such as the time take to start streaming or the amount of “stutter” experienced.

Also I want efficient use of test client resources. I may be simulating tens and hundreds of users, I just need to retrieve the stream of content, I don’t actually need to have it rendered, no need for video graphics.

Now there are quite few clients able to do these kinds of things. I chose Flazr, which is an open-source Java application. In this article I am going to

  1. Describe some simple uses of Flazr.
  2. Explain a problem I hit and give the code for the fix I developed.
  3. Show an extension I developed, which enables Flazr to be aware of some load-balancing capabilities in my infrastructure. This exploits a very small subset of SMIL.

Testing with Flazr

Initially I imported the Flazr 0.7 source into  my Rational Software Developer, Eclipse-based development environment.

image

And added the libraries delivered with Flazr to my classpath.

I can then run the Rtmp client

image

Stream Content, Get Metrics

The simplest case is just to specify the URL for the stream to be played

       rtmp://myhost/myapp/mycontent

I won’t here describe my Streaming Media Server, there are many possible products you can use for that purpose.

This streams the content and display some useful metrics

first media packet for channel: [0 AUDIO c6 #1 t0 (0) s0], after 219ms

and

finished in 26 seconds, media duration: 11 seconds

From this  have  a measure of the responsiveness of my server and also we note that although the media duration was only 11s, it took 26s to stream it – lots of stutter there. And in fact if I stream this content through a conventional viewer there is indeed quite a bit of stutter.

More Demanding Workloads

I can ramp up the workload by asking Flazr to spawn a number of simulated clients each retrieving the stream

-load 5 rtmp://myhost/myapp/mycontent

these 5 are executed  in parallel using the JSE 1.5 Executor capability.

We can adjust the degree of parallelism by controlling the thread pool size.

          -load 5 –threads 2 rtmp://myhost/myapp/mycontent

We then get 5 downloads competed, but done just two at a time,in the two parallel threads. And in the limiting case we can have just one thread and hence get sequential retrieval.

If you try this with Flazr 0.7 you will find that in fact the parallelism is not so controlled and Flazr itself does not shutdown when the last retrieval completes. I’ll explain how I fixed that in a moment, but first I want to mention one other invocation style.

Flazr Scripts

The “-load” option described above allows you to stream in parallel several copies of the same content. If instead you need to emulate a more mixed workload you can instead put a list of URLs in a file and then use a command such as

     -file myscript

to initiate these streams. You can again control the number of parallel streams by using the “-threads” option

    -threads 3 –file myscript

The Halting Problem

As mentioned earlier, when streaming in parallel, Flazr does not exit when the last stream completes. This is very inconvenient if you want to run Flazr as part of some larger test.

The reason for this behaviour is that Flazr is using an Executor and this has a worker thread which waits for new workitems to appear. It is necessary to issue a shutdown request in order for Falzr to exit.

I modified RtmpClient.java in package com.flazr.rtmp.clent. This is the modified code, which I’lll explain in the next couple of sections.

     if(options.getClientOptionsList() != null) {
            logger.info("file driven load testing mode, lines: {}",
                    options.getClientOptionsList().size());
            int line = 0;
            for(final ClientOptions tempOptions :
                    options.getClientOptionsList()) {
                line++;
                logger.info("running line #{}", line);
                for(int i = 0; i < tempOptions.getLoad(); i++) {
                    final int index = i + 1;
                    final int tempLine = line;
                    executor.execute(new Runnable() {
                        @Override public void run() { 
                            logger.info("line #{}, spawned connection #{}"
                                    , tempLine, index);
                            connect(tempOptions);                          
                            logger.info("line #{}, finished connection #{}"
                                    , tempLine, index);
                        }
                    });
                }                         
            }
            // by default the executor hangs around, ask it to go away
            logger.info("queueing shutdown request");
            executor.execute(new Runnable() {
                    @Override public void run() {
                        logger.info("Turning out the lights … ");
                        executor.shutdown();
                    }
                });
            return;
        }

The most important change is to arrange for a shutdown to be requested.

Queue a Shutdown

The Flazr code creates an Executor request for each line in the script file. These requests are processed by the Executor  in the order in which they are created. Hence if I add one last request to the list, a request to shutdown, we know that this will be the last request to be actioned.

There is one corner case to consider, what happens if that shutdown request is issued while other threads are still active. Fortunately this handled by the Executor framework, the executor will not allow any subsequent requests for new work to be started, but will wait for current requests to complete.

So we get the desired behaviour: the Flazr script completes and Flazr then stops.

Which Executor?

However, there is a further wrinkle. The original code had:

   executor.execute(new Runnable() {
      @Override public void run() {                           
         final ClientBootstrap bootstrap
                  = getBootstrap(executor, tempOptions);
         bootstrap.connect(
              new InetSocketAddress(
                  tempOptions.getHost(),
                  tempOptions.getPort()
                   ));
         }
    });

Note that the executor is passed down to the ClientBootStrap. Under the covers the IO code will add additional executors, and this happens after the initiation of this job. This introduces a race condition with he shutdown request, we can hit the shutdown before the parallel IO execution is requested.

Hence I changed this code to use the

      connect(tempOptions);

method, which FLazr uses elsewhere. This creates a dedicated, separate executor.

SMILing

My infrastructure attempts to optimise performance by using a load distribution capability. The user requests

     http://somehost/someapp/somecontent

and receives an XML file, in SMIL format, which contains the URL that this client should use to stream the content. Hence different clients will get the same content from different places.

I added code to interpret these redirection responses, I’ll describe how in my next posting.

Advertisements