Category: C#

Concurrency: Aggregating concurrent requests to a heavy function

I have been dealing with some interesting optimization problems these days, and one of them was a heavy method being called multiple times with the same input in a very short period of time.

The method DownloadCustomersInternal is used to download customer data from a server asynchronously and returns a Task. It resides in a class that is shared across multiple components, and it often gets called by these components almost at the same time. When there’s a lot of data to download, this method can take a while to finish, it makes sense for the callers of this method to wait on the same task if there’s already one. So a wrapper method DownloadCustomers is written:

private Task _downloadTask;
private readonly object _downloadLock = new object();

private Task DownloadCustomers()
{
    lock (_downloadLock)
    {
        // If there is already a download task, just return this one.
        if (_downloadTask != null)
            return _downloadTask;

        // Perform the real download and assign it's task to the download task reference.
        _downloadTask = DownloadCustomersInternal();

        // When the download process is done, clear the download task reference.
        return _downloadTask = _downloadTask.ContinueWith(task =>
        {
            lock (_downloadLock)
                _downloadTask = null;
        });
    }
}

The wrapper method simply checks if there is already a _downloadTask and returns it if there is. This ensures that all calls to this method get the same Task instance before _downloadTask is set to null, which only happens after DownloadCustomersInternal is finished asynchronously. Because the code sets _downloadTask to null uses the same lock, it guarantees thread safety of the variable.

This will resolve the problem by returning the same Task to callers who requests the data while the data is being downloaded, but what if the method is called right after a previous download is finished? Downloading the same data in very close succession can also be considered redundant, so it would be nice to introduce some sort of “cooling period” before the next download can be initiated. Therefore, an updated version of the above method is written:

private Task _downloadTask;
private readonly object _downloadLock = new object();
private DateTime _lastDownloadTime = DateTime.MinValue;
private static readonly TimeSpan MinimumDownloadInterval = TimeSpan.FromSeconds(5);

private Task DownloadCustomers()
{
    lock (_downloadLock)
    {
        // If there is already a download task, just return this one.
        if (_downloadTask != null)
            return _downloadTask;

        // Check if the time between this download request and the previous one is too close.
        var intervalFromLastDownload = DateTime.Now - _lastDownloadTime;
        if (intervalFromLastDownload < MinimumDownloadInterval)
        {
            // If this download request is too close to the previous one,
            // make this request wait until the minimum download interval is passed,
            // then perform the download.
            var waitTaskSource = new TaskCompletionSource<bool>();
            _downloadTask = waitTaskSource.Task.ContinueWith(_=>DownloadCustomersInternal()).Unwrap();

            // Force the wait in a background thread to ensure a non-blocking UI experience.
            ThreadPool.QueueUserWorkItem(_ =>
            {
                Thread.Sleep(MinimumDownloadInterval - intervalFromLastDownload);
                waitTaskSource.SetResult(true);
            });
        }
        else
            // If this download request is far enough from the last one,
            // just perform the normal download.
            _downloadTask = DownloadCustomersInternal();

        // When the download process is done, always clear the download task reference
        // and update the download timestamp.
        return _downloadTask = _downloadTask.ContinueWith(task =>
        {
            lock (_downloadLock)
            {
                _downloadTask = null;
                _lastDownloadTime = DateTime.Now;
            }
        });
    }
}

The key in this update is determining if a call is too close to a previous call when there is no task running. If the call is too close, simply wait in the background until MinimumDownloadInterval elapses. The waiting code in the background is also thread safe because waitTaskSource.Task.ContinueWith (line 22) is assigned to _downloadTask in the same scope of the first lock, and waitTaskSource is only set to finish when the background wait is elapsed.

Advertisements

Search algorithm and the golden ratio

From time to time, I need to insert an item to the correct index in an already sorted collection. One good example is a list of contacts, where the list is already sorted by the last name, and displayed somewhere in the UI. If the user adds a new contact, the new one needs to appear at the correct position.

So I wrote 2 little helper methods to do just that:

private static int ConditionalIndex<T>(this IList<T> list, T item, Func<T, T, int> comparer)
{
    var length = list.Count;
    for (var i = 0; i < length; i++)
        if (comparer(list[i], item) > 0)
            return i;
    return length;
}

public static void ConditionalInsert<T>(this IList<T> list, T item, Func<T, T, int> comparer)
{
    var index = list.ConditionalIndex(item, comparer);
    if (index >= list.Count) list.Add(item);
    else list.Insert(index, item);
}

And I use them like this:

var newContact = new Contact
{
    FirstName = "David",
    LastName = "Anderson"
};

MyContacts.ConditionalInsert(newContact, (c1,c2) => string.Compare(c1.LastName, c2.LastName));

This has been working fine until one day the list had grown big enough to take quite a few seconds to just add a new contact, and the user experience isn’t that good anymore.

The problem is that the ConditionalIndex method is just using a simple loop to search the index. So if the list has 10,000 items and it happens to be so unlucky that the new item needs to be inserted to the end of the list, it would take the method to go through all 10,000 items and execute that comparer delegate 10,000 times.

Therefore, I decided to optimize it to use a smarter algorithm.

After some digging around, I’ve chosen the Golden Section Search. It is a method that effectively reduces the range to be searched to find the target much faster.

To illustrate this idea in a simple way, let’s say we have a list of numbers from 0 to 9 but missing the number 7, and I want to insert 7 to this list. Golden section search will find the correct index by doing the following:

  1. Determine a split index, using the length of the search range divided by the Golden Ratio. In this case, it would be 10 ÷ 1.618 rounded up to integer, which is 6.
  2. Compare the item at index 6 with the item to be inserted, which is comparing number 6 and 7.
  3. If the item in the list is smaller, take the list before the split index as the new search range and repeat the above 2 steps. If the item is larger, take the list after the split index. If the item is the same, then the correct index is found.
  4. In our example, it would mean to take 8 and 9. The new search range length is 2 so the new split index is 2 ÷ 1.618 ≈ 1.
  5. Compare 9 to 7 means taking 8 as the new search range.
  6. Since there’s only 1 item left in the search range, simply compare 8 to 7 and we get the correct index to insert is the index before 8, which is 7.

My actual implementation of this algorithm is as follow:

public static int ConditionalIndex<T>(this IList<T> list, T item, Func<T, T, int> comparer)
{
    var length = list.Count;
    if (length == 0)
        return 0;

    const double goldenRatio = 1.618;
    var startIndex = 0;
    var endIndex = length - 1;
    while (true)
    {
        // If the start index is larger then the end index,
        // that means the search has already passed all necessary
        // search ranges, just return the start index.
        if (startIndex > endIndex)
            return startIndex;

        // If the start index is the same as the end index,
        // that means there is only 1 item left to compare.
        // Compare this item and return the result.
        if (startIndex == endIndex)
            // If the item in the list is larger than the given item,
            // retrun the index of the item in the list, otherwise return the next index.
            return comparer(list[startIndex], item) > 0 ? startIndex : startIndex + 1;

        // If there are still multiple items in the search range,
        // find the split index of the search range.
        var splitIndex = (int)Math.Round(endIndex - (endIndex - startIndex) / goldenRatio);

        // Compare the item sitting at the spliting point.
        var result = comparer(list[splitIndex], item);
        if (result > 0)
            // If this item is larger than the given item,
            // keep searching the range before the splitting point.
            endIndex = splitIndex - 1;
        else if (result < 0)
            // If this item is smaller than the given item,
            // keep searching the range after it.
            startIndex = splitIndex + 1;
        else
            // Otherwise, return the split index.
            return splitIndex;
    }
}

This has resulted in a huge boost in my stress test on a list that has 500,000 items, almost 100 times faster.

Although the use of the golden ratio is suppose to give you the best split index in general, you need to use Math.Round to round it up to an int for the most accurate result. I found that this is actually a trade off compare to loosing a bit of the precision and just ignore rounding the result casting it directly to and int:

var splitIndex = (int)(endIndex - (endIndex - startIndex) / goldenRatio);

This has given me another 1 to 2% speed boost in my stress test.

These extension methods are part of the EnumerationExtensions class in my open sourced Codify library on GitHub.

Reactive Drag in WPF

It has been way too long since I last updated my blog. Well I’ve been a bit lazy and busy at the same time, you guys get the excuse 😀

So without further ado, let’s get down to the fun stuff: doing a proper drag in WPF by using the Reactive Extensions.

I have covered some RX basics and usages 2 years ago for Silverlight/WP7. This is not only a revisit but more on using RX to solve a more complicated case as an example. Let’s start with the basics again:

var mouseDownEvents = Observable.FromEventPattern<MouseEventArgs>(this, "MouseLeftButtonDown");
var mouseMoveEvents = Observable.FromEventPattern<MouseEventArgs>(this, "MouseMove");
var mouseUpEvents = Observable.FromEventPattern<MouseEventArgs>(this, "MouseLeftButtonUp");

var dragEvents =
    from mouseDownPosition in mouseDownEvents
    from mouseMovePosition in mouseMoveEvents.StartWith(mouseDownPosition)
                                             .TakeUntil(mouseUpEvents)
    select mouseMovePosition;

dragEvents.Subscribe(eventPattern =>
{
    // Move things on UI.
});

This is a very simple drag example using RX, it is basically saying, react to all mouse move events, only after the mouse down event has happened and before any mouse up events have occurred. In the subscribe lambda expression, you can do things like get the current mouse position and use it to move things around on the UI.

However, in WPF a more appropriate way to do drag and drop, is to use the System.Windows.DragDrop.DoDragDrop method, so that you can define the drag source, drag data, allowed effects, etc…

dragEvents.Subscribe(eventPattern =>
{
    DragDrop.DoDragDrop(this, this, DragDropEffects.All);
});

It all looks nice and easy, except that:

  1. The DoDragDrop method is a synchronous method. Your code won’t pass that line until you drop.
  2. The method captures the mouse so no mouse up event will fire.

If no mouse up event is fired, the subscription won’t know when to stop, that means even after you dropped the object, the subscription will still react to mouse movements on the current element – a memory leak.

So it seems it’s reasonable enough to change the MouseLeftButtonUp event query to a Drop event query:

// This is the non-working query:
// var mouseUpEvents = Observable.FromEventPattern<MouseEventArgs>(this, "MouseLeftButtonUp");

// This should work:
var dropEvents = Observable.FromEventPattern<MouseEventArgs>(this,"Drop");

var dragEvents =
    from mouseDownEvent in mouseDownEvents
    from mouseMoveEvent in mouseMoveEvents.StartWith(mouseDownEvent)
                                          .TakeUntil(dropEvents)
    select mouseMoveEvent;

// ...

If you play around with this code for a while, you may realize that there are 2 major catches:

  1. If the current element is part of a complicated UI, you have no idea which element’s “Drop” event you should listen to. The drop could happen at anywhere.
  2. Even if you use VisualTreeHelper or Window.GetWindow to get the root element, if the drop happens on somewhere that does not allow dropping (e.g. the AllowDrop property is not set to true), then no Drop event will be raised.

So it looks like we can’t rely on drop either.

If you look at the code carefully, the drag detection is not a drag detection, it is really a “drag start” detection. As soon as you detect the starting of a drag, you can call DoDragDrop method and the framework can handle it from there. So we can see the problem as: how can I detect the drag start moment and not worry about the rest of the mouse movements?

Well logically the drag start is the one immediate mouse move event right after a mouse down event, so accordingly, we can modify the query to this:

var dragStartEvents =
    from mouseDownEvent in mouseDownEvents
    from mouseMoveEvent in mouseMoveEvents.StartWith(mouseDownEvent)
                                          .TakeUntil(mouseUpEvents)
                                          .Take(1)
    select mouseMoveEvent;

dragStartEvents.Subscribe(eventPattern =&amp;gt;
{
    DragDrop.DoDragDrop(this, this, DragDropEffects.All);
});

Now we are specifying query to only get 1 mouse move event after a mouse down event before any mouse up events. Now you can drag the hell out to anywhere without worrying about detecting where it is dropped.

In fact, the StartWith method includes the mouse down event before the first move event, so the subscription is not really reacting to the first move event, but rather, the mouse down event before that. I know this seems awkward, but I can reproduce this behaviour in .NET 4.0/4.5.

So if this annoys you, we can make the query a little bit error proof:

var dragStartEvents =
    from mouseDownEvent in mouseDownEvents
    from mouseMoveEvent in mouseMoveEvents.StartWith(mouseDownEvent)
                                          .TakeUntil(mouseUpEvents)
                                          .Skip(1)
                                          .Take(1)
    select mouseMoveEvent;

Now, this looks a solid query. It stops at mouse up events, skips the first mouse down event then take the real first mouse move event.

Preventing Accidental Drag

If you are a perfectionist, you may want to consider accidental drag detection. Basically we allow the mouse movement within a very small area around the mouse down position without triggering the drag. To do this, we need to alter the event queries a bit so that instead of getting event sequences, we get mouse position sequences:

var mouseDownEvents = Observable.FromEventPattern<MouseEventArgs>(this, "MouseLeftButtonDown").Select(pattern => pattern.EventArgs.GetPosition(this));
var mouseMoveEvents = Observable.FromEventPattern<MouseEventArgs>(this, "MouseMove").Select(pattern => pattern.EventArgs.GetPosition(this));
var mouseUpEvents = Observable.FromEventPattern<MouseEventArgs>(this, "MouseLeftButtonUp").Select(pattern => pattern.EventArgs.GetPosition(this));

Then we need to incorporate one more “Where” filter in the drag start detection query:

var dragEvents = 
    from mouseDownPosition in mouseDownEvents
    from mouseMovePosition in mouseMoveEvents
                                  .Where(mouseMoveEvent => 
                                             Math.Abs(mouseMovePosition.X - mouseDownPosition.X) > SystemParameters.MinimumHorizontalDragDistance
                                             || Math.Abs(mouseMovePosition.Y - mouseDownPosition.Y) > SystemParameters.MinimumVerticalDragDistance
                                        )
                                  .StartWith(mouseDownPosition)
                                  .TakeUntil(mouseUpEvents)
                                  .Skip(1)
                                  .Take(1)
    select mouseMovePosition;

There you go folks, take the reactive looking functional monad code (or whatever you want to call it) and rock hard out there!

C#: Implicit type cast and class conflict

Before jumping into the topic, be clear of 3 things:

  1. This post is about implicit operator overloading.
  2. This is a C# only language feature.
  3. This post is gonna focus on how to resolve class conflict from multiple web references. If you are only wondering what implicit operator is, head to this intro page.

If you haven’t lost your way, then let the party begin 😀

The Problem

It’s perfectly fine if your have multiple classes using the same name in different name spaces. Although people usually don’t do that, but it could be a common case when adding multiple web references from the same provider.

One good example is the Bing Maps SOAP services. There are 4 SOAP services that you can reference from Bing Maps: Geocode, Imagery, Route, and Search. When you add one of these services to your project as a web reference, you need to give it a name space, and Visual Studio can auto generate the necessary proxy classes to be used with the web reference. If you add a second one, you need to give it a different name space. So far it’s pretty straight forward.

Now, the problem is, on Bing Maps server side, all of the services could be sharing some common classes in their back-end. These common classes are usually located under a common name space and all of the services just import this name space and use the same common classes. However, when you add the web references, you give a different name space to each one of them, and the generated classes are all under the name space you’ve given to the respective reference. Therefore, if you add all 4 Bing Maps web references, you will notice that there are some duplicate classes in each name space. For example, the ExecutionOptions class., it exists in all 4 name spaces.

When you have created an ExceptionOptions object of the Geocode service name space, and you want to use it with other services, the compiler complains that the object you’ve created can’t be used with them because it’s not in the correct name space.

The normal way of getting over this problem is that you re-create one ExceptionOptions object for each name space:

using MyApp.BingMaps.Geocode;
using MyApp.BingMaps.Imagery;
using MyApp.BingMaps.Route;
using MyApp.BingMaps.Search;

...

var exOptGeo = new BingMaps.Geocode.ExecutionOptions();
exOptGeo.SuppressFaults = true;
var exOptImg = new BingMaps.Imagery.ExecutionOptions();
exOptImg.SuppressFaults = true;
var exOptRte = new BingMaps.Route.ExecutionOptions();
exOptRte.SuppressFaults = true;
var exOptSrh = new BingMaps.Search.ExecutionOptions();
exOptSrh.SuppressFaults = true;

var geoReq = new GeocodeRequest { ExecutionOptions = exOptGeocode };
var imgReq = new ImageryRequest { ExecutionOptions = exOptImg };
var rteReq = new RouteRequest { ExecutionOptions = exOptRte };
var srhReq = new SearchRequest { ExecutionOptions = exOptSrh };

The code looks very repetitive in this way. It could be much more tedious if the object to re-create is complex.

Since the request objects are all different, we can’t really re-use them, but we can definitely re-use the ExecutionOptions object, which is 100% the same in all 4 name spaces.

Implicit Operator Overloading

Our solution is via Implicit Operator Overloading. The implicit operator is mainly used to eliminate unnecessary casts to improve source code readability. It does not require programmer to cast one type to another. In our case, it will be used to eliminate the re-creation code of each ExecutionOptions object.

Due to that all of the generated proxy classes are partial classes. Without modifying the generated code, we can create our own class file that extends any of the proxy classes like this:

namespace MyApp.BingMaps.Imagery
{
    public partial class ExecutionOptions
    {
        public int NewProperty() { get; set; }
    }
}

By extending the ExecutionOptions class like above, you can directly use NewProperty like any other generated properties:

var exOpt = new BingMaps.Geocode.ExecutionOptions();
exOpt.NewProperty = 1;

Knowing this, we can overload the implicit operator in this partial class as below:

namespace MyApp.BingMaps.Imagery
{
    public partial class ExecutionOptions
    {
        public static implicit operator ExecutionOptions(MyApp.BingMaps.Geocode.ExecutionOptions value)
        {
            return new ExecutionOptions() { ExecutionOption = value.ExecutionOption };
        }
    }
}

The implicit operator is a short cut to recreate the object of the target type from another type. In our case, we overload the implicit operator in the ExecutionOptions partial class under the BingMaps.Imagery name space. The operator automatically handles type casting from a BingMaps.Geocode.ExecutionOptions object by re-creating a local ExecutionOptions type in the class itself.

Now, we can do an implicit type cast without even noticing it:

var exOpt = new BingMaps.Geocode.ExecutionOptions { SuppressFaults = true };
var imgReq = new ImageryRequest { ExecutionOptions = exOpt };

As you can see, we can assign a BingMaps.Geocode.ExecutionOptions to a BingMaps.Imagery.ExecutionOptions without even casting it. If we overload the implicit operator for each ExecutionOptions class in the other 3 name spaces, we can simplify our code to something as below:

var exOpt = new BingMaps.Geocode.ExecutionOptions  { SuppressFaults = true };

var geoReq = new GeocodeRequest { ExecutionOptions = exOpt };
var imgReq = new ImageryRequest { ExecutionOptions = exOpt };
var rteReq = new RouteRequest { ExecutionOptions = exOpt };
var srhReq = new SearchRequest { ExecutionOptions = exOpt };

Update: “var” is used in the code samples to shorten the code for better readability. Thanks for the tip Jeff 😉