Archive

Posts Tagged ‘Functional’

Functional Caching Part 3: Decoration

August 2nd, 2010 No comments

    Now that we have established the cached function pattern and adapted it to add concurrency support, it’s time to put the functions in action.

    First, let’s create a home for our GetOrder method:

    public interface IOrderRepository

    {

      Order GetOrder(int id);

    }

    This interface represents a repository (data store) which contains orders, like a database, web service, or XML file. It abstracts away persistence details such as connection strings, URLs, file paths, and query mechanisms. Using an interface communicates to the reader that GetOrder is not defined in terms of any concrete details.

    The only decision we’ve made so far is the signature of GetOrder; we haven’t had to consider class vs. struct, making the class abstract, sealing it, or making GetOrder virtual. We also haven’t locked implementers into a single inheritance hierarchy. By delaying these minor decisions, interfaces keep the focus on what needs to be done, not how it’s done.

    Before we implement IOrderRepository, let’s see how it would be used:

    public class OrderDiscountService

    {

      private readonly IOrderRepository _orders;

     

      public OrderDiscountService(IOrderRepository orders)

      {

        _orders = orders;

      }

     

      public void ApplyDiscount(int orderId, double percentage)

      {

        var order = _orders.GetOrder(orderId);

     

        order.ApplyDiscount(percentage);

      }

    }

    OrderDiscountService declares its dependency on IOrderRepository by accepting it in the constructor. This separates the conceptual dependency on the GetOrder operation from any details of its execution. The actual IOrderRepository implementation is left up to configuration, a concern for some other class.

    There is a subtle beauty in this. Classes define their boundaries through their dependencies. When those boundaries are interfaces, which have no inherent implementation, a class stands on its own: it can be fully understood without the mental weight of those external details.

    An outstanding benefit of this style is that most classes start to look and feel the same: declare constructor dependencies and implement an interface using private methods. It normalizes complexity with a format that can handle problems large and small. This consistency also pays dividends over time by making it less painful to ease back into past codebases.

    The Core Implementation

    We can write GetOrder by querying an NHibernate session:

    public sealed class OrderRepository : IOrderRepository

    {

      private readonly ISession _session;

     

      public OrderRepository(ISession session)

      {

        _session = session;

      }

     

      public Order GetOrder(int id)

      {

        return _session.Linq<Order>().Where(order => order.Id == id).SingleOrDefault();

      }

    }

    An NHibernate session represents a single conversation with a database. We don’t want to know the gory details of creating one of those suckers, so we declare a dependency on it instead. This communicates to the reader that this class participates in an existing conversation.

    The query is fairly straightforward. We select the one order whose ID matches the parameter, or null if that ID doesn’t exist. This class is really only interesting because it is the NHibernate version of it; we can imagine similar Entity Framework or LINQ to SQL implementations.

    Wrappers with Mad Cache

    OrderRepository is very cohesive: its only concern is to query. Adding more functionality to it, such as caching, would muddle the class’s core intent. In situations such as this, we can use the Decorator pattern to wrap the class with new functionality.

    Decoration here means implementing IOrderRepository again and also accepting it in the constructor. The outer GetOrder is written in terms of the inner one, altering it by using a cached version of the function:

    public sealed class CachedOrderRepository : IOrderRepository

    {

      private readonly Func<int, Order> _getOrder;

     

      public CachedOrderRepository(IOrderRepository innerRepository)

      {

        _getOrder = innerRepository.GetOrder;

     

        _getOrder = _getOrder.Cached();

      }

     

      public Order GetOrder(int id)

      {

        return _getOrder(id);

      }

    }

    We declare that GetOrder passes the call through to the _getOrder member. To initialize the function, first we assign it to the inner, uncached implementation:

    _getOrder = innerRepository.GetOrder;

    This means exactly the same as:

    _getOrder = new Func<int, Order>(innerRepository.GetOrder);

    The C# compiler is smart enough to see that _getOrder is of type Func<int, Order>, a delegate type, and allows us to assign the method directly, rather than have us type out something it can infer. Neat.

    Next, we get a cached version of the function we just assigned:

    _getOrder = _getOrder.Cached();

    Voila! CachedOrderRepository connects caching to GetOrder – it doesn’t actually implement either of them. All three concepts are neatly separated, and each has a very small surface area for external changes to affect it. Our intent is clear: cache the inner repository’s GetOrder method.

    Some Assembly Required

    Like when creating a Lego structure or writing a story, we must weave our various building blocks together in a single context. This action is called composition, the natural yang to the yin of fine-grained objects. In our case, this means creating an object graph: calling the series of constructors necessary to create an instance of OrderDiscountService:

    var session = …;

    var orders = new OrderRepository(session);

    var discountService = new OrderDiscountService(orders);

     

    discountService.ApplyDiscount(7, 0.10);

    We used the non-cached version of the repository because we are only applying a discount to a single order. There is no return on investment for caching the instance. However, if we were to apply a discount to a series of orders which may contain duplicates, we could use the cached version to avoid retrieving the same order multiple times:

    var session = …;

    var orders = new CachedOrderRepository(new OrderRepository(session));

    var discountService = new OrderDiscountService(orders);

     

    foreach(var orderId in new[] { 1, 2, 2, 3, 4, 5, 5, 5, 6, 7, 7 })

    {

      discountService.ApplyDiscount(orderId, 0.10);

    }

    The only difference is that we set orders to a cached version of the repository. OrderDiscountService has no idea that the IOrderRepository it receives is actually a graph with its own composed behaviors. The structure of the graph is a configuration detail of which none of the classes has any knowledge.

    Summary

    We defined an interface for our GetOrder example method and saw how it would be used. We discussed some benefits of interfaces and reflected on using them to create isolatable objects. We implemented IOrderRepository in a typical query-based fashion and seamlessly added caching using the Decorator pattern, keeping it separate from the core query concern.

    We also created multiple configurations of these objects to address slight differences in context. Soon, we will see how we can simply declare these relationships and hand off the manual labor to a framework.

    Functional Caching Part 2: Concurrency

    July 30th, 2010 No comments

      Last time, we defined caching in terms of functions and extended the Func<,> type with the ability to create cached versions. This is powerful because it can cache the results of any operation based on a key.

      However, the Dictionary<,>-based caching mechanism is rather simplistic. It doesn’t handle core caching scenarios such as expiration, dependencies, and notifications. Even more fundamental, though, is the ability to be read and written by multiple threads at the same time.

      Thread safety is the art of not crossing the streams. Here, a stream is a single thread of execution, and crossing means two or more threads simultaneously operating on the same data. Data being read while also being written is inherently unstable; safety means accurately predicting the outcome of multithreaded code.

      This mental model is focused on safety from the effects of threads and is heavy in mechanism. Threads are really an implementation detail of the intent to perform simultaneous actions, known as concurrency. Our goal, then, is to create cached functions which support concurrent usage.

      Here is another extension method in the same style as the original Cached method. This one simply locks the cache variable during each lookup:

      public static Func<TKey, TValue> CachedConcurrently<TKey, TValue>(

        this Func<TKey, TValue> valueFunction)

      {

        var cache = new Dictionary<TKey, TValue>();

       

        return key =>

        {

          TValue value;

       

          lock(cache)

          {

            if(!cache.TryGetValue(key, out value))

            {

              value = valueFunction(key);

       

              cache[key] = value;

            }

          }

       

          return value;

        };

      }

      The cache variable is never referenced outside of this method and thus serves as the perfect synchronization object.

      Summary

      We identified the need for caching functions which may be invoked concurrently. We also created a new method which adds concurrency to the algorithm established by the Cached method. We are starting to see the power of the function composition pattern and its implementation in C#.

      Functional Caching

      July 29th, 2010 No comments
          Consider this typical method for fetching orders:

        Order GetOrder(int id)

        In a certain context, such as a web request, we may want to return the same order instance for the same ID. This would ensure GetOrder(1) == GetOrder(1) holds true. It also removes the need to go to the data store multiple times for the same ID.

        Dictionary<,> is very useful for this kind of caching:

        private readonly Dictionary<int, Order> _cache = new Dictionary<int, Order>();

         

        public Order GetOrder(int id)

        {

          Order order;

         

          if(!_cache.TryGetValue(id, out order))

          {

            order = QueryOrder(id);

         

            _cache[id] = order;

          }

         

          return order;

        }

        First, we check if we’ve queried for the order before. If so, we use it. If not, we perform the query and add the order to the cache.

        This process isn’t specific to orders and begs to be generalized:

        public class Cache<TKey, TValue>

        {

          private readonly Func<TKey, TValue> _valueFunction;

          private readonly Dictionary<TKey, TValue> _cache = new Dictionary<TKey, TValue>();

         

          public Cache(Func<TKey, TValue> valueFunction)

          {

            _valueFunction = valueFunction;

          }

         

          public TValue GetValue(TKey key)

          {

            TValue value;

         

            if(!_cache.TryGetValue(id, out value))

            {

              value = _valueFunction(id);

         

              _cache[key] = value;

            }

         

            return value;

          }

        }

        This factors out the unique part of the operation, the actual retrieval of the value, into whatever method is encapsulated by valueFunction. We went from caching one specific method, GetOrder, to caching any method which takes a single argument and has a return value.

        GetOrder can now delegate caching duties:

        private readonly Cache<int, Order> _cache =

          new Cache<int, Order>(id => QueryOrder(id));

         

        public Order GetOrder(int id)

        {

          return _cache.GetValue(id);

        }

        Here, the results of the lambda expression id => QueryOrder(id) will be cached.

        A Simpler Approach

        We’ve established the key/value caching pattern and implemented it in object-oriented fashion as the Cache<,> class. While this solution is adequate for the scenario presented, there is an easier and (arguably) clearer way to express the same concept.

        Memoization is an approach to caching that uses function composition to build methods which remember their output. The basic idea is to take a function and wrap it in a new function which caches the results:

        public class OrderRepository

        {

          private
        readonly Func<int, Order> _query;

         

          public OrderRepository()

          {

            _query = id => QueryOrder(id);

         

            _query = _query.Cached();

          }

         

          public Order GetOrder(int id)

          {

            return _query(id);

          }

         

          // …

        }

        We create a function representing the order query, then call an extension method to get a cached version. This states our intent, "cache the results of this query", much clearer than the original GetOrder.

        Cache<,> is no longer necessary, since both cached and uncached operations are the same type, Func<,>. This symmetry enables some nice scenarios, such as the seamless caching above.

        Nitty Gritty

         Cached has all the elements of Cache<,>:

        public static Func<TKey, TValue> Cached<TKey, TValue>(

          this Func<TKey, TValue> valueFunction)

        {

          var cache = new Dictionary<TKey, TValue>();

         

          return key =>

          {

            TValue value;

         

            if(!cache.TryGetValue(key, out value))

            {

              value = valueFunction(key);

         

              cache[key] = value;

            }

         

            return value;

          };

        }

        The extension method applies to any one-argument function. It returns a function with the same signature, and whose body wraps the original and caches its results.

        The new function is a closure, a lambda expression which can capture variables. The code is packaged to be run later, and referenced variables such as valueFunction and cache will be evaluated at that time. This means that although cache is only scoped to a method call, it will be referenced by the closure and thus avoid garbage collection.

        Summary

        A common approach to caching is using a dictionary. We identified that as an implementation detail and focused on the intent to cache an operation’s results. We also composed building blocks such as functions and extension methods to help us hide the how and bring out the why.