Archive

Posts Tagged ‘Caching’

Functional Caching Part 4: Dependency Injection

October 7th, 2010 2 comments

    Last time, we explored how to work caching functions into an application. We deconstructed the solution, then composed its elements in various ways to tweak behavior.

    We also identified a pattern: assembling an object graph. The process, known as Dependency Injection, is very mechanical; it requires creating and managing an ever-increasing number of objects, each of which can be arbitrarily complex. That manual labor is a significant obstacle to writing in the Dependency Injection style.

    This issue has spawned several frameworks, known as Inversion of Control containers. These systems separate the need for another object from the details of how to get it.

    This is exactly what we did before: OrderDiscountService needs an instance of IOrderRepository, and doesn’t know or care that we used CachedOrderRepository to fulfill the contract. Accordingly, this technique fits neatly into a container.

    Containers

    A container creates and manages the object instances used in an application. It is an ecosystem, where each object is an inhabitant that has its own unique relationships with other objects. Object relationships are described as B-depends-on-A, where B is an object that calls upon object A to do some work.

    Since B needs an instance of A, it has a choice:

    1. Create A directly
    2. Retrieve A from a static property or method
    3. Accept A through a constructor argument
    4. Accept A through a settable property

    3 and 4 externalize the dependency; B’s creator determines the instance of A. This makes B simpler than 1 or 2, as it doesn’t have to know how to call an external accessor. B is also more cohesive because it has less infrastructure code and is more focused on its core responsibility.

    The main purpose of a container is to understand the contours of these relationships and use that knowledge to create fully-functional objects. This frees up developers to concentrate on the project-specific tasks in the classes, instead of all the connective tissue between them.

    Autofac

    My container of choice is Autofac. It is clean and cohesive, with an excellent API*. The basic flow of Dependency Injection with Autofac is:

    1. Create a container builder
    2. Register all application types with the builder
    3. Build the container
    4. Resolve a top-level object out of the container

    Here is how we would configure OrderDiscountService to use the non-caching OrderRepository:

    var builder = new ContainerBuilder();

     

    // …Register ISession because OrderRepository requires it…

     

    builder

      .RegisterType<OrderRepository>()

      .As<IOrderRepository>()

      .InstancePerDependency();

     

    builder.RegisterType<OrderDiscountService>().InstancePerDependency();

    This registers the OrderRepository type under the IOrderRepository interface, meaning any other object which has a dependency on IOrderRepository will be given an instance of OrderRepository. The InstancePerDependency instruction tells the container to create a new instance every time an object has a dependency on IOrderRepository.

    Here is how we would resolve an instance of OrderDiscountService from the container:

    var container = builder.Build();

     

    var discountService = container.Resolve<OrderDiscountService>();

     

    var orderId = 1;

     

    discountService.ApplyDiscount(orderId, 0.10);

    The builder creates a container with all of the registrations. We then resolve an instance of OrderDiscountService and use it. The container encapsulates the messy, rote details of wiring the object graph. It inspects the constructor to determine its dependencies. When it finds one on IOrderRepository, it knows to create an instance of OrderRepository because of our prior registration.

    To add caching, we simply have to change the object which is registered as IOrderRepository: 

    builder

      .Register(c =>

        new CachedOrderRepository(new OrderRepository(c.Resolve<ISession>())))

      .As<IOrderRepository>()

      .SingleInstance();

    Here, we declare that when an object depends on IOrderRepository, we create an instance of OrderRepository and wrap it in CachedOrderRepository. Other objects remain oblivious to the new capabilities.

    We use the context variable c to fulfill
    OrderRepository
    ‘s dependency on ISession. We have to spell out the dependencies because we want to run arbitrary registration code. This is what the container automagically does for us when we use the RegisterType method.

    We also change the lifetime to SingleInstance, which tells the container to use the same instance to fulfill all dependencies. This extends the benefits of caching to all objects with minimal effort.

    Summary

    This series has explored caching as implemented in Dependency Injection style. First, we defined result caching for any method with a single parameter and a return value. Next, we defined concurrent caching, allowing multiple threads to safely invoke the caching method. Then, we seamlessly weaved caching into an implementation of IOrderRepository using the Decorator pattern.

    In this final installment, we expressed the same object graph as last time, but in a declarative fashion. The boring tab-A-slot-B mechanics are the container’s specialty; by delegating that task, we gain more time to concentrate on providing business value.

    * I got the chance to meet Nick Blumhardt, Autofac’s founder, at PDC 2008. Let’s just say I trust his code.

    Functional Caching Part 3: Decoration

    August 2nd, 2010 No comments

      Now that we have established the cached function pattern and adapted it to add concurrency support, it’s time to put the functions in action.

      First, let’s create a home for our GetOrder method:

      public interface IOrderRepository

      {

        Order GetOrder(int id);

      }

      This interface represents a repository (data store) which contains orders, like a database, web service, or XML file. It abstracts away persistence details such as connection strings, URLs, file paths, and query mechanisms. Using an interface communicates to the reader that GetOrder is not defined in terms of any concrete details.

      The only decision we’ve made so far is the signature of GetOrder; we haven’t had to consider class vs. struct, making the class abstract, sealing it, or making GetOrder virtual. We also haven’t locked implementers into a single inheritance hierarchy. By delaying these minor decisions, interfaces keep the focus on what needs to be done, not how it’s done.

      Before we implement IOrderRepository, let’s see how it would be used:

      public class OrderDiscountService

      {

        private readonly IOrderRepository _orders;

       

        public OrderDiscountService(IOrderRepository orders)

        {

          _orders = orders;

        }

       

        public void ApplyDiscount(int orderId, double percentage)

        {

          var order = _orders.GetOrder(orderId);

       

          order.ApplyDiscount(percentage);

        }

      }

      OrderDiscountService declares its dependency on IOrderRepository by accepting it in the constructor. This separates the conceptual dependency on the GetOrder operation from any details of its execution. The actual IOrderRepository implementation is left up to configuration, a concern for some other class.

      There is a subtle beauty in this. Classes define their boundaries through their dependencies. When those boundaries are interfaces, which have no inherent implementation, a class stands on its own: it can be fully understood without the mental weight of those external details.

      An outstanding benefit of this style is that most classes start to look and feel the same: declare constructor dependencies and implement an interface using private methods. It normalizes complexity with a format that can handle problems large and small. This consistency also pays dividends over time by making it less painful to ease back into past codebases.

      The Core Implementation

      We can write GetOrder by querying an NHibernate session:

      public sealed class OrderRepository : IOrderRepository

      {

        private readonly ISession _session;

       

        public OrderRepository(ISession session)

        {

          _session = session;

        }

       

        public Order GetOrder(int id)

        {

          return _session.Linq<Order>().Where(order => order.Id == id).SingleOrDefault();

        }

      }

      An NHibernate session represents a single conversation with a database. We don’t want to know the gory details of creating one of those suckers, so we declare a dependency on it instead. This communicates to the reader that this class participates in an existing conversation.

      The query is fairly straightforward. We select the one order whose ID matches the parameter, or null if that ID doesn’t exist. This class is really only interesting because it is the NHibernate version of it; we can imagine similar Entity Framework or LINQ to SQL implementations.

      Wrappers with Mad Cache

      OrderRepository is very cohesive: its only concern is to query. Adding more functionality to it, such as caching, would muddle the class’s core intent. In situations such as this, we can use the Decorator pattern to wrap the class with new functionality.

      Decoration here means implementing IOrderRepository again and also accepting it in the constructor. The outer GetOrder is written in terms of the inner one, altering it by using a cached version of the function:

      public sealed class CachedOrderRepository : IOrderRepository

      {

        private readonly Func<int, Order> _getOrder;

       

        public CachedOrderRepository(IOrderRepository innerRepository)

        {

          _getOrder = innerRepository.GetOrder;

       

          _getOrder = _getOrder.Cached();

        }

       

        public Order GetOrder(int id)

        {

          return _getOrder(id);

        }

      }

      We declare that GetOrder passes the call through to the _getOrder member. To initialize the function, first we assign it to the inner, uncached implementation:

      _getOrder = innerRepository.GetOrder;

      This means exactly the same as:

      _getOrder = new Func<int, Order>(innerRepository.GetOrder);

      The C# compiler is smart enough to see that _getOrder is of type Func<int, Order>, a delegate type, and allows us to assign the method directly, rather than have us type out something it can infer. Neat.

      Next, we get a cached version of the function we just assigned:

      _getOrder = _getOrder.Cached();

      Voila! CachedOrderRepository connects caching to GetOrder – it doesn’t actually implement either of them. All three concepts are neatly separated, and each has a very small surface area for external changes to affect it. Our intent is clear: cache the inner repository’s GetOrder method.

      Some Assembly Required

      Like when creating a Lego structure or writing a story, we must weave our various building blocks together in a single context. This action is called composition, the natural yang to the yin of fine-grained objects. In our case, this means creating an object graph: calling the series of constructors necessary to create an instance of OrderDiscountService:

      var session = …;

      var orders = new OrderRepository(session);

      var discountService = new OrderDiscountService(orders);

       

      discountService.ApplyDiscount(7, 0.10);

      We used the non-cached version of the repository because we are only applying a discount to a single order. There is no return on investment for caching the instance. However, if we were to apply a discount to a series of orders which may contain duplicates, we could use the cached version to avoid retrieving the same order multiple times:

      var session = …;

      var orders = new CachedOrderRepository(new OrderRepository(session));

      var discountService = new OrderDiscountService(orders);

       

      foreach(var orderId in new[] { 1, 2, 2, 3, 4, 5, 5, 5, 6, 7, 7 })

      {

        discountService.ApplyDiscount(orderId, 0.10);

      }

      The only difference is that we set orders to a cached version of the repository. OrderDiscountService has no idea that the IOrderRepository it receives is actually a graph with its own composed behaviors. The structure of the graph is a configuration detail of which none of the classes has any knowledge.

      Summary

      We defined an interface for our GetOrder example method and saw how it would be used. We discussed some benefits of interfaces and reflected on using them to create isolatable objects. We implemented IOrderRepository in a typical query-based fashion and seamlessly added caching using the Decorator pattern, keeping it separate from the core query concern.

      We also created multiple configurations of these objects to address slight differences in context. Soon, we will see how we can simply declare these relationships and hand off the manual labor to a framework.

      Functional Caching Part 2: Concurrency

      July 30th, 2010 No comments

        Last time, we defined caching in terms of functions and extended the Func<,> type with the ability to create cached versions. This is powerful because it can cache the results of any operation based on a key.

        However, the Dictionary<,>-based caching mechanism is rather simplistic. It doesn’t handle core caching scenarios such as expiration, dependencies, and notifications. Even more fundamental, though, is the ability to be read and written by multiple threads at the same time.

        Thread safety is the art of not crossing the streams. Here, a stream is a single thread of execution, and crossing means two or more threads simultaneously operating on the same data. Data being read while also being written is inherently unstable; safety means accurately predicting the outcome of multithreaded code.

        This mental model is focused on safety from the effects of threads and is heavy in mechanism. Threads are really an implementation detail of the intent to perform simultaneous actions, known as concurrency. Our goal, then, is to create cached functions which support concurrent usage.

        Here is another extension method in the same style as the original Cached method. This one simply locks the cache variable during each lookup:

        public static Func<TKey, TValue> CachedConcurrently<TKey, TValue>(

          this Func<TKey, TValue> valueFunction)

        {

          var cache = new Dictionary<TKey, TValue>();

         

          return key =>

          {

            TValue value;

         

            lock(cache)

            {

              if(!cache.TryGetValue(key, out value))

              {

                value = valueFunction(key);

         

                cache[key] = value;

              }

            }

         

            return value;

          };

        }

        The cache variable is never referenced outside of this method and thus serves as the perfect synchronization object.

        Summary

        We identified the need for caching functions which may be invoked concurrently. We also created a new method which adds concurrency to the algorithm established by the Cached method. We are starting to see the power of the function composition pattern and its implementation in C#.

        Functional Caching

        July 29th, 2010 No comments
            Consider this typical method for fetching orders:

          Order GetOrder(int id)

          In a certain context, such as a web request, we may want to return the same order instance for the same ID. This would ensure GetOrder(1) == GetOrder(1) holds true. It also removes the need to go to the data store multiple times for the same ID.

          Dictionary<,> is very useful for this kind of caching:

          private readonly Dictionary<int, Order> _cache = new Dictionary<int, Order>();

           

          public Order GetOrder(int id)

          {

            Order order;

           

            if(!_cache.TryGetValue(id, out order))

            {

              order = QueryOrder(id);

           

              _cache[id] = order;

            }

           

            return order;

          }

          First, we check if we’ve queried for the order before. If so, we use it. If not, we perform the query and add the order to the cache.

          This process isn’t specific to orders and begs to be generalized:

          public class Cache<TKey, TValue>

          {

            private readonly Func<TKey, TValue> _valueFunction;

            private readonly Dictionary<TKey, TValue> _cache = new Dictionary<TKey, TValue>();

           

            public Cache(Func<TKey, TValue> valueFunction)

            {

              _valueFunction = valueFunction;

            }

           

            public TValue GetValue(TKey key)

            {

              TValue value;

           

              if(!_cache.TryGetValue(id, out value))

              {

                value = _valueFunction(id);

           

                _cache[key] = value;

              }

           

              return value;

            }

          }

          This factors out the unique part of the operation, the actual retrieval of the value, into whatever method is encapsulated by valueFunction. We went from caching one specific method, GetOrder, to caching any method which takes a single argument and has a return value.

          GetOrder can now delegate caching duties:

          private readonly Cache<int, Order> _cache =

            new Cache<int, Order>(id => QueryOrder(id));

           

          public Order GetOrder(int id)

          {

            return _cache.GetValue(id);

          }

          Here, the results of the lambda expression id => QueryOrder(id) will be cached.

          A Simpler Approach

          We’ve established the key/value caching pattern and implemented it in object-oriented fashion as the Cache<,> class. While this solution is adequate for the scenario presented, there is an easier and (arguably) clearer way to express the same concept.

          Memoization is an approach to caching that uses function composition to build methods which remember their output. The basic idea is to take a function and wrap it in a new function which caches the results:

          public class OrderRepository

          {

            private
          readonly Func<int, Order> _query;

           

            public OrderRepository()

            {

              _query = id => QueryOrder(id);

           

              _query = _query.Cached();

            }

           

            public Order GetOrder(int id)

            {

              return _query(id);

            }

           

            // …

          }

          We create a function representing the order query, then call an extension method to get a cached version. This states our intent, "cache the results of this query", much clearer than the original GetOrder.

          Cache<,> is no longer necessary, since both cached and uncached operations are the same type, Func<,>. This symmetry enables some nice scenarios, such as the seamless caching above.

          Nitty Gritty

           Cached has all the elements of Cache<,>:

          public static Func<TKey, TValue> Cached<TKey, TValue>(

            this Func<TKey, TValue> valueFunction)

          {

            var cache = new Dictionary<TKey, TValue>();

           

            return key =>

            {

              TValue value;

           

              if(!cache.TryGetValue(key, out value))

              {

                value = valueFunction(key);

           

                cache[key] = value;

              }

           

              return value;

            };

          }

          The extension method applies to any one-argument function. It returns a function with the same signature, and whose body wraps the original and caches its results.

          The new function is a closure, a lambda expression which can capture variables. The code is packaged to be run later, and referenced variables such as valueFunction and cache will be evaluated at that time. This means that although cache is only scoped to a method call, it will be referenced by the closure and thus avoid garbage collection.

          Summary

          A common approach to caching is using a dictionary. We identified that as an implementation detail and focused on the intent to cache an operation’s results. We also composed building blocks such as functions and extension methods to help us hide the how and bring out the why.