Stein’s Coding Adventures

February 22, 2009

Unit testing: InternalsVisibleTo is your friend

Filed under: C#, Design — Tags: , , — Stein @ 17:08

Where should you put your unit tests? In the same assembly or in an assembly of its own?

Phil Haack organized a poll on the subject which at the time of this writing shows that nine out of ten developers would go for separate projects. One out of ten would put them in the same assembly, and a small fraction would simply do “other”. Are there, by the way, any more options available?

But when the tests are in a different assembly than the code being tested – wouldn’t the classes and methods under test need to be public? Declaring them as public just for the sake of testing would definitely break the encapsulation. Until quite recently, I belonged to the 10% that put their tests along with the code to be tested, for this particular reason. This solved the encapsulation problem, but came at the price of having to ship the full tests to the customer. It also made it hard for me to put the test data – in this case, images – as resources in the assembly, because that would further have increased the binary footprint of the application.

Somehow I managed to miss the introduction of the InternalsVisibleTo attribute that has been around since .NET 2.0. You add it the assemblies containing the code under test and let it point at your test assemblies, which can then access any internals they need.

When working with signed assemblies, you will need to declare both the name and the public key of the assembly being referenced. It’s not unlikely that you run into some hassle getting the public key string of the unit test assembly. Luckily, Tyler Holmes gives simple description on how this can be done.

I’m sure there still are situations when you’d still want to keep tests and code being tested together. However, access to internal classes and methods shouldn’t be the main reason.

January 29, 2009

Dual Platform P/Invoke

Filed under: C# — Tags: , — Stein @ 18:18

How do you make your product work both in 32 and 64 bit environments?

If you are creating a 100% managed .NET application, the JIT compiler will do most of the work for you, but it will leave you on your own as soon as you cross the border to the unmanaged world. A native unmanaged DLL referenced by P/Invoke must either be a 32 or 64 bit DLL, there is no way one binary at the same time can have several word lengths. But if there were a way of attaching both a 32 and a 64 bit version of the same DLL, we could at least make our application work regardless of the situation.

So, how do we choose which DLL to bind to once the application has started? The DllImport attribute unfortunately only allows specifying a hard-coded file name, which essentially means that the name must be decided during before the time of compilation. The best solution I have found so far would be to make proxies for the 32 and 64 bit versions respectively, each of them bound to the filename of that particular DLL and depending on the runtime word length select the proxy matching the platform. These proxies can actually be compiled into the same assembly.

Unfortunately, this comes at the price of some duplication. As mentioned, two proxies, rather than one, will be required. Furthermore, I’m not aware of any way of letting the static p/invoke methods themselves implement the interface, so this adds an extra level of indirection. On the other hand – if you want to be able to replace the dll with a mock object, an interface could be quite handy anyways.

As an example, let’s say we have an unmanaged codec that we want to access using the following interface:

    public interface ICodec {
        int Decode(IntPtr input, IntPtr output, long inputLength);
    }

In a real-world scenario, it is likely that this interface would contain more than a single method. We create two proxies, one for the 32 bit (x86) mode:

    public class CodecX86 : ICodec {
        private const string dllFileName = @"Codec.x86.dll";

        [DllImport(dllFileName)]
        static extern int decode(IntPtr input, IntPtr output, long inputLength);

        public int Decode(IntPtr input, IntPtr output, long inputLength) {
            return decode(input, output, inputLength);
        }
    }

And one for the 64-bit platforms. Note that this is exactly identical except from the file name constant.

    public class CodecX64 : ICodec {
        private const string dllFileName = @"Codec.x64.dll";

        [DllImport(dllFileName)]
        static extern int decode(IntPtr input, IntPtr output, long inputLength);

        public int Decode(IntPtr input, IntPtr output, long inputLength) {
            return decode(input, output, inputLength);
        }
    }

Finally, we create a factory that will give us an instance that is compatible with the currently running platform:

    public class CodecFactory {
        ICodec instance = null;

        public ICodec GetCodec() {
            if (instance == null) {
                if (IntPtr.Size == 4) {
                    instance = new CodecX86();
                } else if (IntPtr.Size == 8) {
                    instance = new CodecX64();
                } else {
                    throw new NotSupportedException("Unknown platform");
                }
            }
            return instance;
        }
    }

Using this pattern, two DLLs can be deployed alongside eachother and your application will automatically run in native mode regardless of your operating system pointer size.

January 6, 2009

Keep the objects consistent!

Filed under: C#, Design — Tags: , — Stein @ 23:30

The best way to avoid checking all the time whether an object is consistent is to never allow it to get inconsistent.

I’ll try to illustrate the problem with a simple example. The following class expects both the Name and Address fields/properties to actually contain values – not just being null. This is illustrated by the ToString() method, but would in a true scenario happen in most methods of the class. (By the way, note the cool syntactically sweet one-liner for defining fields and properties.)

    class Person {
        public FullName Name { get; set; }
        public Address Address { get; set; }

        public Person() {
            // Not really doing anything. Just put it here in the
            // code sample to show that there is no fancy initialization going on.
        }

        public override string ToString() {
            return String.Format("Person (Name={0}, Street={1}, City={2})", Name.FullName, Address.Street, Address.City);
        }
    }

It gives quite a lot of responsibility to the user of the class, who is entrusted to assign the fields after having invoked the constructor. Failing to do so will at worst give null reference errors when code depending on the contents of a member field tries to use it. Adding assertions or error handling when a field is about to be used would improve the situation somewhat, while cluttering the code. In the example above, one workaround could be to introduce null-checks in the ToString() method to ensure that a proper “Name missing” or “Null” message is shown, rather than NullReferenceException being thrown. Or, perhaps use String.Format which would be a good idea for formatting the string anyways.

However, there are still several problems with this design:

  • Adding new member fields is risky. It can be very tricky to keep track of all the places where the Person class is used. How can you be sure that the new field is added correctly wherever Person instances are created? If it is just an internal class it might be manageble. If not, lets just hope that a unit test will catch it.
  • Even though the object is initialized correctly, there is no guarantee that it will still be consistent. What would happen if someone assigns Null to the Name property?
  • Adding null checks and proper handling will definitely mess up the code.
  • Other problems as well, but mentioning them would move focus away from the point I’m trying to make…

If the object is guaranteed to be consistent when it is created, and impossible to bring into an inconsistent state, there  should be no need to have null-checks scattered all over your class. Check the values at the gates – that would be in the constructor and in the setter properties – and you can trust them to be correct afterwards. If the object is immutable it is of course sufficient to only do the check during construction. Use the readonly (C#) or final (java) keyword to get support from the compiler enforcing this, and to state your intentions.

Actually, immutability is a great design concept that should be used more.  Eric Lippert has written some great posts on this topic that I recommend! He starts out with Immutability in C# Part 1: Kinds of immutability defining what it all is about, and then goes on with 10 more articles exploring how various common data structures can be given immutable implementations.

I’ll round of this post with a modified version of the class that guarantees that name and address can never be null. If the name and address objects are immutable, then this object will also be truly immutable.

    sealed class Person {
        readonly FullName name;
        readonly Address address;

        public FullName Name {
            get {
                return name;
            }
        }

        public Address Address {/snip/}

        public Person(FullName name, Address address) {
            if (name == null) throw new ArgumentNullException("name");
            if (address == null) throw new ArgumentNullException("address");

            this.name = name;
            this.address = address;
        }

        public override string ToString() {
            return String.Format("Person (Name={0}, Street={1}, City={2})", Name.FullName, Address.Street, Address.City);
        }
    }

January 2, 2009

Logging using Lambda Expressions

Filed under: C# — Tags: , , — Stein @ 16:19

Trace logging is a good thing, isn’t it? Lately, I have seen quite a lot of code roughly similar to the following:

  logger.Debug(String.Format("Logging in user {0} from {1}",
      user.FullName, user.GetFullHostName()));

This code has two significant drawbacks. There is an overhead associated with building the string to be logged, that we have to pay regardless of whether the string is actually used. Also, the actual code building the string might throw run-time exceptions. In the given example, this will happen if user is a null reference.

A common way of speeding up logging is to add an extra check to see if the log level prior to constructing the string:

  if (logger.IsDebugEnabled) {
      logger.DebugFormat("Logging in user {0} from {1}",
          user.FullName, user.FullHostName);
  }

This eliminates the string generation overhead, but is more verbose than the straightforward version. Furthermore, if the string generation throws an exception, the execution of the code will actually succeed or fail depending on which the configured log level is. This is particularly nasty, because – honestly – how often do you run at full log-level while developing?

  delegate string StringCreator();

If the logging framework accepted a string delegate as parameter, instead of taking the actual string, would actually be up to the logging framework to decide whether or not to invoke the method. Also, any exceptions during its execution can be caught. It will definitely still be possible to do stupid stuff, but many of the most common mistakes can be gracefully handled.

  logger.Debug(() => "Logging in user " + user.FullName);

I guess this still comes at the price of extra methods being generated under the hood. However, I don’t think that is a too high price to pay for reducing the risk of errors and increasing the clarity of the code.

Another good improvement would be to incorporate the use of String.Format in the logging framework, allowing the developer to write:

  logger.Debug("Logging in user {0} from {1}", user.FullName, user.FullHostName);

This would definitely come with some overhead, but looks nice and clean. The string needs only to be generated if the detail level of the log is verbose enough, although the array containing the arguments will definitely need to be assembled. Also, any exceptions during the actual string generation can be caught and logged appropriately.

Maybe there already are some frameworks out there supporting these syntaxes?

Blog at WordPress.com.