Thursday, June 12, 2008

MSDN Articles Online (.chm)

Glenn Block (@gblock on Twitter) just posted an tweet that mentioned the existence of these.  I had no idea they were there and I'm sure others didn't as well - so here you go: http://is.gd/vZE

Enjoy!

Sunday, June 8, 2008

Extension Methods are way cool!

OK.  So everyone's heard of LINQ by now.  Most everyone has even heard of some of the cool features of C# 3.0 (lambdas), but in my mind, the coolest - extension methods - largely goes unnoticed.  Extension methods are the plumbing on which LINQ and some of the other cool features in the C# 3.0 libraries are implemented.  They are, in my opinion, the best feature C# has introduced since Generics, and are possibly one of the best features added to traditional languages EVER!

Consider the following - you have a class that someone else wrote.  On their class, they've provided a public interface for doing all of the things you need, but there are several additional things that you've implemented (as a separate utility set of functions) that it would be nice to add to the class' public interface.  Unfortunately, the class is marked 'sealed', or it is the base of a large hierarchy of classes that you simply can't add your functionality to (since you can't cause classes in a vendors library to derive from your 'new' version of their base class).

Extension methods to the rescue - all you need to do is declare a static class in your library (which you probably already have called 'StringUtils' or something like that :)), and provide some static methods on it that use the new 'this' keyword on their first argument.  Magically, the compiler will then 'add' this method to all items that have a type that is compatible with the type you have in the 'this-marked' argument.

For example:

public static class StringUtils
{
public static string RemoveAll(this string s, params string[] args)
{
string ret = s;
foreach(string sremove in args)
ret = ret.Replace(sremove, string.Empty);
}
}

by the way, of course I know this is the most horrible way to implement this function - it's just an example so don't tell me how crappy my code is or that I should be using StringBuilder or yada yada yada...!

The point of this is that after declaring such a function, all objects of type 'string' syntactically receive a member called 'RemoveAll' that has a single 'params' argument.  This is VERY cool.

The coolest thing about this - you can also do it for interfaces, enums, and various other types that you can't possibly provide "code" for in a more traditional way.

What else?

Much of the code that I write on a day-to-day basis works with tree-based data structures.  Some of these structures can get very complicated and much of my unit test code needs to do asserts over a large part of a tree (after performing some complex operation).  As a for instance, consider an expression parser.  Such a parser would presumably build an AST for the expression it's given and return that AST for further processing.  ASTs for all but the most simple expressions can get very tedious to 'check' for validity when writing a parser.

I've recently begun using extension methods for by base node class to help with my unit testing.  I directly put the unit test 'asserts' into the extension methods, and these extension methods are in NO WAY suitable to exist in the library that is being tested (why on earth would I want to have all this extra junk in my library just to support unit tests).  As a matter of fact, my libraries even target .NET 3.0 (C# 2.0) rather than .NET 3.5, C# 3.0.  However, that doesn't stop me from being able to use extension methods in my unit testing code (which doesn't get deployed to my clients, so I don't require them to have 3.5, I just have to have it on my dev machine and build machine).

Here's a simple example of how some of my unit testing code looks:

[Test]
public void FormulaTests2()
{
PrimaryLexer l = new PrimaryLexer();
StringReaderAdapter sra = new StringReaderAdapter("a / (b + c)", 0);
InforceScriptLexerFilter lf = new InforceScriptLexerFilter(sra, l);
InforceScriptSemanticParser sp = new InforceScriptSemanticParser(lf);

RootFormula rf = sp.Parse();

// look for 'a' and '/'
rf.Body.Is<binaryop>()
.OperatorIs(InforceScriptTokenId.Slash)
.Left.Is<invokeop>()
.IdRef.Is<idreference>()
.NameIs("a");
// look for 'b' and '+'
rf.Body.Is<binaryop>()
.Right.Is<binaryop>()
.OperatorIs(InforceScriptTokenId.Plus)
.Left.Is<invokeop>()
.IdRef.Is<idreference>()
.NameIs("b");
// look for 'c'
rf.Body.Is<binaryop>()
.Right.Is<binaryop>()
.Right.Is<invokeop>()
.IdRef.Is<idreference>()
.NameIs("c");
}

In order to make all this possible, I defined a few extension methods:

internal static class TreeAssertions
{
public static T Is<T>(this ExpressionBase node) where T : ExpressionBase
{
Assert.IsInstanceOfType(typeof(T), node, "wrong node type");
return (T)node;
}

public static IdReference NameIs(this IdReference node, string name)
{
Assert.AreEqual(name, node.Id.Name, "names don't match");
return node;
}

public static BinaryOp OperatorIs(this BinaryOp node, InforceScriptTokenId op)
{
Assert.AreEqual(op, node.OperatorTokenId);
return node;
}
}

As you can see, the 'Is' test checks the type of a node, and then returns the node, so I can continue checking other things for the same node (assuming it 'passed' the check).  The same is true for the 'NameIs' and 'OperatorIs' checks.  This sort of programming is generally referred to (I think) as 'Literate Programming' - a technique for which the venerable D. Knuth is given the credit.  However, in order to do this sort of thing in the past, I'd have needed to put all these methods on my base class for the tree nodes, something that would have absolutely been the 'wrong' thing to do (since this test code should not be part of the library-proper).  (By the way, I think this style is now being referred to as 'fluent interfaces' in programming circles).

I can't wait to see what else I can find to use these methods for.  I've already found it to be an amazing benefit to my productivity and the readability of my tests.

Thursday, June 5, 2008

CI / TeamCity is Seriously COOL!

OK... I'll be the first to admit that I'm just getting into Agile processes and I'm still a bit skeptical.  At first, I thought (CI = Continuous Integration) 'CI builds - is it really worthwhile?'.  Now, I've got a TeamCity site & build agent up and going, and I'm totally SOLD!

Here are the benefits as I see them for our situation:

  1. We know almost immediately when someone broke the build (they know too!)
  2. We have better checkin quality now that people are tired of getting those 'compilation failed' emails.
  3. We always have a source to go to for a 'current' build - no need to get the sources and build on your own machine, or go ask a 'build master' to get you a build.
  4. We have other 'automation' points that we can hook into when we're ready to move on to bigger & better methods.

As an example of #4, I hope to soon have our NUnit tests running as part of an automated build.  I also think we can have automated installer builds going if we wanted to.  And, best of all, by virtue of TeamCity's ability to 'watch' our source control server for updates, and it's ability to run any arbitrary command line, NAnt, or MSBuild (or many more) task in response to those updates, the sky is the limit!

I can't wait to get more 'good stuff' implemented on TeamCity.

Wednesday, June 4, 2008

The Day of Bugs

Ok, I can honestly say that today was one of the weirder days that I've had in a long while.  I don't know about others, but I can say with confidence that I've never personally identified a bug in Visual Studio in my career.  I've seen plenty of them mentioned by other folks, I've seen 'features' that I'd be inclined to call a bug (but could be interpreted either way), but I've never really found a bug myself.

Today, I found two.  I guess it's a case of 'when it rains, it pours'.  One of them was known long before I 'found' it, but obviously not known to me.  The other, I'm pretty confident, is still unknown to 'everyone'.

Bug #1 - Dynamic Version vs. BAML.

Ok...  So we've done a fair bit of playing with WPF on my project, and we've done some custom control development (user controls) in WPF for use in our application.  WPF is very convenient for being able to prototype and design the UX/UI of something without being bogged down by all the crap you have to do to customize WinForms (our users seem to never like the 'way it is').  I can say that I feel pretty comfortable that my skills with WPF, while not the best on the block, are probably up there along with most of the folks currently doing WPF development.  I've done a ton of data binding work, and feel pretty confident that I know most of the tricks there - especially thanks to the wonderful work of Bea Costa!

So, I was very puzzled when one of our developers started having issues with running our application's forms that use one of my user controls.  The user control was pretty simple - it was a list of names that had alternating highlights for the rows (one row was white, one was gray, etc.).  It did a few other things, but mostly that was the gist of it.  This is one of the simpler controls we have.  Anyway, the weird thing was that the 'bug' that we kept seeing only appeared when running the debug build of our application, and it only appeared when our UI was being used from the APL application.  It never appeared in release builds, and it never appeared when running debug mode in our UI test harness.

So, naturally, I looked first at the APL runtime, thinking it was a bad install on this dev's machine.  We then took his build and his APL workspace and ran it on my machine.  To my surprise, it crashed on my machine too.  Then, we tried running one of my builds on his machine - it worked (also to my surprise!).  So then, I concluded it was a problem with his machine.

Two days later, after he had gotten some other work done and managed to uninstall all of .NET 2.0 through 3.5, VS2005 and VS2008, and then reinstall all of them (carefully in order), he tried it again.  BOOM!  It still didn't work.  I brought him my old laptop, and I had IT set it up for him to be able to use it instead of his desktop, thinking we'd be rebuilding his desktop from scratch.  All the while, still being puzzled by the fact that the behavior ran around to different machines and environments and was so skittish.

Later that day, he came over to my desk and told me the problem started appearing on his release builds too.  I thought, "oh great - a viral bug!".  He then said that the problem also started appearing on my builds.  At this point, I thought - "ok, there's gotta be something else going on here".

The bad part about this bug was this - whenever you ran the application, it would look like it wanted to pop up an error dialog, in fact it would show the thread exception dialog (System.Windows.Forms.ThreadExceptionDialog) briefly (actually several of them on top of each other), but then the application would disappear before you could do anything.  Apparently, looking back, the problem was on one of WPF's "special" threads and APL apparently doesn't react very nicely to the .NET AppDomain having threads other than the main UI thread throw exceptions.

Finally recognizing that I might be able to do something about this, I went into the code and added an exception handler with a plain-old message box in it (e.ToString()).  Looking at the exception text, I saw that it said something about a XAML parse error and that my ValueConverter couldn't be loaded (I had a ValueConverter as a static resource in my XAML for getting the backcolor brush for doing the highlighting).  This error message pointed me to: Rob Relyea's blog post (along with several MSDN forums posts).  The only thing was that his post didn't apply completely to my issue.  But, the workarounds did.  It turned out that I was using AssemblyVersion(1.0.*) in my files (which I really like for our 'in dev' work), but it was causing problems.  It seems that the reason the bug was so 'fleeting' was that there must have been a timing issue on the fourth bit of the version number (the revision), since it's based on a timestamp.

Apparently, my computer is too fast (most of the time), so I didn't see this bug on my builds, just on my colleagues'!  As I said, this bug has been known about for a long time, and while I'm not thrilled with the workaround, it's there and working, so I'll live with it.

BTW, I spent nearly 4 days chasing this bug, off and on (my colleague did most of the legwork).  It wasn't much fun, being that we couldn't get an error message for the first 3 of those.

Bug #2 - VS2008 STL vs. NUnit, C++/CLI, and ME!

So, I spent the last 4 days chasing what I though must have been a bug in our code, only to figure out that I'm pretty sure it's a bug in MS's STL implementation (for Debug builds).  However, this bug is really hard to find (though it, at least, is VERY consistent - i.e. happens every time and is very reproducible).

First, some background on our design.  Our application consists of two major pieces, a UI, and a backend calculation engine.  The UI is written in .NET 3.0/3.5 (C#), and the backend is written in C++ (native).  However, these systems need to be able to share a common file format.  To meet this need, we developed a generic file library in native C++ code that can be used to read/write files for our system.  The files are basically like MS's structured storage, but with the features and interfaces that we desired for our system (along with a design oriented towards meeting our required performance characteristics).

All of our file structures are build upon this file library, along with several other native libraries that support it and some of the other 'shared' features.  We also have a generic 'key' library (this is a domain concept for us - you can think of it as a property bag with some special 'matching' features).

In order to support using these libraries from both C++ and C# code, we decided we'd write the implementations in native C++ using traditional object-oriented design principles (many of which were lifted from my C#/Java experiences), and then write a thin C++/CLI (the managed C++ language) wrapper around this library using the IJW (it just works) interop supported by C++/CLI.  Then, our C# clients would call into the C++/CLI managed library (not even knowing that it's implemented in C++) just as they would call into managed C# code, but be able to use the underlying data structures and implementation of the native libraries.  I think this design is pretty elegant, and we solved quite a few interesting issues when developing it.  We've been using it now for some time and it's working quite well.

So...  now, the bug.  Just recently we needed to add a new feature to the 'key' library.  This library is very simple and the feature was also quite simple.  I added it to the native code, added it in the managed C++/CLI library, and added the C# unit test code to exercise it.  By convention, we only run our unit tests in Release mode, unless we're debugging them, since we only want to take the time to test in one build environment and it makes most sense to test the bits that are going out the door...

Anyway, so I tested in release mode, the tests all passed, and I checked in.  I was happy and I went home for the day.  The next day, I happened to be compiling in Debug mode and I decided to run what I was working on.  I had forgotten that my startup project in VS2008 was set to the unit tests for the 'key' library, so the unit tests ran instead of what I was intending to test.  To my surprise, the unit tests for the 'key' library 'exploded' (they didn't fail, they caused an AccessViolationException that was caught by VS2008 and popped up in the debugger!).  The AV was showing up on the destructor call for one of our unmanaged C++ objects (native library).

To make matters worse, the bug only showed up when the finalizer was called for the class, and even weirder, only showed up when the variables were not deterministically destroyed (i.e. using IDispose and 'using').  Since my unit test code wasn't using 'using' anywhere, I saw the 'explosion'.  But, I saw it only when the finalizer was called (much later than the offending code, obviously).  I invested some time writing code to track which object was the culprit, and after figuring out which (using 'value numbering', a technique I use a lot in single-threaded debugging of applications with lots of object instances and no unique identifiers in them), I followed the code.  It was not at all obvious why there was a problem.  In fact, I looked at it for several days and couldn't figure out what the problem was.

I then posted here, and called my Arch. Evangelist at MS and talked with him about it, and still couldn't figure out what it is (the spoiler is the last post in the thread).  I finally had to resort to the 'commenting' technique to track down the bug.  I first commented out all unit tests and started adding them back in one by one.  Once I found which unit test failed, I commented out the entire test and started adding code back in block by block.  Once I found the offending block, I looked down into the C++ code and STILL couldn't find anything wrong with it.

At that point, I decided I needed to think outside the box.  I looked at what was different between the various method calls that worked and didn't work, and decided that the throwing of the exception might be a problem.  First, I removed all the code in the offending method.  Now, my test failed, but it didn't explode.  So, I put the code back, and looked at the exception more thoroughly - I decided to move it to the head of the function.  That also caused my test to fail, but it didn't explode.  So then I concentrated on the code before the exception (in the original function).  I kept saying to myself - there's NOTHING wrong with this code!  (you can see the code in the MSDN post).  Finally, I thought - "it has to be this code, so let's assume it's broken and figure out when and why".  I then restructured the code (as described in the MSDN post) and determined that it was absolutely something in the 'begin()' or 'end()' STL vector calls.  There's no way I messed that up - that's their code.  Voilà - bug #2.

For this one, I'm going to have to figure out how to submit it to MS.  I'm sure nobody's seen this one (at least as far as I can tell from searching).

Whew!  Looking forward (hopefully) to not having very many more of those days!