Thursday, February 28, 2013

How We Manage Development - Organizing TFS

In the first post, I presented an overview of how we have architected our development infrastructure to support multiple clients. In this article, we will look at how we organize TFS to support our projects.

Team Foundation Server contains a lot of features to support project management. We have chosen to keep Team Foundation Server a developer-only environment, and manage the customer-facing project management somewhere else using SharePoint. In fact, even our business consultants and project managers do not have access or rarely see this magical piece of software we call TFS. (I bet you some are reading these articles to find out!)
There are several reasons for this, and these are choices we have made that I'm sure not everyone agrees with.
- Developers don't like to do bookkeeping. We want to make sure the system is efficient and only contains information that is relevant to be read or updated by developers.
- We need a filter between the development queue and the user/consultant audience. Issues can be logged elsewhere and investigated. AX is very data and parameter driven so a big portion of perceived "bugs" end up to be data or setup related issues. If anyone can log issues or bugs in TFS and assign them to a developer, the developers will quickly start ignoring the lists and email alerts because there are too many of them. TFS is actual development work only, any support issues or investigative tasks for developers/technical consultants are outside of our TFS scope.
- Another reason to filter is project scope and budget. Being agile is hot these days, but there is a timeline and budget so someone needs to manage what is approved for the developers to work on. Only what is approved gets entered in TFS, so that TFS is the actual development queue. That works well, whether you go waterfall or agile.
- Finally, access to TFS means access to a lot of data. There are security features in TFS, of course, but they are nothing like XDS for example. This means check-in comments, bug descriptions, everything would be visible. There are probably ways to get this setup correctly, but for now it's an added bonus of not having to deal with that setup.

Each functional design gets its own requirement work item in TFS. Underneath, as child workitems, we add the development "tasks". Initially, we just add one "initial development" task. We don't actually split out pieces of the FDD, although we easily could i guess. When we get changes, bugs, basically any change after the initial development, we add another sub-task we can track. So, for example, the tasks could look like this in TFS:

This allows us to track any changes after the initial development/release of the code for the FDD. We can use this as a metric for how good we are doing as a development group (number of bugs), how many change requests we get (perhaps a good measure of the quality of the FDD writing), etc. It also allows us to make sure we don't lose track of loose ends in any development we do. When a change is agreed upon, the PM can track our TFS number for that change. If the PM doesn't have a tracking number, it means we never got the bug report.
It is typically the job of the project's development lead to enter and maintain these tasks, based on emails, status calls with PMs, etc. I will explain more about this in the next article.

On to the touchy topic of branching. There are numerous ways to do this, all have benefits and drawbacks. If you want to know all about branching and the different strategies, you can check out the branching guide written by the Microsoft Visual Studio ALM Rangers. It is very comprehensive but an interesting read. There is no one way of doing this, and what we chose to adopt is loosely based on what the rangers call the "code promotion" plan.

Depending on the client we may throw a TEST branch in between MAIN and UAT. We have played with this a lot and tried different strategies on new projects. Overall this one seems to work best, assuming you can simplify it this way. Having another TEST in between may be necessary depending on the complexity of the implementation (factors such as client developers doing work on their systems as well - needing a merge and integrated test environment, or a client that actually understands and likes the idea of a pure UAT - those are rare).
One could argue that UAT should always be moved as a whole into your production environment. And yes, that is how it SHOULD be, but few clients and consultant understand and want that. But, for that very reason, and due to the diversity of our clients' wants and needs, we decided on newer projects to make our branches a bit more generic, like this:

This allows us to be more consistent in TFS across projects, while allowing a client to apply these layers/models as they see fit. It also allows us to change strategies as far as different environment copies at a client, and which code to apply where. All without being stuck with defining names such as "UAT" or "PROD" in our branch names.

So, the only branch that is actually linked to Dynamics AX is the MAIN branch. Our shared AOS works against that branch, and that is it. Anything beyond that is ALL done in Visual Studio Explorer. We move code between branches using Visual Studio. We do NOT make direct changes in any other branch but MAIN. Moving code through branch merging has a lot of benefits.

1) We can merge individual changes forward, such as FDD03, but leave FDD04 in MAIN so it is not moved. This requires developers to ALWAYS do check-ins per FDD, never check-in code for multiple FDDs at the same time. That is critical, but makes sense - you're working on a given task. When you go work on something else (done or not), you check in the code you're putting aside, and check out your next task.
2) By moving individual changes, TFS will automatically know any conflicts that arise. For example, FDD02 may add a field to a table, and the next FDD03 may also add a field to that same table. But both fields are in the same XPO in TFS - the table's XPO. Now, TFS knows which change you made when... so whether we want to only move FDD02, or only FDD03, TFS will know what code changes belongs to which. (more on this in the next article) This removes the messy, manual step of merging XPOs on import.
3) The code move in itself becomes a change as well. For clients requiring strict documentation for code moves (SOx for example), the TFS tracking can show that in the code move, no code was changed (ie proving that UAT and PROD are the same code). We also get time stamps of the code move, user IDs, etc. Perfect tracking, for free.
4) With the code move itself being a change set, we can also easily REVERT that change if needed.
5) It now becomes easy to see which pieces of code have NOT been moved yet. Just look for pending merges between branches.

So, I can look at my history of changes in MAIN, and inquire into the status of an individual change. In the first screenshot, you see a changeset that was merged all the way into RELEASE. My mouse is hovered over the second branch (test) and you can see details. The second screenshot shows a change that has not been moved into RELEASE yet.

Obviously you can also get lists of all pending changes etc. We actually have some custom reports in that area, but you can get that information out of the box as well.

So, for any of you that have played with this and branching, the question looming is: how can you merge all changes for one particular FDD when there are multiple people working on multiple FDDs. TFS does not have a feature to "branch by work item" (if you check in code associating it to a work item, one could in theory decide to move all changes associated with that work item, right?). Well, the easiest way we found is to make sure the check-in comment is ALWAYS preceded with the FDD number. So, when I need to merge something from MAIN to TEST, I just pick all the changesets with that FDD in the title:

Yes, you may need several merges if there are a lot of developers doing a lot of changes for a lot of FDDs. However, it forces you to stay with it, and it actually gets better: if you merge 5 changesets for the same FDD at the same time, the next merge further down to the next branch will only be for the 1 merge-changeset (which contains all 5). Also consider the alternative of you moving the XPOs manually and having to strip out all the unrelated code during the import. After more than 2 years of using TFS heavily, I know what I prefer :-) (TFS, FYI).

Once code is merged, all that needs to happen is a right-click by the lead developer to start a new build - no further action required. This will spin up TFS build using our workflow steps, which imports all the code, compiles, and produces the layer (ax 2009) or model (ax 2012). More details on this in the next article.

Once the build is completed (successfully or not), we get an alert via email. If the build was successful, the model or layer will be on a shared folder - ready to be shipped to the client for installation, with certainty that it compiles. We actually have a custom build report we can then run which, depending on the client's needs, will show which FDDs and which tasks/bugs were fixed in this release, potentially a list of the changed objects in the build, etc. Without the custom report, here's what standard TFS will give you:

As you can tell by the run time (almost 3 hours), this is AX 2012 :-)

This has become a lengthy post, but hopefully interesting. Next article, I will show you some specific examples (including some conflicts in merging etc) - call it a "day in the life of a Sikich AX developer"...

Wednesday, February 27, 2013

How We Manage Development - Architecture

With all the posts about TFS and details on builds etc, I thought it was time to take a step back and explain how we manage development in general. What infrastructure do we have, how does this whole TFS business work in reality?! Let me take you through it.

First, let's talk about our architecture. We currently have two Hyper-V servers. These host a development server per client. These development servers contain almost everything on them except AX database, we use a physical SQL server for that. So the development servers contain AOS, SSRS/SSAS, SharePoint/EP, Help Server and IIS/AIF (where needed). Each development server is tied to a specific project in TFS for that client.
In addition to the main AOS that is used to develop in (yes we still do the "multiple developers - one AOS" scenario), there is a secondary AOS on each machine that is exclusively used for the builds. Typically developers do not have access to this AOS or at least never go in (in fact the AOS is usually shut off when it's not building, to save resources on the virtual machine).

There are several advantages to having a virtual development server per client:
- Each server can have whichever version of AX kernel we need (potentially different versions of SQL etc)
- If clients wish to perform upgrades we can do that without potentially interfering with any other environments
- Any necessary third-party software can be installed without potentially interfering with any other environments
- The virtual servers can be shut down or archived when no development is being done

I'm sure there is more, but those are the obvious ones.

Traditionally, VARs perform the customer development on a development server at the customer site. We used to do this as well (back in the good ol' days). There are several drawbacks to that:
- Obviously it requires an extra environment that, if the client is not doing their own development, may solely be for the consultant-developers. With AX 2012 resource requirements per AOS server have skyrocketed and some clients may not have the resources or budget to support another environment.
- We need source control. This is the 21st century, developing a business-critical application like AX without some form of source control is irresponsible. MorphX source control can be easily used (although it has limited capabilities), but any other source control software (that integrates with AX) requires another license of sorts, in case of TFS you also need a SQL database, etc. Especially for clients not engaging in development efforts, this is a cost solely to support the consultant-developers.
- Segregation of code and responsibilities between VAR and clients that do their own development. Developing in multiple layers is the way to do that. But working with multiple developers in multiple layers (or multiple models) with version control enabled is a nightmare.
- No easy adding of more resources (=developers) for development at critical times, since remote developers need remote access, user accounts, VPNs, etc. Definitely doable, but not practical.

As with most things development in the Dynamics AX world, any new customer has no issue with this and finds it logical and accepts the workflow of remote development, deploying through builds, etc. It is always customers that have implemented AX the traditional way (customers upgrading, etc) that have a hard time changing. It is interesting how this conflicts with a lot of VARs switching to off-shore development, essentially the ultimate remote development scenario. At our AX practice within Sikich we only use internal developers that work out of our Denver office. (Want to join us in Denver?)

We don't have all of our clients up on this model, but all new clients are definitely setup this way. As of this writing, we have 11 AX2009 client environments and 9 AX2012 client environments running on our Hyper-Vs and tied into and managed by TFS.

With that out of the way, we'll look at our TFS setup in the next post.

Wednesday, February 6, 2013

AX 2012's Hidden Compile Errors

This post title caught your attention huh? Well, it should. For about half a year or longer now, we've been struggling with a strange issue with AX 2012 compiles. We finally found more or less what the issue was, but were unable to reliably replicate the issue. Finally in October, one of our developers here at Sikich took the time to replicate an existing issue in a vanilla AX environment and was able to reliably replicate the issue for Microsoft to investigate. And I'm glad to say that we now officially received a kernel hotfix.

For future reference, this issue pertains to all current versions of AX 2012 as of this date (February 6th, 2013). This means:

AX 2012 RTM
AX 2012 CU1
AX 2012 CU2
AX 2012 CU3
AX 2012 CU4
AX 2012 FPK
AX 2012 FPK CU3
AX 2012 FPK CU4
AX 2012 R2

Obviously if you received kernel hotfixes for any of these versions, you may have received this kernel hotfix as well. I don't have build numbers for all of these yet.

Ok, so on to the issue. As you are probably aware as a reader of this blog, we heavily rely on automated builds using TFS. We started noticing builds completing successfully without error, but once installed at a client they resulted in compile errors. Depending on the type of error this may or may NOT result in a CIL generation error as well.
We found out that indeed there were compile errors, but they were never captured in our automated build. Turns out that it is not our build process, but AX that is not reporting the compile errors.

Let me provide you with the example we gave Microsoft. Note this is not a direct CODE example per se, but a compile error nonetheless. There are plenty of other opportunities for compile errors not being thrown, this is just one easy and quick example. Things we saw were missing methods, missing variable declarations, all sorts of things that SHOULD result in a compile error but weren't. Note that an individual compile on only 1 object always results in an error correctly, this issue applies to full compiles or multiple-object compiles (full AOT compile, compiling an AOT project, etc).

Let's start by creating a new project for a service we'll be creating. I called mine "DMTestService". Next, we'll create a service. We create a new class called "TestServiceClass", and add a method "MyOperation". I'm not going to put any code in it since we're just interested in the compile error and not in it actually doing anything.

public void MyOperation()

Next, we create a new service (right-click the project and select New > Service or go to the Services node in the AOT, right-click it and select "New Service"). We will call this service "TestService". Right-click the new TestService and select "Properties". In the properties find the "Class" property, and enter the name of the class we created previously (in my case "TestServiceClass").

Next, right-click on the "Operations" node under your service, and select "Add Operation".

On the Add Service Operation dialog, click the Add checkbox for the MyOperation method, and click OK. This will add the operation to the service.

Alright, good enough. So now if you right-click compile the service, no errors. Now let's break this! Open the MyOperation method on the TestServiceClass and make the method static.

public static void MyOperation()

This will make the operation on your service invalid, as there is no longer an instance method called "MyOperation". And indeed, if you right-click on the TestService and select "Compile", you get this compile error:

And now what we've all been (not so much) waiting for... Right-click your PROJECT and select compile... The error goes away... no errors in the project!!
If you take an XPO of this and import somewhere else, you'll also notice the XPO import does not throw a compile error. If you do a full AOT compile, no error is thrown.

For those of you questioning if this really is a compile error, it's not about the TYPE of compile error necessarily, the problem is errors are reported in the compiler output but cleared out later. The issue has to do with the number of compile passes AX does.

Thanks to the members of my development team to figure out an easy way to reproduce the issue so we could report to Microsoft!

If you wish to get the hotfix from Microsoft support, the KB number is 2797772. It is listed on the "hotfixes" page on PartnerSource, if you have access to that (no download link, you'll have to request it from Microsoft).