Bulk file import into Clearcase

If you have used good source control systems such as subversion or perforce, you will know that adding a whole pile of files (as you would at the beginning of a project) is a trivial task. In the case of subversion it's just two commands, svn add, svn commit, and you are done. If however you have the misfortunte to have to use IBM Rational Clearcase you are a little out of luck.

So before you go and start adding the two hundred files one by one using cleartool mkelem, take a look at this option:

clearfsimport -nsetevent -recurse c:\tmp\srcfolder\new c:\views\destination_view_folder\srcfolder

The catch is that the files must initially be stored somewhere outside your view, which is unlike other source control systems. With clearfsimport you have the files elsewhere and then 'import' them into your view. Clearfsimport also supports a '-preview' switch to tell you what it plans to do without doing it.

Now that you can do this easily, you may want to consider moving dependencies into source control. So if you depend on a particular version of some control or tool, add it to source control and wire your system to use the controlled item. Managing dependencies in this manner can alleviate all sorts of problems with dependencies and gives you an extra level of control over your development environment.

T41 VGA out on Xubuntu

Video output to an external monitor is easy right? Well it was on Windows XP, but with Xubuntu (8.04 Hardy Heron) running on an IBM Thinkpad T41 turned out to be a little trickier than hitting function-F7.

Firstly hitting function-F7 doesn't do anything, even though all the other function buttons (keyboard light, screen brightness, etc) work fine. If you leave the external monitor plugged in, and reboot, you should see the port come to life. Next you run your favourite video player (like VLC) and notice that the video only appears on the laptops LCD display, and not on the external monitor. To solve this simply:

  1. sudo apt-get install xvattr
  2. xvattr -a XV_CRTC -v 1

Done. That should switch the video output to the VGA port. To switch back use this:

  1. xvattr -a XV_CRTC -v -1

If you haven't had a chance to look at Xubuntu, you should check it out. It's essentially Ubuntu but with the Xfce desktop manager. Much smaller and faster than Gnome and Kde. Great for old hardware (like my T41).

Ruby on Rails

So have you played with Ruby or Rails yet? No? Well you might want to give it check it out. The Ruby language is very interesting and Rails, which is built with Ruby, is a very powerful web application development framework.

To get you started take a look at these sites:
http://www.rubyonrails.org/ – All about rails.
http://www.ruby-lang.org/en/ – All about ruby.

Then you should check out the fifteen minute hands-on tutorial here. This is a great overview of the language and will get you started. If you are interested and want to get an environment set up, there are instructions on rubyonrails.org for various platforms. The only additional tool I would recommend is RadRails from Aptana, especially if you are already familiar with Eclipse.

Ruby is a dynamic, duck-typed, programming language. Firstly take a look at this:

class Dummy
end

dummyA = Dummy.new
Dummy.send(:define_method, :hi) do
  print "Hello world!\n"
end

dummyA.hi

~/ruby$ ruby dynamic.rb
Hello world!

Here we have dynamically defined a method on Deal called hi. Pretty straight forward. Now combine this with method_missing, which is called when… you guessed it.. you call a method on a class that doesn't exist.

class Dummy
  def method_missing(m, *args)
    puts("There's no method called #{m} here please try again.")
  end
end

So now we have a way of creating new methods dynamically, and also knowing what the user tried to call and being able to intercept those calls. So when the user does something like this:

User.find_by_name_and_email('someone', 'someone@test.com')

You can generate the find_by_name_and_email method to go look up the database, find a user table, look to see if it has columns called name and email, and then do a query. This ability gives Rails incredible power to make life easier for the developer as I am sure you can appreciate.

Transparent build systems

Whenever you deal with a software project, you deal with a build system. The method and process by which you turn your code into a product. This may involve many steps such as pre-processing, compiling, linking, moving files around, and signing among others. The trouble is that on some projects this system becomes an afterthought rather than a purposeful choice. Today I want to briefly talk about the benefits obtained from a good build system.

So why think seriously about a build system? Why not fire up Visual Studio, hit 'New Project', and just start coding? The simple answer is that you will eventually get stuck, and the last thing you want to get stuck with is a broken limiting build system that is difficult to decipher and extend. A good system will give you:

  • A deterministic repeatable build. This will give you confidence that when you create a build it will actually work, the same way as on other machines, every time. No weird oddities or unexplainable results.
  • Complete transparency over the build. Including every step, action, process, parameters and files involved, dependencies, and the control by which to alter any of the these.

"Why is this good? NMake and command lines are painful, should we not have more abstraction?" you may ask. To a degree, yes, modern IDEs and some build tools make things easier, however they are generally designed to handle the common case, not anything complex. For example if your build involves building with different versions of a framework (such as a product targetting v1.1 and v2.0 of the .Net Framework), requiring input and output files in specific folders (seperating source files from generated ones is very helpful for both source control, tracking, and cleaning purposes), or being able to reproduce build steps individually for debugging.

So what should you do? Its easy, just think about your requirements, the complexity of your project (both now and future), and ensure that you have enough transparency in the build system so that you can get what you need done as well as have the ability to view/fix/extend when necessary. In particular make sure:

  • Build settings are stored somewhere easy to view.
  • You have full control over where files are written to. The last thing you want is to have random temporary files generated all over your source tree (that you then, accidentally, check in to the depot).
  • The build is deterministic. Deleting the output and rebuilding gives you the exact same thing every time.
  • You have visibility into each step of the build, so that when something goes wrong you can see what caused it.
  • You have control over the tools and dependencies used.

Basically as much transparency over the process as you can get. Other than that – don't be afraid of make /nmake /rake (or whichever dependency build utility applies to your project) and a command line. They work great! 

Oleview hangs on Vista

Those who have upgraded to Vista may have many issues to discuss, however today I just wanted to mention one in particular. Typelibs. In particular how to view them.

As you may know a typelib is simply a file containing type information. Information such as data types, interfaces, object classes, member functions, etc. All the things you need to know about to use an OLE object. Now once compiled in their binary form (tlb files) these are a little hard to read so its nice to transform a type library back to Interface Definition Language (IDL).

There are of course many such tools, however Oleview.exe, which comes with Visual Studio, does this for you (File->View TypeLib). This is very handy indeed. However it seems there is an issue with running Oleview from an elevated command prompt in Vista… it hangs. There are a few suggestions on the web but the one that worked for me was thanks to Shri Borde. I.e. don't run Oleview from an elevated command prompt. I'd be curious to find out why this is the case.

The D programming language

Is there really another single letter programming language? Yes, its called D . The D programming language comes from Walter Bright and Digital Mars. Its been around for a while, but with the recent v1.0 release its probably a good time to take a look.

D is a systems programming language focused on taking the performance and power elements from C/C++ and combining them with the productivity of languages such as Ruby and Python. D compiles directly to native code and looks very much like C/C++. There is a good comparison of the language here.

If you want to get started, the first thing is to download the compiler. Then write a little sample code:

int main(char[][] args)
{
    printf("hello world\n");
    printf("args.length = %d\n", args.length);
    for (int i = 0; i < args.length; i++)
       printf("args[%d] = '%s'\n", i, cast(char *)args[i]);
    return 0;
}

Looks very similar to C on first inspection, with a few twists. In addition D provides for a garbage collector, contract programming, resizeable arrays, built-in strings, an inline assembler, and much more.

The built in strings are particularly useful. D introduces a new binary operator '~' for concatenation.

const char[5] abc = "world";
char[] str = "hello" ~ abc;

With contract programming D gives you not only asserts, but also pre and post contracts to validate function behaviour:

in
{
  ...contract preconditions...
}
out(result)
{
  ...contract postconditions...
}
body
{
  ...code...
}

So contract programming, built-in strings, resizable arrays… these things are not terribly new. They can often be implemented with some discipled programming and perhaps some extra C++ modules. So does D bring something serious to the playing table? I vote yes. Just as C# and the .Net framework enable developers to code at a higher level and be more productive (in certain applications of course), D could provide similar benefits to the lower level systems programming area where C/C++ still dominate.

A new look with Drupal 5.0

You may have noticed a change on ArminSadeghi.com. Last night the site was upgraded to use the new Drupal 5.0. Drupal is the content management software I use to run the site. Along with the upgrade there is a whole new theme in place providing a fresh look.

The upgrade went smoothly and I was able to put the site all back within an hour. If you are in a similar position I would recommend upgrading. There is a good video tour of the new features here, and upgrading info here. The most important step is, of course, to back up your site first.

Windows Live Messenger Add-Ins

When software applications do what they are advertised to do users are happy (generally), but what if you want to do more? Thats where you generally start looking for some method to extend the application. Not every application will allow this, but Windows Live Messenger (WLM) is one that does (at least currently with version 8.0). How, you ask? By writing your very own Add-In with the Messenger Add-In API …

WLM only offers this one API (Application Programming Interface), so it’s not like Microsoft Excel where you have a vast array to pick from (i.e. .NET APIs, C APIs, COM Automation, VBA and XLM). You do get to pick a language that supports .NET and then proceed with the following:

  1. Download and install the .Net 2.0 SDK . This is a minimum requirement. You can also use Visual Studio .NET 2005, or Visual Studio Express (which is a free download), both of which come with the .NET 2.0 SDK.
  2. Write your Add-In. Compile and get the DLL. Full instructions, along with the special reg-key to set, and what to call your DLL to make things work are on MSDN .
  3. Add the Add-In to Windows Live Messenger.

The code is simple enough. Below is an example of an Add-In that updates your personal status message every few seconds to show how long you have been online.

using System;
using System.Timers;
using Microsoft.Messenger;
namespace WLM
{
  public class CounterAddIn : IMessengerAddIn
  {
    static MessengerClient m_client;
    static int m_cSeconds;
    Timer m_timer;

    void IMessengerAddIn.Initialize(MessengerClient client)
    {
      m_client = client;
      m_client.AddInProperties.FriendlyName = "CounterAddIn";
      m_cSeconds = 0;
      m_timer = new Timer(3000);
      m_timer.Elapsed += new ElapsedEventHandler(OnTimedEvent);
      m_timer.Enabled = true;
    }

  private static void OnTimedEvent(object source, ElapsedEventArgs e)
  {
    m_cSeconds += 3;
    m_client.AddInProperties.PersonalStatusMessage = "Online for " + m_cSeconds.ToString() + " seconds.";
    }
  }
}

You can also do things such as change your personal image, status, or even respond to users messages (although users always know that the Add-in is running) with automated replies.

One important step is setting the magic registry key such that Messenger lets you add your own custom Add-In. When the key is set correctly Messenger provides controls on its Options page for Add-In support.

Key: HKEY_CURRENT_USER\Software\Microsoft\MSNMessenger\AddInFeatureEnabled

Value: DWORD set to “1” to enable, and “0” to disable Add-Ins.

Manual vs automatic memory management

Good memory management is essential for writing software applications that perform well. If the application takes too long to start or frustrates you as it completes operations, it doesn't make for a good experience. And there are many factors such as response time, working set, and hardware requirements to consider when dealing with performance. However memory management is a key ingredient, and deciding between manual and automatic systems can make a big difference.

This is such a large topic. Where should I start? …

Lets start with some definitions. Manual memory management is when the programmer manually controls the lifetime of allocated memory by specifically allocating and freeing it in a deterministic fashion. Alternatively, automatic memory management tries to determine what memory is no longer used and frees it automatically instead of relying on the programmer to identify it. Automatic memory management is sometimes referred to as Garbage Collection (GC), however "garbage" could be defined as anything, so the term is a little vague. GC often refers to tracing garbage collection, one form of automatic memory management. Reference counting is an alternative automatic memory management method (when you Release an object it isn't necessarily freed, it all depends on the reference count, so as a consumer you do not control memory deallocation). The choice here is mostly independent of programming language. There are some languages that support manual management (such as C, C++), others that support automatic management with tracing GCs (such as Java, and C#), and others still that support both (like D).

So which one is better? Well the truth is that it all depends. There are many pros and cons to each method (discussed at length on wikipedia). In the end you have to pick the solution based on your specific requirements. However today lets talk about performance in particular. If you have some crazy high performance requirements (perhaps a real-time application), what do you do? …. you get more control.

By using manual memory management you are gaining more control over when memory is allocated and deallocated, giving you, the developer, more control over how to deal with it. You can then be mindful of such things as memory locality, consumption in tight loops, and memory reuse, while avoiding indeterministic deallocation (tracing garbage collectors). You can still have enough control with automatic memory management if you stick with ref counting as a means to control memory/object lifetime. However there is a cost to be paid for these advantages, mostly in development difficulty – the more control you have, the more likely you are to make mistakes (mistakes here lead to memory leaks). And mistakes are bugs.. some bad, some really bad.

COM aggregation and ref counting woes

Why are we talking about Component Object Model (COM), isn't that old dead technology? Well… no. There are still so many COM objects in use today, in many projects, that you will run into them sooner or later. As a software engineer you might even have to resolve bugs in these components. Today I want to draw attention to ref counting bugs that can creep in when using aggregation within these objects.

COM objects use reference counting to control their lifetime. This is achieved through the implementation of an IUnknown interface by each and every object. This interface contains (quick – what are the first three v-table entries?) IUnknown::QueryInterface, IUnknown::AddRef and IUnknown::Release. So after you add a reference to an object with AddRef, you are expected to call Release when you are finished. Reference counting bugs can crop up in object clients when someone forgets this rule and are usually a real pain to find. This difficulty can be compounded even further if the object itself messes up its implementation of AddRef.

The implementation of AddRef is typically simple (just increment an internal counter), however when the object aggregates other objects in its internal implementation (another form of object reuse as opposed to using containment) it becomes more complex. In these cases you have to ensure that AddRef and Release calls operate on the correct object.

Lets take the example of a hypothetical AIRPLANE object. An AIRPLANE is implemented by aggregating WING, and ENGINE objects. WING implements IWing and IUnknown, ENGINE implements IEngine and IUnknown. Now clients of the AIRPLANE object would expect that IWing::AddRef, and IEngine::AddRef both control the lifetime of the outer object (the component/object doing the reusing – in this case AIRPLANE). This is fair and reasonable, however the only way this can happen is if the inner objects (WING and ENGINE) are aware of the outer object. So when WING and ENGINE are created, AIRPLANE passes a pointer to its IUnknown implementation (called the controlling unknown) down to the inner objects. If they support aggregation then they will use this pointer to handle any IUnknown calls that come in through IWing and IEngine. If they do not support aggregation they will return CLASS_E_NOAGGREGATION and fail creation. However ENGINE and WING must not delegate to the controlling unknown for any AddRef and Release calls that come in through IUnknown itself, as this is what the outer object will use to control their lifetime.

These are just the basic rules of aggregation that when applied ensure that object lifetime is still managed correctly. Common problems arise when the inner objects forget to delegate the AddRef and Release calls to the controlling unknown, or do so in the wrong case (i.e. when called through IUnknown::AddRef). In this case the client of AIRPLANE may see what appears to be a ref counting bug on their side, but is actually an internal issue with the aggregation.

… no wonder people like managed code and .NET. Ref counting bugs can get tricky.