References and Error Handling
Prev
Next

References and Error Handling

MCOP references are one of the most central concepts in MCOP programming. This section will try to describe how exactly references are used, and will especially also try to cover cases of failure (server crashes).

Basic properties of references

  • An MCOP reference is not an object, but a reference to an object: Even though the following declaration

       Arts::Synth_PLAY p;
    
    looks like a definition of an object, it only declares a reference to an object. As C++ programmer, you might also think of it as Synth_PLAY *, a kind of pointer to a Synth_PLAY object. This especially means, that p can be the same thing as a NULL pointer.

  • You can create a NULL reference by assigning it explicitly

       Arts::Synth_PLAY p = Arts::Synth_PLAY::null();
    
  • Invoking things on a NULL reference leads to a core dump

       Arts::Synth_PLAY p = Arts::Synth_PLAY::null();
       string s = p.toString();
    

    will lead to a core dump. Comparing this to a pointer, it is essentially the same as

       QWindow* w = 0;
       w->show();
    
    which every C++ programmer would know to avoid.

  • Uninitialized objects try to lazy-create themselves upon first use

       Arts::Synth_PLAY p;
       string s = p.toString();
    

    is something different than dereferencing a NULL pointer. You didn't tell the object at all what it is, and now you try to use it. The guess here is that you want to have a new local instance of a Arts::Synth_PLAY object. Of course you might have wanted something else (like creating the object somewhere else, or using an existing remote object). However, it is a convenient short cut to creating objects. Lazy creation will not work once you assigned something else (like a null reference).

    The equivalent C++ terms would be

       TQWidget* w;
       w->show();
    
    which obviously in C++ just plain segfaults. So this is different here. This lazy creation is tricky especially as not necessarily an implementation exists for your interface.

    For instance, consider an abstract thing like a Arts::PlayObject. There are certainly concrete PlayObjects like those for playing mp3s or wavs, but

       Arts::PlayObject po;
       po.play();
    
    will certainly fail. The problem is that although lazy creation kicks in, and tries to create a PlayObject, it fails, because there are only things like Arts::WavPlayObject and similar. Thus, use lazy creation only when you are sure that an implementation exists.

  • References may point to the same object

       Arts::SimpleSoundServer s = Arts::Reference("global:Arts_SimpleSoundServer");
       Arts::SimpleSoundServer s2 = s;
    

    creates two references referring to the same object. It doesn't copy any value, and doesn't create two objects.

  • All objects are reference counted So once an object isn't referred any longer by any references, it gets deleted. There is no way to explicitly delete an object, however, you can use something like this

       Arts::Synth_PLAY p;
       p.start();
       [...]
       p = Arts::Synth_PLAY::null();
    
    to make the Synth_PLAY object go away in the end. Especially, it should never be necessary to use new and delete in conjunction with references.

The case of failure

As references can point to remote objects, the servers containing these objects can crash. What happens then?

  • A crash doesn't change whether a reference is a null reference. This means that if foo.isNull() was true before a server crash then it is also true after a server crash (which is clear). It also means that if foo.isNull() was false before a server crash (foo referred to an object) then it is also false after the server crash.

  • Invoking methods on a valid reference stays safe Suppose the server containing the object calc crashed. Still calling things like

       int k = calc.subtract(i,j)
    
    are safe. Obviously subtract has to return something here, which it can't because the remote object no longer exists. In this case (k == 0) would be true. Generally, operations try to return something “neutral” as result, such as 0.0, a null reference for objects or empty strings, when the object no longer exists.

  • Checking error() reveals whether something worked.

    In the above case,

       int k = calc.subtract(i,j)
       if(k.error()) {
          printf("k is not i-j!\n");
       }
    
    would print out k is not i-j whenever the remote invocation didn't work. Otherwise k is really the result of the subtract operation as performed by the remote object (no server crash). However, for methods doing things like deleting a file, you can't know for sure whether it really happened. Of course it happened if .error() is false. However, if .error() is true, there are two possibilities:

    • The file got deleted, and the server crashed just after deleting it, but before transferring the result.

    • The server crashed before being able to delete the file.

  • Using nested invocations is dangerous in crash resistant programs

    Using something like

       window.titlebar().setTitle("foo");
    
    is not a good idea. Suppose you know that window contains a valid Window reference. Suppose you know that window.titlebar() will return a Titlebar reference because the Window object is implemented properly. However, still the above statement isn't safe.

    What could happen is that the server containing the Window object has crashed. Then, regardless of how good the Window implementation is, you will get a null reference as result of the window.titlebar() operation. And then of course invoking setTitle on that null reference will lead to a crash as well.

    So a safe variant of this would be

       Titlebar titlebar = window.titlebar();
       if(!window.error())
          titlebar.setTitle("foo");
    
    add the appropriate error handling if you like. If you don't trust the Window implementation, you might as well use
       Titlebar titlebar = window.titlebar();
       if(!titlebar.isNull())
          titlebar.setTitle("foo");
    
    which are both safe.

There are other conditions of failure, such as network disconnection (suppose you remove the cable between your server and client while your application runs). However their effect is the same like a server crash.

Overall, it is of course a consideration of policy how strictly you try to trap communication errors throughout your application. You might follow the “if the server crashes, we need to debug the server until it never crashes again” method, which would mean you need not bother about all these problems.

Internals: Distributed Reference Counting

An object, to exist, must be owned by someone. If it isn't, it will cease to exist (more or less) immediately. Internally, ownership is indicated by calling _copy(), which increments an reference count, and given back by calling _release(). As soon as the reference count drops to zero, a delete will be done.

As a variation of the theme, remote usage is indicated by _useRemote(), and dissolved by _releaseRemote(). These functions lead a list which server has invoked them (and thus owns the object). This is used in case this server disconnects (that is, crash, network failure), to remove the references that are still on the objects. This is done in _disconnectRemote().

Now there is one problem. Consider a return value. Usually, the return value object will not be owned by the calling function any longer. It will however also not be owned by the caller, until the message holding the object is received. So there is a time of “ownershipless” objects.

Now, when sending an object, one can be reasonable sure that as soon as it is received, it will be owned by somebody again, unless, again, the receiver dies. However this means that special care needs to be taken about object at least while sending, probably also while receiving, so that it doesn't die at once.

The way MCOP does this is by “tagging” objects that are in process of being copied across the wire. Before such a copy is started, _copyRemote is called. This prevents the object from being freed for a while (5 seconds). Once the receiver calls _useRemote(), the tag is removed again. So all objects that are send over wire are tagged before transfer.

If the receiver receives an object which is on their server, of course they will not _useRemote() it. For this special case, _cancelCopyRemote() exists to remove the tag manually. Other than that, there is also timer based tag removal, if tagging was done, but the receiver didn't really get the object (due to crash, network failure). This is done by the ReferenceClean class.

Prev
Next
Home


Would you like to comment or contribute an update to this page?
Send feedback to the TDE Development Team