As discussed in Chapter 6, a connected call may fail both due to communication failures and to service-side errors. Similarly, a queued call can fail due to delivery failure or to service-side playback errors. WCF provides dedicated error-handling mechanisms for both types of errors, and understanding them as well as integrating your error-handling logic with them is an intrinsic part of using queued services.
While MSMQ can guarantee delivery of the message if it is technically possible to do so, there are multiple examples of when it is not possible to deliver the message. These include but are not limited to:
As you will see shortly, each message has a timestamp, and the message has to be delivered and processed within that timeout. Failing to do so will cause the delivery to fail.
If the security credentials in the message (or the chosen authentication mechanism itself) do not match up with what the service expects, the service will reject the message.
The client cannot use a local nontransactional queue while posting a message to a transactional service-side queue.
If the underlying network fails or is simply unreliable, the message may never reach the service.
The service machine may crash due to software or hardware failures and will not be able to accept the message to its queue.
Even if the message is delivered successfully, the administrator (or any application, programmatically) can purge the messages out of the queue and avoid having the service process them.
Each queue has a quota controlling the maximum size of data it can hold. If the quota is exceeded, future messages are rejected.
After every delivery failure, the message goes back to the client’s queue where MSMQ will continuously retry to deliver it. While in some cases, such as intermediate network failures or quota issues, the retries may eventually succeed, there are many cases where MSMQ will never succeed in delivering the message. In fact, in practical terms, even a large enough number of attempts may be unacceptable and may create a dangerous amount of thrashing. Delivery-failure handling deals with how MSMQ would know it should not retry forever, after how many attempts it should give up, after how long it should give up, and what it should do with the failed messages.
MsmqBindingBase
offers a number of properties governing handling of delivery failures:
public abstract class MsmqBindingBase : Binding,... { public TimeSpan TimeToLive {get;set;} //DLQ settings public Uri CustomDeadLetterQueue {get;set;} public DeadLetterQueue DeadLetterQueue {get;set;} //More members }
In messaging systems, after an evident failure to deliver, the message goes to a special queue called the dead-letter queue (DLQ). The DLQ is somewhat analogous to a classic dead-letter mailbox at the main post office. In the context of this discussion, failure to deliver constitutes not only failure to reach the service-side queue, but also failure to commit the playback transaction. Note that the service may still fail processing the playback and still commit the playback transaction. MSMQ on the client and on the service side constantly acknowledge to each other receiving and processing messages. If the service-side MSMQ successfully received and retrieved the message from the service-side queue (that is, the playback transaction committed), it sends a positive acknowledgement (ACK) to the client-side MSMQ. The service-side MSMQ can also send a negative acknowledgement (NACK) to the client. When the client-side MSMQ receives a NACK, it posts the message to the DLQ. If the client-side MSMQ receives neither ACK nor NACK, the message is considered in-doubt.
With MSMQ 3.0 (that is, on Windows XP and Windows Server 2003), the dead-letter queue is a system-wide queue. All failed messages from any application go to this single repository. With MSMQ 4.0 (that is, on Windows Vista), you can configure an application-specific DLQ where only messages destined to that specific service go. Application-specific dead-letter queues grossly simplify both the administrator’s and the developer’s work.
With MSMQ, each message carries a timestamp initialized when the message is first posted to the client-side queue. In addition, every queued WCF message has a timeout, controlled by the TimeToLive
property of MsmqBindingBase
. After posting a message to the client-side queue, WCF mandates that the message must be delivered and processed in the configured timeout. Note that successful delivery to the service-side queue is not good enough—the call must be processed as well. The TimeToLive
property is therefore somewhat analogous to the SendTimeout
property of the connected bindings. The TimeToLive
property is only relevant to the posting client, and has no affect on the service side, nor can the service change it. TimeToLive
defaults to one day. After continuously trying and failing to deliver (and process) for as long as TimeToLive
allows, MSMQ stops trying and moves the message to the configured DLQ.
You can configure the time-to-live value either programmatically or administratively. For example, using a config file, here is how to configure a time to live of five minutes:
<bindings> <netMsmqBinding> <binding name = "ShortTimeout" timeToLive = "00:05:00"> </binding> </netMsmqBinding> </bindings>
The main motivation for configuring a short timeout is dealing with time-sensitive calls that must be processed in a timely manner. However, time-sensitive queued calls go against the grain of disconnected queued calls in general, because the more time-sensitive the calls are, the more questionable the use of queued services is in the first place. The correct way of viewing time to live is as a last-resort heuristic used to eventually bring to the attention of the administrator the fact that the message was not delivered, not as a way to enforce business-level interpretation of the message sensitivity.
MsmqBindingBase
offers the DeadLetterQueue
property of the enum type DeadLetterQueue
:
public enum DeadLetterQueue { None, System, Custom }
When set to DeadLetterQueue.None
, WCF makes no use of a dead-letter queue. After a failure to deliver, WCF silently discards the message as if the call never happened. DeadLetterQueue.System
is the default value of the property. As its name implies, it uses the system-wide DLQ, and after a delivery failure WCF moves the message from the client-side queue to the system-wide DLQ.
When set to DeadLetterQueue.Custom
, the application can take advantage of a dedicated DLQ. DeadLetterQueue.Custom
requires the use of MSMQ 4.0, and WCF verifies that at the call time. In addition, WCF requires that the application specify the name of the custom DLQ address in the CustomDeadLetterQueue
property of the binding. The default value of CustomDeadLetterQueue
is null
, but when DeadLetterQueue.Custom
is employed, CustomDeadLetterQueue
cannot be null
:
<netMsmqBinding>
<binding name = "CustomDLQ"
deadLetterQueue = "Custom"customDeadLetterQueue = "net.msmq://localhost/private/MyCustomDLQ">
</binding>
</netMsmqBinding>
Conversely, when the DeadLetterQueue
is set to any other value besides DeadLetterQueue.Custom
, then CustomDeadLetterQueue
must be null
.
It is important to realize that the custom DLQ is just another MSMQ queue. It is up to the client-side developer to also deploy a DLQ service that processes its messages. All WCF does on MSMQ 4.0 is automate the act of moving the message to the DLQ once a failure is detected.
If a custom DLQ is required, then, like any other queue, it is up to the client to verify at runtime, before issuing queued calls, that the custom DLQ exists and, if necessary, to create it. Following the pattern presented previously, you can automate and encapsulate this with my QueuedServiceHelper.VerifyQueue( )
method, shown in Example 9-15.
Example 9-15. Verifying a custom DLQ
public static class QueuedServiceHelper
{
public static void VerifyQueue(ServiceEndpoint endpoint)
{
if(endpoint.Binding is NetMsmqBinding)
{
string queue = GetQueueFromUri(endpoint.Address.Uri);
if(MessageQueue.Exists(queue) == false)
{
MessageQueue.Create(queue,true);
}
NetMsmqBinding binding = endpoint.Binding as NetMsmqBinding;
if(binding.DeadLetterQueue == DeadLetterQueue.Custom
)
{
Debug.Assert(binding.CustomDeadLetterQueue != null);
string DLQ = GetQueueFromUri(binding.CustomDeadLetterQueue);
if(MessageQueue.Exists(DLQ) == false)
{
MessageQueue.Create(DLQ,true);
}
}
}
}
//More members
}
The client needs to somehow process the accumulated messages in the DLQ. In the case of the system-wide DLQ, the client can provide a mega-service that supports all contracts of all queued endpoints on the system to enable it to process all failed messages. This is clearly an impractical idea, because that service could not possibly know about all queued contracts, let alone have meaningful processing for all applications. The only feasible way to make this work is to restrict the client side to at most a single queued service per system. Alternatively, you can write a custom application for direct administration and manipulation of the system DLQ using System.Messaging
. That application will parse and extract the relevant messages and process them. The problem with that approach (besides the inordinate amount of work involved) is that if the messages are protected and encrypted (as they should be), the application will have a hard time dealing with and distinguishing between them. In practical terms, the only possible solution for a general client-side environment is the one offered by MSMQ 4.0; that is, a custom DLQ. When using a custom DLQ, you also provide a client-side service whose queue is the application’s custom DLQ. That service will process the failed messages according to the application-specific requirements.
Implementing the DLQ service is done like any other queued service. The only requirement is that the DLQ service be polymorphic with the original service’s contract. If multiple queued endpoints are involved, you will need a DLQ per contract per endpoint. Example 9-16 shows a possible setup.
Example 9-16. DLQ service config file
////////////////// Client side ///////////////////// <system.serviceModel> <client> <endpoint address = "net.msmq://localhost/private/MyServiceQueue" binding = "netMsmqBinding" bindingConfiguration = "MyCustomDLQ" contract = "IMyContract" /> </client> <bindings> <netMsmqBinding> <binding name = "MyCustomDLQ" deadLetterQueue = "Custom" customDeadLetterQueue = "net.msmq://localhost/private/MyCustomDLQ
"> </binding> </netMsmqBinding> </bindings> </system.serviceModel> ////////////////// DLQ service side ///////////////////// <system.serviceModel> <services> <service name = "MyDLQService"> <endpoint address = "net.msmq://localhost/private/MyCustomDLQ
" binding = "netMsmqBinding" contract = "IMyContract" /> </service> </services> </system.serviceModel>
The client config file defines a queued endpoint with the IMyContract
contract. The client uses a custom binding section to define the address of the custom DLQ. A separate queued service (potentially on a separate machine) also supports the IMyContract
contract. The DLQ service uses as its address the DLQ defined by the client.
The DLQ service typically needs to know why the queued call delivery failed. For that, WCF offers the MsmqMessageProperty
class, used to find out the cause of the failure and the current status of the message. MsmqMessageProperty
is defined in the System.ServiceModel.Channels
namespace:
public sealed class MsmqMessageProperty { public const string Name = "MsmqMessageProperty"; public int AbortCount {get;internal set;} public DeliveryFailure? DeliveryFailure {get;} public DeliveryStatus? DeliveryStatus {get;} public int MoveCount {get;internal set;} //More members }
The DLQ service needs to obtain the MsmqMessageProperty
from the operation context’s incoming message properties:
public sealed class OperationContext : ...
{
public MessageProperties IncomingMessageProperties
{get;}
//More members
}
public sealed class MessageProperties : IDictionary<string,object
>,...
{
public object this[string name]
{get;set;}
//More members
}
When a message is passed to the DLQ, WCF will add to its properties an instance of MsmqMessageProperty
detailing the failure. MessageProperties
is merely a collection of message properties that you can access using a string as a key. To obtain the MsmqMessageProperty
, use the constant MsmqMessageProperty.Name
, as shown in Example 9-17.
Example 9-17. Obtaining the MsmqMessageProperty
[ServiceContract(SessionMode = SessionMode.NotAllowed)]
interface IMyContract
{
[OperationContract(IsOneWay = true)]
void MyMethod( );
}
[ServiceBehavior(InstanceContextMode = InstanceContextMode.PerCall)]
class MyDLQService : IMyContract
{
[OperationBehavior(TransactionScopeRequired = true)]
public void MyMethod( )
{
MsmqMessageProperty msmqProperty = OperationContext.Current.
IncomingMessageProperties[MsmqMessageProperty.Name
] as MsmqMessageProperty;
Debug.Assert(msmqProperty != null);
//Process msmqProperty
}
}
Note in Example 9-17 the practices discussed so far of session mode, instance management, and transactions—the DLQ service is, after all, just another queued service.
The properties of MsmqMessageProperty
detail the reasons for failure and offer some contextual information. MoveCount
is the number of attempts made to play the message to the service. AbortCount
is the number of attempts made to read the message from the queue. AbortCount
is less relevant to recovery attempts, because it falls under the responsibility of MSMQ and usually is of no concern. DeliveryStatus
is a nullable enum of the type DeliveryStatus
defined as:
public enum DeliveryStatus { InDoubt, NotDelivered }
DeliveryStatus
will be set to DeliveryStatus.InDoubt
unless the message was positively not delivered (a NACK was received). For example, expired messages are considered in-doubt because their time to live elapsed before the service could acknowledge them one way or the other.
The DeliveryFailure
property is a nullable enum of the type DeliveryFailure
defined as follows (without the specific numerical values):
public enum DeliveryFailure { AccessDenied, NotTransactionalMessage, Purged, QueueExceedMaximumSize, ReachQueueTimeout, ReceiveTimeout, Unknown //More members }
The DLQ service cannot affect the message properties, such as extending its time to live. Handling of delivery failures typically involves some kind of compensating transaction: notifying the administrator; trying to resend a new message, or resending a new request with extended timeout; logging the error; or perhaps doing nothing, merely processing the failed call and returning, thus discarding the message.
Example 9-18 demonstrates one such implementation.
Example 9-18. Implementing a DLQ service
[ServiceBehavior(InstanceContextMode = InstanceContextMode.PerCall)] class MyDLQService : IMyContract { [OperationBehavior(TransactionScopeRequired = true)] public void MyMethod(string someValue) { MsmqMessageProperty msmqProperty = OperationContext.Current. IncomingMessageProperties[MsmqMessageProperty.Name] as MsmqMessageProperty; //If tried more than 25 times: discard message if(msmqProperty.MoveCount >= 25) { return; } //If timed out: try again if(msmqProperty.DeliveryStatus == DeliveryStatus.InDoubt) { if(msmqProperty.DeliveryFailure == DeliveryFailure.ReceiveTimeout) { MyContractClient proxy = new MyContractClient( ); proxy.MyMethod(someValue); proxy.Close( ); } return; } if(msmqProperty.DeliveryStatus == DeliveryStatus.InDoubt || msmqProperty.DeliveryFailure == DeliveryFailure.Unknown) { NotifyAdmin( ); } } void NotifyAdmin( ) {...} }
The DLQ service in Example 9-18 examines the cause of the failure. If WCF tries more than 25 times to deliver the message, the DLQ service simply drops the message and gives up. If the cause for the failure was a timeout, the DLQ service tries again by creating a proxy to the queued service and calling it, passing the same arguments from the original call (the in parameters to the DLQ service operation). If the message is in-doubt or unknown failure took place, the service notifies the application administrator.