Hello,
Has anyone seen this in profiler? I have two brokers on different servers with one of them being the initiator. All messages end up sitting in the initiator's transmission queue. Profiler on the target broker's machine displays this for every attempt to send by the initiator:
A corrupt message has been received. The End of Conversation and Error flags may not be set in the first sequenced message. This occorred in the message with Conversation ID '...', Initiator: 1, and Message sequence number: 0
In case it's relevant encryption is disabled and endpoints on both servers use windows authentication.
Any hints are greatly appreciated.
Regards,
Svega
Hello,
This is weird. I've tried to reproduce the problem but I couldn't. Can you provide the steps you're doing in the initiator when it ends up like this? What I'm interested in mostly is the exact sequence in which these statements occur: BEGIN TRANSACTION, BEGIN DIALOG, SEND, END CONVERSATION, COMMIT.
Also, what SQL Server versions are the initiator and target? Are they both the November RTM or is one of them an earlier CTP release?
Thanks,
~ Remus
Hello Remus,
Thanks for replying.
Both servers are November RTM. Until yesterday I had the two databases on the same server and everything worked. Then I had moved the target service broker to a different server. Both servers are 2003 Enterprise. Initiator is on a x64 machine, target is on a 32-bit machine. I moved the database by detaching it, copying to the other server and attaching it there. I then created a route to the target service in the intiator's database. Sequence of statements:
1. begin tran
2. begin dialog @.convh from service initiator to service 'target' on contract 'contract' with encryption = off
3. send on conversation @.convh message type [...] (@.body)
4. end conversation @.convh
5. commit tran
When I hit the problem my first thought was it had something to do with security. Since I don't use dialog security and use windows authentication for end points I should not need certificates. However I tried to use certificates at adjacent while investigating so I created certificates in both databases and recreated endpoints to use certificate authentication. I could no longer see the "corrupted message" event on tha target side. Now profiler on the target server does not show any broker activity at all. On the initiator side profiler is constantly getting "Broker:Remote Message Acknowledgement" event and all messages remain in the transmission queue with empty transmission status Netstat shows an established connection between the two servers. I have to mention that I didn't initially create routes back to initiator in the target broker database. I thought I should not need them. I then added them but it didn't change anything.
If you need any more info please let me know.
Regards,
Alex
|||One more observation: I ran Network Monitor on the target server and I could clearly see packets with messages from the initiator coming and tcp acknowledgements being sent back to the initiator. However all acks contain Checksum = ERROR... I am not sure this is what's causing the problem as acks in other communications seem to have the same issue. It could well be Network Monitor's problem. Just thought this could help in some way.
Regards,
Alex
|||Alex,
I could reproduce the problem you describe. From what I've found, two conditions have to be met to run into this problem:
- first SEND and the END statements are in one single transaction
- the BEGIN DIALOG statement contains an explicit broker instance
To work around the problem you have three options:
- separate the first SEND and END conversation into two separate transactions
- don't use an explicit broker instance in the BEGIN DIALOG
- change the message exchange patter from BEGIN/SEND/END into an exchange where the initiator does just BEGIN/SEND and it ENDs it's endpoint as a response to the target sending the END from it's side. In general this later case is a better message exchange patter because the first pattern (the BEGIN/SEND/END) is really a 'fire and forget' scenario where the initiator has no means to ever figure out if the message was delivered successfully or not.
The problem is not related to the dialog security or broker endpoint security settings. It can happen only on remote delivery (between 2 SQL instances). The system architecture (x86, AMD64, IA64) doesn't matter either.
The fact that the message gets dropped on the problem scenario is a bug. You can use the SQL Product Feedback center at http://lab.msdn.microsoft.com/productfeedback to report this issue.
Thanks,
~ Remus
Remus,
Thanks a lot for the finding. Of the tree options I have to use broker instance for routing to work because I have multiple instances of the same service on the same server. I will look how I can get around this with the remaining two. Thanks for the prompt investigation.
Regards,
Alex
|||Remus,
I tried option #1(separate first SEND and END CONVERSATION into different trans) and it doesn't seem to make any difference. Messages are still sitting in the transmission queue. The only change is that now profiler does not show any broker events on either of the servers. But in NetMon I can see that initiator tries to deliver. I will see if I can use #3 somehow ot at all.
Regards,
Alex
|||This is probably unrelated and I suspect is a misconfiguration somewhere in the broker endpoint security or in the routes. Try to follow the steps in this post http://blogs.msdn.com/remusrusanu/archive/2005/12/20/506221.aspx to diagnose the problem.
HTH,
~ Remus
P.S. I now see you actualy read those, since you posted a comment
Please make sure you select all events in the Broker category, as well as 'Security Audit/Audit Broker Login' and 'Security Audit/Audit Broker Conversation'.
Does the transmission_status column on the initiator side has any value?
|||I did select all the events. Profilers on both servers show no traces. transmission_status is empty. I've looked at option #3 and it's a little messy to change the message exchange pattern to let target decide when to end dialog. In my case upstream service doesn't really depend on whether the next hop received and processed a message. So the send and forget logic fits well. Still I will spend some more time looking at #3.
Regards,
Alex
Can you check if the sys.dm_broker_connections view contains the row for the broker connection between the 2 SQL instances involved? If it does, can you see if the total_fragmets_sent/total_fragments_received values change?
When an acknowledgemnt for a message was not received the sender will retry that message roughly one every minute. I expect this retry to generate profile events each time it occurs, and also the sys.dm_broker_connections view to change the values, reflecting that the message was actually sent (retry).
Thanks,
~ Remus
Yes there is a row for the connection. total_fragments_sent increases about once every minute by the number of messages in the transmission queue. total_fragments_received does not change. But it's not zero (fragments sent = 7748 and increasing, fragments received = 6). Profiler does not show any activity with all broker events selected plus Audit Broker Conversation and Audit Broker Login. I tried to begin dialog without target broker id and messages still get stuck in the transmission queue. So far I have not been able to send messages to a remote service. Locally everything works fine.
Alex
|||So the initiator is sending messages, but the target is either dropping them either unable to send back acks for them. Both these scenarios should generate plenty of noise in the profiler on the target. No activity at all in the profiler seems very suspicious to me. I know it sounds silly, but can you ensure that you are connected to the right SQL instance and the profile is actually started?
Also, can you check if the target conversation endpoint was created? Look in sys.conversation_endpoints on the target database and see if you find the target conversation endpoint. You can use the conversation_id column to match the initiator conversation endpoint with the target.
Thanks,
~ Remus
I am back in business with profiler. It was my bad. After I brought up Profiler I unchecked all events that were selected by default in Security Audit, Sessions, etc... so that no event is selected. It's interesting that when you uncheck all events and then click the Show All Events checkbox it hides all columns. Unless you click the Show All Columns checkbox no column is displayed. Then even if you select events to trace without columns profiler does not show any traces. It was late yesterday... :)
I ran some more tests without specifying broker id in begin dialog. Messages get delivered to target queue but what's interesting is that the same messages also remain in transmission queue of initiator. On the target side profiler shows a bunch of Broker:Message Undeliverable with this text: "The message has been dropped because the service broker in the target database is unavailable: 'The service broker is administratively disabled'". However in sys.databases the database has is_broker_enabled=1.
I also tried to end dialogs as you suggested (target ends first) but then realized that if messages are not delivered to target then it won't have a chance to end dialog on its side.
I am now looking to get around this by directly putting messages into the target service's queue over a separate connection to the target database. I understand that this is risky as data consistency can be compromised but for now there doesn't seem to be another way.
Regards,
Alex
|||Remus,
I also see undeliverables with this text: "This message could not be delivered because the 'receive sequenced message' action cannot be performed in the 'CLOSED' state". And then come all those with "service broker is administrativeli disabled".
Alex
No comments:
Post a Comment