Problems in bringing a two node cluter.
Hi ,
I have made a two node cluster with Node1 and Node2.I have made separate SG's to run on these two nodes.That is, SG1 ( which is named as Component1SG ) is running on Node1 and SG2 (which is named as Component2SG) is running on node2.
Now I have configured the component1 to have a RMD facility .That is , I am providing one function in "clCompAppMain.c" of Component1 to be used by any Rmd Client.
Now whenever I try to bring any Node , its showing the following errors in the
log file:
***********************************************************
The following is taken in case of second node(with NodeId 3).But same
problem also comes in first node.
******************************************************************
**********************************************
Sat Feb 23 16:40:27 2008 (Node1I0.15933 : AMF.CPM.AMS.00265 : CRITIC) CPM/G active got IOC/TIPC notification for node [3] --
Sat Feb 23 16:40:27 2008 (Node1I0.15933 : AMF.CPM.AMS.00266 : CRITIC) - Possible reasons for this are on node [3] :
Sat Feb 23 16:40:27 2008 (Node1I0.15933 : AMF.CPM.AMS.00267 : CRITIC) - 1. AMF crashed.
Sat Feb 23 16:40:27 2008 (Node1I0.15933 : AMF.CPM.AMS.00268 : CRITIC) - 2. AMF was killed.
Sat Feb 23 16:40:27 2008 (Node1I0.15933 : AMF.CPM.AMS.00269 : CRITIC) - 3. Critical component failed.
Sat Feb 23 16:40:27 2008 (Node1I0.15933 : AMF.CPM.AMS.00270 : CRITIC) - 4. Kernel panicked.
Sat Feb 23 16:40:27 2008 (Node1I0.15933 : AMF.CPM.AMS.00271 : CRITIC) - 5. Communication was lost.
Sat Feb 23 16:40:27 2008 (Node1I0.15933 : AMF.CPM.AMS.00272 : CRITIC) - 6. AMF was shutdown.
Sat Feb 23 16:40:27 2008 (Node1I0.15933 : AMF.CPM.---.00273 : WARN) Not able to find node having node ID [3], error [0xf0013]
Sat Feb 23 16:40:27 2008 (Node1I0.15933 : AMF.CPM.---.00279 : WARN) Not able to find node having node ID [3], error [0xf0013]
***************************************************************
This problem comes in both the nodes. Sometimes it works and some times it doesnt come up.[Please Note that the NodeId of the second node is taken 3 ]
The problem is more frequent in second Node.
I thought the problem concerns TIPC.
So before starting the node again I fired some tipc commands,
>>>rmmod tipc
>>modprobe tipc
and then used
>>./tipc-config -v start --enforce-tipc-settings
But even then I couldnt get away with the problem.

Hi Aditya,
The above logs say the the ASP on one of the nodes has gone down. Since the logs say that the ASP with ASP_ADDRESS 3 has gone down, it should have come on other node on which ASP is still running.
Send me the output of "dmesg" and also of "tipc-config -n".
What are the TIPC configurations(i.e. netid and, addr) on both the nodes?
I am suspecting that there is one more node(other than the two which you are using) in the network, which is using a TIPC netid, that clashs with the same on your nodes.
And if possible please attach the full complete logs generated by both the nodes...
Regards,
Amit