can't startup csa102 on payload nodes

Submitted by kaya on Thu, 2007-01-11 01:11.
can't startup csa102 on payload nodes. situations: I configure to run 2.2 setup for Evaluation System, using three PCs. And add the statement below to the target.conf IP_SCNodeI0=10.182.112.142 SLOT_SCNodeI0=1 IP_PayloadNodeI0=10.182.112.101 SLOT_PayloadNodeI0=3 IP_PayloadNodeI1=10.182.112.102 SLOT_PayloadNodeI1=4 also I commented out comment out "sispServiceUnit name="cmSU"" in project-area_dir/SISP/models/eval/config/amfConfig.xml then I configure --with-model-name=eval and make and deploy the images to these three machines. then I execute "/root/sisp/etc/init.d/sisp start" on these three machines to start sisp. In SCNodeI0, csa101, csa104 can be started normally. The /var/log/sisp form SCNodeI0: ------------------------------------------------------------------------------- PID[16566]:File[atca.c]:Func[recvMsg]:Line[272]: Receiving Message Failed: Unspecified:ff PID[16566]:File[atca.c]:Func[setIPMBAddress]:Line[437]: Unable to get the IPMB address PID[16566]:File[atca.c]:Func[recvMsg]:Line[272]: Receiving Message Failed: Unspecified:ff PID[16566]:File[cpm/clCpmComponent.c]:Func[_cpmSaAwareComponentInstantiate]:Line[3275]: Image name : [sisp_logd], PID : [16575] GMS:[info ] New Group Created with Group id :0 GMS:[info ] Waiting for Prospective Leaders...........PostPoning Leader Election for 5 secs GMS:[info ] LeaderElection Initial Run PID[16566]:File[cpm/clCpmComponent.c]:Func[_cpmSaAwareComponentInstantiate]:Line[3275]: Image name : [sisp_gms], PID : [16580] Leader = 1, deputy = -1, leadershipChanged = 1, numberOfItems = 1 Updating the ACTIVE ENTRY with compId 65536 1 Delete the old Master cpmHandleGroupInformation HaState 1 Active nodeId 1 PID[16566]:File[cpm/clCpmComponent.c]:Func[_cpmSaAwareComponentInstantiate]:Line[3275]: Image name : [sisp_event], PID : [16589] PID[16566]:File[cpm/clCpmComponent.c]:Func[_cpmSaAwareComponentInstantiate]:Line[3275]: Image name : [sisp_txn], PID : [16593] PID[16566]:File[cpm/clCpmComponent.c]:Func[_cpmSaAwareComponentInstantiate]:Line[3275]: Image name : [sisp_name], PID : [16603] PID[16566]:File[cpm/clCpmComponent.c]:Func[_cpmSaAwareComponentInstantiate]:Line[3275]: Image name : [sisp_ckpt], PID : [16608] PID[16566]:File[cpm/clCpmComponent.c]:Func[_cpmSaAwareComponentInstantiate]:Line[3275]: Image name : [sisp_cor], PID : [16611] Initialize the CKPT, create section etc .. PID[16566]:File[cpm/clCpmComponent.c]:Func[_cpmSaAwareComponentInstantiate]:Line[3275]: Image name : [sisp_alarm], PID : [16628] PID[16566]:File[cpm/clCpmComponent.c]:Func[_cpmSaAwareComponentInstantiate]:Line[3275]: Image name : [sisp_fault], PID : [16631] Cannot find module (IP-MIB): At line 0 in (none) Cannot find module (IF-MIB): At line 0 in (none) Cannot find module (TCP-MIB): At line 0 in (none) Cannot find module (UDP-MIB): At line 0 in (none) Cannot find module (HOST-RESOURCES-MIB): At line 0 in (none) Cannot find module (SNMPv2-MIB): At line 0 in (none) Cannot find module (SNMPv2-SMI): At line 0 in (none) Cannot find module (NOTIFICATION-LOG-MIB): At line 0 in (none) Cannot find module (DISMAN-EVENT-MIB): At line 0 in (none) Cannot find module (DISMAN-SCHEDULE-MIB): At line 0 in (none) Cannot find module (UCD-SNMP-MIB): At line 0 in (none) Cannot find module (UCD-DEMO-MIB): At line 0 in (none) Cannot find module (SNMP-TARGET-MIB): At line 0 in (none) Cannot find module (NET-SNMP-AGENT-MIB): At line 0 in (none) Cannot find module (HOST-RESOURCES-TYPES): At line 0 in (none) Cannot find module (SNMP-VIEW-BASED-ACM-MIB): At line 0 in (none) Cannot find module (SNMP-COMMUNITY-MIB): At line 0 in (none) Cannot find module (IP-FORWARD-MIB): At line 0 in (none) Cannot find module (NET-SNMP-EXTEND-MIB): At line 0 in (none) Cannot find module (UCD-DLMOD-MIB): At line 0 in (none) Cannot find module (SNMP-FRAMEWORK-MIB): At line 0 in (none) Cannot find module (SNMP-MPD-MIB): At line 0 in (none) Cannot find module (SNMP-USER-BASED-SM-MIB): At line 0 in (none) Cannot find module (SNMP-NOTIFICATION-MIB): At line 0 in (none) Cannot find module (SNMPv2-TM): At line 0 in (none) NET-SNMP version 5.3.0.1 AgentX subagent connected GMS:[info ] New Member [3] Joined the Group [0] PID[16566]:File[cpm/clCpmComponent.c]:Func[_cpmSaAwareComponentInstantiate]:Line[3275]: Image name : [sisp_snmp], PID : [16634] Leader = 1, deputy = -1, leadershipChanged = 0, numberOfItems = 1 ######## before clCpmResponse ######## *************initMMSnmpIdTranslations(): moID******************* ClCorMOId:[Svc: 3] (10001:fffffffe).(10009:fffffffe).(10005:fffffffe) *************initMMSnmpIdTranslations(): moID******************* ClCorMOId:[Svc: 3] (10001:fffffffe).(10009:fffffffe).(10005:fffffffe) *************initMMSnmpIdTranslations(): moID******************* ClCorMOId:[Svc: 3] (10001:fffffffe).(10009:fffffffe).(1000b:fffffffe) *************initMMSnmpIdTranslations(): moID******************* ClCorMOId:[Svc: 3] (10001:fffffffe).(10009:fffffffe).(1000b:fffffffe) *************initMMSnmpIdTranslations(): moID******************* ClCorMOId:[Svc: 3] (10001:fffffffe).(10009:fffffffe).(1000e:fffffffe) *************initMMSnmpIdTranslations(): moID******************* ClCorMOId:[Svc: 3] (10001:fffffffe).(10009:fffffffe).(1000e:fffffffe) *************initMMSnmpIdTranslations(): moID******************* ClCorMOId:[Svc: 3] (10001:fffffffe).(10009:fffffffe).(10011:fffffffe) *************initMMSnmpIdTranslations(): moID******************* ClCorMOId:[Svc: 3] (10001:fffffffe).(10009:fffffffe).(10011:fffffffe) *************initMMSnmpIdTranslations(): moID******************* ClCorMOId:[Svc: 3] (10001:fffffffe).(10009:fffffffe).(10011:fffffffe) *************initMMSnmpIdTranslations(): moID******************* ClCorMOId:[Svc: 3] (10001:fffffffe).(10009:fffffffe).(10011:fffffffe) *************initMMSnmpIdTranslations(): moID******************* ClCorMOId:[Svc: 3] (10001:fffffffe).(10009:fffffffe).(10011:fffffffe) initializing csa104MIB Objects Almost there Done Cannot find module (IP-MIB): At line 0 in (none) Cannot find module (IF-MIB): At line 0 in (none) Cannot find module (TCP-MIB): At line 0 in (none) Cannot find module (UDP-MIB): At line 0 in (none) Cannot find module (HOST-RESOURCES-MIB): At line 0 in (none) Cannot find module (SNMPv2-MIB): At line 0 in (none) Cannot find module (SNMPv2-SMI): At line 0 in (none) Cannot find module (NOTIFICATION-LOG-MIB): At line 0 in (none) Cannot find module (DISMAN-EVENT-MIB): At line 0 in (none) Cannot find module (DISMAN-SCHEDULE-MIB): At line 0 in (none) Cannot find module (UCD-SNMP-MIB): At line 0 in (none) Cannot find module (UCD-DEMO-MIB): At line 0 in (none) Cannot find module (SNMP-TARGET-MIB): At line 0 in (none) Cannot find module (NET-SNMP-AGENT-MIB): At line 0 in (none) Cannot find module (HOST-RESOURCES-TYPES): At line 0 in (none) Cannot find module (SNMP-VIEW-BASED-ACM-MIB): At line 0 in (none) Cannot find module (SNMP-COMMUNITY-MIB): At line 0 in (none) Cannot find module (IP-FORWARD-MIB): At line 0 in (none) Cannot find module (NET-SNMP-EXTEND-MIB): At line 0 in (none) Cannot find module (UCD-DLMOD-MIB): At line 0 in (none) Cannot find module (SNMP-FRAMEWORK-MIB): At line 0 in (none) Cannot find module (SNMP-MPD-MIB): At line 0 in (none) Cannot find module (SNMP-USER-BASED-SM-MIB): At line 0 in (none) Cannot find module (SNMP-NOTIFICATION-MIB): At line 0 in (none) Cannot find module (SNMPv2-TM): At line 0 in (none) NET-SNMP version 5.3.0.1 AgentX subagent connected GMS:[info ] New Member [4] Joined the Group [0] ------------------------------------------------------------------------------- The /var/log/sisp form PayLoadNodeI0: ------------------------------------------------------------------------------- PID[9332]:File[atca.c]:Func[recvMsg]:Line[272]: Receiving Message Failed: InvCmd:c1 PID[9332]:File[atca.c]:Func[setIPMBAddress]:Line[437]: Unable to get the IPMB address PID[9332]:File[atca.c]:Func[recvMsg]:Line[272]: Receiving Message Failed: InvCmd:c1 PID[9332]:File[cpm/clCpmComponent.c]:Func[_cpmSaAwareComponentInstantiate]:Line[3275]: Image name : [sisp_logd], PID : [9341] GMS:[info ] New Group Created with Group id :0 GMS:[info ] Waiting for Prospective Leaders...........PostPoning Leader Election for 5 secs GMS:[info ] LeaderElection Initial Run GMS:[info ] New Member [4] Joined the Group [0] ------------------------------------------------------------------------------- The /var/log/sisp form PayLoadNodeI1: ------------------------------------------------------------------------------- PID[611]:File[atca.c]:Func[recvMsg]:Line[272]: Receiving Message Failed: InvCmd:c1 PID[611]:File[atca.c]:Func[setIPMBAddress]:Line[437]: Unable to get the IPMB address PID[611]:File[atca.c]:Func[recvMsg]:Line[272]: Receiving Message Failed: InvCmd:c1 PID[611]:File[cpm/clCpmComponent.c]:Func[_cpmSaAwareComponentInstantiate]:Line[3275]: Image name : [sisp_logd], PID : [620] GMS:[info ] New Group Created with Group id :0 GMS:[info ] Waiting for Prospective Leaders...........PostPoning Leader Election for 5 secs GMS:[info ] LeaderElection Initial Run ------------------------------------------------------------------------------- in SCNodeI0,I execute lockutil as below: [root@SCNODE ~]# ./sisp/bin/lockutil.sh la 101 Successfully changed state of csa101SGI0 to LockAssignment [root@SCNODE ~]# ps -ef |grep 101 |grep -v grep root 16867 1 0 15:41 ? 00:00:00 /root/sisp/bin/csa101 -p [root@SCNODE ~]# ./sisp/bin/lockutil.sh la 102 Successfully changed state of csa102SGI0 to LockAssignment [root@SCNODE ~]# ssh paynode0 ps -ef |grep sisp |grep -v grep root 9332 1 0 14:27 ? 00:00:00 ./sisp_amf -c 0 -l 3 -n PayloadNodeI0 -p BootConfig1 root 9341 1 0 14:27 ? 00:00:00 /root/sisp/bin/sisp_logd root 9346 1 0 14:27 ? 00:00:00 /root/sisp/bin/sisp_gms gmsconfig.xml [root@SCNODE ~]# ssh paynode1 ps -ef |grep sisp |grep -v grep root 611 1 0 15:31 ? 00:00:00 ./sisp_amf -c 0 -l 4 -n PayloadNodeI1 -p BootConfig1 root 620 1 0 15:31 ? 00:00:00 /root/sisp/bin/sisp_logd root 625 1 0 15:31 ? 00:00:00 /root/sisp/bin/sisp_gms gmsconfig.xml So it shows that csa102 can't startup on payloadNodeI1 by execute lockutil on SCNode. in PayLoadNodeI0, I execute lockutil as below: [root@eta ~]# ./sisp/bin/lockutil.sh la 102 PID[28582]:File[clEo.c]:Func[main]:Line[499]: Could not open shared memery segment. Please check the component Name or theshared memory segment permission rc=10120 Questions: 1. Any serious error in the /var/log/sisp? 2. Can I start up component on PayLoadNode using lockutil, as shown above, or I can only startup component on SCNode? 3. Can I startup component on PayLoadNode using other methods? and how? 4. If I can startup component on PayLoadNode, can I startup component which is located in payloadNode, like csa102, on SCNode? can I startup component which is located in SCNode, like csa101, on payloadNode? 5. I investigate the error above, about share memery segment, the more details error info is as below: in posix.c, function:cosPosixShmIdGet() shmId = shmget ((key_t)key, size, (0666 | IPC_CREAT)); pName:lockutilServer_PayloadNodeI0 key:-797338933, size:0, ret shmId:-1 errno:22, (indicated Invalid argument) So it seems that the shm for "lockutilServer_PayloadNodeI0" hasn't been create in initializaion code of sisp. Is there some error about my payloadNode sisp initializaion? Your help will be highly appreciated! -- cheers, Kaya
Submitted by harikrishna_gp on Thu, 2007-01-11 23:10.

Kaya,
From the errors on pay load node it looks like csa102 was not configured to be run on pay load blade.
For getting the multi node set up running you must be following all the steps in section 4.2 of OpenClovis_EvaluationGuide.
If you have done that and still dont see the results please post amfConfig.xml, amfDefinitions.xml.
Here are the answers for the other questions you have

1. As of nothing looks alarming in /var/log/sisp?

2. You can use lockutils on payload node aswell.

3. You can use debug CLI as explained in the csa101 case. Remember setc needs slot id as argument. If you provide the payload slot id it should be fine.

4.Again you can use debug CLI for this purpose with providing proper slot id.

Thanks,
Hari

Submitted by kaya on Sat, 2007-01-13 07:18.
Hi, After investigating more into the code, I found that the code is exactly blocked as blow: bmInitialize() cpmBmStartup() -> cpmBmSetLevel(2) -> cpmBmStartNextLevel (boot level 2) -> while(gpClCpm->polling) clCpmCpmLocalRegister(gpClCpm->pCpmLocalInfo); __Here, as clCpmCpmLocalRegister always failed, so it can't get rid of this loop. blocked.__ clCpmCpmLocalRegister -> clCpmMasterAddressGet() -> clIocTransparencyLogicalToPhysicalAddrGet() -> ioctl(fd, IOC_XPARENCY_LAYER_PHYSICAL_ADDR_GET, &temp) -> iocKernelXparencyLayerPhysicalAddressGet() -> _clIocTransparencyLogicalToPhysicalAddrGet() the root cause for clCpmCpmLocalRegister is that when it calls ioctl to get PHYSICAL_ADDR, It always failed. I have researched the IOC code, but it's so hard to me to understand, I still can't get the answer, could you give me some opinion about this issue? Also, could you answer the questions I replied last time? Your help will be highly appreciated! Thanks, Kaya
Submitted by kaya on Fri, 2007-01-12 03:28.
Submitted by harikrishna_gp on Mon, 2007-01-15 19:24.

Kaya,
What chassis are you using? Who is the vendor?
During the configuration of evaluation kit when you execut e './configure'(step 4.2.3) you shoud specify '--with-cm-build'. Please run help on './configure' and build with Chassis manager.

What version of linux kernel are you using? Can you post the out put of 'uname -a'

Thanks,
Hari

Submitted by kaya on Mon, 2007-01-15 21:02.

Hari,
My target environment is emulated by Runtime Hardware Setup 2.2.
Below is a list of the hardware.
1 System Controller node (common PC)
2 Payload Nodes (common PCs)
1 switch

can this configuration be succesful?

output of 'uname -a':
[root@eta ~]# uname -a
Linux eta 2.6.9-42.EL #1 Wed Jul 12 23:16:43 EDT 2006 i686 i686 i386 GNU/Linux

Submitted by harikrishna_gp on Tue, 2007-01-16 21:35.

Kaya,
The setup you are using is a well tested configuration.
Looking at the log files and the debug cli logs suggest that System controller is not able to communicate with the pay load blades.
I am assuming that the basic connectivity has been tested and all the three machines can reach each other.
Please make sure that there are no firewalls enabled.
You can run iptables -F as a root to make sure that any firewalls are removed.

Thanks,
Hari

Submitted by kaya on Thu, 2007-01-25 00:16.

Hi,Hari

At last I found the problem, that's because one of the nodes, system controller, has different broadcast address with other two payload nodes.
After I ajust the broadcast address, everything go well.
One question is, even the netmask of system controller is different from payload nodes, everything go well, it seems that sisp is only affected by broadcast address?

Thanks,
Kaya

Submitted by harikrishna_gp on Fri, 2007-01-26 23:21.

Hi Kaya,
Glad to hear that. net-mask will not really have any impact as all these nodes are in the same LAN.

Thanks,
Hari