Issue


All virtual servers on a compute resource are displayed as offline on the CP interface while being online.

Environment


  • All OnApp versions
  • Compute Resource - Xen, KVM - Static - Federation


Cause 


Different MTU values can cause communication errors between the servers. The values should match to ensure that there is no packet corruption or data loss.


Resolution


Verify if SNMP is running properly on a compute resource and reporting status to Control Panel.

  1. Check the /onapp/interface/log/production_snmp_stats_runner.log file to see if the compute resource is checking in.

    tail -f /onapp/interface/log/production_snmp_stats_runner.log
    CODE
  2. If the output shows the following error, there is a problem with the SNMP process:

    [INFO][28784] 2014-05-29 14:14:14 +0100 L1 [HV: 1] undefined method `split’ for nil:NilClass
    CODE

    Connect via SSH to the compute resource and check if the SNMP process is running.

    ps aux | grep snmp
    CODE
  3. SNMPD and SNMPTRAPD processes should be running. If not, start the processes:

    /etc/init.d/snmpd restart
    /etc/init.d/snmptrapd restart
    CODE
  4. After restarting the daemons, go back to Control Panel and see if the virtual servers are displayed as online. Otherwise, check the /onapp/interface/log/onapp.err file for any errors:

    tail -f /onapp/interface/log/onapp.err
    CODE


    If the following message appears, there is a network issue between Control Panel and the compute resource:

    Timeout: No Response from udp:10.25.0.5:161.
    CODE
  5. Telnet from Control Panel to the compute resource on port 161 to check if the port is open:

    telnet <HV_IP> 161
    CODE
  6. If it connects, check the MTU setting on the NIC that is used to connect to the compute resource:

    # ifconfig eth1
    eth1 Link encap:Ethernet HWaddr 00:16:3E:6F:F7:9E
    inet addr:10.25.0.4 Bcast:10.25.0.255 Mask:255.255.255.0
    inet6 addr: fe80::216:3eff:fe6f:f79e/64 Scope:Link
    UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
    RX packets:3104352 errors:0 dropped:0 overruns:0 frame:0
    TX packets:3319144 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:1000
    RX bytes:587530312 (560.3 MiB) TX bytes:1412152936 (1.3 GiB)
    Interrupt:32
    Then log into the HV and check the same on the nic that snmp is using to report
    
    # ifconfig eth2
    eth2 Link encap:Ethernet HWaddr AC:16:2D:B9:21:C1
    inet addr:10.25.0.5 Bcast:10.25.0.255 Mask:255.255.255.0
    inet6 addr: fe80::ae16:2dff:feb9:21c1/64 Scope:Link
    UP BROADCAST RUNNING MULTICAST MTU:9000 Metric:1
    RX packets:19863436 errors:1 dropped:0 overruns:0 frame:0
    TX packets:19592604 errors:0 dropped:0 overruns:0 carrier:0
    collisions:0 txqueuelen:1000
    RX bytes:8089028444 (7.5 GiB) TX bytes:4331136805 (4.0 GiB)
    CODE
  7. If the MTU is different, change the values so they match. If the compute resource is not using the NIC for any other purposes, set the MTU on the compute resource to 1500. Otherwise, if the CP NIC supports it, you can change the MTU on the CP NIC to 9000.

    To change MTU on the fly, run ifconfig eth2 mtu 1500.

    To change the NIC setting to set MTU to 1500 on reboot, edit the /etc/sysconfig/network-scripts/ifcfg-eth2 file and update the MTU setting in the file appropriately.