CDN Debugger
OnApp CDN Debugger provides you with the ability of initial investigation in case of any problem with your CDN performance. It contains the following tools:
- DNS Debugger- explains the redirection logic of the CDN.
DNS-Debugger allows you to check which CDN PoP would the visitor request get redirected to. Go to: https://debug.onappcdn.com and select the "DNS-Debugger". You will be required to provide the following details:
- Resource ID, or Resource name - mandatory field
- IP - optional field
You can obtain the Resource ID using the following ways:
- In the CNAME of your CDN resource, the Resource ID is the numeric sequence that precedes your CDN resource CNAME.
- In CP, the Resource ID is the value of "CDN reference" which can be found at the "CDN Resource Details" page.
IP (an optional field) is used to simulate the visitor's request on a CDN resource. If no IP is provided, the current connection IP will be automatically selected. You may also insert a different IP in order to simulate the DNS-Debugger result from a different region/country.
The result will display:- The place you (acting as a visitor) have come from, with the information about the country, coordinate, and closest airport.
- The selected CDN locations you were get redirected to based on your location in part 1. All possible locations will be listed, and the selected location's edge IP(s) will be highlighted in green color.
- The health status of each Edge location.
OnApp CDN redirection decision is made based on the ping result from the CDN PoPs to the DNS IP of the visitor.If a server has multiple IPs, the IP shown in the DNS debugger may be different from the server's "main IP" shown in other debugging tools, because the customers are distributed across all POP's IPs.
- Ping Test - provides the ping result from an IP/hostname to OnApp CDN POPs.
- Traceroute Test - provides the traceroute result from an IP/hostname to OnApp CDN POPs.
The Traceroute test gives you the traceroute result of your CDN resource to a specific OnApp CDN PoP.
Go to https://debug.onappcdn.com and select the Traceroute Test, insert IP/hostname, select the CDN PoP that you would like to test, and click "Run Test". - Content Compare Tool - allows you to compare the file on origin with the edge servers.
Content Compare allows you to check if the HTTP request of a CDN resource is successful on all the possible CDN PoPs where it can be accessed. Go to http://debug.onappcdn.com/contentcompare, insert your CDN hostname together with a file.
The result will display if the HTTP request is successful on all CDN PoPs (showing code 200). If it is not successful, for example, showing error code (400) on specific PoPs, please, stop using that particular PoP to avoid users getting an error page and inform CDN support. - Munin Graphs - provides an overview of the edge servers' status.
Munin, the monitoring tool surveys all your computers and presents all the information in graphs via the web interface. Using Munin you can easily monitor the performance of your computers, networks, SANs, applications, weather measurements, and so on. It determines "what's different today" when a performance problem crops up and helps to check the capacity-wise on any resources.
OnApp CDN provides Munin graphs for all edge servers to help operators monitor their edge servers' status. The Munin graphs are accessible through the CDN Debug page. These are the examples of good and bad graphs:
Good
Bad
CPU Usage
Y-Axis represents the CPU usage percentage.
Avoid having high iowait and high steal. Also, spot the unusual trend like the system CPU usage and user CPU usage growing rapidly.
This shows CPU has low iowait and low steal.
This shows CPU has high iowait and some steal.
Possible Actions: Upgrade storage to a high-performance disk (eg. SSD) and/or upgrade CPU.
Disk usage in percent
Y-Axis represents disk usage percentage.
Ignore /dev, /run, /run/lock, /run/shm and /boot partitions.
Cache partitions are OK to fill up until 90% of the disk space.
There are a lot of free space on /, /mnt/nginx/bay-* and /var/cache/nginx-hls partitions.
There is not enough free space on / partition. Generally, all partitions should not grow beyond 90% of disk space.
Possible Actions: Require investigation.
Utilization per device
Y-Axis represents the disk percentage busy.
The disk utilization should be below 80% on average.
This shows minimal disk utilization.
This shows high disk utilization, reaching 90% disk utilization.
Possible Actions: Upgrade storage to a high-performance disk (eg. SSD) and/or add more disks.
Disk IOs per device
Y-Axis represents IO operations per second.The positive value shows writing operations. The negative value shows reading operations. Zero value indicates no IO operations on the server.
This shows sustained reading and writing operations in the edge which indicates the edge is working well.
This shows reading and writing are almost zero (middle line) which indicates the edge is idle and not processing any request. The spike showed when the edge is resumed to ACTIVE.
Possible Actions: Require further investigation
Disk latency per device
Y-Axis represents Average IO Wait in seconds. The shorter disk latency means better performance.This shows the disk latency is on average below 10 milliseconds. This is reasonable disk latency for an edge with SSD storage. For edge with HDD storage, disk latency below 50 milliseconds is acceptable.
This shows the disk latency is high (more than 10 milliseconds).
Possible Actions: Upgrade storage to a high-performance disk (eg. SSD)
Load average
Y-Axis represents CPU Load. Higher numbers may represent a problem or an overloaded machine.
The load is below 6 on average. Generally, the load of 6 to 8 is normal for average specification edge servers with 4 to 8 CPU cores.
The load is high which is above 10 on average. However, for a server with high specification, it is OK to have a higher CPU Load.
Possible Actions: Upgrade CPU to be able to handle large loads.
Memory usage
Y-Axis represents memory usage in bytes.
This shows steady memory usage.
Memory usage grows quickly beyond its capacity and has small unused space.
Possible Actions: Require further investigation
Nginx
Y-Axis represents Connections or Requests. Higher the value indicates more connections and requests it handles.
This shows that the edge handles a lot of user requests.
This shows that the edge handles a little number of user requests.
Possible Actions: Ensure the edge has a good specification so that DNS will redirect more requests to the edge.
Ping & Packet Loss
Y-Axis represents both Packet Loss and Ping.
A positive value represents ping time in millisecond. The negative value represents the percentage of packet loss.
This shows a consistent connection between the edge and OnApp CDN Monitoring servers with no negative value.
This shows packet loss (negative value) and unreachable from OnApp CDN Monitoring servers.
Possible Actions: Ensure the Internet connection to the server is stable.
Throughput per device
Y-Axis represents bytes of reading and writing per second.
A positive value indicates data writing. A negative value indicates the data reading. Zero value indicates no data operations on the storage devices.
This shows that the storage devices are actively handling user requests.
This shows that the storage devices are near zero which indicates the edge might be idle from handling user requests.
Possible Actions: Require investigation
Uptime
Y-Axis represents uptime in days.
The server uptime is increasing over time indicates no downtime on the server.
The sudden drop in server uptime indicates the server experienced downtime recently.
Possible Actions: Ensure the server has lesser downtime and network connection is good.
External links:
- Edge Monitoring - allows you to track your edge servers' health status.
Go to https://debug.onappcdn.com/edgemon. This monitoring panel allows you to check the health of the CDN edge servers. It also shows the graph of the edge server components, like CPU, RAM, and Disk usage.
check_fs_error - This is to check the filesystem of the edge server by writing to it
check_heartbeat - This is to check if the edge responds to the monitoring
check_http - This is to check the edge server port 80 for respond
check_load - This is to check the CPU load
check_munin - This is to check if OnApp Munin monitoring can retrieve data from this edge
check_network - This is to check the outgoing speed of the server
check_ping - This is to check if the edge server is pingable from inside and outside and packet loss
check_puppet_compile - This is to check if the edge server has the latest OnApp CDN configurationWhen the edge monitoring service detects that an edge server does not function as expected, the status of the edge server changes to DOWN. This information is provided in the CDN DNS redirection strategy. When the edge monitoring service detects that the edge server has recovered, the status of the edge server changes to OK, which is provided in the CDN DNS redirection strategy as well.
If the status of the edge server is DOWN for 30 consecutive days, its status changes into DEFUNCT and all monitoring activity is permanently halted for this edge server. It means that the edge server will remain DEFUNCT even if it has recovered from service interruption. To check it, go to your Dashboard (admin.onapp.com) > CDN > Portal > Edge Servers > Edge Servers. To enable delivery service for this edge server, contact support@onapp.com.
Additionally, you can subscribe for alerts on your edge server status at https://debug.onappcdn.com/edgealert.
CDN edge monitoring servers are located in:
- Singapore - collectdproxy-sin.omega.onappcdn.com
- Manchester - collectdproxy-man.omega.onappcdn.com
- Los Angeles - collectdproxy-lax.omega.onappcdn.com
- Rotterdam -collectdproxy-rtm.omega.onappcdn.com
These monitoring servers are subject to change without notice.
- Ping IP Submission - allows you to submit DNS IPs for the Ping engine to collect the ping data and to receive the results in the redirection algorithm.
- Edge Servers IP Ranges - allows you to view all the current IP ranges for the subscribed marketplace and edge servers of your firewall system(s).
- Hardware Inventory - provides you with a list of your servers and POPs.
OnApp CDN Debugger requires the same login credentials as the dashboard user account does. If you experience issues using the CDN debugger, it means that the CDN Debug access option may not be enabled in your dashboard user account. Edit the user account permission in the dashboard or contact OnApp Support.