10.1 Monitoring¶
About This Service¶
Overview¶
Keep-alive monitoring as well as resource scarcity check for monitoring targets.
Capacity management through viewing statistics of various equipments.
Keep logs of failure occurrence and its status
Features¶
Available Functions¶
List of Functions¶
Following are the functions to be available with Monitoring, which allows Customers to configure the settings as needed.
Functions |
Details |
Data Collection/Accumulation |
Customers can collect needed data from various resources and store them for a certain period. |
Custom Meter |
Data values collected by script etc. can be accumulated in the monitoring server. |
Graph Generation |
This function enables Customers to generate graphs from the collected and stored data by Monitoring service. |
Action Settings |
Customers can set up a couple of specific actions in case of excessive thresholds they set. |
Monitoring Item List |
Customers can confirm the list of monitoring items they set. |
Alert History |
Customers can confirm alert histories that have exceeded the thresholds. |
Data Download |
Customers can download stored monitoring data via API. |
API | API is available. |
Description of Functions¶
Data Collection/Accumulation¶
This function collects data from various resources for the relevant monitoring items.
The collected data is accumulated on the system, and can be used for the function of Graph Generation which is described in later section.
The data accumulation period is 32 days (including the day).
Virtual Server Instance
Meter |
Display Name |
Meter Name |
Unit |
Collection source and determination method |
Note |
Monitoring Interval |
CPU usage rate * 1 |
CPU Utilization | nova.cpu.utilization.percents | percent | Collect on hypervisor where virtual server instance is running |
Calculate the average of monitoring intervals |
5 minutes |
Disk Read Bytes |
Disk Read Bytes | nova.disk.read.bytes | byte | Collect on hypervisor where virtual server instance is running |
Calculate total value within monitoring interval |
5 minutes |
Disk Write Bytes |
Disk Write Bytes | nova.disk.write.bytes | byte | Collect on hypervisor where virtual server instance is running |
Calculate total value within monitoring interval |
5 minutes |
Disk Read Requests |
Disk Read Requests | nova.disk.read.requests | request | Collect on hypervisor where virtual server instance is running |
Calculate total value within monitoring interval |
5 minutes |
Disk Write Requests |
Disk Write Requests | nova.disk.write.requests | request | Collect on hypervisor where virtual server instance is running |
Calculate total value within monitoring interval |
5 minutes |
Incoming Traffic |
Incoming Traffic | nova.network.incoming.bytes | byte | Collect on hypervisor where virtual server instance is running |
Calculate total value within monitoring interval |
5 minutes |
Outgoing Traffic |
Outgoing Traffic | nova.network.outgoing.bytes | byte | Collect on hypervisor where virtual server instance is running |
Calculate total value within monitoring interval |
5 minutes |
VM life and death monitoring * 2 |
VM Status | nova.vm.status.bool | boolean | Collect on hypervisor where virtual server instance is running
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
Hypervisor Status |
Hypervisor Status | nova.hv.status.bool | boolean | Determined by the state of the network interface of each virtual server instance on the hypervisor on which the virtual server instance is running
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
Note
*1 The maximum value is “100 percent x number of CPU cores”. (Example: If the virtual server instance is 16 CPUs, the value ranges from 0 to 1600 percent.)
Note
*2 “VM dead peer monitoring” shows the monitoring result from the infrastructure side that provides the virtual server instance, and the trouble caused by internal instance problems (kernel panic or OS crash) occurring while the virtual server instance is running can not be detected.
Virtual Server Volume
Meter |
Display Name |
Meter Name |
Unit |
Collection source and determination method |
Note |
Monitoring Interval |
Disk Read Bytes |
Disk Read Bytes | cinder.disk.read.bytes | byte | Collect on hypervisor where virtual server instance is running |
Calculate total value within monitoring interval |
5 minutes |
Disk Write Bytes |
Disk Write Bytes | cinder.disk.write.bytes | byte | Collect on hypervisor where virtual server instance is running |
Calculate total value within monitoring interval |
5 minutes |
Disk Read Requests |
Disk Read Requests | cinder.disk.read.requests | request | Collect on hypervisor where virtual server instance is running |
Calculate total value within monitoring interval |
5 minutes |
Disk Write Requests |
Disk Write Requests | cinder.disk.write.requests | request | Collect on hypervisor where virtual server instance is running |
Calculate total value within monitoring interval |
5 minutes |
Baremetal Server / vSphere ESXi / Hyper-V
Meter |
Display Name |
Meter Name |
Unit |
Collection source and determination method |
Note |
Monitoring Interval |
Power Supply Status |
Chassis Power Status | baremetal-server.chassis-power.status.bool | boolean | Collected from the resource by IPMI
Definitions:
ON =false (0)
OFF= true (1)
|
Monitoring value |
5 minutes |
Fan Status |
Fan Status | baremetal-server.fan.status.bool | boolean | Determined based on presence / absence of fan failure collected from IPMI by IPMI
Send true (1) in case any fan failure
Send false (0) in normal state
|
Monitoring value |
5 minutes |
Chassis Power Status |
Power Supply Status | baremetal-server.power-supply.status.bool | boolean | Collected from the resource by IPMI
Send true (1) if there is even one power supply failure
Send false (0) in normal state
|
Monitoring value |
5 minutes |
CPU Status |
CPU Status | baremetal-server.cpu.status.bool | boolean | Determined by presence or absence of CPU failure collected by IPMI from this resource
Send true (1) in case any CPU failure
Send false (0) in normal state
|
Monitoring value |
5 minutes |
Memory Status |
Memory Status | baremetal-server.memory.status.bool | boolean | Determined based on presence / absence of memory failure collected by IPMI from this resource
Send true (1) in case any memory failure
Send false (0) in normal state
|
Monitoring value |
5 minutes |
HDD failure |
Disk Status Failures | baremetal-server.disk.status.failures | int | Determined based on presence / absence of HDD failure collected by IPMI from the target resource
Send the number of failed disks (0 to 36)
|
Monitoring value |
5 minutes |
NIC failure |
NIC Status | baremetal-server.nic.status.bool | boolean | Determined based on presence / absence of NIC failure collected by IPMI from the target resource
Send true (1) if at least one NIC has failed
Send false (0) in normal state
※ It is not displayed in Workload Optimized 2
|
Monitoring value |
5 minutes |
System board failure |
System Board Status | baremetal-server.system.board.status.bool | boolean | Determined based on presence / absence of system board failure collected by IPMI from the target resource
Send true (1) if at least one system board has failed
Send false (0) in normal state
※ It is not displayed in Workload Optimized 2
|
Monitoring value |
5 minutes |
Other Statuses |
Other Statuses | baremetal-server.etc.status.bool | boolean | Judge whether failure or not collected from the resource by IPMI
RAID Controller failure
Failed to get HDD information (Workload Optimaized 2 only)
Other Statuses
Send true (1) in above cases
Send false (0) in normal state
|
Monitoring value |
5 minutes |
Internet Connectivity
Meter |
Display Name |
Meter Name |
Unit |
Collection source and determination method |
Note |
Monitoring Interval |
Traffic IN |
Incoming Traffic | internet-connectivity.traffic.in.bps | bps | Collect from the network equipment to which the function of the resource is assigned |
Calculate the average of monitoring intervals |
5 minutes |
Traffic OUT |
Outgoing Traffic | internet-connectivity.traffic.out.bps | bps | Collect from the network equipment to which the function of the resource is assigned |
Calculate the average of monitoring intervals |
5 minutes |
Gateway Status |
Gateway Status | internet-connectivity.internet_gateway.status.bool | boolean | Based on the presence or absence of the SNMP response of the network device to which the function of the resource is assigned
IF connected to the Internet GW logical network
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
GW-IF Status |
Interface Status | internet-gw-interface.gw_interface.status.bool | boolean | Determined by the normality of the VRRP status of the network device to which the function of the resource is assigned
IF connected to the Internet GW logical network
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
Note
In the absence of global IP for the Internet gateway, sample traffic values for traffic IN and traffic OUT are not monitored because there is no internet connection traffic.
VPN Connectivity
Meter |
Display Name |
Meter Name |
Unit |
Collection source and determination method |
Note |
Monitoring Interval |
Traffic IN |
Incoming Traffic | vpn-interface.traffic.in.bps | bps | Collect from the network equipment to which the function of the resource is assigned |
Calculate the average of monitoring intervals |
5 minutes |
Traffic OUT |
Outgoing Traffic | vpn-interface.traffic.out.bps | bps | Collect from the network equipment to which the function of the resource is assigned |
Calculate the average of monitoring intervals |
5 minutes |
Gateway Status |
Gateway Status | vpn-connectivity.vpn_gateway.status.bool | boolean | Based on the presence or absence of the SNMP response of the network device to which the function of the resource is assigned
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
GW-IF Status |
Interface Status | vpn-gw-interface.gw_interface.status.bool | boolean | Determined by the normality of the VRRP status of the network device to which the function of the resource is assigned
IF connected to the logical network of VPN GW
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
IF Status |
Interface Status | vpn-interface.vpn_interface.status.bool | boolean | Determined by the normality of the VRRP status of the network device to which the function of the resource is assigned
IF connected to the VPN network of VPN GW
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
Flexible InterConnect ECL2.0 Connection
Meter |
Display Name |
Meter Name |
Unit |
Collection source and determination method |
Note |
Monitoring Interval |
Traffic IN |
Incoming Traffic | fic-interface.traffic.in.bps | bps | Collect from the network equipment to which the function of the resource is assigned |
Calculate the average of monitoring intervals |
5 minutes |
Traffic OUT |
Outgoing Traffic | fic-interface.traffic.out.bps | bps | Collect from the network equipment to which the function of the resource is assigned |
Calculate the average of monitoring intervals |
5 minutes |
Gateway Status |
Gateway Status | fic-connectivity.fic_gateway.status.bool | boolean | Determined by the status collected by SNMP from the network device to which the function of the resource is assigned.
Failure if all redundant systems are in the DOWN status
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
GW-IF Status |
Interface Status | fic-gw-interface.gw_interface.status.bool | boolean | Determined by the normality of the VRRP status of the network device to which the function of the resource is assigned
IF connected to Flexible InterConnect GW’s logical network
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
IF Status |
Interface Status | fic-interface.fic_interface.status.bool | boolean | Determined by the normality of the BGP status of the network device to which the function of the resource is assigned
IF connected to the Flexible InterConnect GW network
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
Logical Network
Meter |
Display Name |
Meter Name |
Unit |
Collection source and determination method |
Note |
Monitoring Interval |
Traffic IN |
Incoming Traffic | logical-network-port.traffic.in.bps | bps | Collect from the network equipment to which the function of the resource is assigned |
Calculate the average of monitoring intervals |
5 minutes |
Traffic OUT |
Outgoing Traffic | logical-network-port.traffic.out.bps | bps | Collect from the network equipment to which the function of the resource is assigned |
Calculate the average of monitoring intervals |
5 minutes |
NW Status |
Operational State | logical-network.network.status.bool | boolean | Based on the result of HTTP health check for the network device to which the function of the resource is assigned
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
Port Status |
Operational State | logical-network-port.port.status.bool | boolean | Collect from the network equipment to which the function of the resource is assigned
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
Datacenter Inter-Connectivity
Meter |
Display Name |
Meter Name |
Unit |
Collection source and determination method |
Note |
Monitoring Interval |
Traffic IN |
Incoming Traffic | interdc-interface.traffic.in.bps | bps | Collect from the network equipment to which the function of the resource is assigned |
Calculate the average of monitoring intervals |
5 minutes |
Traffic OUT |
Outgoing Traffic | interdc-interface.traffic.out.bps | bps | Collect from the network equipment to which the function of the resource is assigned |
Calculate the average of monitoring intervals |
5 minutes |
Gateway Status |
Gateway Status | interdc-connectivity.interdc_gateway.status.bool | boolean | Based on the presence or absence of the SNMP response of the network device to which the function of the resource is assigned
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
GW-IF Status |
Interface Status | interdc-gw-interface.gw_interface.status.bool | boolean | Determined by the normality of the VRRP status of the network device to which the function of the resource is assigned
Datacenter Inter-Connectivity IF connected to GW’s logical network
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
IF Status |
Interface Status | interdc-interface.interdc_interface.status.bool | boolean | Determined by the normality of the VRRP status of the network device to which the function of the resource is assigned
IF connected to the network side of Datacenter Inter-Connectivity GW
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
Amazon Web Services Inter-Connectivity
Meter |
Display Name |
Meter Name |
Unit |
Collection source and determination method |
Note |
Monitoring Interval |
Gateway Status |
Gateway Status | aws-connectivity.aws_gateway.status.bool | boolean | Based on the presence or absence of SNMP response of the network device connected to the AWS
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
GW-IF Status |
Interface Status | aws-gw-interface.gw_interface.status.bool | boolean | based on the presence or absence of VRRP status of the network device connected to the AWS
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
Traffic IN |
Incoming Traffic | aws-interface.traffic.in.bps | bps | based on the presence or absence of SNMP response to the network device connected to the AWS |
Send average value for 5 minutes |
5 minutes |
Traffic OUT |
Outgoing Traffic | aws-interface.traffic.out.bps | bps | based on the presence or absence of SNMP response to the network device connected to the AWS |
Send average value for 5 minutes |
5 minutes |
IF Status |
Interface Status | aws-interface.aws_interface.status.bool | boolean | Based on the presence or absence of BGP status of the network device connected to the AWS
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
Google Cloud Platform Inter-Connectivity
Meter |
Display Name |
Meter Name |
Unit |
Collection source and determination method |
Note |
Monitoring Interval |
Gateway Status |
Gateway Status | gcp-connectivity.gcp_gateway.status.bool | boolean | Determined by presence or absence of SNMP response of network equipment connected to GCP
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
GW-IF Status |
Interface Status | gcp-gw-interface.gw_interface.status.bool | boolean | Determined by the normality of the VRRP status of the network equipment connected to the GCP
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
Traffic IN |
Incoming Traffic | gcp-interface.traffic.in.bps | bps | based on the presence or absence of SNMP response to the network device connected to the AWS |
Send average value for 5 minutes |
5 minutes |
Traffic OUT |
Outgoing Traffic | gcp-interface.traffic.out.bps | bps | based on the presence or absence of SNMP response to the network device connected to the AWS |
Send average value for 5 minutes |
5 minutes |
IF Status |
Interface Status | gcp-interface.gcp_interface.status.bool | boolean | Determined by the normality of the BGP status of the network equipment connected to the GCP
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
Azure Inter-Connectivity
Meter |
Display Name |
Meter Name |
Unit |
Collection source and determination method |
Note |
Monitoring Interval |
Gateway Status |
Gateway Status | azure-connectivity.azure_gateway.status.bool | boolean | Obtained by SNMP from network equipment connected to Azure |
Monitoring value |
1 minute |
GW-IF Status |
Interface Status | azure-gw-interface.gw_interface.status.bool | boolean | Obtained from netconf from Azure connected network equipment |
Monitoring value |
1 minute |
Traffic IN |
Incoming Traffic | azure-interface.traffic.in.bps | bps | Obtained by SNMP from network equipment connected to Azure |
Send average value for 5 minutes |
5 minutes |
Traffic OUT |
Outgoing Traffic | azure-interface.traffic.out.bps | bps | Obtained by SNMP from network equipment connected to Azure |
Send average value for 5 minutes |
5 minutes |
IF Status |
Interface Status | azure-interface.azure_interface.status.bool | boolean | Determined by the normality of the BGP status of the network equipment connected to Azure
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
Firewall(Brocade 5600 vRouter)
Meter |
Display Name |
Meter Name |
Unit |
Collection source and determination method |
Note |
Monitoring Interval |
CPU Utilization(User) |
CPU Utilization(User) |
firewall.cpu.user.percents | percent | Collected by SNMP from the virtual server instance to which the function of the resource is assigned |
Calculate the average of monitoring intervals |
5 minutes |
CPU Utilization(System) |
CPU Utilization(System) |
firewall.cpu.system.percents | percent | Collected by SNMP from the virtual server instance to which the function of the resource is assigned |
Calculate the average of monitoring intervals |
5 minutes |
CPU Utilization(Idle) |
CPU Utilization(Idle) |
firewall.cpu.idle.percents | percent | Collected by SNMP from the virtual server instance to which the function of the resource is assigned |
Calculate the average of monitoring intervals |
5 minutes |
Memory Total |
Memory Total | firewall.memory.total.kbytes | kbyte | Collected by SNMP from the virtual server instance to which the function of the resource is assigned |
Calculate the average of monitoring intervals |
5 minutes |
Memory Available |
Memory Available | firewall.memory.available.kbytes | kbyte | Collected by SNMP from the virtual server instance to which the function of the resource is assigned |
Calculate the average of monitoring intervals |
5 minutes |
TCP Active Connections |
TCP Active Connections | firewall.tcp.active.connections | connection | Collected by SNMP from the virtual server instance to which the function of the resource is assigned |
Total number of connections since startup |
5 minutes |
TCP Passive Connections |
TCP Passive Connections | firewall.tcp.passive.connections | connection | Collected by SNMP from the virtual server instance to which the function of the resource is assigned |
Total number of connections since startup |
5 minutes |
FW Status |
Firewall Status | firewall.firewall.status.bool | boolean | Based on the presence or absence of SNMP response from the virtual server instance to which the function of the resource is assigned
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
IF Status |
Interface Status | firewall-interface.firewall_interface.status.bool | boolean | Collected by SNMP from the virtual server instance to which the function of the resource is assigned
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
Traffic IN |
Incoming Traffic | firewall-interface.traffic.in.bps | bps | Collected by SNMP from the virtual server instance to which the function of the resource is assigned |
Calculate the average of monitoring intervals |
5 minutes |
Traffic OUT |
Outgoing Traffic | firewall-interface.traffic.out.bps | bps | Collected by SNMP from the virtual server instance to which the function of the resource is assigned |
Calculate the average of monitoring intervals |
5 minutes |
Firewall(vSRX)
Meter |
Display Name |
Meter Name |
Unit |
Collection source and determination method |
Note |
Monitoring Interval |
CPU Usage(RE) |
CPU Utilization (RE) | vsrx.cpu.utilization-re.percents | percent | CPU usage (%) of RE (CPU processing part) of vSRX |
Calculate the average of monitoring intervals |
5 minutes |
CPU Usage (FPC) |
CPU Utilization (FPC) | vsrx.cpu.utilization-fpc.percents | percent | CPU usage (%) of vSRX FPC (packet transfer part) |
Calculate the average of monitoring intervals |
5 minutes |
Memory usage (RE) |
Memory Utilization (RE) | vsrx.memory.utilization-re.percents | percent | Memory usage (%) of RE (CPU processing part) of vSRX |
Calculate the average of monitoring intervals |
5 minutes |
Memory usage(FPC) |
Memory Utilization (FPC) | vsrx.memory.utilization-fpc.percents | percent | Memory usage (%) of FPC (packet transfer part) of vSRX |
Calculate the average of monitoring intervals |
5 minutes |
TCP Active Connections |
TCP Active Connections | vsrx.tcp.active.connections | cps(connections per sec) | Display cumulative value from vSRX, convert difference to cps (connections per sec) |
Average value for 5 minutes |
5 minutes |
TCP Passive Connections |
TCP Passive Connections | vsrx.tcp.passive.connections | connection | Display cumulative value from vSRX, convert difference to cps (connections per sec) |
Average value for 5 minutes |
5 minutes |
Alive monitoring 1 (sysUptime) |
OS Monitoring Status | vsrx.os.monitoring.status.bool | boolean | From SNMP sysUptime acquired from vSRX, it is judged whether monitoring by snmp is normal or not
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
Alive monitoring 2 (vSRX GuestOS) |
OS Login Status | vsrx.os.login.status.bool | boolean | Determine whether login processing can be executed with v API using rest API
Normal = 0 (200 OK)
Failure = 1 (401 Unauthorized)
|
Monitoring value |
1 minute |
Alive monitoring 3 (nova VM) |
VM Status | vsrx.vm.status.bool | boolean | Collect on hypervisor where virtual server instance is running
Normal = 0 (200 OK & Status Active)
Fault = 1 (200 OK & Status Shut Off)
|
Monitoring value |
1 minute |
Traffic IN |
Incoming Traffic for ge-00X | vsrx.ge-00X.traffic.in.bps | bps | Collect cumulative value from ifSCX (ifHCInOctets), convert from byte to bps (bit per sec) and display it
“ge-00X” is displayed for each Interface
|
Calculate the average of monitoring intervals |
5 minutes |
Traffic OUT |
Outgoing Traffic for ge-00X | vsrx.ge-00X.traffic.out.bps | bps | Collect cumulative value from ifSCX (ifHCInOctets), convert from byte to bps (bit per sec) and display it
“ge-00X” is displayed for each Interface
|
Calculate the average of monitoring intervals |
5 minutes |
Note
On the vSRX meter, the number of interfaces is displayed for 15 ports, however the interfaces from ge-008 to ge-014 are not used. Therefore, please note that monitoring data will not be displayed for them.
Due to the specifications of Juniper vSRX, the memory usage rate (FPC) is 71% for version 19.2R1.8 and 63% for 20.4R2. Therefore, the value will change only when your memory usage exceeds the steady memory utilization (FPC) of each version.
Load Balancer(NetScaler VPX)
Meter |
Display Name |
Meter Name |
Unit |
Collection source and determination method |
Note |
Monitoring Interval |
CPU Utilization |
CPU Utilization | load-balancer.cpu.usage.percents | percent | Collected by SNMP from the virtual server instance to which the function of the resource is assigned
Sum of the value of each core
|
Calculate the average of monitoring intervals |
5 minutes |
Memory Usage |
Memory Usage | load-balancer.memory.usage.percents | percent | Collected by SNMP from the virtual server instance to which the function of the resource is assigned |
Calculate the average of monitoring intervals |
5 minutes |
HTTP Request Connections |
HTTP Request Connections | load-balancer.http.request.connections | connection | Collected by SNMP from the virtual server instance to which the function of the resource is assigned |
Total number of requests since startup |
5 minutes |
TCP Client Connections |
TCP Client Connections | load-balancer.tcp.client.connections | connection | Collected by SNMP from the virtual server instance to which the function of the resource is assigned |
Calculate the average of monitoring intervals |
5 minutes |
TCP Server Connections |
TCP Server Connections | load-balancer.tcp.server.connections | connection | Collected by SNMP from the virtual server instance to which the function of the resource is assigned |
Calculate the average of monitoring intervals |
5 minutes |
LB Status |
Load Balancer Status | load-balancer.load_balancer.status.bool | boolean | Collected by SNMP from the virtual server instance to which the function of the resource is assigned
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
IF Status |
Interface Status | load-balancer-interface.load_balancer_interface.status.bool | boolean | Collected by SNMP from the virtual server instance to which the function of the resource is assigned
Normal = 0
Failure = 1
|
Monitoring value |
1 minute |
Traffic IN |
Incoming Traffic | load-balancer-interface.traffic.in.bps | bps | Collected by SNMP from the virtual server instance to which the function of the resource is assigned |
Calculate the average of monitoring intervals |
5 minutes |
Traffic OUT |
Outgoing Traffic | load-balancer-interface.traffic.out.bps | bps | Collected by SNMP from the virtual server instance to which the function of the resource is assigned |
Calculate the average of monitoring intervals |
5 minutes |
Managed Load Balancer
Meter |
Display Name |
Meter Name |
Unit |
Collection source and determination method |
Note |
Monitoring Interval |
CPU Utilization |
CPU Utilization | managed-lb.cpu.utilization.percents | percent | Collected by SNMP from the server instance to which the function of the resource is assigned |
Calculate the average of monitoring intervals |
5 minutes |
Memory usage |
Memory Utilization | managed-lb.memory.utilization.percents | Percent | Collected by SNMP from the server instance to which the function of the resource is assigned |
Calculate the average of monitoring intervals |
5 minutes |
Traffic IN |
Incoming Traffic on interface-x(x = 1 ~ 7) |
managed-lb.interface-x.traffic.in.bps | bps | Collected by SNMP from the server instance to which the function of the resource is assigned |
Average value for the last 300 seconds during monitoring |
5 minutes |
Traffic OUT |
Outgoing Traffic on interface-x (x = 1 ~ 7) |
managed-lb.interface-x.traffic.out.bps | bps | Collected by SNMP from the server instance to which the function of the resource is assigned |
Average value for the last 300 seconds during monitoring |
5 minutes |
Number of simultaneous TCP connections |
TCP Concurrent Connections | managed-lb.tcp.concurrent.connections | connection | Collected by SNMP from the server instance to which the function of the resource is assigned |
5 minutes |
|
Number of new TCP connections |
TCP New Connections | managed-lb.tcp.new.connections | connection | Collected by SNMP from the server instance to which the function of the resource is assigned |
Calculate the difference between the value at the time of the previous monitoring and the value at the time of the current monitoring |
5 minutes |
Number of packets exceeding the upper limit |
Rejected Packets | managed-lb.rejected.packets | packet | Collected by SNMP from the server instance to which the function of the resource is assigned |
Calculate the difference between the value at the time of the previous monitoring and the value at the time of the current monitoring |
5 minutes |
Note
Managed Load Balancer Policy
Meter |
Display Name |
Meter Name |
Unit |
Collection source and determination method |
Note |
Monitoring Interval |
Number of connections |
Total Connections | managed-lb-policy.total.connections | connection | Collected by HTTP from the server instance to which the function of the resource is assigned |
Calculate the difference between the value at the time of the previous monitoring and the value at the time of the current monitoring |
5 minutes |
Number of normal members |
Healthy Members | managed-lb-policy.healthy.members | count | Collected by HTTP from the server instance to which the function of the resource is assigned |
5 minutes |
|
Number of abnormal members |
Unhealthy Members | managed-lb-policy.unhealthy.members | count | Collected by HTTP from the server instance to which the function of the resource is assigned |
5 minutes |
Note
If you are using an HA configuration (redundant configuration) plan, the value of the active resource is acquired at the time of monitoring every minute and the average value is displayed. Therefore, it may not be the average of consecutive acquisition results every minute within the acquisition interval (5 minutes, etc.).
Custom Resource
Custom resource is a monitoring resource that a customer can create arbitrarily by custom meter functions.
It can be used such as to monitor servers other than the EnterpriseCloud2.0 by EnterpriseCloud2.0.
For custom resource creation, please refer to the column of the Custom Meter.
Note
Custom Meter¶
The custom meter is a function that accumulates data values collected by the customer arbitrarily by script etc. in the monitoring server.
As a result, it is possible to perform arbitrary values as graphs and alerts like other meters.
Since this function uses the API, it is necessary that the device that is the source of the API is connected to the Internet.
If you specify an arbitrary name to a Resource ID at the time of custom meter creation, you can create a custom resource of the specified name.
** Mechanism of custom meter **
Create a script to acquire custom meter registration data and execute it (script prepared for customer)
The above data can be sent to custom meter using API, and API can be implemented similarly in script (see API reference)
Use the Enterprise Cloud 2.0 customer portal and API to browse data accumulated in monitoring or set alerts
** Function of custom meter **
Custom Meter |
Details |
Create custom meter (API) |
Create a data storage area for the custom meter
Please note that data registration becomes possible only after creation of the data storage area and that new creation and data registration cannot be done at the same time.
|
Add custom meter |
Data (sample value) managed by monitoring can be registered in the created data storage area. |
Edit custom meter(API) |
Customer can change each parameter of custom meter once created. |
Custom meter agent |
Predefined collection items can be automatically accumulated in the monitoring server. |
** Custom meter status **
Active custom meter |
It means a custom meter actually in use and being billed. Sample value registration to the custom meter will cause transition to this status.
The maximum number of active custom meters that can coexist at the same time is 30. If the number is exceeded, creation of the 31st will cause an error.
|
Inactive custom meter |
If sample value registration is not made for 24 hours after the latest value registration, an active custom meter will become inactive.
If this status is reached, the meter above is not counted as an active custom meter. Therefore, other active custom meter can be newly used.
In addition, sample values accumulated so far will be retained for the specified retention period (31 days by default, 397 days when combined with meter retention period extension function).
If value registration is made to a custom meter being inactive once, it will be counted again as an active custom meter.
The maximum number of inactive custom meters and active custom meters that can exist at the same time is up to 120. If it exceeds, an error will occur when creating the 121st item.
|
Note
Custom meter agent
Meters acquired by custom meter agent
No | Categories |
Meter Name |
Details |
Unit |
Collection source and determination method |
Optional |
Monitoring Interval |
1 | CPU Utilization |
cpu.user.percents | CPU utilization (user mode) |
percent | Calculated from differences from the values collected previously by referring to /proc/stat. |
Calculate the average of monitoring intervals |
Optional |
2 | CPU Utilization |
cpu.nice.percents | CPU utilization (low-priority user mode) |
percent | Calculated from differences from the values collected previously by referring to /proc/stat. |
Calculate the average of monitoring intervals |
Optional |
3 | CPU Utilization |
cpu.system.percents | CPU utilization (system mode) |
percent | Calculated from differences from the values collected previously by referring to /proc/stat. |
Calculate the average of monitoring intervals |
Optional |
4 | CPU Utilization |
cpu.idle.percents | CPU utilization (task queue) |
percent | Calculated from differences from the values collected previously by referring to /proc/stat. |
Calculate the average of monitoring intervals |
Optional |
5 | CPU Utilization |
cpu.iowait.percents | CPU utilization (I/O queue) |
percent | Calculated from differences from the values collected previously by referring to /proc/stat. |
Calculate the average of monitoring intervals |
Optional |
6 | CPU Utilization |
cpu.irq.percents | CPU utilization (interrupt) |
percent | Calculated from differences from the values collected previously by referring to /proc/stat. |
Calculate the average of monitoring intervals |
Optional |
7 | CPU Utilization |
cpu.softirq.percents | CPU utilization (interrupt) |
percent | Calculated from differences from the values collected previously by referring to /proc/stat. |
Calculate the average of monitoring intervals |
Optional |
8 | CPU Utilization |
cpu.steal.percents | CPU utilization (time used by other OS in virtual environment) |
percent | Calculated from differences from the values collected previously by referring to /proc/stat. |
Calculate the average of monitoring intervals |
Optional |
9 | CPU Utilization |
cpu.guest.percents | CPU utilization (virtual CPU for guest OS) |
percent | Calculated from differences from the values collected previously by referring to /proc/stat. |
Calculate the average of monitoring intervals |
Optional |
10 | Disk |
disk.{device name}.reads.completed.count |
Completed read I/Os |
count | Calculated from differences from the values collected previously by checking devices of the monitoring target and by referring to corresponding lines in /proc/diskstats. |
Calculate total value within monitoring interval |
Optional |
11 | Disk |
disk.{device name}.reads.merged.count |
Merge read I/Os |
count | Calculated from differences from the values collected previously by checking devices of the monitoring target and by referring to corresponding lines in /proc/diskstats. |
Calculate total value within monitoring interval |
Optional |
12 | Disk |
disk.{device name}.reads.sectors.count |
Read sectors |
count | Calculated from differences from the values collected previously by checking devices of the monitoring target and by referring to corresponding lines in /proc/diskstats. |
Calculate total value within monitoring interval |
Optional |
13 | Disk |
disk.{device name}.reads.milliseconds |
Read milliseconds |
millisecond | Calculated from differences from the values collected previously by checking devices of the monitoring target and by referring to corresponding lines in /proc/diskstats. |
Calculate total value within monitoring interval |
Optional |
14 | Disk |
disk.{device name}.writes.completed.count |
Completed write I/Os |
count | Calculated from differences from the values collected previously by checking devices of the monitoring target and by referring to corresponding lines in /proc/diskstats. |
Calculate total value within monitoring interval |
Optional |
15 | Disk |
disk.{device name}.writes.merged.count |
Merge write I/Os |
count | Calculated from differences from the values collected previously by checking devices of the monitoring target and by referring to corresponding lines in /proc/diskstats. |
Calculate total value within monitoring interval |
Optional |
16 | Disk |
disk.{device name}.writes.sectors.count |
Write sectors |
count | Calculated from differences from the values collected previously by checking devices of the monitoring target and by referring to corresponding lines in /proc/diskstats. |
Calculate total value within monitoring interval |
Optional |
17 | Disk |
disk.{device name}.writes.milliseconds |
Write seconds |
millisecond | Calculated from differences from the values collected previously by checking devices of the monitoring target and by referring to corresponding lines in /proc/diskstats. |
Calculate total value within monitoring interval |
Optional |
18 | Disk |
disk.{device name}.currently.ios.count |
I/Os in execution |
count | Calculated from differences from the values collected previously by checking devices of the monitoring target and by referring to corresponding lines in /proc/diskstats. |
Monitoring value |
Optional |
19 | Disk |
disk.{device name}.ios.milliseconds |
I/O exec. seconds |
millisecond | Calculated from differences from the values collected previously by checking devices of the monitoring target and by referring to corresponding lines in /proc/diskstats. |
Calculate total value within monitoring interval |
Optional |
20 | Disk |
disk.{device name}.weighted.ios.milliseconds |
Weighted I/O exec. seconds |
millisecond | Calculated from differences from the values collected previously by checking devices of the monitoring target and by referring to corresponding lines in /proc/diskstats. |
Calculate total value within monitoring interval |
Optional |
21 | Network |
network.{nwif name}.receive.bytes |
Received bytes |
byte | Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface |
Calculate total value within monitoring interval |
Optional |
22 | Network |
network.{nwif name}.receive.packets.count |
Received packets |
count | Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface |
Calculate total value within monitoring interval |
Optional |
23 | Network |
network.{nwif name}.receive.errs.count |
Received packets with errors |
count | Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface |
Calculate total value within monitoring interval |
Optional |
24 | Network |
network.{nwif name}.receive.drop.count |
Received packets dropped |
count | Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface |
Calculate total value within monitoring interval |
Optional |
25 | Network |
network.{nwif name}.receive.fifo.count |
Received packets with FIFO errors |
count | Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface |
Calculate total value within monitoring interval |
Optional |
26 | Network |
network.{nwif name}.receive.frame.count |
Received packets with frame errors |
count | Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface |
Calculate total value within monitoring interval |
Optional |
27 | Network |
network.{nwif name}.receive.compressed.count |
Compressed received packets |
count | Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface |
Calculate total value within monitoring interval |
Optional |
28 | Network |
network.{nwif name}.receive.multicast.count |
Received packets transmitted by multicast |
count | Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface |
Calculate total value within monitoring interval |
Optional |
29 | Network |
network.{nwif name}.transmit.bytes |
Transmitted bytes |
byte | Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface |
Calculate total value within monitoring interval |
Optional |
30 | Network |
network.{nwif name}.transmit.packets.count |
Transmitted packets |
count | Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface |
Calculate total value within monitoring interval |
Optional |
31 | Network |
network.{nwif name}.transmit.errs.count |
Transmitted packets with errors |
count | Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface |
Calculate total value within monitoring interval |
Optional |
32 | Network |
network.{nwif name}.transmit.drop.count |
Transmitted packets dropped |
count | Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface |
Calculate total value within monitoring interval |
Optional |
33 | Network |
network.{nwif name}.transmit.fifo.count |
Transmitted packets with FIFO errors |
count | Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface |
Calculate total value within monitoring interval |
Optional |
34 | Network |
network.{nwif name}.transmit.colls.count |
Transmitted packets with collisions |
count | Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface |
Calculate total value within monitoring interval |
Optional |
35 | Network |
network.{nwif name}.transmit.cerrier.count |
Transmitted packets with carrier loss |
count | Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface |
Calculate total value within monitoring interval |
Optional |
36 | Network |
network.{nwif name}.transmit.compressed.count |
Compressed transmitted packets |
count | Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface |
Calculate total value within monitoring interval |
Optional |
37 | Load average |
loadavg.1.count | Load average(1 min) |
count | Refer to /proc/loadavg |
Average value in past 1 minute during monitoring |
Optional |
38 | Load average |
loadavg.5.count | Load average(5min) |
count | Refer to /proc/loadavg |
Average value in past 5 minutes during monitoring |
Optional |
39 | Load average |
loadavg.15.count | Load average(15 min) |
count | Refer to /proc/loadavg |
Average value in past 15 minutes during monitoring |
Optional |
40 | Memory |
memory.memtotal.kilobytes | Total memory capacity |
kilobyte | Refer to /proc/meminfo |
Monitoring value |
Optional |
41 | Memory |
memory.memfree.kilobytes | Available memory capacity |
kilobyte | Refer to /proc/meminfo |
Monitoring value |
Optional |
42 | Memory |
memory.buffers.kilobytes | Buffer memory capacity |
kilobyte | Refer to /proc/meminfo |
Monitoring value |
Optional |
43 | Memory |
memory.cached.kilobytes | Cash memory capacity |
kilobyte | Refer to /proc/meminfo |
Monitoring value |
Optional |
44 | Memory |
memory.swapcached.kilobytes | Swap cash memory capacity |
kilobyte | Refer to /proc/meminfo |
Monitoring value |
Optional |
45 | Memory |
memory.active.kilobytes | Active memory capacity |
kilobyte | Refer to /proc/meminfo |
Monitoring value |
Optional |
46 | Memory |
memory.inactive.kilobytes | Inactive memory capacity |
kilobyte | Refer to /proc/meminfo |
Monitoring value |
Optional |
47 | Memory |
memory.swaptotal.kilobytes | Total swap capacity |
kilobyte | Refer to /proc/meminfo |
Monitoring value |
Optional |
48 | Memory |
memory.swapfree.kilobytes | Available swap capacity |
kilobyte | Refer to /proc/meminfo |
Monitoring value |
Optional |
49 | Memory |
memory.usedtotal.kilobytes | Used memory capacity |
kilobyte | Calculated from MemTotal, MemFree, Buffers and Cached in /proc/meminfo |
Monitoring value |
Optional |
Note
** Other notes **
The number of times to register the value of the custom meter data is 1500 times per day (registration of the 1501st item becomes an error). Therefore, values can be registered at intervals of about once per day. If you execute value registration exceeding 1,500 within 1 day, it will not be accepted. The number of registered values per day will be reset to 0:00 (UTC). Also, with respect to editing actions of custom meters that are not in value registration, they are not covered by the upper limit (1500 times) per day.
The number of possible actions per API request is 100 actions. However, “Create new custom meter” and “Register custom meter data” can not be executed at the same time for the same custom meter
Custom meter can not be deleted by customer’s operation. The custom meter is automatically deleted when it exceeds the set retention period after becoming an inactive custom meter. (Inactive custom meters are not subject to billing.)
Graph Generation¶
This function displays a graph based on the collected/accumulated data.
Multiple graphs can be displayed on the same screen. The screen is refreshed automatically and the refresh interval can be customized.
Action Settings¶
This function collects, stores, and analyzes monitoring data, as well as changes the settings of threshold level and actions for monitored items.
This function can send failure notifications to Customers by various methods.
Notification Method
Notification via email
By sending user API request
Note
Customizing Notification
Notification levels (Urgent, Alert, and Info etc.) can be set and the notification destination can be assigned according to level.
Detailed Information of Failure Notification
Date and Time
Host name
Items and trigger description
Latest data
Adding User Comments
Note
Monitoring Item List¶
Data Download¶
Resource Name
Meter Name
Time Period (At most, as far back as accumulation period)
API¶
Menu¶
Menu / Plan¶
Basic |
Advanced |
|
Pricing |
Free of charge |
Charged |
Data Retention Period |
32 days (Including the day) |
32 days or 397 days (about 1 year 1 month, including that day)
Up to 100
For more than 101 items, a monthly fee is added for each one
Customers can set per every meter.
Up to 300 tenants can extend the retention period up to 300
|
Number of Alarm |
10 Alarms
Per tenant
|
Up to 100
For more than 101 items, a monthly fee is added for each one
Up to 300 per tenant
|
Custom Meter |
1 piece
Per tenant
|
10 Alarms
Monthly fee is added for every 1 piece or more
Up to 30 per tenant
|
Others |
N/A |
Dashboard function, Data download function |
Note
Subscription Methods¶
Terms and Conditions¶
Conditions for Usage with Other Services¶
Minimum Use Period¶
As for Advanced menu, the minimum usage period is one month.
Menu |
Billing Unit |
Advanced Plan |
per Tenant |
Advanced Plan - Additional Meter |
per Meter |
Advanced Plan - Additional Alarms |
per Alarm |
Advanced plan - additional custom meter |
per Meter |
Quality of Service¶
Support Coverage¶
The functions described in Section 2.1 are supported by this service.
Operation¶
Item |
Details |
Operation Time |
24/7 |
Failure Support Policy |
Rapid restoration performed by NTT Com. |