11.1. Monitoring

11.1.1. About This Service

11.1.1.1. Overview

This service provides functions to collect and report the information (meter) on various resources, such as virtual servers, which allows Customers to evaluate the health/performance of each service.
Additionally, alerts can be set by selecting the threshold level of monitoring items. The alert can then be sent to an email address designated by the Customers.
Customers can manage following items by utilizing monitoring features:
  • Keep-alive monitoring as well as resource scarcity check for monitoring targets.

  • Capacity management through viewing statistics of various equipments.

  • Keep logs of failure occurrence and its status

11.1.1.2. Features

Customers can utilize all the basic functions of Monitoring service free of charge.
In case Customers find these basic functions insufficient of their needs, they can expand the functions by subscribing paid menus, depending on their purposes and ways of usage.

11.1.2. Available Functions

11.1.2.1. List of Functions

Following are the functions to be available with Monitoring, which allows Customers to configure the settings as needed.

Functions

Details

Data Collection/Accumulation

Customers can collect needed data from various resources and store them for a certain period.

Custom Meter

Data values collected by script etc. can be accumulated in the monitoring server.

Graph Generation

This function enables Customers to generate graphs from the collected and stored data by Monitoring service.

Action Settings

Customers can set up a couple of specific actions in case of excessive thresholds they set.

Monitoring Item List

Customers can confirm the list of monitoring items they set.

Alert History

Customers can confirm alert histories that have exceeded the thresholds.

Data Download

Customers can download stored monitoring data via API.

API

API is available.


11.1.2.2. Description of Functions

Data Collection/Accumulation

  • This function collects data from various resources for the relevant monitoring items.

  • The collected data is accumulated on the system, and can be used for the function of Graph Generation which is described in later section.

  • The data accumulation period is 32 days (including the day).

Virtual Server Instance

Meter

Display Name

Meter Name

Unit

Collection source and determination method

Note

Monitoring Interval

CPU usage rate * 1

CPU Utilization nova.cpu.utilization.persents percent

Collect on hypervisor where virtual server instance is running

Calculate the average of monitoring intervals

5 minutes

Disk Read Bytes

Disk Read Bytes nova.disk.read.bytes byte

Collect on hypervisor where virtual server instance is running

Calculate total value within monitoring interval

5 minutes

Disk Write Bytes

Disk Write Bytes nova.disk.write.bytes byte

Collect on hypervisor where virtual server instance is running

Calculate total value within monitoring interval

5 minutes

Disk Read Requests

Disk Read Requests nova.disk.read.requests request

Collect on hypervisor where virtual server instance is running

Calculate total value within monitoring interval

5 minutes

Disk Write Requests

Disk Write Requests nova.disk.write.requests request

Collect on hypervisor where virtual server instance is running

Calculate total value within monitoring interval

5 minutes

Incoming Traffic

Incoming Traffic nova.network.incoming.bytes byte

Collect on hypervisor where virtual server instance is running

Calculate total value within monitoring interval

5 minutes

Outgoing Traffic

Outgoing Traffic nova.network.outgoing.bytes byte

Collect on hypervisor where virtual server instance is running

Calculate total value within monitoring interval

5 minutes

VM life and death monitoring * 2

VM Status nova.vm.status.bool boolean
Collect on hypervisor where virtual server instance is running
Normal = 0
Failure = 1

Monitoring value

1 minute

Hypervisor Status

Hypervisor Status nova.hv.status.bool boolean
Determined by the state of the network interface of each virtual server instance on the hypervisor on which the virtual server instance is running
Normal = 0
Failure = 1

Monitoring value

1 minute


Note

※ 1 The maximum value is “100 percent x number of CPU cores”. (Example: If the virtual server instance is 16 CPUs, the value ranges from 0 to 1600 percent.)

Note

※ 2 “VM dead peer monitoring” shows the monitoring result from the infrastructure side that provides the virtual server instance, and the trouble caused by internal instance problems (kernel panic or OS crash) occurring while the virtual server instance is running can not be detected.


Virtual Server Volume

Meter

Display Name

Meter Name

Unit

Collection source and determination method

Note

Monitoring Interval

Disk Read Bytes

Disk Read Bytes cinder.disk.read.bytes byte

Collect on hypervisor where virtual server instance is running

Calculate total value within monitoring interval

5 minutes

Disk Write Bytes

Disk Write Bytes cinder.disk.write.bytes byte

Collect on hypervisor where virtual server instance is running

Calculate total value within monitoring interval

5 minutes

Disk Read Requests

Disk Read Requests cinder.disk.read.requests request

Collect on hypervisor where virtual server instance is running

Calculate total value within monitoring interval

5 minutes

Disk Write Requests

Disk Write Requests cinder.disk.write.requests request

Collect on hypervisor where virtual server instance is running

Calculate total value within monitoring interval

5 minutes


Baremetal Server / vSphere ESXi / Hyper-V

Meter

Display Name

Meter Name

Unit

Collection source and determination method

Note

Monitoring Interval

Power Supply Status

Chassis Power Status baremetal-server.chassis-power.status.bool boolean
Collected from the resource by IPMI
Definitions: ON =false (0)
OFF= true (1)

Monitoring value

5 minutes

Fan Status

Fan Status baremetal-server.fan.status.bool boolean
Determined based on presence / absence of fan failure collected from IPMI by IPMI
Send true (1) in case any fan failure
Send false (0) in normal state

Monitoring value

5 minutes

Chassis Power Status

Power Supply Status baremetal-server.power-supply.status.bool boolean
Collected from the resource by IPMI
Send true (1) if there is even one power supply failure
Send false (0) in normal state

Monitoring value

1 minute

CPU Status

CPU Status baremetal-server.cpu.status.bool boolean
Determined by presence or absence of CPU failure collected by IPMI from this resource
Send true (1) in case any CPU failure
Send false (0) in normal state

Monitoring value

1 minute

Memory Status

Memory Status baremetal-server.memory.status.bool boolean
Determined based on presence / absence of memory failure collected by IPMI from this resource
Send true (1) in case any memory failure
Send false (0) in normal state

Monitoring value

1 minute

HDD failure

Disk Failure Number baremetal-server.disk.status.failures int
Determined based on presence / absence of memory failure collected by IPMI from this resource
Send the number of failed disks (0 to 36)

Monitoring value

1 minute

NIC failure

NIC Status baremetal-server.nic.status.bool boolean
Determined based on presence / absence of memory failure collected by IPMI from this resource
Send true (1) if at least one NIC has failed
Send false (0) in normal state
※ It is not displayed in Workload Optimized 2

Monitoring value

1 minute

System board failure

System Board Status baremetal-server.system.board.status.bool boolean
Determined based on presence / absence of memory failure collected by IPMI from this resource
Send true (1) if at least one system board has failed
Send false (0) in normal state
※ It is not displayed in Workload Optimized 2

Monitoring value

1 minute

Other Statuses

Other Statuses baremetal-server.etc.status.bool boolean
Judge whether failure or not collected from the resource by IPMI
RAID Controller failure
Failed to get HDD information (Workload Optimaized 2 only)
Other Statuses
Send true (1) in above cases
Send false (0) in normal state

Monitoring value

1 minute


Internet Connectivity

Meter

Display Name

Meter Name

Unit

Collection source and determination method

Note

Monitoring Interval

Traffic IN

Incoming Traffic internet-connectivity.traffic.in.bps bps

Collect from the network equipment to which the function of the resource is assigned

Calculate the average of monitoring intervals

5 minutes

Traffic OUT

Outgoing Traffic internet-connectivity.traffic.out.bps bps

Collect from the network equipment to which the function of the resource is assigned

Calculate the average of monitoring intervals

5 minutes

Gateway Status

Gateway Status internet-connectivity.internet_gateway.status.bool boolean
Based on the presence or absence of the SNMP response of the network device to which the function of the resource is assigned
IF connected to the Internet GW logical network
Normal = 0
Failure = 1

Monitoring value

1 minute

IF Status

Interface Status internet-gw-interface.gw_interface.status.bool boolean
Determined by the normality of the VRRP status of the network device to which the function of the resource is assigned
IF connected to Internet GW Internet
Normal = 0
Failure = 1

Monitoring value

1 minute

Note

In the absence of global IP for the Internet gateway, sample traffic values for traffic IN and traffic OUT are not monitored because there is no internet connection traffic.


VPN Connectivity

Meter

Display Name

Meter Name

Unit

Collection source and determination method

Note

Monitoring Interval

Traffic IN

Incoming Traffic vpn-interface.traffic.in.bps bps

Collect from the network equipment to which the function of the resource is assigned

Calculate the average of monitoring intervals

5 minutes

Traffic OUT

Outgoing Traffic vpn-interface.traffic.out.bps bps

Collect from the network equipment to which the function of the resource is assigned

Calculate the average of monitoring intervals

5 minutes

Gateway Status

Gateway Status vpn-connectivity.vpn_gateway.status.bool boolean
Based on the presence or absence of the SNMP response of the network device to which the function of the resource is assigned
Normal = 0
Failure = 1

Monitoring value

1 minute

GW-IF Status

Interface Status vpn-gw-interface.gw_interface.status.bool boolean
Determined by the normality of the VRRP status of the network device to which the function of the resource is assigned
IF connected to the logical network of VPN GW
Normal = 0
Failure = 1

Monitoring value

1 minute

IF Status

Interface Status vpn-interface.vpn_interface.status.bool boolean
Determined by the normality of the VRRP status of the network device to which the function of the resource is assigned
IF connected to the VPN network of VPN GW
Normal = 0
Failure = 1

Monitoring value

1 minute


Logical Network

Meter

Display Name

Meter Name

Unit

Collection source and determination method

Note

Monitoring Interval

Traffic IN

Incoming Traffic logical-network-port.traffic.in.bps bps

Collect from the network equipment to which the function of the resource is assigned

Calculate the average of monitoring intervals

5 minutes

Traffic OUT

Outgoing Traffic logical-network-port.traffic.out.bps bps

Collect from the network equipment to which the function of the resource is assigned

Calculate the average of monitoring intervals

5 minutes

NW Status

Operational State logical-network.network.status.bool boolean
Based on the result of HTTP health check for the network device to which the function of the resource is assigned
Normal = 0
Failure = 1

Monitoring value

1 minute

Port Status

Operational State logical-network-port.port.status.bool boolean
Collect from the network equipment to which the function of the resource is assigned
Normal = 0
Failure = 1

Monitoring value

1 minute


Datacenter Inter-Connectivity

Meter

Display Name

Meter Name

Unit

Collection source and determination method

Note

Monitoring Interval

Traffic IN

Incoming Traffic interdc-interface.traffic.in.bps bps

Collect from the network equipment to which the function of the resource is assigned

Calculate the average of monitoring intervals

5 minutes

Traffic OUT

Outgoing Traffic interdc-interface.traffic.out.bps bps

Collect from the network equipment to which the function of the resource is assigned

Calculate the average of monitoring intervals

5 minutes

Gateway Status

Gateway Status interdc-connectivity.interdc_gateway.status.bool boolean
Based on the presence or absence of the SNMP response of the network device to which the function of the resource is assigned
Normal = 0
Failure = 1

Monitoring value

1 minute

GW-IF Status

Interface Status interdc-gw-interface.gw_interface.status.bool boolean
Determined by the normality of the VRRP status of the network device to which the function of the resource is assigned
Datacenter Inter-Connectivity IF connected to GW’s logical network
Normal = 0
Failure = 1

Monitoring value

1 minute

IF Status

Interface Status interdc-interface.interdc_interface.status.bool boolean
Determined by the normality of the VRRP status of the network device to which the function of the resource is assigned
IF connected to the network side of Datacenter Inter-Connectivity GW
Normal = 0
Failure = 1

Monitoring value

1 minute


Amazon Web Services Inter-Connectivity

Meter

Display Name

Meter Name

Unit

Collection source and determination method

Note

Monitoring Interval

Gateway Status

Gateway Status aws-connectivity.aws_gateway.status.bool boolean
Based on the presence or absence of SNMP response of the network device connected to the AWS
Normal = 0
Failure = 1

Monitoring value

1 minute

GW-IF Status

Interface Status aws-gw-interface.gw_interface.status.bool boolean
based on the presence or absence of VRRP status of the network device connected to the AWS
Normal = 0
Failure = 1

Monitoring value

1 minute

Traffic IN

Incoming Traffic aws-interface.traffic.in.bps bps

based on the presence or absence of SNMP response to the network device connected to the AWS

Send average value for 5 minutes

5 minutes

Traffic OUT

Outgoing Traffic aws-interface.traffic.out.bps bps

based on the presence or absence of SNMP response to the network device connected to the AWS

Send average value for 5 minutes

5 minutes

IF Status

Interface Status aws-interface.aws_interface.status.bool boolean
Based on the presence or absence of BGP status of the network device connected to the AWS
Normal = 0
Failure = 1

Monitoring value

1 minute


Firewall

Meter

Display Name

Meter Name

Unit

Collection source and determination method

Note

Monitoring Interval

CPU Utilization(User)

CPU Utilization(User)

firewall.cpu.user.percents percent

Collected by SNMP from the virtual server instance to which the function of the resource is assigned

Calculate the average of monitoring intervals

5 minutes

CPU Utilization(System)

CPU Utilization(System)

firewall.cpu.system.percents percent

Collected by SNMP from the virtual server instance to which the function of the resource is assigned

Calculate the average of monitoring intervals

5 minutes

CPU Utilization(Idle)

CPU Utilization(Idle)

firewall.cpu.idle.percents percent

Collected by SNMP from the virtual server instance to which the function of the resource is assigned

Calculate the average of monitoring intervals

5 minutes

Memory Total

Memory Total firewall.memory.total.kbytes kbyte

Collected by SNMP from the virtual server instance to which the function of the resource is assigned

Calculate the average of monitoring intervals

5 minutes

Memory Available

Memory Available firewall.memory.available.kbytes kbyte

Collected by SNMP from the virtual server instance to which the function of the resource is assigned

Calculate the average of monitoring intervals

5 minutes

TCP Active Connections

TCP Active Connections firewall.tcp.active.connections connection

Collected by SNMP from the virtual server instance to which the function of the resource is assigned

Total number of connections since startup

5 minutes

TCP Passive Connections

TCP Passive Connections firewall.tcp.passive.connections connection

Collected by SNMP from the virtual server instance to which the function of the resource is assigned

Total number of connections since startup

5 minutes

FW Status

Firewall Status firewall.firewall.status.bool boolean
Based on the presence or absence of SNMP response from the virtual server instance to which the function of the resource is assigned
Normal = 0
Failure = 1

Monitoring value

1 minute

IF Status

Interface Status firewall-interface.firewall_interface.status.bool boolean
Collected by SNMP from the virtual server instance to which the function of the resource is assigned
Normal = 0
Failure = 1

Monitoring value

1 minute

Traffic IN

Incoming Traffic firewall-interface.traffic.in.bps bps

Collected by SNMP from the virtual server instance to which the function of the resource is assigned

Calculate the average of monitoring intervals

5 minutes

Traffic OUT

Outgoing Traffic firewall-interface.traffic.out.bps bps

Collected by SNMP from the virtual server instance to which the function of the resource is assigned

Calculate the average of monitoring intervals

5 minutes


Load Balancer

Meter

Display Name

Meter Name

Unit

Collection source and determination method

Note

Monitoring Interval

CPU Utilization

CPU Utilization load-balancer.cpu.usage.percents percent
Collected by SNMP from the virtual server instance to which the function of the resource is assigned
Sum of the value of each core

Calculate the average of monitoring intervals

5 minutes

Memory Usage

Memory Usage load-balancer.memory.usage.persents percent

Collected by SNMP from the virtual server instance to which the function of the resource is assigned

Calculate the average of monitoring intervals

5 minutes

HTTP Request Connections

HTTP Request Connections load-balancer.http.request.connections connection

Collected by SNMP from the virtual server instance to which the function of the resource is assigned

Total number of requests since startup

5 minutes

TCP Client Connections

TCP Client Connections load-balancer.tcp.client.connections connection

Collected by SNMP from the virtual server instance to which the function of the resource is assigned

Calculate the average of monitoring intervals

5 minutes

TCP Server Connections

TCP Server Connections load-balancer.tcp.server.connections connection

Collected by SNMP from the virtual server instance to which the function of the resource is assigned

Calculate the average of monitoring intervals

5 minutes

LB Status

Load Balancer Status load-balancer.load_balancer.status.bool boolean
Collected by SNMP from the virtual server instance to which the function of the resource is assigned
Normal = 0
Failure = 1

Monitoring value

1 minute

IF Status

Interface Status load-balancer-interface.load_balancer_interface.status.bool boolean
Collected by SNMP from the virtual server instance to which the function of the resource is assigned
Normal = 0
Failure = 1

Monitoring value

1 minute

Traffic IN

Incoming Traffic load-balancer-interface.traffic.in.bps bps

Collected by SNMP from the virtual server instance to which the function of the resource is assigned

Calculate the average of monitoring intervals

5 minutes

Traffic OUT

Outgoing Traffic load-balancer-interface.traffic.out.bps bps

Collected by SNMP from the virtual server instance to which the function of the resource is assigned

Calculate the average of monitoring intervals

5 minutes


Custom Resource

  • Custom resource is a monitoring resource that a customer can create arbitrarily by custom meter functions.

  • It can be used such as to monitor servers other than the EnterpriseCloud2.0 by EnterpriseCloud2.0.

  • For custom resource creation, please refer to the column of the Custom Meter.


11.1.2.3. Custom Meter

  • The custom meter is a function that accumulates data values collected by the customer arbitrarily by script etc. in the monitoring server.

  • As a result, it is possible to perform arbitrary values as graphs and alerts like other meters.

  • Since this function uses the API, it is necessary that the device that is the source of the API is connected to the Internet.

  • If you specify an arbitrary name to a Resource ID at the time of custom meter creation, you can create a custom resource of the specified name.


Mechanism of custom meter

When using the custom meter, it is necessary to prepare in advance script (or application) for customer to periodically acquire data and send it to monitoring.
By sending the data acquired by the script to the monitoring endpoint of the API according to separately determined request parameters and format, it is possible to create monitoring items and accumulate / edit sample values.
In addition, target resources are compatible also with servers other than Enterprise Cloud 2.0, so various data can be registered with a server on Enterprise Cloud 1.0 as a monitoring target, for example.
  • Create a script to acquire custom meter registration data and execute it (script prepared for customer)

  • The above data can be sent to custom meter using API, and API can be implemented similarly in script (see API reference)

  • Use the Enterprise Cloud 2.0 customer portal and API to browse data accumulated in monitoring or set alerts


** Function of custom meter **

Custom Meter

Details

Create custom meter (API)

Create a data storage area for the custom meter
Please note that data registration becomes possible only after creation of the data storage area and that new creation and data registration cannot be done at the same time.

Add custom meter

Data (sample value) managed by monitoring can be registered in the created data storage area.

Edit custom meter(API)

Customer can change each parameter of custom meter once created.

Custom meter agent

Predefined collection items can be automatically accumulated in the monitoring server.


Custom meter status

Depending on the state, the following two kinds of statuses exist for the custom meter created.

Active custom meter

It means a custom meter actually in use and being billed. Sample value registration to the custom meter will cause transition to this status.
The maximum number of active custom meters that can coexist at the same time is 30. If the number is exceeded, creation of the 31st will cause an error.

Inactive custom meter

If sample value registration is not made for 24 hours after the latest value registration, an active custom meter will become inactive.
If this status is reached, the meter above is not counted as an active custom meter. Therefore, other active custom meter can be newly used.
In addition, sample values accumulated so far will be retained for the specified retention period (31 days by default, 397 days when combined with meter retention period extension function).
If value registration is made to a custom meter being inactive once, it will be counted again as an active custom meter.

Note

The maximum number of created custom meters is 120.
When custom meters are transmitted regularly from VM or the like on Enterprise Cloud 2.0, the Monitoring API Endpoint may become unreachable due to maintenance or failure of Enterprise Cloud 2.0 network or gateway, and value registration of the custom meter may become temporarily impossible.
When you use a custom meter in Enterprise Cloud 2.0 environment, please pay attention to network maintenance and fault information.

Custom meter agent

The custom meter accumulates data values collected by scripts created arbitrarily by the customer into the monitoring server.
If the custom meter agent is started on the target server, operations from acquisitions of items you generally want to collect on an OS (memory utilization, load average, etc.) until accumulation on the monitoring server can be executed automatically.
The source code of Customer meter agent is available at GitHub .

Meters acquired by custom meter agent

No

Categories

Meter Name

Details

Unit

Collection source and determination method

Optional

Monitoring Interval

1

CPU Utilization

cpu.user.percents

CPU utilization (user mode)

percent

Calculated from differences from the values collected previously by referring to /proc/stat.

Calculate the average of monitoring intervals

Optional

2

CPU Utilization

cpu.nice.percents

CPU utilization (low-priority user mode)

percent

Calculated from differences from the values collected previously by referring to /proc/stat.

Calculate the average of monitoring intervals

Optional

3

CPU Utilization

cpu.system.percents

CPU utilization (system mode)

percent

Calculated from differences from the values collected previously by referring to /proc/stat.

Calculate the average of monitoring intervals

Optional

4

CPU Utilization

cpu.idle.percents

CPU utilization (task queue)

percent

Calculated from differences from the values collected previously by referring to /proc/stat.

Calculate the average of monitoring intervals

Optional

5

CPU Utilization

cpu.iowait.percents

CPU utilization (I/O queue)

percent

Calculated from differences from the values collected previously by referring to /proc/stat.

Calculate the average of monitoring intervals

Optional

6

CPU Utilization

cpu.irq.percents

CPU utilization (interrupt)

percent

Calculated from differences from the values collected previously by referring to /proc/stat.

Calculate the average of monitoring intervals

Optional

7

CPU Utilization

cpu.softirq.percents

CPU utilization (interrupt)

percent

Calculated from differences from the values collected previously by referring to /proc/stat.

Calculate the average of monitoring intervals

Optional

8

CPU Utilization

cpu.steal.percents

CPU utilization (time used by other OS in virtual environment)

percent

Calculated from differences from the values collected previously by referring to /proc/stat.

Calculate the average of monitoring intervals

Optional

9

CPU Utilization

cpu.guest.percents

CPU utilization (virtual CPU for guest OS)

percent

Calculated from differences from the values collected previously by referring to /proc/stat.

Calculate the average of monitoring intervals

Optional

10

Disk

disk.{device name}.reads.completed.count

Completed read I/Os

count

Calculated from differences from the values collected previously by checking devices of the monitoring target and by referring to corresponding lines in /proc/diskstats.

Calculate total value within monitoring interval

Optional

11

Disk

disk.{device name}.reads.merged.count

Merge read I/Os

count

Calculated from differences from the values collected previously by checking devices of the monitoring target and by referring to corresponding lines in /proc/diskstats.

Calculate total value within monitoring interval

Optional

12

Disk

disk.{device name}.reads.sectors.count

Read sectors

count

Calculated from differences from the values collected previously by checking devices of the monitoring target and by referring to corresponding lines in /proc/diskstats.

Calculate total value within monitoring interval

Optional

13

Disk

disk.{device name}.reads.milliseconds

Read milliseconds

millisecond

Calculated from differences from the values collected previously by checking devices of the monitoring target and by referring to corresponding lines in /proc/diskstats.

Calculate total value within monitoring interval

Optional

14

Disk

disk.{device name}.writes.completed.count

Completed write I/Os

count

Calculated from differences from the values collected previously by checking devices of the monitoring target and by referring to corresponding lines in /proc/diskstats.

Calculate total value within monitoring interval

Optional

15

Disk

disk.{device name}.writes.merged.count

Merge write I/Os

count

Calculated from differences from the values collected previously by checking devices of the monitoring target and by referring to corresponding lines in /proc/diskstats.

Calculate total value within monitoring interval

Optional

16

Disk

disk.{device name}.writes.sectors.count

Write sectors

count

Calculated from differences from the values collected previously by checking devices of the monitoring target and by referring to corresponding lines in /proc/diskstats.

Calculate total value within monitoring interval

Optional

17

Disk

disk.{device name}.writes.milliseconds

Write seconds

millisecond

Calculated from differences from the values collected previously by checking devices of the monitoring target and by referring to corresponding lines in /proc/diskstats.

Calculate total value within monitoring interval

Optional

18

Disk

disk.{device name}.currently.ios.count

I/Os in execution

count

Calculated from differences from the values collected previously by checking devices of the monitoring target and by referring to corresponding lines in /proc/diskstats.

Monitoring value

Optional

19

Disk

disk.{device name}.ios.milliseconds

I/O exec. seconds

millisecond

Calculated from differences from the values collected previously by checking devices of the monitoring target and by referring to corresponding lines in /proc/diskstats.

Calculate total value within monitoring interval

Optional

20

Disk

disk.{device name}.weighted.ios.milliseconds

Weighted I/O exec. seconds

millisecond

Calculated from differences from the values collected previously by checking devices of the monitoring target and by referring to corresponding lines in /proc/diskstats.

Calculate total value within monitoring interval

Optional

21

Network

network.{nwif name}.receive.bytes

Received bytes

byte

Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface

Calculate total value within monitoring interval

Optional

22

Network

network.{nwif name}.receive.packets.count

Received packets

count

Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface

Calculate total value within monitoring interval

Optional

23

Network

network.{nwif name}.receive.errs.count

Received packets with errors

count

Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface

Calculate total value within monitoring interval

Optional

24

Network

network.{nwif name}.receive.drop.count

Received packets dropped

count

Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface

Calculate total value within monitoring interval

Optional

25

Network

network.{nwif name}.receive.fifo.count

Received packets with FIFO errors

count

Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface

Calculate total value within monitoring interval

Optional

26

Network

network.{nwif name}.receive.frame.count

Received packets with frame errors

count

Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface

Calculate total value within monitoring interval

Optional

27

Network

network.{nwif name}.receive.compressed.count

Compressed received packets

count

Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface

Calculate total value within monitoring interval

Optional

28

Network

network.{nwif name}.receive.multicast.count

Received packets transmitted by multicast

count

Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface

Calculate total value within monitoring interval

Optional

29

Network

network.{nwif name}.transmit.bytes

Transmitted bytes

byte

Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface

Calculate total value within monitoring interval

Optional

30

Network

network.{nwif name}.transmit.packets.count

Transmitted packets

count

Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface

Calculate total value within monitoring interval

Optional

31

Network

network.{nwif name}.transmit.errs.count

Transmitted packets with errors

count

Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface

Calculate total value within monitoring interval

Optional

32

Network

network.{nwif name}.transmit.drop.count

Transmitted packets dropped

count

Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface

Calculate total value within monitoring interval

Optional

33

Network

network.{nwif name}.transmit.fifo.count

Transmitted packets with FIFO errors

count

Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface

Calculate total value within monitoring interval

Optional

34

Network

network.{nwif name}.transmit.colls.count

Transmitted packets with collisions

count

Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface

Calculate total value within monitoring interval

Optional

35

Network

network.{nwif name}.transmit.cerrier.count

Transmitted packets with carrier loss

count

Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface

Calculate total value within monitoring interval

Optional

36

Network

network.{nwif name}.transmit.compressed.count

Compressed transmitted packets

count

Calculated from differences from the values collected previously by referring to /proc/net/dev. Calculate values per NW interface

Calculate total value within monitoring interval

Optional

37

Load average

loadavg.1.count

Load average(1 min)

count

Refer to /proc/loadavg

Average value in past 1 minute during monitoring

Optional

38

Load average

loadavg.5.count

Load average(5min)

count

Refer to /proc/loadavg

Average value in past 5 minutes during monitoring

Optional

39

Load average

loadavg.15.count

Load average(15 min)

count

Refer to /proc/loadavg

Average value in past 15 minutes during monitoring

Optional

40

Memory

memory.memtotal.kilobytes

Total memory capacity

kilobyte

Refer to /proc/meminfo

Monitoring value

Optional

41

Memory

memory.memfree.kilobytes

Available memory capacity

kilobyte

Refer to /proc/meminfo

Monitoring value

Optional

42

Memory

memory.buffers.kilobytes

Buffer memory capacity

kilobyte

Refer to /proc/meminfo

Monitoring value

Optional

43

Memory

memory.cached.kilobytes

Cash memory capacity

kilobyte

Refer to /proc/meminfo

Monitoring value

Optional

44

Memory

memory.swapcached.killobytes

Swap cash memory capacity

kilobyte

Refer to /proc/meminfo

Monitoring value

Optional

45

Memory

memory.active.killobytes

Active memory capacity

kilobyte

Refer to /proc/meminfo

Monitoring value

Optional

46

Memory

memory.inactive.killobytes

Inactive memory capacity

kilobyte

Refer to /proc/meminfo

Monitoring value

Optional

47

Memory

memory.swaptotal.killobytes

Total swap capacity

kilobyte

Refer to /proc/meminfo

Monitoring value

Optional

48

Memory

memory.swapfree.killobytes

Available swap capacity

kilobyte

Refer to /proc/meminfo

Monitoring value

Optional

49

Memory

memory.usedtotal.killobytes

Used memory capacity

kilobyte

Calculated from MemTotal, MemFree, Buffers and Cached in /proc/meminfo

Monitoring value

Optional

Note

To run custom meter agent, the server where custom meter run need to access to API endpoint of Monitoring via Internet.
If an instance initiated by a custom meter agent stops, or if communication with the Internet is suspended, values collected by the custom meters will not be accumulated on the monitoring server.
The operation target OSs of custom meter agent are as follows.
Linux: CentOS 7.1, Ubuntu16.04LTS (It also works in Linux environments other than these but they are not supported.)
The custom meter agent is targeted for use and operation on a virtual server of Enterprise Cloud 2.0, and the right to use in that case will be provided.
The custom meter agent is targeted for use and operation on a virtual server of Enterprise Cloud 2.0, and the right to use in that case will be provided.
In addition, customers can ask questions and have consultations within the basic support offered by NTTCom.
1)Questions on custom meter agent specifications and how to run the agent
2)Questions/Consultations on cause investigation and workaround when a custom meter agent does not work normally
If NTTCom judges that it is effective for problem solving, a revised version of custom meter agent will be developed and offered to customers. The lead time to deliver the revised version, however, shall be determined by NTTCom.
The operation rate and traffic of repository server is best effort.

Other notes

  • The number of times to register the value of the custom meter data is 1500 times per day (registration of the 1501st item becomes an error). Therefore, values can be registered at intervals of about once per day. If you execute value registration exceeding 1,500 within 1 day, it will not be accepted. The number of registered values per day will be reset to 0:00 (UTC). Also, with respect to editing actions of custom meters that are not in value registration, they are not covered by the upper limit (1500 times) per day.

  • The number of possible actions per API request is 100 actions. However, “Create new custom meter” and “Register custom meter data” can not be executed at the same time for the same custom meter

  • Custom meter can not be deleted by customer’s operation. The custom meter is automatically deleted when it exceeds the set retention period after becoming an inactive custom meter. (Inactive custom meters are not subject to billing.)


11.1.2.4. Graph Generation

  • This function displays a graph based on the collected/accumulated data.

  • Multiple graphs can be displayed within the same screen. The screen is refreshed automatically and the refresh interval can be customized.

グラフ作成画面

11.1.2.5. Action Settings

  • This function collects, stores, and analyzes monitoring data, as well as changes the settings of threshold level and actions for monitored items.

  • This function can send failure notifications to Customers by various methods.


Notification Method

There are several failure notification methods that allow Customers to receive the failure notifications as follows:
  • Notification via email

  • By sending user API request


Note

Alarms in the past may be notified due to network delay or workload of servers.
If an alarm is notified, check the newest status on the portal.

Customizing Notification

Customers can further customize failure notifications depending its conditions.
Customers can receive failure notifications by the methods based on the designated timeslot and failure levels.
  • Notification levels (Urgent, Alert, and Info etc.) can be set and the notification destination can be assigned according to level.


Detailed Information of Failure Notification

Notification messages include the failure date and time, asset information, settings and the latest data. The notification messages can display items by the following macro:
  • Date and Time

  • Host name

  • Items and trigger description

  • Latest data

  • Adding User Comments


Note

For a notification of which region is set to JP1 or JP2 resources, alarm mails will be transmitted both in Japanese and English.
Alarm mails for regions other than JP1 and JP2 will be transmitted only in English.

11.1.2.6. Monitoring Item List

Customers can see the list of monitoring items they set.
They can also see the details of threshold they configured, as well as the status whether it is exceeded or not.
Additionally, they can select and delete multiple alarms displayed in the list.
アラーム一覧

11.1.2.7. Alert History

Customers can see history of actions previously executed.
アラーム履歴

11.1.2.8. Data Download

Performance data collected from various resources can be downloaded by GUI or API.
The following items can be set for download.
  • Resource Name

  • Meter Name

  • Time Period (At most, as far back as accumulation period)


11.1.2.9. API

This service provides API.
Customers can obtain monitoring information by sending ASPI request.
By configuring API parameters, they can set the details of data they would like to get, and set the time period.
The number of API requests for monitoring is limited to 4 calls / sec for each Source IP address and 240 cal / min for each user.

11.1.3. Menu

11.1.3.1. Menu / Plan

The service menu includes following two types.
Customers can select either of them as needed.
 

Basic

Advanced

Pricing

Free of charge

Charged

Data Retention Period

32 days (Including the day)

32 days or 397 days (about 1 year 1 month, including that day)
Up to 100
For more than 101 items, a monthly fee is added for each one
Customers can set per every meter.
Up to 300 tenants can extend the retention period up to 300

Number of Alarm

10 Alarms
Per tenant
Up to 100
For more than 101 items, a monthly fee is added for each one
Up to 300 per tenant

Custom Meter

1 piece
Per tenant
10 Alarms
Monthly fee is added for every 1 piece or more
Up to 30 per tenant

Others

N/A

Dashboard function, Data download function

Note

For customers who use Advanced Plan and find that the maximum setting numbers of “Data Retention Period”, “Alarms” and “Custom Meters” per tenant are not enough, please talk to us separately.

11.1.3.2. Subscription Methods

All Customers can subscribe the Basic menu of Monitoring, with Enterprise Cloud 2.0 contract.
For Advanced menu, they can subscribe by placing an order via Customer Portal as needed.

11.1.4. Terms and Conditions

11.1.4.1. Conditions for Usage with Other Services

Available meters for monitoring settings are limited to the menu Customers subscribe.

11.1.4.2. Minimum Use Period

As for Advanced menu, the minimum usage period is one month.

Menu

Billing Unit

Advanced Plan

per Tenant

Advanced Plan - Additional Meter

per Meter

Advanced Plan - Additional Alarms

per Alarm

Advanced plan - additional custom meter

per Meter


11.1.5. Pricing

11.1.5.1. Initial Fee

There is no initial fee.

11.1.5.2. Monthly Fee

Basic menu is free. Fixed monthly fee is applied to Advanced menu.

11.1.6. Quality of Service

11.1.6.1. Support Coverage

The functions described in Section 2.1 are supported by this service.

11.1.6.2. Operation

This service is provided with the following operational quality.

Item

Details

Operation Time

24/7

Failure Support Policy

Rapid restoration performed by NTT Com.


11.1.6.3. SLA

This service does not provide a SLA.


11.1.7. Restrictions

Please note that If the resource’s survival time is less than the monitoring interval, the meter will not be output.
The number of API requests for monitoring is limited to 4 calls / sec for each Source IP address and 240 cal / min for each user.