Approaches to assessing saizings

Table of contents

Minimum requirements
Performance table
Approaches to calculating resources utilized

Minimum requirements

Minimum server parameters for platform operation in single server mode:

4 of the cores,
8 GB RAM,
100 GB HDD.

Recommended server parameters for platform operation in single-server mode with database and file storage on the server:

8+ cores,
32+ GB RAM,
250+ GB SSD TBW 3500+ TB,
1+ TB HDD/SSD for the database,
1+ TB HDD/SSD for file storage,
RAID.

Performance table

Load testing was performed on different processors in single-server and dual-server designs. Three tests were conducted with varying emphasis on either CPS, number of simultaneous conversations, or number of simultaneous conversations IVR. In each, the other two indicators were obviously non-zero, but much smaller than the peak possible values.

The result of testing different processors was summarized and reduced to 1 core.

Average table of performance per 1 core 2.5 GHz:

Emphasis on quantity	Per core	The server has 12 cores per 2.5 GHz
CPS	cps - 6, calls - 25, ivr - 25	cps - 72, calls - 300, ivr - 300
Calls	cps - 0.5, calls - 180, ivr - 0	cps - 6, calls - 2000, ivr - 0
IVR	cps - 0.5, calls - 125, ivr - 125	cps - 6, calls - 1500, ivr - 1500

Emphasis on quantity

Per core

The server has 12 cores per 2.5 GHz

CPS

cps - 6, calls - 25, ivr - 25

cps - 72, calls - 300, ivr - 300

Calls

cps - 0.5, calls - 180, ivr - 0

cps - 6, calls - 2000, ivr - 0

IVR

cps - 0.5, calls - 125, ivr - 125

cps - 6, calls - 1500, ivr - 1500

Approaches to calculating resources utilized

Three approaches are distinguished:

Extension. Gradually load, evaluate actual resource utilization. Not enough - add more.
Load Extrapolation. Measure the costs obtained on a small volume of executable processes in a fully functioning system. For example on 10 operators. Extrapolate the result to the expected number of simultaneously executing processes. For example, up to 1000 operators.
Expert judgment. Based on the estimated figures for telephony, we apply a factor of 3-6 depending on the type of redundancy and the size of the system.

Approach 1. Expansion

We install on 1 server, customize functionality for full production work in 1-server version. As the load increases, we add servers and distribute the configuration to more servers.

We always start with 1-2-4 servers depending on the type of redundancy and roughly estimated load. If the load approaches 60-70%, we identify costly services and allocate them to separate machines. Or reconfigure the system based on the increased number of servers.

Predominantly excreted outward:

databases;
Microservices serving media traffic;
microservices serving the SIP alarm system;
microservices serving IVR and other scenarios;
microservices serving the data model
web server microservices
product layer microservices
change subscription microservices

Then, by observing the remaining microservices' utilization of server resources, identify leaders.

Approach 2. Load extrapolation

The system is configured for full functionality on a known necessary number of servers (the number that obviously will not have to be abandoned, maybe on one server). All scenarios are created, all processes, rules, connections, integrations are configured.
A portion of the planned processes or call flows are transferred to the system, the proportion of which can be quantified in relation to the planned peak load. So, for example, 10%. Not 1000 operators are connected but 100, the balancer does not start the entire call flow but only 10%, etc._
Measures the actual resource utilization of the system when the allocated part is serviced.
The resulting value is extrapolated to the planned value based on the allocated share.
An increasing coefficient is applied for reservations. From 1.5 to 2.
An increasing peak factor is applied to ensure underutilization. Factor 1.5 (base CPU utilization on servers should not be more than 70%, excluding possible point peaks).
The required number of relevant servers is calculated.
The database and archival data and file repositories are discussed separately.

Here it is important to adequately estimate its share of the total number of processes. Each of the processes of a fully loaded system must be executed and represented in a measurable part.

A quadratic (exponential) reduction of the load in the measured part relative to the full load shall not be allowed. For example, shrinking a collection by a factor of 10 results in a 100-fold reduction in crossing, and is just as many times faster to process a query with a JOIN.

Approach 3: Expert judgment

A score for telephony can be constructed based on the metrics, given in table.

Roughly estimate the expected load on the system at peak:
- How many simultaneous conversations and which ones.
- How many subscribers are waiting in queues (probably no more than the number of external trunks).
- How many telephone devices will be registered at one time and at what frequency re-registration will take place.
- Which data processing processes will be executed with what frequency and load. Can be translated into the number of components per second.
- How many users will be connected to the system at the same time.
- Which reports and dashboards will be built concurrently with the main job.
- How much of the conversations will be recorded and how long they will be stored.
- How many domains will be in the system.
Define external services.
- Whether external file storage will be connected.
- Whether the database will be hosted on external servers.
Determine the type of reservation based on the objectives.
- Will there be a system installation ready to drop one server. The calculated factor is (N+1)/N, but practically from 1.5 to 2 depending on the configuration.
- Will there be data center redundancy (dual data center setup). Ratio 2.
- Whether zones will be formed ready for decommissioning. The coefficient depends on the number and size of zones. Rough coefficient 2.
Estimating the number of cores required.
- Based on the above table, the number of cores required is estimated based on the estimated expected peak load.
- Based on the number of cores, the number of corresponding servers is estimated.
Estimating the disk system required.
- The disk subsystem is discussed in Disk load
An estimate of the RAM required.
- Base memory of 32 GB per server is used by standard telephony processes of any scale and call center up to 100 people with database placement on the server.
- On increasing RAM size requirements work:
  - Large databases that retain history for several years - on the servers where the database is executed;
  - Existence of processes for building reports on long periods of dates - on those servers where the database is executed;
  - Having cache-enabled capacious collections in the product layer data model.
We add a margin that reduces loading in peaks.
- the average load on the processor should not be more than 70%. Factor 1.5. (base level of CPU utilization on servers should not be more than 70%, excluding possible point peaks).
- SSD/HDD disks should not be filled more than 50% (bytes, inodes).
- Read/write queue load on HDD disk should not be observed for periods of more than 1 second, and in the total volume should not exceed 1% (not relevant when using the SSD).
We add a reserve to the redundancy scheme. Calculated factor (N+1/N), where N is the calculated number of servers, in practice from 1.5 to 2.
When installing the database on the cluster servers, we add resources for database operation. (from 2 cores and 100 GB to 24+ cores and 1+ TB + RAID)
- The approximate size to calculate is 150 MB each month of storage for a domain containing 100 users and 30 thousand connections per month.
If local storage is used, add resources (SSD/HDD) to provide it.
- The approximate size for calculation is 1 minute stereo MP3 - 180 KB.
- Additionally, other collections with attachments need to be evaluated and considered:
  - if the scripts save anything.
  - whether mailboxes are set up and how long messages are stored.
  - Is there any other historical data with attachments.
  - whether the database is backed up by the system means (even if to an external storage).
We apply a coefficient for the product layer on the project data model operation (from 1.1 to 2).
- factor 1.1 - if the product layer is installed in a single domain and is only used to view the call log.
- coefficient 1.5 - if the product layer is installed in all domains with their total number more than 20.
- Factor 2.0 - if users work in bulk in applications, non-voice service processes are executed.
VCS. Examples:
- 1 server with 8 cores and 1 Gbit/s link can handle the service:
  - 1 A room for 200 people where everyone has their cameras on and up to 10 speakers.
  - 1 A room for 500 people where video/screen and sound is broadcast by 1 speaker, the rest without cameras and microphones.
  - 10 rooms of 300 people where only audio is broadcast.
  - 100 Rooms of 3-5 participants where everyone has video turned on and everyone is a speaker.
- Mixing is performed deferred at the expense of free server resources.