Step 8. Which role instance is active
In the previous steps we have been near this question several times, but to formulate it and answer it fully, the categories considered are only now sufficient.
For example, in the case study from step 1, WS is just as likely to discover the right DC as when the DC is reserved in Active-Passive mode and divided into groups by domain zones.
There are several principles that solve the problem centered in this question. And each of them is used in a different place.
The most obvious one is that, given a configuration file, the only thing left to do is to search through all the servers on the site that could in principle host the desired role. This is not the most efficient way for several reasons, and loss of performance and speed are not the most key ones here. The key ones lie in the architecture and style of the platform "Incoplax" itself. At least it could be improved by adding caching and encapsulating this search in a separate service. Pretty much everything in this step next is a combination of these mentioned ways of improvement in one way or another.
Principle 1. Named microservices
For other services on the system, the DC role shines with a list of named microservices - one for each domain. It is these microservices that are raised by the active instance of the role, and are not raised by backup instances. It is these microservices that are segregated into role groups according to the domains being served. Thus for WS, it is sufficient to formulate a DC-oriented task and specify the one of interest
domain, and the rest is handled through the
subsystem
global names. The WS does not care whether the MDC or SDC version of the DC is on its site, or how the domain zones are divided into DC role groups, or which domains are served on the current site and which are proxied. All of this is implemented outside of it - in the global name registration system and in the role itself DC.
The global name registration system is the primary purpose of the RPCI service role. It provides registration of global name relationships to microservice addresses at the current site, provides monitoring of registered microservices so that zombie registrations are not stored or used, and provides the address of a valid responsible microservice by global name in response to requests from other subsystems.
The RPCI role is reserved in Active-Active mode, and has a client module active on any node and available to any services on it. In fact, the client module is a functional facade RPCI.
Thus it is a simultaneous implementation of both the caching tactic and the encapsulation principle. However, each role that owns globally-named microservices is responsible for registering its microservices with RPCI, for switching global names when flowing, and for enforcing naming conventions (names are used on both sides - at the time the microservice is registered and at the time it is searched before sending a request). RPCI is responsible for continuously providing up-to-date information_.
When considering the service interaction of roles in the diagram, it does not matter how they are distributed across servers. The only important thing is that they are all located on the same site.
Principle 2. Client module for encapsulation
The above first principle mentions the RPCI role client module, available on any node. It searches for the RPCI role’s service and caches inaccessible instances and periodically checks them. This principle is used for service roles, in particular ServerShell searches in this way IC.
This approach implements the encapsulation principle. Whether there is caching depends on the specific implementation of the client component of the role.
Principle 3: Local monitoring of accessibility
A number of roles are redundant and scalable in Active-Active mode. So far, only WS has been considered among such roles, but it is not a perfect example, since only external actors use them. Nevertheless, let’s imagine that some internal service needed to process a request in one of the instances of such a role. If it is a specific instance, it is addressed exactly, and this is not a problem (this is what external actors do with WS, for example). And if it is sufficient that the request is handled by any of the role instances, then the available one must be selected. If suddenly it fails, then, for example, take the next one and execute the request again - this already depends on the context and may contain transactions.
To avoid having to contact the RPCI server each time, the local client module monitors the issued process after receiving a response from the server. The process data is then provided locally and quickly until the process is terminated.
Principle 4. Ping-pong and mass mailing
Caching is not always useful, there are some operations that cannot tolerate even 5 seconds of unnecessary delay. Such a delay can occur if the cache provides data about an unreachable server for some time, and the request initiator sends a request and waits for its execution or some response to it for some time before committing failure and moving on to the next instance. The solution to this issue is a system ping-pong before sending a request. The ping request either goes through quickly, or the server/service is definitely unavailable. The advantage of the approach is cheap quick relevance checking. Internal services do not abuse pings in non-time-critical and reliable request delivery tasks, so their total number in traffic is not too high.
Bulk query distribution can be used in places where query processing takes significantly longer than packet processing. Thus, mass distribution relies on the request-recipient role to determine which instance will fulfill the request. Or the answers of each instance may be needed at all, for example, in the case "collect all external connections of the system" it would be necessary to poll all WS roles and combine their answers into one list. Of course, mass distribution is done in parallel (sending requests to all recipients and only then collecting the responses).
term | Determination |
---|---|
|
! |
|
! |
|
! |
|
! |
|
! |
|
! |
|
! |
|
! |
-
Next Step: Step 9: Cross-site communication
-
Previous step: Step 7: Ensuring scalability