Knowledge Sharing

3.15.2024

EventStoreDB with AWS: how the cluster works and how to configure the launch

Technologies don't stand still. Our previous article about the EventStoreDB Cluster with AWS has undergone a significant update to reflect the ever-evolving world of technology.

So, how does the cluster work, and how to configure the launch of EventStoreDB with AWS? Of course, you can start by studying the documentation. While the documentation has all the information, dealing with it can be daunting for beginners. Additionally, only a few online publications feature examples of how to use this tool.

We are thrilled to share a new article from our Back-end engineer, Oleg Kowal. It is helpful for beginners looking to explore a cool tool and get started without spending too much time reading documentation.

Key principles of EventStoreDB operation as part of a group

EventStoreDB is a database that stores information as the event log. It is currently one of the most popular tools for implementing the Event Sourcing template. In this article, I will not only elaborate on the fundamental principles of using EventStoreDB in a group setting but also demonstrate the various stages of installation and configuration using the EC2 cloud service from AWS.

I'll use costless tools for the examples: the AWS free tier and the open-source EventStoreDB OSS build. Thus, anyone can experiment with them.

As part of the group, EventStoreDB functions according to the principle of shared-nothing architecture. Every machine or virtual machine that runs the database software is called a group participant. Each of them uses its processors, RAM, and disks independently. Any coordination between nodes is done at the software level using the network.

This approach provides better fault tolerance since participants do not use shared resources. In case of problems with one of the participants, the group continues its function. Furthermore, it is feasible to distribute data across multiple geographic regions, which can effectively minimize latency for users.

Maintenance is made easy with EventStoreDB's fault tolerance, allowing you to install updates and make changes for each participant individually. The tool uses a preset group size for a predefined number of participants. If you need to add a new participant, firstly, you must change the group size settings for all current ones.

The disadvantage of this approach is redundancy because each group participant keeps a complete copy of the data. It creates additional load since data needs to be transmitted over the network to each participant and then stored on a disk. Participants must also agree on the values to be stored.

Data consistency and availability are well-known problems in distributed systems, often called the CAP theorem. Under normal conditions, when the network is working correctly, a distributed system can ensure data consistency and availability. However, when a network partition occurs, you should choose between data consistency and availability.

EventStoreDB achieves data consistency by using a quorum. In this data distribution model, most participants in the group must confirm that they have committed writes to disk before confirming that to the client. Thus, to tolerate the failure of n nodes, the cluster's size must be (2n + 1). For example, a three-database-node cluster can continue recording if one node is unavailable. If more participants fail, the group loses the ability to write new information and only allows reading.

Under normal conditions, one group participant assumes the leader's primary role while the others are designated as followers.

Leader

The group appoints a leader through an election process. A person in this role is responsible for reconciling and saving data to disk before sending a confirmation message to the client. Only one participant can assume the leader's role at a time. If the system detects two nodes with the Leader role within the group, it will initiate a new vote. The participant with fewer holding data will be expelled from the group to rejoin it.

Follower

The follower role in a group is assigned based on a voting process. The group uses one or more participants with the follower role to create a quorum or most nodes needed to confirm that a record is saved.

To determine the leader and follower roles, the EventStoreDB group uses a consensus algorithm based on voting. Here is a visualization of an algorithm. Consensus is a fundamental problem in fault-tolerant distributed systems, which involves multiple participants agreeing on values to make a final decision.

Typically, consensus algorithms progress when most participants are available, meaning a group of 5 participants can continue working even if two fail. However, if most participants fail, they stop progressing without returning an incorrect result.

Group participants use the Gossip protocol to discover each other's status and select a group leader. This protocol works by having each participant send some dial data of other participants, allowing them to build a global map from limited local interactions. Here is a good visualization that demonstrates how protocols of this kind operate.

These protocols have demonstrated their effectiveness in detecting failures in large distributed systems asynchronously. Moreover, they have no limitations associated with reliable multicasting for group communication.

Now that you understand what an EventStoreDB group is and how it works, it's time to install it on EC2.

Installing EventStoreDB on EC2

To get started, open the AWS Management Console. Go to the EC2 section and click on Instances. Then, navigate to a beautiful orange Launch instances button.

Enter the name.

Choose the Ubuntu 20.04 operating system.

Choose the t2.micro type in the “Instance type” section, and select “Proceed without a key pair” in the Key pair (login) section.

Click on the “Edit” button in the network settings, and do not modify any existing fields. Then use the “Add security group rule” button to add two new rules.

It allows us to access the instances over port 22 via SSH to configure them. Port 2113 is required to access the EventStoreDB web interface, and port 1112 is used for communication between group participants.

I left the instances open to all IP addresses to simplify this example. However, AWS (and I personally) recommend configuring your security group rules to allow access only from known IP addresses.

You can leave the “Configure storage” section unchanged. In the “Advanced details” section, add a set of commands to install EventStoreDB on each instance in the User data field.

#!/bin/bash curl -s https://packagecloud.io/install/repositories/EventStore/EventStore-OSS/script.deb.sh | sudo bash sudo apt-get install eventstore-oss=22.10.0

In the “Summary” section, change the number of instances to 3.

Click on the “Launch instance.” After the successful launching of instances, continue configuring the EventStoreDB group.

You should prepare a list of private IPs of your instances. In our example, I referenced them in the config files as private-ip-node-1, private-ip-node-2, and private-ip-node-3.

Click on the instance and open the Details tab to view the instance’s private IP.

Next, we'll need to create a configuration file for each instance. To streamline this process, I've included a template (note the three dashes at the beginning of the file and the spacing when copying, as it uses YAML format).

This template is tailored for the first file and should be linked to the second and third nodes via GossipSeed. The same attends to all other nodes in the system.

Here's a sample configuration file with a simplified security setup. If you're using EventStoreDB for work, it's crucial to configure security features properly to prevent unauthorized access to sensitive data. For more in-depth information on security settings, please refer to the official EventStoreDB documentation.

# Paths Db: /var/lib/eventstore Index: /var/lib/eventstore/index Log: /var/log/eventstore # Insecure mode # When running with protocol security # disabled, everything is sent unencrypted # over the wire Insecure: true

# Network configuration IntIp: private-ip-node-1 ExtIp: private-ip-node-1 HttpPort: 2113 IntTcpPort: 1112 EnableExternalTcp: false EnableAtomPubOverHTTP: true

# Cluster gossip ClusterSize: 3 DiscoverViaDns: false GossipSeed: private-ip-node-2:2113,private-ip-node-3:2113

# Projections configuration RunProjections: All

# Timeouts and intervals GossipIntervalMs: 2000 GossipTimeoutMs: 3000 IntTcpHeartbeatInterval: 5000 IntTcpHeartbeatTimeout: 1000

EventStoreDB also has an online configurator that guides you through all the necessary steps to create these files.

Next, connect to the instance and execute the following commands (*repeat for each group member).

Select the instance and click the “Connect” button.

After connecting to the instance, run the following commands:

Create a configuration file:

$ sudo touch /etc/eventstore/eventstore.conf

Open the file in any convenient way and paste the configuration that corresponds to this member:

$ sudo nano /etc/eventstore/eventstore.conf

Start EventStoreDB. Upon installing EventStoreDB, the service will not commence automatically. It allows you to modify the configuration in /etc/eventstore/eventstore.conf and avoid creating database and index files in their default location.

$ sudo systemctl start eventstore

You can check your cluster readiness for use http://public-ip-node-1:2113/web/index.html#/clusterstatus. (The public IP address is specified directly next to the private IP address. Be sure it's http, not https).

At this point, you should have all participants in the Alive status. Moreover, the group must have one Leader and two Followers.

It is a basic sample for launching an EventStoreDB cluster on AWS. However, kindly note that it is not a ready-to-use real-life example, just a reference to start with.

Additionally, if you prefer using Docker instead of AWS, the EventStoreDB documentation provides excellent examples.

Conclusions

Hopefully, this article has helped you to take initial steps towards developing distributed systems with EventStoreDB. I also hope that this article has made it easier for you to work with the official documentation in the future.