Thursday, November 19, 2020

AWS Network Firewall: More Than Just Layer 4


Up until very recently, network prevention has been quite limited in Amazon Web Services (AWS). Consumers were left with the following options:
  • Create Security Groups to limit various types of layer 3 and 4 traffic to/from Elastic Compute Cloud (EC2) instances
  • Create Network Access Control Lists (NACL) to limit layer 3 and 4 traffic to/from entire Virtual Private Cloud (VPC) subnets
  • Route traffic through a network appliance running as an EC2 instance (not as "cloud-friendly" as this is often less scalable and sized to handle peak traffic)
To add more network protection options, AWS just released an awesome new capability in select regions called AWS Network Firewall. Protections that are afforded here are:
  • Allow or deny based on source IP and/or port, destination IP and/or port, and protocol (also known as 5-tuple)
  • Allow or deny based upon domain names
  • Allow or deny based upon Suricata-compatible IPS rules
Wait... we can make forwarding decisions based on PAYLOADS?! Now we're talking!

Similar to physical or virtual firewalls, some thought about which assets to protect, which types of traffic to hone in on, and other major considerations must be determined prior to deployment. Likely, you may have already done this if deploying an EC2 instance-based firewall.

To deploy one or more of these, changes to the existing VPC architecture are required (more on this in a bit).

Some Assembly Required

A best practice outlined by AWS is to architect your VPC to support this VPC Firewall. It is not as simple as turning on the service and being on your merry way. To support the upcoming case study, I'll keep the re-architected VPC (formerly set up by default) to the following:
  • VPC: Create or use a default VPC
  • Subnets: Use two of the existing subnets created by the default VPC
    • Subnet 1: Rename to FirewallSubnet. This will be used to contain the Network Firewall endpoint.
    • Subnet 2 through X: Rename to ProtectedSubnetX. These will be used to house the instances being protected by the Network Firewall
  • EC2 instance(s): Deploy or re-home into the ProtectedSubnet(s)
  • Route Tables: The default Route Table will no longer work, but more on this after the Network Firewall is deployed

Network Firewall Rule Groups

The first component to build out the AWS Network Firewall (and last on their list in the VPC service... WHY AMAZON?!) is the Network firewall rule group. This is where you decide what to allow or deny based on the previous list (5-tuple, domain names, or IPS rules).

This is very simple to set up:
  • Determine if this rule group will be stateless (inspect packets within the context of the traffic flow) or stateful (inspect individual packets on their own)
  • Give the rule group a name and optional description
  • Assign a capacity (see for more detail)
  • If a stateful rule group, follow the rest of the wizard to define your rule(s) for the traffic you wish to allow or deny

Firewall Policies

The next piece of the puzzle is a Firewall policy. These policies simple consist of one or more Firewall rule groups so stepping through the wizard is very straight-forward (give it a name and description, add the rule groups, choose a default action, and tag the policy if you wish).


The Firewall configuration is slightly more complex, but not by much. Here is where the re-architecting decisions begin to make more sense. 

To begin configuring this Firewall, yet again, give it a name and description. Selecting the appropriate VPC is the next step (easy if you only have one... otherwise, ensure you are choosing the correct one!). After this, choose the FirewallSubnet (or whatever you named it) subnet as this is the subnet in which the Firewall endpoint will be placed.

The only other required step is to select the Firewall policy created earlier.

Some (More) Assembly Required

Your Firewall will deploy in a few minutes, but... it won't actually be doing anything until you force traffic through it. To do so, some VPC Route Table adjustments need to be made. Following guidance from, creating a few Route Tables is the easiest-to-understand option. Here is an example of establishing three Route Tables to force traffic through the Network Firewall:

  • ProtectedRouteTable: Used to send any traffic from any instances in the ProtectedSubnet to any destination outside the subnet to the Network Firewall
  • FirewallRouteTable:  Used to send any traffic to the Internet Gateway
  • IGWRouteTable: Used to send any traffic destined to the ProtectedSubnet from outside the subnet to the Network Firewall
Sounds complex (and it is!) but the upcoming case study will show how this is set up in more detail. But first...


To appease your security analysts (or to at least see the fruits of your labor), you should be logging Network Firewall alerts at the very least. This is achieved by navigating to the Firewall details tab to set up logging as this service does not log AT ALL unless you configure it to. The usual suspects are available as log destinations: S3 bucket, CloudWatch, Kinesis data firehose.

Case Study: Bring Your Own Suricata Rules (BYOSR?)

VPC Setup

And now a very detailed walkthrough. To explore this service, I began by using a "clean" and unused region in AWS to not affect some production workloads. I chose us-west-2 as it is one of two US regions that offer this service at the time of this writing (the other being us-east-1).

Since the default VPC is already there, time to make a few changes. I left the subnet configuration as it was (even through I'm only using two or the four subnets for this example) other than changing two names: ProtectedSubnet for one and FirewallSubnet for another.

EC2 Target Instance

Next up was to deploy an instance I'd like to protect. I simply chose an Amazon Linux 2 AMI, logged in, and added the following to support my upcoming "attack":

sudo yum update -y
sudo yum install httpd -y
echo "Hello Visiter" | sudo tee /var/www/html/Hello
sudo systemctl enable httpd
sudo systemctl start httpd

Network Firewall Rule Group

The major capability that really piqued my interest is the fact that you can not only block traffic based upon known bad IPs, ports, or domains, but also by ingesting Suricata IPS rules! To test this out, I created a stateful rule group that simply blocks any time it sees this very nasty traffic or... traffic that contains the text "Hello". Here is what this Suricata IPS-compliant rule looks like:

Network Policy

The policy, as stated earlier, is very simple. I just have the policy incorporate my lone rule.

Network Firewall

Again, my setup is very simple as it's ultimately a single rule in a rule group, a single rule group in a policy, and then the policy is used in the Network Firewall configuration.


To ensure that my rule is actually doing anything, I'll set up a CloudWatch log group called FirewallAlerts. I'll dig into this log group in a bit once some data is generated.

Back to the Network Firewall configuration under Firewall details, I chose the FirewallAlerts CloudWatch log group as my destination, but was faced with a few options on which types of data to send there:

Another opportunity to enable Flow logging! To save some money, through, and to keep my CloudWatch log group focused on alerts, I just chose Alerts here.

Route Tables

By far, the most complex part of this setup is the Route Table configuration, so I'll keep it as brief as possible. I followed the previous guidance of creating three Route Tables like so:


The Target of vpce-0698c483f02db5617 represents the Network Firewall. Also, the ProtectedSubnet was associated with this Route Table (for obvious reasons). With this, any traffic sent from the instance to a destination outside the subnet will be forwarded to the Network Firewall.


Here, the target of igw-0dd503989e7a965b9 represents the Internet Gateway which was already created when setting up the default VPC. The FirewallSubnet was associated with this Route Table and, combined with these rules, will force any traffic meant for the outside world to be routed to the Internet Gateway.


This Route Table is a bit different as it looks at the traffic as it is returning to the VPC. Since I wish to have the return traffic traverse the firewall, I have any traffic destined for the ProtectedSubnet IP space ( in this case) configured to be routed to the Network Firewall. Since this Route Table is to be assigned, not to a subnet, but to an Internet Gateway, an Edge association is configured.


And now to trigger the Firewall block. If you remember, the Suricata rule was simply looking for the text "Hello" in any TCP traffic:

deny tcp any any -> any any (msg: "No Hellos"; content: "Hello"; sid: 4000000; rev:1;)

To attempt to trigger this rule, I opened up a web-browser session and navigated to the IP address plus /Hello.

What caused the issue here? Was it the Network Firewall? Let's find out! Going back to the CloudWatch log group, I found an interesting log:

    "firewall_name": "NoHellos",
    "availability_zone": "us-west-2c",
    "event_timestamp": "1605798915",
    "event": {
        "timestamp": "2020-11-19T15:15:15.242314+0000",
        "flow_id": 967950538327967,
        "event_type": "alert",
        "src_ip": "XXX.XXX.XXX.XXX",
        "src_port": 49882,
        "dest_ip": "",
        "dest_port": 80,
        "proto": "TCP",
        "alert": {
            "action": "blocked",
            "signature_id": 4000000,
            "rev": 1,
            "signature": "No Hellos",
            "category": "",
            "severity": 3
        "app_proto": "http"

Looks like it worked!


Thus far, I'm really impressed with what can be done using this new service. Yes, it is quite the level of effort if it involves re-architecting an existing deployment and can be costly (well over $200/month if my implementation were to run 24/7), but I'm sure as the service really takes off more capabilities will be included to enhance your organization's defense in depth even further.

Monday, February 3, 2020

Impossible Traveler

Use Case

An impossible traveler, at least in the case of this article, is defined as the occurrence of a user interacting with the same resource from two different locations, but, given the time delta and distance between the sources, could not possibly have made that trip in a reasonable amount of time. 

For instance, let's assume that we have a user log in to a web resource from an IP address in San Francisco, CA at 20:00 UTC and then submit a POST to the site from an IP address located in New York, NY at 20:30 UTC. There is no method of transportation (at least until Elon Musk figures this out) that a human can take to travel over 4,100km in 30 minutes.


We are basing the location of the user on a source IP address. As you probably know, Virtual Private Network (VPN) connections can easily allow a user to log in to the resource from one location, connect their VPN client to a VPN endpoint in another country within seconds, and log into (or simply interact with) the resource again. 

This will give the appearance of an impossible traveler. This could still warrant investigation depending on your corporate security policy. For example, if VPNs are required, they broke policy by connecting to the resource without using the VPN. Or, if VPN clients are not allowed, this would help highlight that violation as well.

Existing Tools

There are plenty of tools, most specifically supporting cloud services, that can pinpoint impossible traveler situations. Some vendors may also categorize this activity as a part of user anomaly detection. Some vendors I'm familiar with (there are many more):

These services do work pretty well, but what if we are generating our own logs that need to be analyzed? Don't worry, I wrote a Python script for that...

Located on my GitHub, there is a script that I... mostly wrote myself (thanks Stack Overflow) that you give a few parameters:

$ python -h
usage: [-h] [-k KMPH] firstIP secondIP

positional arguments:

optional arguments:
  -h, --help            show this help message and exit

  -k KMPH, --kmph KMPH  Speed (in kmph)

What the script is essentially doing is taking your IP addresses, doing a GeoIP lookup to get their coordinates, then determining, based on the average speed you give it, how long it would take to get from the first IP to the second IP. By default, I chose the speed to default to 900kmph since, according to Wikipedia, the average cruising speed is between 880 and 926kmph.

To use this script, simply add the first IP in question, then the secondIP to get a rough time of how long it would take a commercial flight, at average cruising speed, to get from the first source to the second. If the time different between the two events is less than that, you have an anomaly. Let's take a look at the following log:

2020-02-03T20:02:17Z "Login successful for ryanic"
2020-02-03T21:13:11Z "Login successful for ryanic"

If we were to do a GeoIP lookup (which the script is essentially doing), we would find that is located in Norwalk, CT (IBM) and is located in Houston, TX (HP). We would also find that these two locations are roughly 1,821km apart*.

* Assuming the GeoIP records are correct

In this situation, the user logged in just 1 hour, 10 minutes, and 54 seconds apart from these two locations. Let's use our script to see if this is possible given today's air travel:

$ python
Time to travel 1821.0 kilometers (Hours:Minutes)
-> 1:57

There you have it: Looks like that's a no. If you've traveled via commercial flights, you know we did not account for taxing to/from gates as well as getting up to cruising speed. This flight more realistically would take 3+ hours gate-to-gate.

This script is also flexible in that you can give it a different speed for commuters. Let's say average speed in your local commuting area is 60kmph. You can give the -k (or . --kmph) flag to the script and indicate the speed of your choice.

So, if you find your vendor does not support all "impossible traveler" situations, feel free to use my script!

Thursday, January 31, 2019

Logstash AWS Quick Deploy

For anyone looking to play around with Logstash, here's a super quick way to get it up and running in AWS (yes, even works in the free tier if you choose a t2-micro).

Sunday, November 4, 2018

Highly-Available and Load-Balanced Logstash

The Challenge

When using the Elastic Stack, I've found that Elasticsearch and Beats are great at load-balancing, but Logstash... not so much as they do not support clustering. The issues arise when you have end devices that do not support installing Beats agents which send to two or more Logstash servers. To get around this, you would typically:

  • Set up any one of the Logstash servers as the syslog/event destination
    • Pro: Only one copy of the data to maintain
    • Con: What if that server or Logstash input goes down?
  • Set up multiple Logstash servers as the syslog/event destinations
    • Pro: More likely to receive the logs during a Logstash server or input outage
    • Con: Duplicate copies of the logs to deal with
A third option that I've developed and laid out below contains all of the pros and none of the cons of the above options to provide a highly-available and load-balanced Logstash implementation. This solution is highly scalable as well. Let's get started.


To begin creating this proof-of-concept solution, I began with a very minimal configuration:
  • Two virtual machines within same layer 2 domain (inside VMware Fusion)
    • CentOS 7 64-bit
    • Logstash 6.4.2
    • Java
    • Keepalived
    • IP Virtual Server (ipvsadm)
  • Host machine to generate some traffic (which will generate sample logs)
    • Mac OSX
    • nc

Log Server Configuration

OS install

For this, I simply created a small VMware Fusion virtual machine using the CentOS 7 Minimal ISO as my installation source (this one in particular). The rest of the machine creation is pretty straight-forward. (Note: I did change from NAT to Wi-Fi networking as I was having very strange issues with NAT networking)

After starting the virtual machine, the install process will begin. This is where you can just do a basic install, but I chose a few options that hit close to home with my day job:
  • Partition disk manually if intending to use a security policy (this would otherwise cause a security policy violation that will keep us from proceeding)

  • Configure static addressing (my Wi-Fi network within Fusion is with a gateway)
  • Apply the DISA STIG for CentOS Linux 7 because... security.
  • Don't forget to set the root password and create an administrative user. Without this, you'll have a hard time logging in (especially via SSH... given this security policy)

Application Install

From here, let the machine reboot and SSH in (it's a much better experience than using the console via Fusion, in my opinion). Some packages can now be added.
  • First, the Logstash and Load Balancing pre-requisite applications:
    • sudo yum -y install java tcpdump ipvsadm keepalived
  • Next, install Logstash per Elastic's best practices:
    • sudo rpm --import
    • sudo vi /etc/yum.repos.d/logstash.repo

      name=Elastic repository for 6.x packages
    • sudo yum -y install logstash

    Logstash configuration

    There would be no way to show off all of the possible Logstash configurations (that's some research for you :) ), so I'll just set up a simple one for testing our highly-available Logstash cluster:
    • This is a bit different, but the API will need to be exposed outside the localhost:
      • sudo vi /etc/logstash/logstash.yml
        • Uncomment and set to the server's IP address
        • Uncomment http.port and set to JUST 9600
    • The input and output configuration for Logstash is next (you can change the filename to something else... unless you agree). For this testing, I'm just setting up a raw UDP listener on port 5514 and writing to a file in /tmp.
      • sudo vi /etc/logstash/conf.d/ryanisawesome.conf
        input {
          udp {
            host => "" # server's IP
            port => 5514
            id => "udp-5514"

        output {
          file {
            path => "/tmp/itworked.txt"
            codec => json_lines

    SELinux Tweaks

    There's a few settings that need changed to allow keepalived and ipvsadm to work properly.
    • Set the nis_enabled SELinux boolean to allow keepalived to call scripts which will access the network
      • sudo setsebool -P nis_enabled=1
    • Allow IP forwarding and binding to a nonlocal IP address
      • sudo vi /etc/sysctl.conf
        net.ipv4.ip_forward = 1
        net.ipv4.ip_nonlocal_bind = 1
        • If you chose the DISA STIG Policy during the VM build, comment out "net.ipv4.ip_forward = 0" (yes... this is a finding if this system is not a router. But once ipvsadm is running it IS a router. So we're all good ;) )
      • sudo sysctl -p


    Here's where the real bread-and-butter of this setup lies: keepalived. This application is typically used to provide a virtual IP between two or more servers. If the primary server were to go down, the second (slave) would pick up the IP to avoid any substantial downtime. This is not a bad solution in regards to high-availability, but that means only one server will be online at a given time to process our logs. We can do better. 

    Another feature of keepalived is virtual_servers. With this, you can configure a listening port for our virtual IP and, when data is received, will forward to a pool of servers via a load-balancing method of your choosing. The configuration would look something like this:
    • sudo vi /etc/keepalived/keepalived.conf
      # Global Configuration
      global_defs {
        notification_email {

        smtp_server localhost
        smtp_connect_timeout 30
        router_id LVS_MASTER

      # describe virtual service ip
      vrrp_instance VI_1 {
        # initial state
        state MASTER
        interface ens33
        # arbitary unique number 0..255
        # used to differentiate multiple instances of vrrpd
        virtual_router_id 1
        # for electing MASTER, highest priority wins.
        # to be MASTER, make 50 more than other machines.
        priority 100
        authentication {
          auth_type PASS
          auth_pass secret42
        virtual_ipaddress {


      # describe virtual Logstash server
      virtual_server 5514 {
        delay_loop 5
        lb_algo rr
        lb_kind NAT
        protocol UDP

        real_server 5514 {
          MISC_CHECK {
            misc_path "/bin/python /etc/keepalived/ udp-5514"
        real_server 5514 {
          MISC_CHECK {
            misc_path "/bin/python /etc/keepalived/ udp-5514"

    Logstash Health Checks

    You'll probably notice a reference to in the above configuration. Keepalived will need to run an external script to determine whether or not the configured "real server" is eligible to receive the data. This is typically pretty easy to do with TCP... if a SYN, SYN/ACK, ACK is successful, we can assume the service is listening. This is not an option with a Logstash UDP input as nothing is sent back to confirm that the service is listening. What can be used instead is the API. The following script simply makes an API call to list the node's stats, parse the resulting list of inputs, and, if the input we're looking for is up, exit normally.

    • sudo vi /etc/keepalived/!/bin/python
      import sys
      import urllib2
      import json

      if len(sys.argv) != 3:
          print "This script needs 3 arguments!: IP input-id"

      res = urllib2.urlopen('http://' + sys.argv[1] + ':9600/_node/stats').read()
      inputs = json.loads(res)['pipelines']['main']['plugins']['inputs']

      match = False

      for input in inputs:
          if sys.argv[2] == input['id']:
              match = True

      if match == True:
    Keepalived will add this server to the list of real servers if the exit code of our script is 0 and remove it from the list if it is anything except 0. The aforementioned keepalived configuration is set up to check this script every 5 seconds for minimal log loss if one goes down. Adjust as you see fit here (i.e., how much loss can you acceptably handle).

    Of course, you would have to create several of these if you have Logstash listening on multiple ports, but cut and paste is easy. Just look at /var/log/messages to ensure that these scripts are exiting properly. If you see a line like "Oct 30 09:44:58 stash1 Keepalived_healthcheckers[16141]: pid 16925 exited with status 1", either the script failed or a particular input is not up. Since this error message isn't the most descriptive, you'll have to manually test or view each input on each host to see which one it is. You can manually test the Logstash inputs (once that service is running) by issuing:

    • /bin/python /etc/keepalived/ <IP> <input-id>

    Firewall Rules

    Sure, we could just disable firewalld... but we did just expose our API to anything that can reach this machine, so we need to lock this down a bit better. Don't worry, the rules are pretty straight-forward. (Note: replace '' with your host which is sending logs to Logstash and '', '', and '' with the two Logstash servers and virtual IP address, in that order).
    • sudo firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address= protocol value=vrrp accept' 
    • sudo firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address= protocol value=vrrp accept'
    • sudo firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address= destination address= port port=9600 protocol=tcp accept'
    • sudo firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address= destination address= port port=9600 protocol=tcp accept'
    • sudo firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address= destination address= port port=5514 protocol=udp accept'
    • sudo firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address= destination address= port port=5514 protocol=udp accept'
    • sudo firewall-cmd --permanent --add-rich-rule='rule family=ipv4 source address= destination address= port port=5514 protocol=udp accept'
    • sudo firewall-cmd --reload

    The Second Logstash server

    Shut down the Logstash server virtual machine since it's much easier to just clone this one and make a few configuration changes instead of stepping through this process all over again.

    Now that it's shut down...

    Boot the second one up (leaving the first powered off for now) and make the following changes in the VM console:
    • Set hostname
      • sudo hostnamectl set-hostname stash2
    • Set IP address
      • sudo vi /etc/sysconfig/network-scripts/ifcfg-<interface>
        • Change IPADDR to appropriate IP address
      • sudo systemctl restart network
    • Change Logstash listening IPs
      • sudo vi /etc/logstash/logstash.yml
        • Change to stash2's IP address
      • sudo vi /etc/logstash/conf.d/ryanisawesome.conf
        • Change host to stash2's IP address
    • Swap the unicast_src_ip and unicast_peer IP addresses 
      • sudo vi etc/keepalived/keepalived.conf
    • Reboot
      • sudo reboot now
    Now, you should be able to start the original virtual machine (in my case, Stash1)

    Putting It All Together

    We've finally reached the point to fire up all the services and test out the HA Logstash configuration. On each Logstash VM:

    • sudo systemctl enable logstash
    • sudo systemctl start logstash
    • sudo systemctl enable keepalived
    • sudo systemctl start keepalived
    You can monitor that Logstash is up by viewing the output of:
    • sudo ss -nltp | grep 9600
    If you have no output, it's not up yet. If it doesn't come up after a few minutes, check out /var/log/logstash/logstash-plain.log to any error messages. Personally, I like to "tail -f" this file right after start logstash to ensure everything is working properly (plus it looks cool to those that look over your shoulder as all that nerdy text flies by).

    On each machine, you can now check that ipvsadm and keepalived are configured properly and playing nice together. You should be able to run the following command and get similar output (you IPs may be different, but you should see TWO real servers):
    • ip a
      • Only ONE of the two servers should have the virtual IP assigned (by default, the one with the higher IP address since the priority is the same and this is the tie-breaker when using VRRP)
    • sudo ipvsadm -ln
      IP Virtual Server version 1.2.1 (size=4096)
      Prot LocalAddress:Port Scheduler Flags
        -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
      UDP rr ops
        ->           Masq    1      0          0       
        ->           Masq    1      0          0 
    To test that load balancing is happening, the sample log source (in my case, my host operating system) will need to send some data over UDP 5514 to the virtual IP address. To do this, I'm going to use netcat (but really anything that can send data manually over UDP will work... including PowerShell). 
    • for i in $(seq 1 4); do echo "testing..." | nc -u -w 1; done
    What I just did was send four test messages to the virtual IP. If everything worked properly, the virtual server will have received the messages and load-balanced, in a round-robin fashion, to each server's /tmp/itworked.txt file. On each server, let's check it out.
    • cat /tmp/itworked.txt
    Success! Both servers received two messages!

    Thursday, August 9, 2018

    Ryan's CTF Has Come to an End...

    Thanks everyone!

    My Google Cloud Platform trial is very low on funds, so it's time to end the CTF. I hope everyone had a great time. Here are the results:

    372 teams!

    Only 17% got the NINJA challenge... All but 8 were solved AFTER John Hammond's video walkthrough.

    3142 flag submissions!

    First 10 with a perfect score of 1000!

    True CTF NINJAs with perfect scores!

    To all that played and provided feedback, THANK YOU! There will be more of this to come!

    Monday, July 16, 2018

    Free CTF is Online!

    While my free Google Cloud Platform account is still active (until ~30 Nov 2018), feel free to try out my Capture the Flag at! Have fun red teamers! Ground rules: please try not to hack the platform itself. That ruins the fun for others.

    Monday, April 2, 2018

    Resolving REST over HTTP Man-in-the-Middle with IPSEC

    So... what's the problem?

    I am currently working with some Elastic Stack clustering when I quickly realized that, if two or more nodes traverse a security boundary, they may be subject to tampering if an evil man (or woman) in the middle were to intercept and modify the data sent between them. Yes, there is a solution from Elastic, called X-Pack Security, that can provide SSL between the nodes, but that's one of the rare things that Elastic charges for. The reason I was even pondering the use of Elastic was to replace a certain, well-known data aggregation solution that is eating up quite a bit of budget (and even more memory)

    Below shows the traffic between two nodes ( and This data sent over TCP port 9300 is used for Elasticsearch cluster communication as is simply REST over HTTP. What could possibly go wrong? This communication could be things such as node to node conversations, replication of data, or other important cluster information. If this data were to be captured, attackers could get some valuable intelligence regarding the systems supported by this log aggregation service. Worse yet, if this communication were to be poisoned via a man-in-the-middle attack, the entire log aggregation service could be deemed useless.

    What as the solution?

    After a ton of Googling, I stumbled upon a solution that has been very well known in the Red Hat community for some time now that, honestly, I should have been aware of - LibreSwan. LibreSwan is a free, open-source package that allows for host-to-host, host-to-network, and network-to-network IPSEC tunneling. This would be perfect! After some trial and error, I decided to use the Pre-Shared Key implementation to test it out. Here's the very simple setup and configuration of LibreSwan:

    1) Install Libreswan on each node:

    # yum -y install libreswan

    2) Configure a tunnel (in this case, it is named mytunnel and the config file is located at /etc/ipsec.d/my_host-to-host.conf):

    # vi /etc/ipsec.d/my_host-to-host.conf

    conn mytunnel

    3) Create a random, base64 key of 48 total bytes (only need to do this on one machine as the keys MUST match on both sides. Doing this twice should not yield the same key):

    # openssl rand -base64 48

    4) Create a "secrets" file with the above command output as the pre-shared key (Note: this is one, continuous line):

    # vi /etc/ipsec.d/es.secrets : PSK "n8+ef4PA4VAtqd1iX7QzC3sLmxlLi30LzOTgg7JBmNXQ7Wsi8SnweO+hjlXNK/rE"

    5) Enable and start the Libreswan service:

    # systemctl enable ipsec
    # systemctl start ipsec

    As you can see below, the Wireshark output is now showing ESP communication between the two endpoints! As long as that pre-shared key is kept secret, we're good (although it may be a good idea to rotate this key occasionally so offline brute force attacks would be less successful).

    What's next?

    I should probably move away from the pre-shared key implementation and onto getting this to work with public/private key so I don't need to worry about secure transmission of the pre-shared key in production. Other than that, this solution seems pretty solid as I've been sending quite a bit of data at my Elastic Stack implementation and it's replicating across the cluster rather seamlessly!