Network Access Control Lists vs Security Groups

Both are used to protect networks and resources, but there is often confusion about the difference between Network Access Control Lists (NACLs) and Security Groups, and when each should be used.

This post, aims to demystify the two concepts.

The differences that we will cover are:

  • Stateful vs Stateless
  • Inbound vs Outbound
  • Allow vs Deny
  • Rule Order

Future post will then look at how to use this knowledge to apply both NACLs and Security Groups, and how to troubleshoot connectivity issues when NACLs and Security Groups are in place.

If you want to see how this forms part of a Well-Architected Network take a look at my post on creating a VPC.


How to represent network traffic.

So the first thing to understand is how data travels over a network.

I like to think of IT network the same as a road network, and in a lot of ways its very similar.

Photo by Alex Kalinin / Unsplash

Long distance connections such as the international bearers are like large motorways/interstates/expressways. Junctions on these are the large network centers and carrier hotels with huge routers. These junctions connect with other large bearers but also to smaller "local" connections. Local connections such as your broadband or AWS Direct Connect, are like the highways and main roads. Junctions on these are connected by smaller routers, such as the one in your home, as less traffic is handled by them. These routers connect the local connection to the final network such as your home Wi-Fi. Within the local network connections such as small subnets/VLANs are like the town/estate roads. Junctions between each of these are managed by switches and hubs.

With this in mind we can think of NACLs as gates or barriers on these junctions that determine if you are allowed through based on the combination of the source (IP and Port), destination (IP and Port), and protocol (TCP/UDP/ICMP etc.).

Security Groups on the other hand are like the locks on your doors. A NACL might allow traffic onto the network but the resource doesn't have to accept it. Just like a community or office building might have a gate with a guard, you don't have to accept someone into your house or individual office. And vise versa, just because your neighborhood allows people to drive on the roads you can stop lock the garage to prevent your car being driven out.


Stateful vs Stateless

So we can see a difference in where NACLs and Security Groups are applied, network vs resource level, but there is also another major difference.

NACLs are stateless when processed where as Security Groups are Stateful. This is a term applied to other firewall functions and you will see in documentation on AWS Network Firewall and other firewall providers. So what do the two terms mean.

Stateful

This means that devices examine, and track, packets that flow through them. This process records the state of the packet and as such examination is said to be stateful as the processing is aware of the state. This means that outbound (reply ) traffic is inherently allowed based on the inbound part of the flow being allowed. So, if we take our house as an example, if I am allowed into your house I am also allowed to leave.

Stateless

This means that traffic flows are not examined or its state maintained. As a result, each packet is looked at in it's own merit based on a set of rules. Lets think about our network as a carpark. At the entrance there are barriers to restrict width and height, and we might have a barrier to control entry. When you enter you are given a ticket that raises the barrier. At the exit the barrier you have to provide a paid ticket for the barrier to raise. The barrier doesn't know anything about you and applies the same exit criteria to all cars.


Inbound vs Outbound

So I think this is where things get confusing, especially as on both NACLs and Security Groups on AWS label everything as Inbound and Outbound Rules.

Lets take a typical web flow, https on port 443, to demonstrate the components as they apply to NACLS. We have a single NACL on the VPC and a Security Group on the EC2 instance running a web server. We've put the instance behind a network load balance to restrict access and allow scaling.

NACLs

In terms of connection this is an inbound flow from an external user to the VPC. However, for NACLs, which we now know are stateless, we have to provide a rule for every packet that enters or leaves the network (VPC or Subnet). As such the direction of the connection doesn't matter, just the direction of the packet.

So in this example we would have to have inbound rules in the NACL to allow anyone to come in when going to something running on port 443.

We also then need an outbound rule to allow the return traffic. In this instance we have to set the port range on the destination which could be any of the high/ephemeral ports of the users device.

What you will see is that in a NACL we can only specify 4 things; Source or Destination IP, Protocol (TCP/UDP/ICMP etc.), Port/Type and Action (Allow or Deny). And this is where it starts to get complicated and we start to see gaps in the security.

We are saying that anyone (0.0.0.0/0) can enter the VPC as long as the destination is accessed on port 443 but also have to remember to allow the return traffic, which can be forgotten. But we are also saying that you can access the webserver but also any other server that might be listening on the same port.

Security Groups

In terms of connection, as for NACL, this is an inbound flow from an external user to the EC2 instance.

So in this example we would only need to have an inbound rule in the Security Group to allow anyone to come in when accessing port 443.


Allow vs Deny

Another big difference between NACLs and Security Groups is the fact that NACLs can have deny rules. You we see there is a default deny at the bottom of the NACL that says if you are not allowed further up then deny. As well as this you can also create other deny rules.

Security Groups on the other hand only have allowed rules. If it is not allowed it will be blocked.

The reason for this is due to where they are applied. As security groups are at the resource level there is no where further the packet can go. If it is not allowed the resource just drops it.

On a NACL as it can be applied to a VPC or Subnet we want to explicitly tell the network device (router/switch) to drop the packet. This means that no further processing will happen.

This has the benefit of reducing traffic on resources further in the network as traffic will never reach them.


Rule Order

One of the other differences between NACLs and Security Groups is how they apply the rules.

NACLs

NACLs apply their rules in strict order based on the rule number. This means that if something is allowed higher up the rule list it can not be denied later and vice versa, if something is denied it can then not be allowed. Again this can cause confusion but has benefits in managing networks. A useful example of this is denying all traffic from a specific IP ranges. By having these denies as the first rules there is no further processing of packets. This reduces the load on the network devices and allows other packets to be processed faster. Another example would be known good IPs. For example I have a rule at the top of some of my NACLs that allow my home IP (Static from provider) to access on any port. This means no mater what I build in the VPC i can always access, subject to security groups, with out modifying NACLs.

Security Groups

As security groups are only allow rules, and applied on resources, all rules are evaluated together. This means that if any of the rules mean I have access I would be permitted access. One disadvantage of this is that it has to apply all the rules to every packet and if there are large rule sets it slows down the traffic. It is also why there is a limit on the number of rules that can be configured on a resource (SG rules times interfaces has to be less that 1000). It is also why some instances have problems when there are not enough resources to process the state of packets due to large rules.


I hope you've found this useful and have a better understanding of the two concepts.

In my next post I'll look at how to use them to your advantage to ensure highest security in your AWS network.