This post is meant to be the sequel of the one I wrote one month ago about CentOS as router, transparent proxy, and much more.
A big chunk of the precedent article is on how configure squid and squidGuard to act as a transparent proxy with URLs filtering capabilities.
But there’s a problem with that: nowadays many sites (f4c3b00k.c0m just to name the most annoying one) are HTTPS.
With HTTP one can really easily intercept a packet and read the payload (which contains the URL) but with HTTPS this is not possible anymore since the payload is encrypted.
The only way to be able to read the payload of an HTTPS packet is doing a man-in-the-middle attack with a fake certificate, but that’s not advisable and you really don’t wanna do it.
If, like in my case, we are not interested in what the users are doing but we just want them to not be able to access some sites/services/whatever ipset (combined with iptables) are the right tools for the job.
iptables is a pretty powerful tool, the only real issue is that it doesn’t scale pretty well if the number of the rules is very big, and this is not a good thing since we probably want to blacklist thousands of IPs.
And here comes ipset: with it it’s possible to manage huge blacklists without iptables slowing down.

The system is the usual CentOS 6.4 x86_64 plus epel repos; to install ipset just type:

[root@CentOS ~]# yum install ipset ipset-service && reboot

First of all, create an ipset blacklist:

[root@CentOS ~]# ipset create blacklist_name hash:ip

Once created, we have to populate it with some IPs.
This can be done in two ways:
1, providing ipset a list of IPs;
2. providing ipset a list of hostnames, which it’ll resolve as IP addresses and then add to the previously created blacklist (this one may take a while to complete).
In both cases the command to add an IP/hostname to the blacklist is the following:

[root@CentOS ~]# ipset add blacklist_name "IP or hostname"

If, like in my case, we are interested in blocking a big number of IPs/hostnames we can automatize the procedure using a small bash script like this one:

#!/bin/sh
for ipaddress in $(egrep -h -v -E "^#|^$" /path/to/blacklist_file); do
   ipset add blacklist_name $ipaddress
done

To make the blacklist permanent type in the following command:

[root@CentOS ~]# ipset save

The last thing to do is tell iptables to block every packet from/to one of the blacklisted IPs/hostnames; it can be done by just adding a simple iptables rule:

[root@CentOS ~]# iptables -I INPUT -m set --match-set blacklist_name src -j DROP

[root@CentOS ~]# service iptables save

[root@CentOS ~]# service iptables restart

Now every packet from/to any IP/hostname in the blacklist should be dropped which will result in an unreachable remote host.

.:. Building an IPs blacklist

The hostname’s IP resolving procedure can be done manually instead of delegating it to ipset using a very small bash script:

[root@CentOS scripts]# cat url2ip.sh 
#!/bin/sh

counter=0

for ipaddress in $(egrep -h -v -E "^#|^$" ./input_hostnames); do
   host $ipaddress | grep -oE '((1?[0-9][0-9]?|2[0-4][0-9]|25[0-5])\.){3}(1?[0-9][0-9]?|2[0-4][0-9]|25[0-5])' >> ipsblacklist
   echo $counter
   counter=$[$counter+1]
done

Given a file input_hostnames the script will extract the relative IP addresses and write them in the output file ipsblacklist which can be later parsed using the other 4 lines script posted above.

.:. Enable ipset at boot

Edit ipset configuration file: