Getting Cloudflare IPS with Python – Quick recipe

This is a very quick article to show how you can fetch Cloudflare IPS via HTTPS using Python, for use in your Python build scripts, for instance, if you are trying to protect a website with Cloudflare and you want the webserver only to accept Cloudflare IPs.

It is not as easy as you might think.

import urllib
def whitelistCloudflareIps():
    req = urllib.request.Request('https://www.cloudflare.com/ips-v4')
    with urllib.request.urlopen(req) as response:
        ips = response.read()
        ips_v4 = ips.replace("\n",",")[0:-1]

    req = urllib.request.Request('https://www.cloudflare.com/ips-v6')
    with urllib.request.urlopen(req) as response:
        ips = response.read()
        ips_v6 = ips.replace("\n",",")[0:-1]

    return ips

You might think that the code above will work. However it doesn’t. For some reason Cloudflare is blocking Python libraries from scraping their website, even pages that you would expect a script to be calling. Probably some extreme DDOS protection.

So if you attempt to run the code above you will get HTTP 403 Forbidden! Nothing wrong with that code as far as I can tell. I tried additional libraries like pycurl and pywget

However because our build scripts are all written in Python, I had to get some snippet of code that I could call in Python. Normally in any environment, you will either have curl or wget available. However, if you call wget or curl from a script it will work. So below is my snippet of code I include in my builds:

import os
def whitelistCloudflareIps():
    tmp_file_name="/tmp/ips_cf.txt"
    os.system(f"curl https://www.cloudflare.com/ips-v4 > {tmp_file_name}")
    os.system(f"curl https://www.cloudflare.com/ips-v6 >> {tmp_file_name}")
    f = open(tmp_file_name, "r")
    ips = f.read()

    ips = ips.replace("\n",",")[0:-1]

    return ips

But it doesn’t end here. After I created the above solution I reached out on Reddit to find out if there is something that I was missing. And it turns out that I was. Cloudflare doesn’t like requests without a User-Agent set. So if I want the more pythonic code to work, I can use the following solution:

def whitelistCloudflareIps():

    headers = {
        'User-Agent': 'My User Agent 1.0',
        'From': '[email protected]'  # This is another valid field
    }

    ips_v4= ""
    ips_v6 = ""
    
    try:
      req = urllib.request.Request('https://www.cloudflare.com/ips-v4', headers=headers)
      with urllib.request.urlopen(req) as f:
          ips_v4 = f.read().decode('utf-8').replace("\n",",")
    except urllib.error.URLError as e:
      print(e.reason)
      
    try:
      req = urllib.request.Request('https://www.cloudflare.com/ips-v6', headers=headers)
      with urllib.request.urlopen(req) as f:
          ips_v6 = f.read().decode('utf-8').replace("\n",",")[0:-1]
    except urllib.error.URLError as e:
      print(e.reason)

    return ips_v4 + ips_v6

Which basically sets a User-Agent in the header. This works quite well.

Now there is another way. The www.cloudflare.com domain is not the best domain to make this API call to. If I use the sub-domain api.cloudflare.com then I don’t even have to specify a User-Agent. So the following code will also work:

import urllib
import urllib.request

def whitelistCloudflareIps():

    ips_v4= ""
    ips_v6 = ""
    
    try:
      req = urllib.request.Request('https://api.cloudflare.com/client/v4/ips')
      with urllib.request.urlopen(req) as f:
          ips_v4 = f.read().decode('utf-8').replace("\n",",")
    except urllib.error.URLError as e:
      print(e.reason)
      
    try:
      req = urllib.request.Request('https://api.cloudflare.com/client/v6/ips')
      with urllib.request.urlopen(req) as f:
          ips_v6 = f.read().decode('utf-8').replace("\n",",")[0:-1]
    except urllib.error.URLError as e:
      print(e.reason)

    return ips_v4 + ips_v6