How to fix “403 Forbidden” errors when calling APIs using Python requests?

Bypassing 403 Forbidden Errors in Web Scraping: A Step-by-Step Guide Without Selenium

OuneLog
3 min readJan 26, 2025

Introduction

Web scraping can feel like navigating a minefield when servers block your requests with 403 Forbidden errors. These errors often occur because websites detect non-browser traffic (like scripts) through mechanisms like TLS fingerprinting, header validation, or IP blocking. While tools like Selenium mimic browsers, they’re resource-heavy. In this guide, I’ll share multiple proven techniques to bypass 403 errors using Python, including a hidden gem: curl_cffi.

The Problem: 403 Forbidden Hell

While trying to scrap some data from a website, my Python script using the popular requests library kept hitting a brick wall:

import requests
response = requests.get(url, headers=perfect_headers) # Always returns 403!

Despite:

  • Perfectly replicated headers (via MITMproxy)
  • Matching cookies
  • Correct user-agent
  • Proper TLS configuration

The server kept rejecting my requests with 403 Forbidden errors. Why?

Why 403 Errors Happen

  1. Missing/Invalid Headers: Servers check for browser-like headers (e.g., sec-ch-ua, user-agent).
  2. TLS/JA3 Fingerprinting: Servers detect non-browser TLS handshakes.
  3. IP Rate Limiting: Too many requests from the same IP.
  4. Path/Protocol Validation: URLs or HTTP versions may trigger suspicion.

in my case it was The Culprit: TLS Fingerprinting

Modern websites don’t just check headers — they analyze your TLS handshake fingerprint (JA3). Libraries like requests and urllib have distinct fingerprints that scream "BOT!" to servers.

The Solution

Use curl_cffi to Impersonate Browser TLS Fingerprints

The curl_cffi library mimics browser TLS fingerprints, bypassing JA3 detection. it combines cURL’s power with browser-like TLS fingerprints.

1. Installation

pip install curl_cffi

2. The Magic Code

# Install: pip install curl_cffi
from curl_cffi import requests

headers = {
"user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36...",
"accept": "*/*",
"referer": "https://example.com"
}

response = requests.get(
"https://example.com/",
headers=headers,
impersonate="chrome110" # Mimics Chrome 110 TLS
)

Impersonation Targets:

# Available options 
impersonate="chrome110"
impersonate="chrome120"
impersonate="safari16"

Key Differentiators

  • impersonate parameter specifying Chrome 110
  • No SSL verification needed
  • Automatic handling of HTTP/2 and brotli encoding

Why This Works

  • Spoofs Chrome’s TLS fingerprint, making the request appear browser-like.
  • Avoids the need for Selenium or headless browsers.

Tips:

  • Add random delays between requests
  • Rotate user-agent strings
  • Use proxy rotation

Other Solutions to try

1. Refine Headers to Match Browser Requests

Capture headers from a real browser (using Chrome DevTools or mitmproxy) and include all critical headers like:

  • sec-ch-ua, sec-fetch-*, referer, origin

Example:

headers = {
"sec-ch-ua": '"Google Chrome";v="131", "Chromium";v="131", "Not-A Brand";v="24"',
"sec-ch-ua-mobile": "?0",
"sec-ch-ua-platform": "Windows",
"sec-fetch-site": "same-origin",
"sec-fetch-mode": "cors",
"referer": "https://example.com/",
"priority": "u=1, i"
}

Tip: Simplify headers if they conflict (e.g., use accept: */* instead of complex values).

2. Use Sessions and Rotate User-Agents

Persist cookies and rotate headers with requests.Session:

import requests
from fake_useragent import UserAgent
session = requests.Session()
ua = UserAgent()
headers = {
"user-agent": ua.chrome,
"accept-language": "en-US,en;q=0.9"
}
session.headers.update(headers)
response = session.get("https://example.com/")

3. Spoof HTTP/2 with httpx

Some sites require HTTP/2 support. Use httpx for HTTP/2 compatibility:

# Install: pip install httpx
import httpx
with httpx.Client(http2=True, headers=headers) as client:
response = client.get("https://example.com/")

4. Bypass Path Validation

Modify the URL to trick path-based filters:

url = "https://example.com//"  # Add trailing slashes
# OR
url = "https://example.com/?cache=1" # Add dummy params

5. Route Through Proxies

Rotate IPs to avoid blocks:

proxies = {
"http": "http://user:pass@proxy_ip:port",
"https": "http://user:pass@proxy_ip:port"
}
response = requests.get(url, headers=headers, proxies=proxies)

Free Proxies: Use services like FreeProxyList, but expect instability.

6. Disable SSL Verification (Last Resort)

If the site blocks non-browser SSL handshakes:

response = requests.get(url, headers=headers, verify=False)  # Use with caution!

Conclusion

Bypassing 403 errors requires mimicking browsers at multiple levels: headers, TLS fingerprints, and request patterns. While curl_cffi is a game-changer, combining it with header refinement, HTTP/2, and proxies ensures robust scraping. Always respect robots.txt and avoid overloading servers.

Got your own 403 horror story? Share your experiences in the comments!

⚠️ Disclaimer: This article is for educational purposes only. Always obtain proper authorization before scraping any website.

--

--

OuneLog
OuneLog

No responses yet