How to bypass Cloudflare bot protection

kentonv · on April 4, 2021

Hi, I'm the tech lead of Cloudflare Workers.

This article contains several misunderstandings.

If a Cloudflare customer has configured their origin server to respond only to Cloudflare IPs, then they MUST also verify that the "Host" header on any request actually matches their domain name. If they do not verify the Host header, then anyone can sign up for Cloudflare and simply configure their DNS to point to the victim's origin IP address, and requests will be routed there -- but will have the attacker's domain in the Host header. Workers is not needed for such an attack. This attack has always been possible and is common to basically all CDNs. Fundamentally, the CDN has no way of knowing if the origin server that a user has configured really belongs to them -- the CDN can only tell the origin (via the Host header) what customer it thinks it is serving, and expect the origin not to accept requests that were on behalf of a different customer.

Instead of IP-based authentication, we strongly recommend using mTLS-based authenticated origin pulls (with a zone-specific key pair) or Argo Tunnel, as these methods are much more secure.

As long as the origin server is verifying via one of the above means (IP+host header, AOP, or Argo Tunnel) that the request was processed by Cloudflare on behalf of the customer's zone, then the attack described in the article doesn't accomplish anything. If a Worker makes a request to a hostname that is on Cloudflare, then all of the target host's Cloudflare security settings will apply to that request the same as if the request came from an external client.

Additionally, as mentioned in the article, any requests coming from a Worker will have the CF-Worker header identifying the zone which sent the request. If a customer suspects abusive requests coming from Workers, they should report it to us.

~~The article claims that the X-Forwarded-For header can be forged, but this is not true if the target host is itself a Cloudflare customer.~~

EDIT: What I said about X-Forwarded-For seems to be incorrect. CF-Connecting-IP has always been the header we use to identify the client IP, and it cannot be forged. X-Forwarded-For seems to have more complicated and subtle behavior that I'm not familiar with. I'm investigating.

kentonv · on April 4, 2021

A couple other notes:

* The IP address 2a06:98c0:3600::103, mentioned in the article as appearing in the CF-Connecting-IP header, is a special IP address that is used for all cross-zone requests that come from Workers. This is not the actual address of any Cloudflare machine. Workers fundamentally don't have distinct IP addresses. Instead, the CF-Worker header needs to be used to identify which worker sent the request. We intentionally do not identify the original client's IP here because the Worker itself could be working against the client, and could have maliciously modified their request to be something entirely different. Hence, the request can't be considered to have come from the original client.

* Any cloud service that lets you make HTTP requests can, obviously, be used to set up a proxy that hides your IP address. Workers is actually less useful for this compared to many other cloud hosts since we tag all outgoing HTTP requests from a Worker with the sender's domain name (in the CF-Worker header), which makes it relatively easier to track, report, and block abuse.

orf · on April 4, 2021

> If a Cloudflare customer has configured their origin server to respond only to Cloudflare IPs, then they MUST also verify that the "Host" header on any request actually matches their domain name.

Out of interest do you provide any automatic checks for this? It seems like it would be trivial to add some kind of verification step that simulates a request with a bogus host header and checks the status code of the response.

kentonv · on April 4, 2021

I don't think we do, but that's a good idea! I think it could be automatic. We'd have to first detect that the origin server appears to be blocking non-Cloudflare IPs, by seeing if it responds to requests from some random non-CF IP address. Then we'd verify that it refuses to respond to a request for some dummy domain. If it serves the site's normal content when the host name is from the wrong domain, then we could alert the customer that they seem to have an insecure configuration. I will suggest this internally...

orf · on April 4, 2021

The “normal content” bit is possibly quite hard to detect. You could do some kind of distance calculation between “safe” responses and “unsafe responses”, but it seems like there would be a bunch of edge cases given the variety of sites you host.

Anyway, thanks for hearing my suggestion!

kentonv · on April 4, 2021

Yeah, I think I'd just look for HTTP status code.

EB66 · on April 4, 2021

What would CloudFlare automatically check the Host header against? The Host header would be set to the attacker's domain (which is in CloudFlare) and, per OP, CloudFlare does not verify ownership of the origin server IP.

It would be a nice feature if CloudFlare customers could opt-in for origin server IP verification. Once the origin IP is verified, it would prevent other CloudFlare users from using the IP as their own origin. However, I understand why CloudFlare doesn't verify origin IPs -- it would require a verification process that doesn't rely solely on uploading a verification file to a web accessible directory. It'd be a bit difficult to keep the verification process simple for end-users if they aren't running a web server. Or maybe I'm wrong and non-HTTP use cases don't need to be considered and a verification file uploaded to a web accessible folder would work fine.

orf · on April 4, 2021

Somewhere in the interface you’d have a button that when pressed would send a request to the specified origin server with a Host header set to “foobar.com”. If the status code is 200 you’d display a warning saying “perhaps you should re-read our documentation about verifying host headers”.

I can think of a whole bunch of corner cases to this, but that’s the general jist.

EB66 · on April 4, 2021

But in order to safely enable that check for all requests that go from CloudFlare to the origin IP, you would first need to verify ownership of the origin IP.

You wouldn't want users who don't own the origin IP to be able to turn on that sort of check and potentially cause a service disruption. It could be the case that the legitimate owner already has a Host header mismatch in their HTTP requests/responses.

floatingatoll · on April 4, 2021

Presenting a warning about possible misconfiguration to the customer need not create a service disruption.

EB66 · on April 4, 2021

We're talking about a situation where an attacker would create the service disruption by enabling Host header checks on an origin server that the attacker doesn't own. So a warning wouldn't help because that warning would be displayed to the attacker.

The logic is tricky, but in the end origin server ownership verification is what's required to safely support Host header checks.

floatingatoll · on April 4, 2021

If the attacker is logged into the Cloudflare control panel, where presumably the warning would be displayed — where else could it be displayed? certainly not on the content served to end users — then, yes, the attacker could clear the warning. They could also change the origin servers, or do countless other things to disrupt service, that do not require modifying an origin server. I consider that an acceptable failure case for the warning.

I'm not arguing for or against ownership verification, but there is opportunity to improve here that does not depend on the question of ownership verification.

EE84M3i · on April 4, 2021

I could understand verifying domain names, but verifying IP addresses seems like it would be an extremely complex feature for very little benefit.

You'd have to have implement it in-line with the dns resolution in the server before talking to the origin, right? There's no other time you could reasonably do it (DNS changes), and it would have to be checked on all customers requests, not just the ones that opted into the feature (because those other customers could be the "attacker"!). Of course, you would cache it and use an efficient lookup scheme, but in general cloud services really don't like creating features that have non-localized effects like this.

Not to mention this sounds like a support nightmare foot gun for enterprise customers.

r1ch · on April 4, 2021

Assuming you're using HTTPS (and if not, why?!), this is checked automatically as part of the TLS handshake. The only downside is Cloudflare makes it easy to accidentally disable this check as "Full SSL" bypasses certificate verification.

meowface · on April 4, 2021

Yeah, I do feel like there should be more visible encouragement to use "Full (Strict)" when someone adds a new domain for the first time. Especially now that it's so easy to provision free SSL certificates via Let's Encrypt etc.

maxgashkov · on April 5, 2021

It seems to me that author of the article implies that Cloudflare treats Workers' IPs as having much higher reputation (threat score in CF's terms) than any other IP on the Internet by default. This allows worker to bypass bot protection as it does not trigger at all or only when zone is configured to trigger on all requests (under attack mode).

Could you comment on that?

kentonv · on April 5, 2021

To be honest, I don't know anything about Cloudflare's bot management product. That isn't my department.

I would speculate that proxying through any cloud service (Workers, AWS, Digital Ocean, whatever) can indeed be a tool to hide bot activity, and that any bots product needs to think about how to combat that. I would also speculate that originating IP address is not the only signal the product uses to determine threat. Moreover, I would speculate that this is all an arms race, there will be a constant stream of new techniques developed by bot authors and new techniques deployed to thwart them.

You may be right that the blog author intended to report a problem specifically in bot management, and I have no idea whether that specific problem is real. However, the blog post is not very clear, and many readers have misinterpreted it as reporting a way to totally bypass all Cloudflare protections, such as WAF, geo firewall, rate limiting, etc. That's the misunderstanding I meant to address. Sorry, I could probably have been clearer about that.

lilyball · on April 5, 2021

It looks to me like it’s just about how you can use Cloudflare as a reverse proxy, to hide the fact that your origin is Tor. And the bit about being able to change X-FORWARDED-FOR seems to just be about defeating IP-based rate limiting on sites that trust Cloudflare sufficiently to respect the X-FORWARDED-FOR header.

meowface · on April 4, 2021

>If a Cloudflare customer has configured their origin server to respond only to Cloudflare IPs, then they MUST also verify that the "Host" header on any request actually matches their domain name. If they do not verify the Host header, then anyone can sign up for Cloudflare and simply configure their DNS to point to the victim's origin IP address, and requests will be routed there -- but will have the attacker's domain in the Host header.

What exactly could such a misconfiguration enable? Just that the attacker could configure lower levels of Cloudflare security settings for their domain, to bypass the origin's intended Cloudflare security settings? Would this also bypass I'm Under Attack Mode?

abiro · on April 4, 2021

> If a Cloudflare customer has configured their origin server to respond only to Cloudflare IPs, then they MUST also verify that the "Host" header on any request actually matches their domain name.

Where is this documented?

toredash · on April 4, 2021

I'm 100% sure it isn't. Cloudflare docs are not great, and as a Enterprise customer I spend more time with support than needed as a result.

aeyes · on April 4, 2021

> If a Cloudflare customer has configured their origin server to respond only to Cloudflare IPs, then they MUST also verify that the "Host" header on any request actually matches their domain name.

The worker code sends the correct host name.

https://github.com/jychp/cloudflare-bypass/blob/dd7c1cfc2e6b...

kentonv · on April 4, 2021

Right. If the hostname requested by the worker is a host that is itself protected by Cloudflare, then all of that host's usual security settings apply. Nothing is bypassed.

The author of the blog post seems to have been confused by the fact that different behavior happens if the target host is on Cloudflare vs. not on Cloudflare.

btown · on April 5, 2021

> any requests coming from a Worker will have the CF-Worker header identifying the zone which sent the request. If a customer suspects abusive requests coming from Workers, they should report it to us.

What happens if you receive a report of abusive behavior from a Worker? Would you rate limit the specific Worker, or would there be any impact on the larger Cloudflare account of the Worker's owner?

For instance, if we have a Worker that connects to an arbitrary endpoint based on user-generated parameters, or some vulnerability is found (in the JS code for the Worker) that allows this to happen, and it ends up being used for abusive behavior, would we have recourse to fix the vulnerability, or would action be taken immediately? Would our larger account, or our unrelated Workers, be at risk of being taken offline?

kentonv · on April 5, 2021

These issues are handled on a case-by-case basis. I can't make any hard promises here (it's not my department), but generally, if your worker is not obviously malicious, we will talk to you before taking any drastic action.

r1ch · on April 4, 2021

> If a Cloudflare customer has configured their origin server to respond only to Cloudflare IPs, then they MUST also verify that the "Host" header on any request actually matches their domain name.

This could be improved considerably if Cloudflare removed support for insecure TLS options. "Full" SSL bypasses certificate hostname verification on the origin server which could prevent such an attack at the TLS level. Every time I've added a new zone to Cloudflare, Full or Flexible (ie, no SSL) has been the default. The only safe option is Full (Strict) but I'd bet good money that the majority of Cloudflare customers aren't using that.

thesis · on April 4, 2021

This seems like the same kind of reason we're not using Cloudflare Access to protect some internal resources.

We'd love to use Argo Tunnel but unfortunately it breaks with an OPTIONS request for us. We've tried opening tickets with Cloudflare.

sillywabbit · on April 4, 2021

Once upon a time: [link redacted]

kentonv · on April 4, 2021

Yep, I worked on that one. Thank you for reporting it.

Ironically, the purported problem in this case is the opposite: Because we don't attribute the request as coming from the original client (in order to avoid the security problem you reported), a client that has bad reputation associated with their IP can proxy through a worker to hide that. The answer is to assign reputation to the Worker itself.

sillywabbit · on April 4, 2021

Yeah, I don't envy you. Finding loopholes is more fun than trying to account for everything upfront.

ec109685 · on April 5, 2021

You see the request on “both sides” of the worker, so shouldn’t you be able to evaluate the “connecting client to the worker’s reputation” when allowing or denying the upstream request?

kentonv · on April 5, 2021

You mean by somehow evaluating whether the incoming request and the outgoing one are sufficiently "the same", in which case the outgoing request can be attributed to the original client?

This seems hard. Let's say the request is the very simplest "GET /" request. The worker does nothing except rewrite the hostname, in order to proxy it to a different server. Is it "the same" request?

Well, imagine the new destination host is a company-internal dashboard that authenticates users based on IP address. Now this "GET /" request which was proxied will receive a response that contains secrets. The worker that did the proxying can intercept the response and learn the secrets.

So clearly if the request's host is rewritten, it cannot be "the same" request and it cannot be considered to have come from the original client.

But the only case where we care about any of this is when a worker makes a request to a different domain. If a request is to the same domain, we trust it, because in that case the attacker can only attack themselves anyway. So the only case that even matters here is when the host has been rewritten, and in that case, we've established that the new request definitely cannot be attributed to the original client.

So it seems to me there's nothing interesting we can do here. A request coming out of a worker (going to a different domain) must always be attributed to the worker itself, not to the client that triggered the worker.

MuffinFlavored · on April 4, 2021

From the article:

> X-FORWARDED-FOR: 1.2.3.4 (yes, we can override this header with whatever we want !)

Why aren't you guys disallowing that field from having any external value?

kentonv · on April 4, 2021

So we're supposed to be appending the client's IP to whatever the client sent for X-Forwarded-For, documented here:

https://support.cloudflare.com/hc/en-us/articles/200170986-H...

For better or worse, this is how X-Forwarded-For works, by convention: each forwarder is supposed to append a value. This means that, even in a correct implementation, an attacker can specify an arbitrary prefix of values. The consumer must be very careful about which part of the chain they trust.

But the report here seems to be suggesting that we're mishandling this appending in some cases. We're investigating and if true will fix ASAP.

Because X-Forwarded-For is surprisingly complicated, we recommend using CF-Connecting-IP instead. As the docs say: "To restore original visitor IP addresses at your origin web server, Cloudflare recommends your logs or applications look at CF-Connecting-IP or True-Client-IP instead of X-Forwarded-For since CF-Connecting-IP and True-Client-IP have a consistent format containing only one IP."

modoc · on April 4, 2021

Because it's legitimate to have values before the request gets to CloudFlare, for instance if it was routed out through a corporate proxy, etc..

ec109685 · on April 5, 2021

Only possible if non-TLS or the company is man in the middling TLS (and certificate pinning isn’t enabled).

allending · on April 5, 2021

So... completely possible then.

MuffinFlavored · on April 4, 2021

Ouch. I wonder if it's time to cryptographically sign those values or add a checksum or something.

acdha · on April 4, 2021

Basically this is why you want to go full zero trust: if you use mutual authentication over TLS, this becomes much less of a concern since you’re not trying to craft policies based on IP addresses with multiple intermediaries.

zamalek · on April 4, 2021

> This attack has always been possible and is common to basically all CDNs.

I strongly recommend you play around with Azure CDN, because this is not entirely true.

jeroenhd · on April 4, 2021

That's a nice trick, with some automation you should be able to register new domain names and Cloudflare accounts to get past the 100k limit as well.

Cloudflare's anti bot tooling is terribly annoying while using Tor. I understand why they do it, though. Using Tor is a good way to show how terrifyingly much control Cloudflare has over the internet.

jchw · on April 4, 2021

Not to detract from the point but I find Cloudflare to be on the more reasonable side when it comes to blocking Tor. A lot of sites end up effectively banning all Tor IPs, Cloudflare merely requires CAPTCHA. And they do offer Privacy Pass as a sort of approach to make it a little less annoying.

capableweb · on April 4, 2021

> Cloudflare merely requires CAPTCHA

I do agree that Privacy Pass is in theory a good idea, although to say that CloudFlare "merely" requires captchas to be filled out is a bit disingenuous. You're often required to complete 5-10 captchas, and sometimes even after that, you get denied. Had that happen to me multiple times.

jchw · on April 4, 2021

I don’t use Tor for all of my traffic, but sometimes I do use it just because I can. And my experience lately is that I usually get hCaptcha and I usually pass it once and get in. It’s Recaptcha that is a serious problem.

akalsz · on April 4, 2021

Yeah after years of suffering from reCAPTCHA I'm somewhat thankful for hCaptcha. It is still annoying (especially Cloudflare's integration which seems barely compatible with the Tor Browser's cookie and circuit management), but at least I don't have to switch exit nodes twenty times just to have a chance of my solution being accepted (like with reCAPTCHA).

croutonwagon · on April 4, 2021

Google owned domains, namely Youtube, are even worse. They often redirect to different domains for blocks so just changing the circuit does nothing.

mikequinlan · on April 4, 2021

If you can't pass a CAPTCHA, you need to stop and ask yourself -- are you really a human being, or have you just been programed to believe that you are?

iaml · on April 4, 2021

Not a valid question when google literally has a patent on unsolvable captcha. https://patents.google.com/patent/US9407661

kazinator · on April 4, 2021

If you have been trying to solve the same captcha for several hours, you need to stop and ask yourself -- are you really a human being, or have you just been programed to believe that you are?

arbol · on April 4, 2021

This patent has expired due to non payment of fees

Google234 · on April 5, 2021

That’s also not google lol

iaml · on April 14, 2021

My bad, I was referring to this comment [0] and didn't notice it was not actually owned by google.

[0] https://news.ycombinator.com/item?id=16157480

capableweb · on April 4, 2021

Actually a good point as I never even considered that. Maybe I'm the one who's at wrong here?

Seemingly I am passing the captchas, they are not giving me any errors. Simply seems to accept my input, reload the page and ask me to do more.

colinmhayes · on April 4, 2021

There are some really crazy captchas out there. I've had some where I'd need a microscope and enhance button to get.

0df8dkdf · on April 4, 2021

> You're often required to complete 5-10 captchas, and sometimes even after that, you get denied. Had that happen to me multiple times.

Glad you mentioned it. This also has happened to me many time. It is unfortunate botnet and spam has give IP addresses bad names.

dotancohen · on April 4, 2021

  > And they do offer Privacy Pass as a sort of
  > approach to make it a little less annoying.

The Privacy Pass extension requires, as it mentions when installing it, access to all the data on all websites that you visit. So in order to stop getting so many Captchas, one has to give this company unfettered access to all sites that one visits? Why even bother with an HTTPS connection when the browser is potentially leaking the information away with explicit user consent?

And I don't even use Tor. But every few weeks I'll have a two or three day spat where every webpage that I visit requires at least two captchas to be solved. Maybe another user from my ISP is scrapping, but I'm a good citizen on the 'net.

t0astbread · on April 4, 2021

As far as I know, Cloudflare transparently redirects some Tor users to their own hidden services and then uses the circuit ID for rate limiting: https://blog.cloudflare.com/cloudflare-onion-service/

"Some" because they seem to have internal heuristics for detecting Tor Browser users before they enable that feature for a connection, so it doesn't apply to all connections made through Tor. YMMV I guess but I haven't seen a Cloudflare CAPTCHA over Tor in a long time.

akalsz · on April 4, 2021

I still wonder how that detection thing works. My custom Firefox setup with requests proxied through Tor passes as the TBB, but copying the same request as a curl command somehow doesn't.

t0astbread · on April 5, 2021

Yeah, it's pretty weird and apparently it's being updated as well. Like, a year ago or so I wrote a browser extension to trick that detection mechanism when I'm on stock Firefox. Just when writing that comment I discovered something has changed and that's not needed anymore.

deanc · on April 4, 2021

Try signing up to gitlab and then signing in. It’s completely broke with TOR because of cloudflares crappy redirect and nobody has done anything about it. The captcha also breaks after you spend about thirty mins trying to find a relay which doesn’t trigger the redirect page protection loop.

grishka · on April 4, 2021

Discriminating IP addresses is a terrible idea to begin with. You're breaking the internet by doing it. Please treat all IPs equally.

catblast01 · on April 4, 2021

It’s just silly to suggest this when quite simply not all IPs behave equally and the differences are not subtle. It’s annoying when “bad” IPs get reassigned and change, but that is a relatively infrequent event - and certainly CloudFlare is at least adaptive. It’s just not a principal aligned with reality.

grishka · on April 4, 2021

The problem that no one seems to acknowledge: one IP doesn't necessarily mean one person or entity behind it. Residential ISPs often use a single public IP as an exit point for many subscribers simultaneously. So if one, just one, of these subscribers does something that our internet overlord Cloudflare considers nefarious, everyone else sharing this IP gets punished for doing nothing wrong.

jbluepolarbear · on April 4, 2021

This specific problem happens all the time. That’s the ISPs problem, not cloudflare. The ISP has a user negatively impacting other users, best course of action would be remove that user and notify cloudflare after.

grishka · on April 4, 2021

Do ISPs actually do any of this? Does any ISP ever do any of this? I don't think so. Also, no one knows what was the particular thing that pissed off Cloudflare, because it's a black box.

I'll say it again: centralizing the internet around a single company is a terrible idea. Especially so when said company terminates TLS and thus has access to unencrypted traffic.

jbluepolarbear · on April 4, 2021

ISPs don’t because they don’t care. Until there’s a legal or monetary reason to do so ISP providers will do the bare minimum. My spouse worked for a large American ISP for would hear all the time that apartment complexes would not be able to access certain services because one user did something that got the IP blocked and nobody cared because it wasn’t their problem.

grishka · on April 4, 2021

ISPs don't care so they don't do anything. Website operators don't care so they offload the responsibility to Cloudflare. Cloudflare doesn't care so it bans and/or humiliates unrelated people however it sees fit. Nice.

simple_phrases · on April 5, 2021

It's Cloudflare's problem too if my customers can't access my site because I made the poor choice of trusting Cloudflare's antibot protection.

BlueTemplar · on April 4, 2021

This specific issue is going to be solved by IPv6 though.

grishka · on April 4, 2021

In 50 years when half the internet users will finally have access to IPv6. Or, I'm pretty sure some people, or Cloudflare, will start banning entire /32 or even bigger subnets because why not. Oh you're from Russia? Here, have a 403 for no apparent reason.

BlueTemplar · on April 4, 2021

Considering the current adoption rates, I'm eyeballing 5 years :

https://www.google.com/intl/en/ipv6/statistics.html

Bender · on April 5, 2021

It may be important to tease out the mobile operators and VPS providers like AWS/Azure/GCP first. They have skewed the numbers a bit. Almost all VPS providers and cell providers are IPv6 now. Residential is hit-or-miss. Maybe that will change as more people use Starlink and other newer tech.

BlueTemplar · on April 6, 2021

Yeah, in no small part because cellular operators tend to be newer than landline ISPs...

jbluepolarbear · on April 4, 2021

100% of the bad traffic to my servers comes from TOR. Not all IPs are equal, some aren’t playing nice. Cloudflare makes it so I don’t have to worry about it.

zelphirkalt · on April 4, 2021

That's how responsibility is shifted always to the next person or institution.

You make your life simple by saying "Cloudflare makes that decision." Cloudflare says "We are merely offering a service here! You don't have to use it, if you think it's wrong." And ISPs say (or should say): "We are here to sell Internet access, not to watch what everyone on the Internet." and may by law not be allowed to look into the traffic.

In the end the user suffers and is discriminated against and no one wants to be responsible.

I would take the position, that each person, who uses Cloudflare, has to live with being responsible for whatever Cloudflare imposes upon users / visitors. The mass of people using Cloudflare services makes the difference. Each person using it adds a little bit to it. It is a collective responsibility.

jbluepolarbear · on April 4, 2021

Then what’s the alternative? If I don’t use a service like Cloudflare I have to invest a lot of time and money into making sure my servers accept the traffic I want and reject the traffic I don’t. This isn’t an ethics dilemma, bad actors wreck a lot of innocent people’s lives all the time it wasn’t going to stop on the internet.

bombcar · on April 4, 2021

The alternative is simple - ban tor ips entirely as it’s not worth the hassle.

pepemon · on April 4, 2021

No, we won't. Many of those IPs are the origins for spam, botnets, crawlers, DDoS, exploitation engines, etc. The only feasible solution even with all modern heuristics and ML is still good ol' IP scoring and/or banning.

Tepix · on April 4, 2021

You're saying RBLs don't work? In my experience they work rather well.

ffpip · on April 4, 2021

> Cloudflare merely requires CAPTCHA.

Cloudflare doesn't "require" captchas for TOR users. By default, they treat TOR IPs it like any other IP. However since a lot of traffic comes out of a TOR IP, it looks like bot traffic

Sebb767 · on April 4, 2021

They at least monitor TOR IPs. See this blog post [0] from 2016.

https://blog.cloudflare.com/the-trouble-with-tor/

rattray · on April 4, 2021

I hadn't heard of "Privacy Pass" before.

https://blog.cloudflare.com/cloudflare-supports-privacy-pass...

Sounds like it allows you to browse without tracking in a way that more reliably signals you're a human, which seems very useful.

Why would a human browsing the internet with Tor not want to use Privacy Pass?

m-p-3 · on April 4, 2021

And I found hCaptcha to be way less annoying than ReCaptcha when using Tor.

hedora · on April 4, 2021

> I understand why they do it

Care to elaborate? Why block access to static pages by default? Tor’s aggregate bandwidth is tiny compared to CF’s aggregate bandwidth, so DOS attacks surely aren’t the reason.

jchw · on April 4, 2021

It’s not bandwidth/DoS attacks, but it is malicious traffic. Even for websites that don’t explicitly block Tor exit nodes, they quickly end up banning most Tor exit nodes just by virtue of blocking malicious traffic that crops up.

Malicious traffic takes many forms. As an attacker trying to attack a website, you are probably not going to come in from a residential IP address, unless it’s one you have access to illegally. It is much more likely you will use Tor.

And because of the nature of Tor, you literally, full stop, cannot ban individual Tor users to any meaningful degree. So if you offer services not requiring sign up, or services where the sign up is unobtrusive, users abusing the service via Tor are difficult to stop.

Of course there are counter points... such as [1]. But let’s be honest, from the perspective of a tired sysadmin trying to stop ongoing vandalism, your options are limited, especially if you don’t want an intrusive sign up option.

As a counterpoint to the counterpoint, if Tor users were treated like any other user malicious parties might use the Tor network even more often than they already do.

Edit: also, though it is not necessarily malicious per-se, there are some types of traffic like scraping that may be undesirable from the PoV of the website. I am not going to make a value judgment regarding the merit of blocking scrapers, but it is nonetheless a goal where you can see the reasoning behind blanket banning Tor.

[1]: https://blog.torproject.org/the-value-of-anonymous-contribut...

BlueTemplar · on April 4, 2021

Search engines use scraping. I would guess that most websites want to be featured on the search engines?

jchw · on April 4, 2021

Not all websites want to be scraped even by search engines, and some only want to be scraped by search engines. Unwanted scrapers may refuse to follow robots.txt or do things with the scraped data that is either undesired to their business model or possibly even illegal, though that is complex. The point is, for better or worse, as far as I know, websites have the right to block scrapers while allowing search engines, and as long as it remains viable to prevent scraping, it remains likely that sites intending to do this will end up blocking or obstructing Tor traffic. Is it a good thing? I would argue it’s too abstract to judge in such a generalized fashion. Whether or not it is a good thing or a bad thing, though, is not important. It is the world we live in right now.

colinmhayes · on April 4, 2021

I wrote a search engine in school. Since I was naive/lazy/wanted a large corpus it didn't follow robots.txt. It ended up DOSing multiple sites, including Duke's law school during class sign up. I don't think those sites wanted to be featured on my search engine.

flas9sd · on April 4, 2021

not only Tor, but other network resources that are legit, but can be abused because of their no-cost nature like Hurricane Electrics ipv6-in-ipv4 tunnel broker. Not sure how I feel about cloudflares role as arbiter. I think instead of relying on a service, a good nginx module that has ratelimiting abilities easily surfaced with some grasp of ASN origin could satisfy the need of most sites out there.

xwdv · on April 4, 2021

Also worth noting it’s a good time to buy Cloudflare stock, it’s down from recent highs and may be on its way back up in the next month. This is a good long term hold.

oefrha · on April 4, 2021

> As you can see, X-FORWARDED-FOR can be set to an arbitrary value, so you can bypass IP limitation requests during scrapping or more dangerous IP verification during a login procedure … The origin IP is not forwarded to the website, so the only way to block this kind of request on your server is to filter on CF-WORKER header.

X-Forwarded-For should always be treated as forged at least for security purposes (except the hops set by reverse proxies under your control). If you do IP whitelisting based on untrusted X-Forwarded-For you’re doing it terribly wrong.

daurnimator · on April 4, 2021

A lot of people just verify that X-Forwarded-For matches a Cloudflare-owned IP.

As this post demonstrates, you need to verify against the cloudflare origin pull CA (see https://support.cloudflare.com/hc/en-us/articles/204899617-A... )

tgsovlerkhgsel · on April 4, 2021

I suspect that the idea here is that XFF is supposed to be trusted, because it is supposed to come from Cloudflare.

userbinator · on April 4, 2021

For a very long time, XFF-spoofing was a trick to get past journal subscriptions and other paywalls. I don't feel so guilty about disclosing this here now, as a lot of them have (unfortunately) blocked that route. No doubt a few popular bookz/journalz sites owe at least some of their content to this technique. But it's something to try and might still work on a few of them...

RiceAndWaters · on April 7, 2021

Hey, I saw one of your past comments where you mentioned being (briefly) active in the bookz scene in the past.

I really want to know more about this scene. If it's not too much work for you, please contact me at Bibliohoarding@tutanota.com

Thank you!

minitech · on April 4, 2021

This just sounds like a way to make a proxy with Cloudflare workers, which aren’t considered by Cloudflare’s filters to be as suspicious as Tor yet. You could do the same with any new VPS/cloud function provider. Am I missing some actual bypass?

If the X-Forwarded-For value from the request is forwarded by Cloudflare’s proxies unaltered, giving special privileges to Cloudflare workers, that’s a problem, but it’s not clear whether “a site who display your headers” is protected by Cloudflare and displaying an unaltered X-Forwarded-For. I don’t expect Cloudflare to have made this oversight, and the e-mail response from them matches: “All workers subrequests go through Cloudflare again, and therefore you won’t be able to bypass any restrictions directly.”

bassdropvroom · on April 4, 2021

> You could do the same with any new VPS/cloud function provider.

I'm not so sure this is true. Not long ago Cloudflare's CEO was making fun of people trying (and failing) to bypass Cloudflare's anti-bot measures. This allows to bypass this because as far as Cloudflare is concerned, the requests are coming from a trusted source. Setting up a proxy with a provider that isn't Cloudflare, means the requests aren't coming from a trusted source any more.

judge2020 · on April 4, 2021

The thing is that it isn't being seen as a trusted source. The same global rate limits apply, and the `Cf-Connecting-Ip` (which is the one you should use for security) request header still can't be forged and is passed to the end website correctly.

tyingq · on April 4, 2021

I have seen workers code that appears to remove Cf-Connecting-Ip prior to a fetch(), though no idea if that works. Maybe it just gets replaced with the workers ip? Like:

  router.route('/proxy/:url', '*', async (event) => {
    event.request.headers.delete('cf-connecting-ip');
    event.request.headers.delete('x-real-ip');
    return await fetch(event.parameters.url,event.request);

Edit: Tried several variations of this. It doesn't work. The cf-connecting-ip header is always in the fetch request with the right data. It's a bit confusing because some of the Cloudflare tools don't show you all the headers.

jsnell · on April 4, 2021

> Cloudflare's CEO was making fun of people trying (and failing) to bypass Cloudflare's anti-bot measures

Do you happen to have a link?

bassdropvroom · on April 4, 2021

https://mobile.twitter.com/eastdakota/status/127327783910265...

jsnell · on April 9, 2021

Thanks!

spzb · on April 4, 2021

> Am I missing some actual bypass?

No, you're not. Which is why Cloudflare were correct to say this isn't a security issue.

SergeAx · on April 4, 2021

VPS providers are not free, and so is traffic produced by VPS. On the other hand, 100k req/day is a generous limit, easily expandable with additional accounts.

minitech · on April 4, 2021

GitHub Actions and other CI, Heroku, AWS, GCP, Azure, Fly.io, etc. have free tiers. This is another one of those that, when misused enough, will develop a bad reputation too. (Except in the case of Cloudflare workers to Cloudflare bot blocking, they’re in potentially an even better position to distinguish legitimate traffic because they have information from both ends.)

freedomben · on April 4, 2021

I'm glad I'm not the only one that can't type Cloudflare reliably. Couldflare is an important part of my architecture. Same with Cloudlfare. No CloudFalre or Cloudfalre yet but there's still time ;-)

luckylion · on April 4, 2021

As mentioned in the article, it's easy to mitigate by not only checking that the request is arriving at your endpoint from an official CF IP, but to also check that it does not contain a CF-Worker header (or contains one with your domain).

The limits on workers' sub requests (e.g. requests are done by host name only, can't use IPs with a different host name) make it somewhat cumbersome to set up reverse proxy workers (which was my use case), but the tech itself is just great, works very reliably and the delay, while noticeable, is tiny and never bothered us.

userbinator · on April 4, 2021

More precisely, "bot and minority browser/system" protection... when I come across a site that has it, I will just go somewhere else, or use Google's cache if absolutely necessary.

foolmeonce · on April 4, 2021

I'd wondered about the openness of the workers do connect to unrelated resources myself..

But I agree with CF, they don't inherently have a security problem. They have the normal situation of a proxy again. If a lot of new domains show up to bot proxy then the Baysian for an unknown worker domain is going to become bot..

winrid · on April 4, 2021

Their bot protection is so annoying for integrations, I may just use this...

casefields · on April 4, 2021

At first I didn't mind, but it seems to be popping up more and more. Like a little vampire sucking up to 5 seconds of life each time, that I'll never get back.

winrid · on April 5, 2021

Even worse: Lot more than 5mins to get whitelisting etc setup. The same person in the org setting up the integration probably doesn't have cloudflare account access, so it becomes a source of customers dropping off.

MaheshC · on April 4, 2021

If somebody from cloudflare is readying this. This is great. Please do not fix it, or if you do, provide a way for legitimate limited bypassing when, for instance, using puppeteer from an AWS instance.

SergeAx · on April 4, 2021

> At this point you can request any site using Cloudflare

I don't see why I can't request any site on the internet whatsoever, having legit Cloudflare IP address. Is there some limitation of Cloudflare workers?

judge2020 · on April 4, 2021

1. They limit subrequests to 50 per workers request

2. they have abuse alarms for obvious "spray requests by utilizing Workers" behavior (although we don’t know how well these might work)

2. The Cf-Connecting-Ip header is passed with the correct originating request IP, so if a website at least partially trusts CF (without directly using CF's proxy) they can match based on this header if the REMOTE_ADDR is Cloudflare. They could also block requests with the header Cf-Worker[0].

3. You can't forge the HOST header on CF, so if a website does use CF you can't bypass the actual CF proxy firewall by fetching the IP and setting the HOST header yourself (you can do it within your zone though)[1].

0: https://community.cloudflare.com/t/is-the-cf-worker-header-f...

1: https://developers.cloudflare.com/workers/runtime-apis/reque... (resolveOverride)

toredash · on April 4, 2021

I believe you can with Enterprise plan, override the Host header that is.

CFA178B · on April 4, 2021

Disclaimer: Throwaway account.

There is a way to bypass CF bot protection, which mostly uses basic HTTP features, nothing fancy (and especially not a security issue).

is_true · on April 7, 2021

don't leave us hanging, man

CFA178B · on April 8, 2021

Don't want to, but I am getting a slim benefit from the situation, and I would not like to give that up.

xPaw · on April 4, 2021

So is there any reason not to block any requests on your server if they contain the CF-WORKER header?

minitech · on April 4, 2021

Yep – any legitimate tool that makes requests on behalf of a user could do so using Cloudflare workers. (A webpage translation service, for example.) It’s the same as any other provider. You could block based on the value of the CF-Worker header if you find it’s being abused.

oefrha · on April 4, 2021

I’ve built legit CORS proxies for web-based community tools using CF Workers when the first party doesn’t set CORS headers.

klaudius · on April 4, 2021

They say there are multiple ways to do IP spoofing. Does anyone know some other ways to do it?

Tepix · on April 4, 2021

Find a provider that doesn't prevent you from doing it.

p2detar · on April 4, 2021

Cool. I’d be interested in any hints about bypassing Akamai’s protection bot. They use some annoying js machinery to detect browser sensors and stuff.

Stumbled on that some days ago. Quite annoying and there doesn’t seem to be a easy hack.

dcow · on April 4, 2021

TL;DR: somebody learned how proxies work and is naively mad at Cloudflare because they offer a cloud worker product that lets you proxy requests and their “bug” report is actually a feature.