In my opinion, the pat answers about security are incomplete. I'd like to see a detailed writeup of specifically why a raw UDP API cannot be made as secure as current HTTPS.
My sense is that the game they are playing is blame management and plausible deniability.
Without https, it becomes plausible for banks and other websites where security is paramount to blame the web standards for lacking a way to secure connections when they leak sensitive user data.
So what the committees and browser vendors really wants is a way for the browsers to easily know that all connections with this site are "secured". Now, if information leaks, the blame is solely on the site operators.
Currently they can do this if the site uses https.
If you introduce UDP to the mix, and tell them "I will encrypt the packets myself", then the browser has no way to tell whether the connection is secure or not, so they will default to telling the user that this website uses an insecure connection.
This would not be so problematic, except I think they want to eventually deprecate non-secure connections.
Efficiency and simplicity is the last thing they care about. They will only care about it when someone demonstrates the existence of a clearly superior web application that cannot be implemented without a certain feature. I think this is why wasm got standarized.
One dimension being glossed over here is that the more "simple" and "efficient" a protocol is, the more it tends to put all its eggs into one fused basket. This has knock-on effects, such as making it harder to deploy, scale and maintain the resulting infrastructure.
Not to mention that some people's threat model includes e.g. state level actors eavesdropping on internal data center links, which means even those should be encrypted in sensitive situations.
Create your own client app. This is very much trying to fit a square peg into a round hole.
If you want to, you can even give your client app an address bar, and let others use your app for their servers. Then you won't even need to touch html or css or JavaScript.
>The owners of ddosfuntimes.com can go ahead and set the IP address in their DNS records to point target.ddosfuntimes.com at any server they want, and they will receive all the XHR traffic from every browser that visits the page. And to the best of my knowledge, there isn’t a damn thing the target can do about that.
The attacks people are concerned about are about more than just DDoS. It's e.g. about impersonation and theft-of-service. You shouldn't be able to mimic an existing site and trick people into using it. You also shouldn't be able to use resources from a server you don't own. This is web security 101.
This is not possible in your outlined scenario: you can't make an HTTPS connection to `target.ddosfuntimes.com` because the hostname wouldn't match. So the only thing you'd be able to do is make a lot of failed HTTPS connections, which are aborted early. This would also be preceded by a DNS request for a hostname that can be flagged as suspicious (i.e. potentially blockable at the ISP level). So it is in fact possible to detect and filter out early, somewhat. This is exactly what services like CloudFlare are for.
>Furthermore, since the browser already allows the page to send as much HTTPS data as it wants back to the originating site, one could optionally allow any site to send UDP packets back to its own (exact) originating IP without asking the user.
The HTTP origin policy is actually about more than just IP, it involves the port too. You can call this paranoid, but nevertheless, that's the rule. So you'd either have to require that TCP and UDP ports match, or, you'd need to relax the restriction, or introduce some port negotiation scheme.
>So unless I’m missing something, XHR already allows you to target any website you wish with unwanted traffic from anyone who visits your site. So why the concern about UDP?
Not only are there security concerns beyond DDoS, but there are also practical concerns due to e.g. NAT and crappy airport/hotel wi-fi, which people generally expect to work. There are also privacy concerns with VPN users (and users who live undemocratic regimes), e.g. the ability to discover somebody's true IP address.
In fact, if you want a UDP-like protocol that satisfies all the above, you pretty much end up with WebRTC, which has semi-viable answers for these issues, runs on DTLS, and has been used to actually run complex video/audio conferencing in the wild.
I should add, you made a few blunders in your questioning on Twitter, and continue to do so here, so you are not doing yourself any favors by adopting a demanding attitude because you don't get the detailed security research you want _right now_. Like it or not, these are the kind of things professionals are generally paid to work on. They take a lot of time, experimentation, verification, and validation. So if you get low-level explanations on Twitter, without the detail you want, that's because you're sounding like somebody who needs to have the basics of the field explained to them.
Btw:
>ridiculous things in it like arbitrary text parsing, which no one in their right mind would ever put into a “secure” protocol
I'm not sure exactly which part you find so ridiculous—feel free to elaborate–but there's plenty of precedent of horrid binary encodings which left destruction in their wake (e.g. ASN.1). They are harder to work with, harder to parse, harder to extend (see: IPv4), and harder to verify by humans working with them. And let's be fair: the main reason "arbitrary text parsing" is considered a minefield is because people were stupid enough to keep doing it with handrolled code calling the C standard library, instead of verifiable tools like grammars and parser generators, because of a misplaced sense of "I can handle these footguns".
The trend you see these days is binary encodings with a canonical 1-to-1 text representations, which seems like the best compromise between accessibility and efficiency.
Cloudflare blocking mismatched hostnames helps address the issue with XHR, reducing the concern raised in the blog about XHR's potential for DDoS attacks (as this has a solution). Browsers need to easily verify security, which is why HTTPS is required. If raw UDP was allowed, browsers couldn't ensure a connection is secure, so they'd mark it as insecure. This is likely why raw UDP won't be permitted. The web will always lag behind native clients in security, but using UDP for custom apps has the same DDoS and security risks. It's frustrating for technical users; I'd love to see raw UDP allowed too.
The last thing I want is for a random web page or iframe/ad in my browser to have the ability to spoof packets coming out of my PC to look like something unexpected. Giving out access to UDP is a footgun, except someone else will be very deliberately aiming the gun at your feet and everyone else's.
Any custom encryption method will be impossible for the browser to verify. You could use some protocol to claim that you've encrypted your UDP API, but the browser wouldn't know whether that is true, or whether your implementation is reasonable.
This means no padlock icon on sites using UDP, and probably "WARNING: This connection may be insecure!" too.
Sites care about broadcasting their security to users. Which site would ever choose to use a feature which is going to warn the user that their site may be insecure?
Browsers care about warning their users in situations which are potentially dangerous. Why would they show a reassuring padlock that they can't verify, or implement a feature which gets users into the habit of clicking "continue anyway" on warnings?
It's not a technical hurdle to a secure implementation, but it's a hurdle to a secure implementation with a reasonable user experience. Nudging users towards secure behaviour is just as important as technical security. The existing padlock system isn't perfect, but it's a lot better than nothing.
Was a hypothetical WebUDP API even considered and/or rejected by WHATWG/W3C/etc?
The only *technical* objection to handing out raw UDP to web apps that I can think of is that it partially bypasses same-origin. It's important to note here that both the port and protocol constitute part of the origin; so for example http://example.com:80 and https://example.com:443 would be separate origins that cannot talk to one another. In fact, if a website were to be using custom ports, then http://example.com:8080 and https://example.com:8080 would also be separate origins.
These setups are all entirely crazy, but they are also historical legacy. Some badly-written webapp *somewhere* relies on the fact that browsers will isolate different servers on the same machine from each others' client traffic.
In fact, digging deeper into the historical legacy of the TCP/UDP spec, ports below 1024 are supposed to be admin-privileged, while anything higher is available for all users. So if same-origin didn't consider ports, then any user on the same machine as the server (a common arrangement in the early days of the web) could set up example.com:1337, and anyone linked there from example.com:80 would be sending all their :80 login cookies to :1337.
*Modern* servers don't look like this, of course. You'll have example.com:443, 0db8.net:443, ietf.org:443, and all sorts of other machines all being hosted out of one set of anycasted Fastflare or Cloudly IPs. This means that we have to tell the machine at the other end what domain we expect it to pretend to be with a Host: header or SNI header, so that requests get routed to the right vhost.
If we granted raw UDP to webapps then both the old style "services and shell accounts" and modern "vhosts all the way down" approach to configuring web servers would have new security concerns, because we're relaxing same-origin. All the existing Web-specific socket protocols are there to maintain the "protocol, domain, port" origin rules by bootstrapping sockets off of the already well-understood HTTP and HTTPS protocol headers so that they *don't have to change those rules*.
Now, would *relaxing* these rules for raw sockets hurt things? I'm not sure. But the benefits are really vague. Our security model is more rational, yes, but we made it make sense by weakening it. We can support really old legacy protocols that nobody should be using now - e.g. Ruffle could faithfully emulate Flash's RMTP and XML sockets using your raw sockets proposal. And you could theoretically roll your own HTTP stack and crypto for whatever reason.
On the other hand, every time the same-origin policy is relaxed even a little, bad things happen without a *lot* of foresight. To beat up on Flash again, it handled cross-origin requests using a special crossdomain.xml file at the root of a website. Because it was "in-band", so to speak, webapps with upload forms that didn't know about these files could be tricked into hosting them, for starters. Worse, it did not send any identifying header that would tell you if a request was cross-domain, so permissive cross-domain policies would leak sensitive data. Compare it to HTTP CORS: the browser pre-flights every request and the server responds if it's allowed. The actual cross-domain request itself is also properly marked with the origin, so you don't even need to restrict any cross-domain access. You can just allow all requests, check if each request is cross-domain and redact sensitive information if so.
Raw sockets feel less like HTTP CORS and more like crossdomain.xml. There's no mechanism in TCP or UDP for the browser to say, "hey this request came from a script on example.com:80"; and without that information non-HTTP services will get confused and get hacked.
Example:
Let's say Cloudflare has a secret management service on UDP port 4000 that lets their operations team debug broken production servers. Of course, this is dangerous, so it's all firewalled off to a VPN that operations only connects to when they are debugging machines.
You are a hacker. You set up notevil.net:443, put it behind Flarely, and buy the premium service package with all the extra support stuff in it. Then, after a week or two you ask support to look at your website. The website is "down", and because you're a high-paying customer one of the ops team is dispatched to take a look at the site.
Oops. Raw sockets means that notevil.net:443 can now access that secret management port on :4000 *from within the FastCloud operations VPN*. We've already broken through their corporate firewall with just normal web standards. It does not matter if the hosting provider "shouldn't" have this kind of management setup; it does not matter that our security model is uneven and prohibits this tier of attack but allows another. The same-origin restriction is a strict promise that we wouldn't allow this to happen and existing security systems are built around it.
> But at least it exists, and perhaps if adoption continues to grow, eventually it will be possible to require HTTP/3 without losing a significant number of users. For now, it’s only something you can do on the side - you still have to have a traditional HTTPS fallback.
It's estimated that 75% of browsers (well... I guess user agents) on the net support HTTP/3 acording to wikipedia
> Which brings us to the third item on the list, and the real sticking point. As far as I’m aware, no current or planned future Web API ever lets you do number three. There are many new web “technologies” swarming around the custom packet idea (WebRTC, WebSockets, WebTransport), but to the best of my knowledge, all of them require an HTTPS connection to be made first, so your “custom packet” servers still need to implement all of HTTPS anyway.
Right, WebTransport over HTTP/3 is an option.
Your theoretical "I am writing a UDP library but want to isolate my HTTPS stuff somewhere else" situation... I mean beyond the fact that there are many options to strip away the abstraction, if you're writing a custom UDP protocol but are afraid of HTTPS, HTTP/3, that is a bit odd to say the least!
Now for technical concerns:
TCP is, of course, TCP. Ordered, etc. And so if you're sitting there with HTTP over TCP, well... firewalls etc are very straightforward. There is a very limited amount of things going on, so it is easy to control. UDP is completely the opposite, as you really can only do destination-based filtering.
So you only have source and destination as an option. One option is to say "ok you can't do cross origin UDP". In that case there's the reality that you control the client and the server, so to speak, so... just use WebTransport! You have your option there.
Another scenario, where you say "we'll allow all cross-origin UDP. Users can say accept/deny". Some random site decides they don't like you, and add some JS to DDOS you. They ask users to accept (and users do it cuz it's one of those websites). Users end up DDOSing you. The "approve/deny" flow works well for when you are the potential victim, but in the DDOS model you are not "consenting".
You might say "well, they can do that from their machines by dowloading software", and you are right.
How is this different from the image tag? The magic of HTTP makes actually filtering out stuff real straightforward. XHR stuff is different from handling black box UDP. It's actually easier, and because there is a protocol it's much easier to cut things off, not wait around while receiving garbage, etc. And middleware can also handle it on their end. Enterprises have networks to run.This is how Cloudflare exists and does its stuff/reject connections real easily. Raw, black-boxed UDP? All off the table.
You could make a protocol off of UDP that does a bunch of stuff to make this easier. But now... what? Are you just making TCP? You're gonna make your own magical protocol to avoid implementing WebTransport?
So in that case... you would probably want to establish some handshake for the connection. You might want some origin-based tokens similar to how people set up CSRF tokens for POST. And there's all these other concerns.
And now you have ended up adding some layer over UDP anyways!
Your claim is that you can't just go raw UDP mode in browsers. This is true. But WebTransport exists and you can use that to get all the advantages of UDP mode, while also taking advtange of the security benefits of the HTTP protocol.
So serious people will look at WebTransport, and correctly identify that it fits within models they are used to work in, and they can get all the advantages of UDP speed.
So your ask here isn't "let me have access to UDP", it's "let's add a UDP-based API that will make it easier to distribute DDOS-y software and generally create more unauditable traffic and make network QoS harder". Remember, you have access to UDP if you deal with WebTransport! And you might think that's not important. Many people disagree.
The technical claim is that a raw UDP connection that is assumed to have any content is harder to automatically filter out or otherwise manage, but the wrapping cost of WebTransport over HTTP/3 is small enough that browser can offer UDP to users, so long as they are using that.
And, of course, you can give yourself raw UDP in Javascript, if you're using something line Node. You're asking for a raw UDP API inside of browsers.
The answer, as always, is "corporate users". Every single big enterprise I've interacted with has some form of HTTPS filtering and/or MITM in place to prevent data leaks (or so they claim). Nobody who does that wants to deal with CaseyCrypt(tm) over raw UDP/TCP.
So if I understand you correctly, the answer might perhaps be the opposite of security, but rather, that UDP might actually be secure, and they would not want that? If that was the answer, though, that doesn't seem like a reason to keep it out of the spec. That might be a reason why CaseyCo should not choose to use raw UDP if they want enterprise customers, but is it a valid reason to leave it out of the spec entirely?
I'm pretty sure the answers to all of your questions have nothing to do with "technical ability/merits" and are pretty much "political". As well you know, HTTPS is not about "security" in the sense that "nobody can read my traffic", it's about "security" in the sense that "I can be fairly certain the guy on the other end of my connection is who he says he is". You will notice that most "hardening" around HTTPS tends to be about strengthening confidence in the certificate chain as opposed to making MITM impossible. I would argue that at this point it's pretty clear that the fact that 3rd parties can "read" HTTPS is pretty much a "feature" of the protocol (even if it's not in the spec).
Another (somewhat) related point to consider: it's pretty interesting how all the major browser vendors (Mozilla excepted maybe?) also happen to be cloud vendors. I'm pretty sure all the data collection legislation around cloud stuff also plays a part in this.
And that's the sad truth: these aren't technical decisions.
They are bureaucratic decisions made to entrench and buttress the financialization of the web end user (recall only tech and drug dealers call customers "users").
The technical justification for a bureaucratic decisions is found post-hoc, then paraded around and treated as doctrine.
When an engineer (such as @Casey, @Notorious, maybe myself) wants to implement something logically coherent, that's much lower overhead (i.e. inherently faster), we hit the wall of bad technical decisions made by that same bureaucracy.
TL;DR: death by committee, after bureaucrats and corporations own the stage
At least one problem with raw udp/tcp I see here, is that it is now possible to leak data from private networks/local services (these days CORS prevents most kinds of client requests unless server is configured in a specific way that allows that) .
I believe I covered that in the article. I am unable to find any plausible reason why you cannot implement either the same, or even a more restrictive, security policy for outbound UDP packets as you can for outbound XHR. If you have one, I would love to hear it!
You can of course implement it, as proven by HTTP/3 existing. You can, of course, do everything!
If you acknowledge the premise that browsers aim to only allow for secure connections, and that network managers prefer to have established protocols over their network, for QoS reasons (even ignoring DDOS questions)... you might stil need UDP for performance reasons, and so you have it. It's in WebTransport. All the perf advantages of UDP, with security concerns addressed.
Implementation concerns exist, but hey you're not writing your own crypto either.
> cannot implement either the same, or even a more restrictive, security policy for outbound UDP packets as you can for outbound XHR
I think you can, it is just not going to be as simple as we all would like. As I mentioned above, you will need some kind of way to restrict website from sending traffic to some internal/private network addresses. You cannot really trust the user here I think (e.g. imagine some employee inside a company vpn clicking on "approve connection" without even knowing what all of that tcp/udp/ip nonsense is), so the only other option you have is for the servers to approve/reject incoming connections. Currently we use CORS preflight requests to ask servers which origins are allowed to make requests. Origin(protocol + domain name + port) is the key word here. Since we can host multiple websites on the same IP or the same website on multiple IPs, we need a different kind of identificator than IP addresses, which is what origin does. Now lets say you are trying to establish a raw UDP connection with some server and we only want users of our website to access it. Browser will need to somehow ask the server if it can do that. You can definitely have some kind of standard handshake procedure for that, but at that point your connection isn't really raw UDP anymore.
I don't know that extra security of this kind would be particularly difficult to provide. If the problem is just that you can't figure out how to do a CORS check on a UDP packet, then just add a second DNS requirement, this time on the receiving side (eg., the target domain name must have a CNAME or something that says they allow inbound UDP from a particular source domain name).
Furthermore, regarding user approval, they can also agree to download an executable using a single click, and apparently that is not an issue? That executable will be able to send as much raw UDP as it wants.
So again, I'm not sure I'm seeing the actual difference. Certainly any enterprise can just decide not to allow any raw UDP at all (which is fine), just like they can decide not to allow their users to install EXEs on their machines.
I think that your DNS method could work. All IP addresses can have a domain name associated with them using a reverse lookup (requesting a PTR record from a domain like 1.0.168.192.in-addr.arpa). So the equivalent of a CORS request could be to lookup the domain name associated with the destination IP then find a TXT record that says allow UDP from xzy.com to port 123. It doesn't have perfect security but I would think it would be as good as CORS which isn't perfect either.
The difference is that DNS servers are generally slow to update, subject to enormous caching delays, and need to return the same result for all clients within a region. CORS is an application-level protocol serving individual responses, running in the same environment as the servers handling the request, which can easily interface with session tokens, databases, etc.
Your blind spot IMO is that it is wrong to treat the web as a mere collection of protocols which you just need to tweak. The web is like a city, with built infrastructure, where people work, live and play, and who get upset if you break things or demand they do work just to keep up. If you want to stay sane in such an environment, while creating new things, you need a radically different mindset.
The only thing that works in practice to evolve it is:
- Take an existing protocol P which you can use as a starting point.
- Identify a proper subset S which you can layer all your requirements on to.
- Design a secure and bidirectional translation between your subset S and the original P.
- This will be "inefficient" and "slow" and will make native developers laugh at you. Ignore them.
- Publicize it and get other people with the same requirements to adopt the same approach.
- When the opportunity and momentum is there, figure out a new protocol S' that eliminates all the P-legacy, which everyone who is already used to S can trivially switch to.
- S' becomes a new standard, P is deprecated and sunset.
You can either shake your fist at the sky, or you can deal with the messy reality. Those are the 2 options.
> I don't know that extra security of this kind would be particularly difficult to provide
I've been trying to think of other methods, but I couldn't really come up with an easy one :D
> If the problem is just that you can't figure out how to do a CORS check on a UDP packet, then just add a second DNS requirement, this time on the receiving side (eg., the target domain name must have a CNAME or something that says they allow inbound UDP from a particular source domain name).
This doesn't really work since you can just point your domain name to any IP you want to connect to, for example, you have a 10 year old vulnerable router on 192.168.0.1 with a UDP control panel. You point a.evil.com to 192.168.0.1 and trick user to load a webpage on b.evil.com that is authorized to access a.evil.com.
CORS is nice from a security standpoint since it asks exactly the server that is going to serve your regular request for a permission.
> they can also agree to download an executable using a single click, and apparently that is not an issue? That executable will be able to send as much raw UDP as it wants.
Fair point, but for better or worse web is thought of as a more or less safe garden. You don't really expect to be hacked by visiting some website.
> Certainly any enterprise can just decide not to allow any raw UDP at all (which is fine), just like they can decide not to allow their users to install EXEs on their machines.
Most people have some kind of unprotected device in their home network tho (e.g. router, NAS, TV, internet of shit), not only enterprises. (Obv we can go ahead and add a lot of scary warnings and antiviruses, but I don't think it is a good solution personally.)
Thanks for the thorough breakdown. I will give this some more thought and perhaps do another post dealing with just this specific problem, since so far it's the only thing I've seen mentioned that is a real technical issue.
> Attackers may use the API to by-pass third parties' CORS policies.
> Mitigation
> We could forbid the API from being used for TCP with the well known HTTPS port, whenever the destination host supports CORS.
The whole point of CORS is that you cannot leak any information by default. By allowing connection to hosts that don't support CORS you are effectively doing the opposite.
My sense is that the game they are playing is blame management and plausible deniability.
Without https, it becomes plausible for banks and other websites where security is paramount to blame the web standards for lacking a way to secure connections when they leak sensitive user data.
So what the committees and browser vendors really wants is a way for the browsers to easily know that all connections with this site are "secured". Now, if information leaks, the blame is solely on the site operators.
Currently they can do this if the site uses https.
If you introduce UDP to the mix, and tell them "I will encrypt the packets myself", then the browser has no way to tell whether the connection is secure or not, so they will default to telling the user that this website uses an insecure connection.
This would not be so problematic, except I think they want to eventually deprecate non-secure connections.
Efficiency and simplicity is the last thing they care about. They will only care about it when someone demonstrates the existence of a clearly superior web application that cannot be implemented without a certain feature. I think this is why wasm got standarized.
One dimension being glossed over here is that the more "simple" and "efficient" a protocol is, the more it tends to put all its eggs into one fused basket. This has knock-on effects, such as making it harder to deploy, scale and maintain the resulting infrastructure.
Not to mention that some people's threat model includes e.g. state level actors eavesdropping on internal data center links, which means even those should be encrypted in sensitive situations.
Create your own client app. This is very much trying to fit a square peg into a round hole.
If you want to, you can even give your client app an address bar, and let others use your app for their servers. Then you won't even need to touch html or css or JavaScript.
>The owners of ddosfuntimes.com can go ahead and set the IP address in their DNS records to point target.ddosfuntimes.com at any server they want, and they will receive all the XHR traffic from every browser that visits the page. And to the best of my knowledge, there isn’t a damn thing the target can do about that.
The attacks people are concerned about are about more than just DDoS. It's e.g. about impersonation and theft-of-service. You shouldn't be able to mimic an existing site and trick people into using it. You also shouldn't be able to use resources from a server you don't own. This is web security 101.
This is not possible in your outlined scenario: you can't make an HTTPS connection to `target.ddosfuntimes.com` because the hostname wouldn't match. So the only thing you'd be able to do is make a lot of failed HTTPS connections, which are aborted early. This would also be preceded by a DNS request for a hostname that can be flagged as suspicious (i.e. potentially blockable at the ISP level). So it is in fact possible to detect and filter out early, somewhat. This is exactly what services like CloudFlare are for.
>Furthermore, since the browser already allows the page to send as much HTTPS data as it wants back to the originating site, one could optionally allow any site to send UDP packets back to its own (exact) originating IP without asking the user.
The HTTP origin policy is actually about more than just IP, it involves the port too. You can call this paranoid, but nevertheless, that's the rule. So you'd either have to require that TCP and UDP ports match, or, you'd need to relax the restriction, or introduce some port negotiation scheme.
>So unless I’m missing something, XHR already allows you to target any website you wish with unwanted traffic from anyone who visits your site. So why the concern about UDP?
Not only are there security concerns beyond DDoS, but there are also practical concerns due to e.g. NAT and crappy airport/hotel wi-fi, which people generally expect to work. There are also privacy concerns with VPN users (and users who live undemocratic regimes), e.g. the ability to discover somebody's true IP address.
In fact, if you want a UDP-like protocol that satisfies all the above, you pretty much end up with WebRTC, which has semi-viable answers for these issues, runs on DTLS, and has been used to actually run complex video/audio conferencing in the wild.
I should add, you made a few blunders in your questioning on Twitter, and continue to do so here, so you are not doing yourself any favors by adopting a demanding attitude because you don't get the detailed security research you want _right now_. Like it or not, these are the kind of things professionals are generally paid to work on. They take a lot of time, experimentation, verification, and validation. So if you get low-level explanations on Twitter, without the detail you want, that's because you're sounding like somebody who needs to have the basics of the field explained to them.
Btw:
>ridiculous things in it like arbitrary text parsing, which no one in their right mind would ever put into a “secure” protocol
I'm not sure exactly which part you find so ridiculous—feel free to elaborate–but there's plenty of precedent of horrid binary encodings which left destruction in their wake (e.g. ASN.1). They are harder to work with, harder to parse, harder to extend (see: IPv4), and harder to verify by humans working with them. And let's be fair: the main reason "arbitrary text parsing" is considered a minefield is because people were stupid enough to keep doing it with handrolled code calling the C standard library, instead of verifiable tools like grammars and parser generators, because of a misplaced sense of "I can handle these footguns".
The trend you see these days is binary encodings with a canonical 1-to-1 text representations, which seems like the best compromise between accessibility and efficiency.
Cloudflare blocking mismatched hostnames helps address the issue with XHR, reducing the concern raised in the blog about XHR's potential for DDoS attacks (as this has a solution). Browsers need to easily verify security, which is why HTTPS is required. If raw UDP was allowed, browsers couldn't ensure a connection is secure, so they'd mark it as insecure. This is likely why raw UDP won't be permitted. The web will always lag behind native clients in security, but using UDP for custom apps has the same DDoS and security risks. It's frustrating for technical users; I'd love to see raw UDP allowed too.
Never stop writing. I love to read your stuff!
The last thing I want is for a random web page or iframe/ad in my browser to have the ability to spoof packets coming out of my PC to look like something unexpected. Giving out access to UDP is a footgun, except someone else will be very deliberately aiming the gun at your feet and everyone else's.
Any custom encryption method will be impossible for the browser to verify. You could use some protocol to claim that you've encrypted your UDP API, but the browser wouldn't know whether that is true, or whether your implementation is reasonable.
This means no padlock icon on sites using UDP, and probably "WARNING: This connection may be insecure!" too.
Sites care about broadcasting their security to users. Which site would ever choose to use a feature which is going to warn the user that their site may be insecure?
Browsers care about warning their users in situations which are potentially dangerous. Why would they show a reassuring padlock that they can't verify, or implement a feature which gets users into the habit of clicking "continue anyway" on warnings?
It's not a technical hurdle to a secure implementation, but it's a hurdle to a secure implementation with a reasonable user experience. Nudging users towards secure behaviour is just as important as technical security. The existing padlock system isn't perfect, but it's a lot better than nothing.
Was a hypothetical WebUDP API even considered and/or rejected by WHATWG/W3C/etc?
The only *technical* objection to handing out raw UDP to web apps that I can think of is that it partially bypasses same-origin. It's important to note here that both the port and protocol constitute part of the origin; so for example http://example.com:80 and https://example.com:443 would be separate origins that cannot talk to one another. In fact, if a website were to be using custom ports, then http://example.com:8080 and https://example.com:8080 would also be separate origins.
These setups are all entirely crazy, but they are also historical legacy. Some badly-written webapp *somewhere* relies on the fact that browsers will isolate different servers on the same machine from each others' client traffic.
In fact, digging deeper into the historical legacy of the TCP/UDP spec, ports below 1024 are supposed to be admin-privileged, while anything higher is available for all users. So if same-origin didn't consider ports, then any user on the same machine as the server (a common arrangement in the early days of the web) could set up example.com:1337, and anyone linked there from example.com:80 would be sending all their :80 login cookies to :1337.
*Modern* servers don't look like this, of course. You'll have example.com:443, 0db8.net:443, ietf.org:443, and all sorts of other machines all being hosted out of one set of anycasted Fastflare or Cloudly IPs. This means that we have to tell the machine at the other end what domain we expect it to pretend to be with a Host: header or SNI header, so that requests get routed to the right vhost.
If we granted raw UDP to webapps then both the old style "services and shell accounts" and modern "vhosts all the way down" approach to configuring web servers would have new security concerns, because we're relaxing same-origin. All the existing Web-specific socket protocols are there to maintain the "protocol, domain, port" origin rules by bootstrapping sockets off of the already well-understood HTTP and HTTPS protocol headers so that they *don't have to change those rules*.
Now, would *relaxing* these rules for raw sockets hurt things? I'm not sure. But the benefits are really vague. Our security model is more rational, yes, but we made it make sense by weakening it. We can support really old legacy protocols that nobody should be using now - e.g. Ruffle could faithfully emulate Flash's RMTP and XML sockets using your raw sockets proposal. And you could theoretically roll your own HTTP stack and crypto for whatever reason.
On the other hand, every time the same-origin policy is relaxed even a little, bad things happen without a *lot* of foresight. To beat up on Flash again, it handled cross-origin requests using a special crossdomain.xml file at the root of a website. Because it was "in-band", so to speak, webapps with upload forms that didn't know about these files could be tricked into hosting them, for starters. Worse, it did not send any identifying header that would tell you if a request was cross-domain, so permissive cross-domain policies would leak sensitive data. Compare it to HTTP CORS: the browser pre-flights every request and the server responds if it's allowed. The actual cross-domain request itself is also properly marked with the origin, so you don't even need to restrict any cross-domain access. You can just allow all requests, check if each request is cross-domain and redact sensitive information if so.
Raw sockets feel less like HTTP CORS and more like crossdomain.xml. There's no mechanism in TCP or UDP for the browser to say, "hey this request came from a script on example.com:80"; and without that information non-HTTP services will get confused and get hacked.
Example:
Let's say Cloudflare has a secret management service on UDP port 4000 that lets their operations team debug broken production servers. Of course, this is dangerous, so it's all firewalled off to a VPN that operations only connects to when they are debugging machines.
You are a hacker. You set up notevil.net:443, put it behind Flarely, and buy the premium service package with all the extra support stuff in it. Then, after a week or two you ask support to look at your website. The website is "down", and because you're a high-paying customer one of the ops team is dispatched to take a look at the site.
Oops. Raw sockets means that notevil.net:443 can now access that secret management port on :4000 *from within the FastCloud operations VPN*. We've already broken through their corporate firewall with just normal web standards. It does not matter if the hosting provider "shouldn't" have this kind of management setup; it does not matter that our security model is uneven and prohibits this tier of attack but allows another. The same-origin restriction is a strict promise that we wouldn't allow this to happen and existing security systems are built around it.
> But at least it exists, and perhaps if adoption continues to grow, eventually it will be possible to require HTTP/3 without losing a significant number of users. For now, it’s only something you can do on the side - you still have to have a traditional HTTPS fallback.
It's estimated that 75% of browsers (well... I guess user agents) on the net support HTTP/3 acording to wikipedia
> Which brings us to the third item on the list, and the real sticking point. As far as I’m aware, no current or planned future Web API ever lets you do number three. There are many new web “technologies” swarming around the custom packet idea (WebRTC, WebSockets, WebTransport), but to the best of my knowledge, all of them require an HTTPS connection to be made first, so your “custom packet” servers still need to implement all of HTTPS anyway.
Right, WebTransport over HTTP/3 is an option.
Your theoretical "I am writing a UDP library but want to isolate my HTTPS stuff somewhere else" situation... I mean beyond the fact that there are many options to strip away the abstraction, if you're writing a custom UDP protocol but are afraid of HTTPS, HTTP/3, that is a bit odd to say the least!
Now for technical concerns:
TCP is, of course, TCP. Ordered, etc. And so if you're sitting there with HTTP over TCP, well... firewalls etc are very straightforward. There is a very limited amount of things going on, so it is easy to control. UDP is completely the opposite, as you really can only do destination-based filtering.
So you only have source and destination as an option. One option is to say "ok you can't do cross origin UDP". In that case there's the reality that you control the client and the server, so to speak, so... just use WebTransport! You have your option there.
Another scenario, where you say "we'll allow all cross-origin UDP. Users can say accept/deny". Some random site decides they don't like you, and add some JS to DDOS you. They ask users to accept (and users do it cuz it's one of those websites). Users end up DDOSing you. The "approve/deny" flow works well for when you are the potential victim, but in the DDOS model you are not "consenting".
You might say "well, they can do that from their machines by dowloading software", and you are right.
How is this different from the image tag? The magic of HTTP makes actually filtering out stuff real straightforward. XHR stuff is different from handling black box UDP. It's actually easier, and because there is a protocol it's much easier to cut things off, not wait around while receiving garbage, etc. And middleware can also handle it on their end. Enterprises have networks to run.This is how Cloudflare exists and does its stuff/reject connections real easily. Raw, black-boxed UDP? All off the table.
You could make a protocol off of UDP that does a bunch of stuff to make this easier. But now... what? Are you just making TCP? You're gonna make your own magical protocol to avoid implementing WebTransport?
So in that case... you would probably want to establish some handshake for the connection. You might want some origin-based tokens similar to how people set up CSRF tokens for POST. And there's all these other concerns.
And now you have ended up adding some layer over UDP anyways!
Your claim is that you can't just go raw UDP mode in browsers. This is true. But WebTransport exists and you can use that to get all the advantages of UDP mode, while also taking advtange of the security benefits of the HTTP protocol.
So serious people will look at WebTransport, and correctly identify that it fits within models they are used to work in, and they can get all the advantages of UDP speed.
So your ask here isn't "let me have access to UDP", it's "let's add a UDP-based API that will make it easier to distribute DDOS-y software and generally create more unauditable traffic and make network QoS harder". Remember, you have access to UDP if you deal with WebTransport! And you might think that's not important. Many people disagree.
The technical claim is that a raw UDP connection that is assumed to have any content is harder to automatically filter out or otherwise manage, but the wrapping cost of WebTransport over HTTP/3 is small enough that browser can offer UDP to users, so long as they are using that.
And, of course, you can give yourself raw UDP in Javascript, if you're using something line Node. You're asking for a raw UDP API inside of browsers.
The answer, as always, is "corporate users". Every single big enterprise I've interacted with has some form of HTTPS filtering and/or MITM in place to prevent data leaks (or so they claim). Nobody who does that wants to deal with CaseyCrypt(tm) over raw UDP/TCP.
So if I understand you correctly, the answer might perhaps be the opposite of security, but rather, that UDP might actually be secure, and they would not want that? If that was the answer, though, that doesn't seem like a reason to keep it out of the spec. That might be a reason why CaseyCo should not choose to use raw UDP if they want enterprise customers, but is it a valid reason to leave it out of the spec entirely?
I'm pretty sure the answers to all of your questions have nothing to do with "technical ability/merits" and are pretty much "political". As well you know, HTTPS is not about "security" in the sense that "nobody can read my traffic", it's about "security" in the sense that "I can be fairly certain the guy on the other end of my connection is who he says he is". You will notice that most "hardening" around HTTPS tends to be about strengthening confidence in the certificate chain as opposed to making MITM impossible. I would argue that at this point it's pretty clear that the fact that 3rd parties can "read" HTTPS is pretty much a "feature" of the protocol (even if it's not in the spec).
Another (somewhat) related point to consider: it's pretty interesting how all the major browser vendors (Mozilla excepted maybe?) also happen to be cloud vendors. I'm pretty sure all the data collection legislation around cloud stuff also plays a part in this.
And that's the sad truth: these aren't technical decisions.
They are bureaucratic decisions made to entrench and buttress the financialization of the web end user (recall only tech and drug dealers call customers "users").
The technical justification for a bureaucratic decisions is found post-hoc, then paraded around and treated as doctrine.
When an engineer (such as @Casey, @Notorious, maybe myself) wants to implement something logically coherent, that's much lower overhead (i.e. inherently faster), we hit the wall of bad technical decisions made by that same bureaucracy.
TL;DR: death by committee, after bureaucrats and corporations own the stage
At least one problem with raw udp/tcp I see here, is that it is now possible to leak data from private networks/local services (these days CORS prevents most kinds of client requests unless server is configured in a specific way that allows that) .
I believe I covered that in the article. I am unable to find any plausible reason why you cannot implement either the same, or even a more restrictive, security policy for outbound UDP packets as you can for outbound XHR. If you have one, I would love to hear it!
You can of course implement it, as proven by HTTP/3 existing. You can, of course, do everything!
If you acknowledge the premise that browsers aim to only allow for secure connections, and that network managers prefer to have established protocols over their network, for QoS reasons (even ignoring DDOS questions)... you might stil need UDP for performance reasons, and so you have it. It's in WebTransport. All the perf advantages of UDP, with security concerns addressed.
Implementation concerns exist, but hey you're not writing your own crypto either.
> cannot implement either the same, or even a more restrictive, security policy for outbound UDP packets as you can for outbound XHR
I think you can, it is just not going to be as simple as we all would like. As I mentioned above, you will need some kind of way to restrict website from sending traffic to some internal/private network addresses. You cannot really trust the user here I think (e.g. imagine some employee inside a company vpn clicking on "approve connection" without even knowing what all of that tcp/udp/ip nonsense is), so the only other option you have is for the servers to approve/reject incoming connections. Currently we use CORS preflight requests to ask servers which origins are allowed to make requests. Origin(protocol + domain name + port) is the key word here. Since we can host multiple websites on the same IP or the same website on multiple IPs, we need a different kind of identificator than IP addresses, which is what origin does. Now lets say you are trying to establish a raw UDP connection with some server and we only want users of our website to access it. Browser will need to somehow ask the server if it can do that. You can definitely have some kind of standard handshake procedure for that, but at that point your connection isn't really raw UDP anymore.
I don't know that extra security of this kind would be particularly difficult to provide. If the problem is just that you can't figure out how to do a CORS check on a UDP packet, then just add a second DNS requirement, this time on the receiving side (eg., the target domain name must have a CNAME or something that says they allow inbound UDP from a particular source domain name).
Furthermore, regarding user approval, they can also agree to download an executable using a single click, and apparently that is not an issue? That executable will be able to send as much raw UDP as it wants.
So again, I'm not sure I'm seeing the actual difference. Certainly any enterprise can just decide not to allow any raw UDP at all (which is fine), just like they can decide not to allow their users to install EXEs on their machines.
I think that your DNS method could work. All IP addresses can have a domain name associated with them using a reverse lookup (requesting a PTR record from a domain like 1.0.168.192.in-addr.arpa). So the equivalent of a CORS request could be to lookup the domain name associated with the destination IP then find a TXT record that says allow UDP from xzy.com to port 123. It doesn't have perfect security but I would think it would be as good as CORS which isn't perfect either.
Thank you for the suggestion - I plan on doing a follow-up where I look at what the best scheme will be, and how secure it could be made!
The difference is that DNS servers are generally slow to update, subject to enormous caching delays, and need to return the same result for all clients within a region. CORS is an application-level protocol serving individual responses, running in the same environment as the servers handling the request, which can easily interface with session tokens, databases, etc.
Your blind spot IMO is that it is wrong to treat the web as a mere collection of protocols which you just need to tweak. The web is like a city, with built infrastructure, where people work, live and play, and who get upset if you break things or demand they do work just to keep up. If you want to stay sane in such an environment, while creating new things, you need a radically different mindset.
The only thing that works in practice to evolve it is:
- Take an existing protocol P which you can use as a starting point.
- Identify a proper subset S which you can layer all your requirements on to.
- Design a secure and bidirectional translation between your subset S and the original P.
- This will be "inefficient" and "slow" and will make native developers laugh at you. Ignore them.
- Publicize it and get other people with the same requirements to adopt the same approach.
- When the opportunity and momentum is there, figure out a new protocol S' that eliminates all the P-legacy, which everyone who is already used to S can trivially switch to.
- S' becomes a new standard, P is deprecated and sunset.
You can either shake your fist at the sky, or you can deal with the messy reality. Those are the 2 options.
Well there's definitely a third option you didn't enumerate.
> I don't know that extra security of this kind would be particularly difficult to provide
I've been trying to think of other methods, but I couldn't really come up with an easy one :D
> If the problem is just that you can't figure out how to do a CORS check on a UDP packet, then just add a second DNS requirement, this time on the receiving side (eg., the target domain name must have a CNAME or something that says they allow inbound UDP from a particular source domain name).
This doesn't really work since you can just point your domain name to any IP you want to connect to, for example, you have a 10 year old vulnerable router on 192.168.0.1 with a UDP control panel. You point a.evil.com to 192.168.0.1 and trick user to load a webpage on b.evil.com that is authorized to access a.evil.com.
CORS is nice from a security standpoint since it asks exactly the server that is going to serve your regular request for a permission.
> they can also agree to download an executable using a single click, and apparently that is not an issue? That executable will be able to send as much raw UDP as it wants.
Fair point, but for better or worse web is thought of as a more or less safe garden. You don't really expect to be hacked by visiting some website.
> Certainly any enterprise can just decide not to allow any raw UDP at all (which is fine), just like they can decide not to allow their users to install EXEs on their machines.
Most people have some kind of unprotected device in their home network tho (e.g. router, NAS, TV, internet of shit), not only enterprises. (Obv we can go ahead and add a lot of scary warnings and antiviruses, but I don't think it is a good solution personally.)
Thanks for the thorough breakdown. I will give this some more thought and perhaps do another post dealing with just this specific problem, since so far it's the only thing I've seen mentioned that is a real technical issue.
So it looks like there is _yet another_ Web standard that is already tackling all this, and in fact they already did the threat assessment: https://github.com/WICG/direct-sockets/blob/main/docs/explainer.md
Their threat assessment is very _interesting_.
> Threat
> Attackers may use the API to by-pass third parties' CORS policies.
> Mitigation
> We could forbid the API from being used for TCP with the well known HTTPS port, whenever the destination host supports CORS.
The whole point of CORS is that you cannot leak any information by default. By allowing connection to hosts that don't support CORS you are effectively doing the opposite.