409 points by linolevan 22 hours ago
| 146 comments
steve197721 hours ago
I don't find the wording in the RFC to be that ambiguous actually.
> The answer to the query, possibly preface by one or more CNAME RRs that specify aliases encountered on the way to an answer.
The "possibly preface" (sic!) to me is obviously to be understood as "if there are any CNAME RRs, the answer to the query is to be prefaced by those CNAME RRs" and not "you can preface the query with the CNAME RRs or you can place them wherever you want".
mrmattyboy19 hours ago
I agree this doens't seem too ambiguous - it's "you may do this.." and they said "or we may do the reverse". If I say you're could prefix something.. the alternative isn't that you can suffix it.
But also.. the programmers working on the software running one of the most important (end-user) DNS servers in the world:
1. Changes logic in how CNAME responses are formed
2. I assume some tests at least broke that meant they needed to be "fixed up" (y'know - "when a CNAME is queried, I expect this response")
3. No one saw these changes in test behavoir and thought "I wonder if this order is important". Or "We should research more into this", Or "Are other DNS servers changing order", Or "This should be flagged for a very gradual release".
4. Ends up in test environment for, what, a month.. nothing using getaddrinfo from glibc is being used to test this environment or anyone noticed that it was broken
Cloudflare seem to be getting into thr swing of breaking things and then being transparent. But this really reads as a fun "did you know", not a "we broke things again - please still use us".
There's no real RCA except to blame an RFC - but honestly, for a large-scale operation like there's this seems very big to slip through the cracks.
I would make a joke about South Park's oil "I'm sorry".. but they don't even seem to be
We used to say at work that the best way to get promoted was to be the programmer that introduced the bug into production and then fix it. Crazy if true here...
black3r16 hours ago
> 4. Ends up in test environment for, what, a month.. nothing using getaddrinfo from glibc is being used to test this environment or anyone noticed that it was broken
"Testing environment" sounds to me like a real network real user devices are used with (like the network used inside CloudFlare offices). That's what I would do if I was developing a DNS server anyway, other than unit tests (which obviously wouldn't catch this unless they were explicitly written for this case) and maybe integration/end-to-end tests, which might be running in Alpine Linux containers and as such using musl. If that's indeed the case, I can easily imagine how noone noticed anything was broken. First look at this line:
> Most DNS clients don’t have this issue. For example, systemd-resolved first parses the records into an ordered set:
Now think about what real end user devices are using: Windows/macOS/iOS obviously aren't using glibc and Android also has its own C library even though it's Linux-based, and they all probably fall under the "Most DNS clients don't have this issue.".
That leaves GNU/Linux, where we could reasonably expect most software to use glibc for resolving queries, so presumably anyone using Linux on their laptop would catch this right? Except most distributions started using systemd-resolved (most notable exception is Debian, but not many people use that on desktops/laptops), which is a locally-cached recursive DNS server, and as such acts as a middleman between glibc software and the network configured DNS server, so it would resolve 1.1.1.1 queries correctly, and then return the results from its cache ordered by its own ordering algorithm.
account427 hours ago
> other than unit tests (which obviously wouldn't catch this unless they were explicitly written for this case)
They absolutely should have unit tests that detect any change in output and manually review those changes for an operation of this size.
skywhopper3 hours ago
For the output of Cloudflare’s DNS server, which serves a huge chunk of the Internet, they absolutely should have a comprehensive byte-by-byte test suite, especially for one of the most common query/result patterns.
jrochkind116 hours ago
> I assume some tests at least broke that meant they needed to be "fixed up"
OP said:
"However, we did not have any tests asserting the behavior remains consistent due to the ambiguous language in the RFC."
One could guess it's something like -- back when we wrote the tests, years ago, whoever did it missed that this was required, not helped by the fact that the spec proceeded RFC 2119 standardizing the all-caps "MUST" "SHOULD" etc language, which would have helped us translsate specs to tests more completely.
ibejoeb6 hours ago
Even if there weren't tests for the return order, I would have bet that there were tests of backbone resolvers like getaddrinfo. Is it really possible that the first time anyone noticed that that crashed, or that ciscos bootlooped, was on a live query?
account427 hours ago
You'd think that something this widely used would have golden tests that detect any output change to trigger manual review but apparently they don't.
jrochkind1just now
Oh, they explain, if I understand right, they did the output change intentionally, for performance reasons. Based on the inaccurate assumption that order did not matter in DNS responses -- becuase there are OTHER aspects of DNS responses in which, by spec, order does not matter, and because there were no tests saying order mattered for this component.
> "The order of RRs in a set is not significant, and need not be preserved by name servers, resolvers, or other parts of the DNS." [from RFC]
> However, RFC 1034 doesn’t clearly specify how message sections relate to RRsets.
The developer(s) was assuming order didn't matter in general, cause the RFC said it didn't for one aspect, and intentionally made a change to order for performance reasons. But it turned out that change did matter.
Mistakes of this kind seem unavoidable, this one doesn't necessary say to me the developers made a mistake i never could or something.
I think the real conclusion is they probably need tests using actual live network stacks with common components, and why didn't they have those? Not just unit tests or with mocks, but tests that would have actually used real getaddrinfo function in glibc and shown it failing?
bpt319 hours ago
> Ends up in test environment for, what, a month.. nothing using getaddrinfo from glibc is being used to test this environment or anyone noticed that it was broken
This is the part that is shocking to me. How is getaddrinfo not called in any unit or system tests?
I would hazard a guess that their test environment have both the systemd variant and the Unbound variants (Unbound technically does not arrange them, but instead reconstructs it according to RFC "CNAME restart" logic because it is a recursive resolver in itself), but not just plain directly-piped resolv.conf (Presumably because who would run that in this day and age. This is sadly just a half-joke, because only a few people would fall on this category.)
SAI_Peregrinus18 hours ago
Probably Alpine containers, so musl's version instead of glibc's.
laixintao13 hours ago
Yes, at least they should test the glibc case.
inopinatus20 hours ago
The article makes it very clear that the ambiguity arises in another phrase: “difference in ordering of the RRs in the answer section is not significant”, which is applied to an example; the problem with examples being that they are illustrative, viz. generalisable, and thus may permit reordering everywhere, and in any case, whether they should or shouldn’t becomes a matter of pragmatic context.
Which goes to show, one person’s “obvious understanding” is another’s “did they even read the entire document”.
All of which also serves to highlight the value of normative language, but that came later.
PunchyHamster15 hours ago
it wouldn't be a problem if they tested it properly... especially WHEN stuff is ambigous
nraynaud8 hours ago
They may not have realized their interpretation is ambiguous until after the incident, that’s the kind of stuff you realize after you find a bug and do a deep dive in the literature for a post mortem. They probably worked with the certitude that record order is irrelevant until that point.
the_mitsuhiko18 hours ago
> I don't find the wording in the RFC to be that ambiguous actually.
I agree with you, and I also think that their interpretation of example 6.2.1 in the RFC is somewhat nonsensical. It states that “The difference in ordering of the RRs in the answer section is not significant.” But from the RFC, very clearly this comment is relevant only to that particular example; it is comparing two responses and saying that in this case, the different ordering has no semantic effect.
And perhaps this is somewhat pedantic, but they also write that “RFC 1034 section 3.6 defines Resource Record Sets (RRsets) as collections of records with the same name, type, and class.” But looking at the RFC, it never defines such a term; it does say that within a “set” of RRs “associated with a particular name” the order doesn’t matter. But even if the RFC had said “associated with a particular combination of name, type, and class”, I don’t see how that could have introduced ambiguity. It specifies an exception to a general rule, so obviously if the exception doesn’t apply, then the general rule must be followed.
Anyway, Cloudflare probably know their DNS better than I do, but I did not find the article especially persuasive; I think the ambiguity is actually just a misreading, and that the RFC does require a particular ordering of CNAME records.
(ETA:) Although admittedly, while the RFC does say that CNAMEs must come before As in the answer, I don’t necessarily see any clear rule about how CNAME chains must be ordered; the RFC just says “Domain names in RRs which point at another name should always point at the primary name and not the alias ... Of course, by the robustness principle, domain software should not fail when presented with CNAME chains or loops; CNAME chains should be followed”. So actually I guess I do agree that there is some ambiguity about the responses containing CNAME chains.
taeric20 hours ago
Isn't this literally noted in the article? The article even points out that the RFC is from before normative words were standardized for hard requirements.
devman019 hours ago
Even if 'possibly preface' is interpreted to mean CNAME RRSets should appear first there is still a broken reliance by some resolvers on the order of CNAME RRsets if there is more than one CNAME in the chain. This expectation of ordering is not promised by the relevant RFCs.
paulddraper21 hours ago
100%
I just commented the same.
It's pretty clear that the "possibly" refers to the presence of the CNAME RRs, not the ordering.
Dylan1680719 hours ago
The context makes it less clear, but even if we pretend that part is crystal, a comment that stops there is missing the point of the article. All CNAMEs at the start isn't enough. The order of the CNAMEs can cause problems despite perfect RFC compliance.
andrewshadura18 hours ago
To me, this reads exactly the opposite.
patrickmay21 hours ago
A great example of Hyrum's Law:
"With a sufficient number of users of an API,
it does not matter what you promise in the contract:
all observable behaviors of your system
will be depended on by somebody."
combined with failure to follow Postel's Law:
"Be conservative in what you send, be liberal in what you accept."
mmastrac21 hours ago
Postel's law is considered more and more harmful as the industry evolved.
CodesInChaos20 hours ago
That depends on how Postel's law is interpreted.
What's reasonable is: "Set reserved fields to 0 when writing and ignore them when reading." (I heard that was the original example). Or "Ignore unknown JSON keys" as a modern equivalent.
What's harmful is: Accept an ill defined superset of the valid syntax and interpret it in undocumented ways.
treve19 hours ago
Good modern protocols will explicitly define extension points, so 'ingoring unknown JSON keys' is in-spec rather than assumed that an implementer will do.
tuetuopay19 hours ago
Funny I never read the original example. And in my book, it is harmful, and even worse in JSON, since it's the best way to have a typo somewhere go unnoticed for a long time.
sweetjuly16 hours ago
The original example is very common in ISAs at least. Both ARMv8 and RISC-V (likely others too but I don't have as much experience with them) have the idea of requiring software to treat reserved bits as if they were zero for both reading and writing. ARMv8 calls this RES0 and an hardware implementation is constrained to either being write ignore for the field (eg read is hardwired to zero) or returning the last successful write.
This is useful as it allows the ISA to remain compatible with code which is unaware of future extensions which define new functionality for these bits so long as the zero value means "keep the old behavior". For example, a system register may have an EnableNewFeature bit, and older software will end up just writing zero to that field (which preserves the old functionality). This avoids needing to define a new system register for every new feature.
yxhuvud19 hours ago
I disagree. I find accepting extra random bytes in places to be just as harmful. I prefer APIs that push back and tell me what I did wrong when I mess up.
n2d421 hours ago
Very much so. A better law would be conservative in both sending and accepting, as it turns out that if you are liberal in what you accept, senders will choose to disobey Postel's law and be liberal in what they send, too.
mikestorrent18 hours ago
It's an oscillation. It goes in cycles. Things formalize upward until you've reinvented XML, SOAP and WSDLs; then a new younger generation comes in and says "all that stuff is boring and tedious, here's this generation's version of duck typing", followed by another ten years of tacking strong types onto that.
MCP seems to be a new round of the cycle beginning again.
Ericson231413 hours ago
No they won't do that, because vibe coding boring tedious shit is easy and looks good to your manager.
I'm dead serious, we should be in a golden age of "programming in the large" formal protocols.
Gigachad17 hours ago
The modern view seems to be you should just immediately abort if the spec isn't being complied with since it's possibly someone trying to exploit the system with malformed data.
esafak21 hours ago
I think it is okay to accept liberally as long as you combine it with warnings for a while to give offenders a chance to fix it.
wolrah1 hour ago
Warnings only work if the person receiving them is either capable of and motivated to do something about it, or capable of motivating the person/people capable of doing something about it.
A weak warning that's just an entry in a scrolling console means nothing to end users and can be ignored by devs. A strong warning that comes out as a modal dialog can still be ignored by devs and then just annoys users. See the early era of Windows UAC for possibly the most widespread example of a strong warning added after the fact.
hdjrudni21 hours ago
"Warnings" are like the most difficult thing to 'send' though. If an app or service doesn't outright fail, warnings can be ignored. Even if not ignored... how do you properly inform? A compiler can spit out warnings to your terminal, sure. Test-runners can log warnings. An RPC service? There's no standard I'm aware of. And DNS! Probably even worse. "Yeah, your RRs are out of order but I sorted them for you." where would you put that?
esafak20 hours ago
> how do you properly inform?
Through the appropriate channels; in-band and out-of-band.
immibis18 hours ago
a content-less tautology
diarrhea19 hours ago
Randomly fail or (increasingly) delay a random subset of all requests.
Melonai19 hours ago
That sounds awful and will send administrators on a wild goose chase throughout their stack to find the issue without many clues except this thing is failing at seemingly random times. (I myself would suspect something related to network connectivity, maybe requests are timing out? This idea would lead me in the completely wrong direction.)
It also does not give any way to actually see a warning message, where would we even put it? I know for a fact that if my glibc DNS resolver started spitting out errors into /var/log/god_knows_what I would take days to find it, at best the resolver could return some kind of errno with perror giving us a message like "The DNS response has not been correctly formatted", and then hope that the message is caught and forwarded through whatever is wrapping the C library, hopefully into our stderr. And there's so many ways even that could fail.
SahAssar17 hours ago
So we arrive at the logical conclusion: You send errors in morse code, encoded as seconds/minutes of failures/successes. Any reasonable person would be able to recognize morse when seeing the patterns on a observability graph.
Start with milliseconds, move on to seconds and so on as the unwanted behavior continues.
dotancohen21 hours ago
The Python 3 community was famously divided on that matter, wrt Python 3. Now that it is over, most people on the "accept liberally" side of the fence have jumped sides.
psnehanshu20 hours ago
Warnings are ignored. It's much better to fail fast.
ajross20 hours ago
That's true, but sort of misses the spirit of Hyrum's law (which is that the world is filled with obscure edge cases).
In this case the broken resolver was the one in the GNU C Library, hardly an obscure situation!
The news here is sort of buried in the story. Basically Cloudflare just didn't test this. Literally every datacenter in the world was going to fail on this change, probably including their own.
black3r16 hours ago
> Literally every datacenter in the world was going to fail on this change
I would expect most datacenters to use their own local recursive caching DNS servers instead of relying on 1.1.1.1 to minimize latency.
stevefan199914 hours ago
that means you did a leaky abstraction indirectly but it is on the people level
I am very petty about this one bug and have a very old axe to grind that this reminded me of! Way back in 2011 CloudFlare launched an incredibly poorly researched feature to just return CNAME records at a domain apex ... RFCs be damned.
The problem? CNAMEs are name level aliases, not record level, so this "feature" would break the caching of NS, MX, and SOA records that exist at domain apexes. Many of us warned them at the time that this would result in a non-deterministic issue. At EC2 and Route 53 we weren't supporting this just to be mean! If a user's DNS resolver got an MX query before an A query, things might work ... but the other way around, they might not. An absolute nightmare to deal with. But move fast and break things, so hey :)
In earnest though ... it's great to see how now CloudFare are handling CNAME chains and A record ordering issues in this kind of detail. I never would have thought of this implicit contract they've discovered, and it makes sense!
ycombiredd11 hours ago
You just caused flashbacks of error messages from BIND of the sort "cannot have CNAME and other data", from this proximate cause, and having to explain the problem many, many times. Confusion and ambiguity of understandings have also existed since forever by people creating domain RR's (editing files) or the automated or more machined equivalents.
Related, the phrase "CNAME chains" causes vague memories of confusion surrounding the concepts of "CNAME" and casual usage of the term "alias". Without re-reading RFC1034 today, I recall that my understanding back in the day was that the "C" was for "canonical", and that the host record the CNAME itself resolved to must itself have an A record, and not be another CNAME, and I acknowledge the already discussed topic that my "must" is doing a lot of lifting there, since the RFC in question predates a normative language standard RFC itself.
So, I don't remember exactly the initial point I was trying to get at with my second paragraph; maybe there has always been some various failure modes due to varying interpretations which have only compounded with age, new blood, non-standard language being used in self-serve DNS interfaces by providers, etc which I suppose only strengthens the "ambiguity" claim. That doesn't excuse such a large critical service provider though, at all.
Dylan168078 hours ago
Is a deliberate violation of a spec really a bug? And I don't think their choice was "move fast and break things" at all.
It is a nightmare, but the spec is the source of the nightmare.
NelsonMinar21 hours ago
It's remarkable that the ordinary DNS lookup function in glibc doesn't work if the records aren't in the right order. It's amazing to me we went 20+ years without that causing more problems. My guess is most people publishing DNS records just sort of knew that the order mattered in practice, maybe figuring it out in early testing.
pixl9721 hours ago
I think it's more of a server side ordering, in which there were not that many DNS servers out there, and the ones that didn't keep it in order quickly changed the behavior because of interop.
It's more likely because the internet runs on a very small number of authorative server implementations which all implement this ordering quirk.
immibis18 hours ago
This is a recursive resolver quirk
zinekeller16 hours ago
... that was perpetuated by BIND.
(Yes, there are other recursive resolver implementations, but they look at BIND as the reference implementation and absent any contravention to the RFC or intentional design-level decisions, they would follow BIND's mechanism.)
account424 hours ago
It's also the most natural way to structure the answer:
Hey, where can I find A.
Answer: A is actually B
Answer: Also B can be found at 42
jeroenhd6 hours ago
People probably ran into this all the time, but no single party large enough to have it gain attention produced the failure state.
If a small business or cloud app can't resolve a domain because the domain is doing something different, it's much easier to blame DNS, use another DNS server, and move on. Or maybe just go "some Linuxes can't reach my website, oh well, sucks for the 1-3%".
Cloudflare is large enough that they caused issues for millions of devices all at once, so they had to investigate.
What's unclear to me is if they bothered to send patches to broken open-source DNS resolvers to fix this issue in the future.
iainmerrick6 hours ago
No, because they're not really broken. I think this is fairly clear:
Based on what we have learned during this incident, we have reverted the CNAME re-ordering and do not intend to change the order in the future.
To prevent any future incidents or confusion, we have written a proposal in the form of an Internet-Draft to be discussed at the IETF.
That is, explicitly documenting the "broken" behaviour as permitted.
fweimer16 hours ago
The last time this came up, people said that it was important to filter out unrelated address records in the answer section (with names to which the CNAME chain starting at the question name does not lead). Without the ordering constraint (or a rather low limit on the number of CNAMEs in a response), this needs a robust data structure for looking up DNS names. Most in-process stub resolvers (including the glibc one) do not implement a DNS cache, so they presently do not have a need to implement such a data structure. This is why eliminating the ordering constraint while preserving record filtering is not a simple code change.
Dylan168078 hours ago
Doesn't it need to go through the CNAME chain no matter what? If it's doing that, isn't filtering at most tracking all the records that matched? That requires a trivial data structure.
Parsing the answer section in a single pass requires more finesse, but does it need fancier data structures than a string to string map? And failing that you can loop upon CNAME. I wouldn't call a depth limit like 20 "a rather low limit on the number of CNAMEs in a response", and max 20 passes through a max 64KB answer section is plenty fast.
skywhopper3 hours ago
It’s not remarkable, because it’s the way all DNS servers work. Order is important in DNS results. It’s why results with multiple A records are returned in shuffled orders: because that impacts how the client interprets the results. Anyone who works with DNS regularly beyond just reading the RFCs ought to recognize this intuitively.
linsomniac17 hours ago
>While in our interpretation the RFCs do not require CNAMEs to appear in any particular order
That seems like some doubling-down BS to me, since they earlier say "It's ambiguous because it doesn't use MUST or SHOULD, which was introduced a decade after the DNS RFC." The RFC says:
>The answer to the query, possibly preface by one or more CNAME RRs that specify aliases encountered on the way to an answer.
How do you get to interpreting that, in the face of "MUST" being defined a decade later, as "I guess I can append the CNAME to the answer?
Holding onto "we still think the RFC allows it" is a problem. The world is a lot better if you can just admit to your mistakes and move on. I try to model this at home and at work, because trying to "language lawyer" your way out of being wrong makes the world a worse place.
skywhopper3 hours ago
The RFC is also 39 years old! At this point, DNS is what existing software expects it to be, not what someone proposed in the mid-eighties. The fact that they did not have any testing to match exact byte-by-byte responses with existing behavior and other DNS resolvers for this layer of service is massively irresponsible.
bwblabs18 hours ago
I will hijack this post to point out CloudFlare really doesn't understand RFC1034, their DNS authoritative interface only blocks A and AAAA if there is a CNAME defined, e.g. see this:
$ echo "A AAAA CAA CNAME DS HTTPS LOC MX NS TXT" | sed -r 's/ /\n/g' | sed -r 's/^/rfc1034.wlbd.nl /g' | xargs dig +norec +noall +question +answer +authority @coco.ns.cloudflare.com
;rfc1034.wlbd.nl. IN A
rfc1034.wlbd.nl. 300 IN CNAME www.example.org.
;rfc1034.wlbd.nl. IN AAAA
rfc1034.wlbd.nl. 300 IN CNAME www.example.org.
;rfc1034.wlbd.nl. IN CAA
rfc1034.wlbd.nl. 300 IN CAA 0 issue "really"
;rfc1034.wlbd.nl. IN CNAME
rfc1034.wlbd.nl. 300 IN CNAME www.example.org.
;rfc1034.wlbd.nl. IN DS
rfc1034.wlbd.nl. 300 IN DS 0 13 2 21A21D53B97D44AD49676B9476F312BA3CEDB11DDC3EC8D9C7AC6BAC A84271AE
;rfc1034.wlbd.nl. IN HTTPS
rfc1034.wlbd.nl. 300 IN HTTPS 1 . alpn="h3"
;rfc1034.wlbd.nl. IN LOC
rfc1034.wlbd.nl. 300 IN LOC 0 0 0.000 N 0 0 0.000 E 0.00m 0.00m 0.00m 0.00m
;rfc1034.wlbd.nl. IN MX
rfc1034.wlbd.nl. 300 IN MX 0 .
;rfc1034.wlbd.nl. IN NS
rfc1034.wlbd.nl. 300 IN NS rfc1034.wlbd.nl.
;rfc1034.wlbd.nl. IN TXT
rfc1034.wlbd.nl. 300 IN TXT "Check my cool label serving TXT and a CNAME, in violation with RFC1034"
The result is DNS resolvers (including CloudFlare Public DNS) will have a cache dependent result if you query e.g. a TXT record (depending if it has the CNAME cached).
At internet.nl (https://github.com/internetstandards/) we found out because some people claimed to have some TXT DMARC record, while also CNAMEing this record (which results in cache dependent results, and since internet.nl uses RFC 9156 QName Minimisation, if first resolves A, and therefor caches the CNAME and will never see the TXT). People configure things similar to https://mxtoolbox.com/dmarc/dmarc-setup-cname instructions (which I find in conflict with RFC1034).
I don't think they're advising anyone create both a CNAME and TXT at the same label - but it certainly looks like that from the weird screenshot at step 5 (which doesn't match the text).
I think it's mistakenly a mish-mash of two different guides, one for 'how to use a CNAME to point to a third party DMARC service entirely' and one for 'how to host the DMARC record yourself' (irrespective of where the RUA goes).
bwblabs17 hours ago
I'm not sure, but we're seeing this specifically with _dmarc CNAMEing to '.hosted.dmarc-report.com' together with a TXT record type, also see this discussion users asking for this at deSEC: https://talk.desec.io/t/cannot-create-cname-and-txt-record-f...
My main point was however that it's really not okay that CloudFlare allows setting up other record types (e.g. TXT, but basically any) next to a CNAME.
ycombiredd10 hours ago
Yes. This type of behavior was what I was referring to in an earlier comment mentioning flashbacks to seeing logs from named filled with "cannot have cname and other data", and slapping my forehead asking "who keeps doing this?", in the days when editing files by hand was the norm. And then, of course having repeats of this feeling as tools were built, automations became increasingly common, and large service providers "standardized" interfaces (ostensibly to ensure correctness) allowing or even encouraging creation of bad zone configurations.
The more things change, the more things stay the same. :-)
mgaunardjust now
It would make sure that any graph is provided in topological order.
forinti21 hours ago
> While in our interpretation the RFCs do not require CNAMEs to appear in any particular order, it’s clear that at least some widely-deployed DNS clients rely on it. As some systems using these clients might be updated infrequently, or never updated at all, we believe it’s best to require CNAME records to appear in-order before any other records.
That's the only reasonable conclusion, really.
hdjrudni20 hours ago
And I'm glad they came to it. Even if everyone else is wrong (I'm not saying they are) sometimes you just have to play along.
seiferteric20 hours ago
Now that I have seemingly taken on managing DNS at my current company I have seen several inadequacies of DNS that I was not aware of before. Main one being that if an upstream DNS server returns SERVFAIL, there is no distinction really between if the server you are querying is failed, or the actual authoritative server upstream is broken (I am aware of EDEs but doesn't really solve this). So clients querying a broken domain will retry each of their configured DNS servers, and our caching layer (Unbound) will also retry each of their upstreams etc... Results in a bunch of pointless upstream queries like an amplification attack. Also have issue with the search path doing stupid queries with NXDOMAIN like badname.company.com, badname.company.othername.com... etc..
indigodaddy19 hours ago
re: your SERVFAIL observation, oh man did I run into this exact issue about a year or so ago when this came up for a particular zone. all I was doing was troubleshooting it on the caching server. Took me a day or two to actually look at the auth server and find out that the issue actually rooted from there.
simoncion12 hours ago
> So clients querying a broken domain will retry each of their configured DNS servers, our caching layer (Unbound) will also retry each of their upstreams etc...
I expect this is why BIND 9 has the 'servfail-ttl' option. [0]
Turns out that there's a standards-track RFC from 1998 that explicitly permits caching SERVFAIL responses. [1] Section 8 of that document suggests that this behavior was permitted by RFC 1034 (published back in 1987).
I would expect, that dns servers like 1.1.1.1 at this scale have integration tests running real resolvers, like the one in glibc. How come this issue was discovered only in production?
t0mas8818 hours ago
This case would only happen if a CNAME chain first expired from the cache in the wrong order and then subsequently was queried via glibc. Theirs tests may test both that glibc resolving works and that re-querying expired records works, but not the combination of the two.
wolttam18 hours ago
My take is quite cynical on this.. This post reads to me like a post-justification of some strange newly introduced behaviour.
Please order the answer in the order the resolutions were performed to arrive at the final answer (regardless of cache timings). Anything else makes little sense, especially not in the name of some micro-optimization (which could likely be approached in other ways that don’t alter behaviour).
Gigachad17 hours ago
The DNS specification should be updated to say CNAMES _must_ be ordered at the top rather than "possibly". Cloudflare was complying with the specification. Cisco was relying on unspecified behavior that happened to be common.
alexey-salmin14 hours ago
The only reasonable interpretation of "possibly prefaced" is that the CNAMEs either come first or not at all (hence "possibly"). Nowhere the RFC suggests that they may come in the middle.
Something is broken in Cloudflare since a couple of years. It takes a very specific engineering culture to run the internet and it's just not there anymore.
Dylan168078 hours ago
Except that "first or not at all" doesn't prevent this bug from triggering.
Nowhere the RFC suggests multiple CNAMEs need to be in a specific order.
hdgvhicv8 hours ago
I’m no fan of the centralised intenet cloudflare heralds, but blaming anyone but Cisco for this reboot behaviour is wrong.
skywhopper3 hours ago
Cloudflare broke clients all over the world. What the 40 year old RFC says is not the de facto “specification” at this point.
alt227just now
Cloudflare broke 'Cisco' clients all over the world. Not CFs problem that the biggest router vendor in the world programmed their routers wrongly.
tuetuopay20 hours ago
Many rightfully interpret the RFC as "CNAME have to be before A", but the issue persists inbetween CNAMEs in the chain as noted in the article. If a record in the middle of the chain expires, glibc would still fail if the "middle" record was to be inserted between CNAMEs and A records.
It’s always DNS.
m304719 hours ago
DNS is a wire protocol, payload specification, and application protocol. For all of that, I personally wonder whether its enduring success isn't that it's remarkably underspecified when you get to the corner cases.
There's also so much of it, and it mostly works, most of the time. This creates a hysteresis loop in human judgement of efficacy: even a blind chicken gets corn if it's standing in it. Cisco bought cisco., but (a decade ago, when I had access to the firehose) on any given day belkin. would be in the top 10 TLDs if you looked at the NXDOMAIN traffic. Clients don't opportunistically try TCP (which they shouldn't, according to the specification...), but we have DoT (...but should in practice). My ISPs reverse DNS implementation is so bad that qname minimization breaks... but "nobody should be using qname minimization for reverse DNS", and "Spamhaus is breaking the law by casting shades at qname minimization".
"4096 ought to be enough for anybody" (no, frags are bad. see TCP above). There is only ever one request in a TCP connection... hey, what are these two bytes which are in front of the payload in my TCP connection? People who want to believe that their proprietary headers will be preserved if they forward an application protocol through an arbitrary number of intermediate proxy / forwarders (because that's way easier than running real DNS at the segment edge and logging client information at the application level).
Tangential, but: "But there's more to it, because people doing these things typically describe how it works for them (not how it doesn't work) and onlookers who don't pay close attention conclude "it works"." http://consulting.m3047.net/dubai-letters/dnstap-vs-pcap.htm...
sebastianmestre21 hours ago
I kind of wish they start sending records in randomized order to take out all the broken implementations that depend on such a fragile property
0xbadcafebee13 hours ago
That won't cause implementations to be fixed. The implementations in question are in devices that are old (DNS is over 40 years old) and will never be upgraded. Affected users will just choose a different DNS resolver. Pretty soon word will get around that "if you don't want a broken device, don't use CloudFlare for DNS". It's less hassle for CloudFlare to just maintain the existing de-facto standard.
wolttam18 hours ago
Is the property of an answer being ordered in the order that resolutions were performed to construct it /that/ fragile?
Randomization within the final answer RRSet is fine (and maybe even preferred in a lot of cases)
t0mas8818 hours ago
Well cisco had their switches get into a boot loop, that sounds very broken...
hdgvhicv8 hours ago
Yes it’s a well known behaviour from these Cisco switches, not just reliant on name ordering. If SBS fails they reboot.
We thought it as just the default ntp servers abut had some reboot during this event because www.cisco.com was unavailable.
m304712 hours ago
That would be a Flag Day initiative. ;-)
Honestly, it shouldn't matter. Anybody who's using a stub resolver where this matters, where /anything/ matters really, should be running their own local caching / recursing resolver. These oftentimes have options for e.g. ordering things for various reasons.
teddyh16 hours ago
Cloudflare is well known for breaking DNS standards, and also then writing a new RFC to justify their broken behavior, and getting IETF to approve it. (The existence of RFC 8482 is a disgrace to everyone involved.)
> To prevent any future incidents or confusion, we have written a proposal in the form of an Internet-Draft to be discussed at the IETF
Of course.
alt227just now
This really depends on what side of the fence you are on.
As a website host/maintainer, I am happy that the DNS 'ANY' query has been deprecated.
I am sure if you are a network engineer or ISP, then it propbably annoys you no end.
netfortius3 hours ago
Why couldn't a "code specialized" LLM/AI be added to the change flow, in the cloudflare process, and asked to check against all known implementations of name resolution stubs, dns clients, etc., etc. If not in such cases, then when?
peanut-walrus9 hours ago
I've always found it weird that CNAMEs get resolved and lumped into the answer section in the first place. While helpful, this is not what you asked for and it makes much more sense to me to stick that in additional section instead.
As an aside, I am super annoyed at Cloudflare for calling their proxy records "CNAME" in their UI. Those are nothing like CNAMEs and have caused endless confusion.
danepowell20 hours ago
Doesn't the precipitating change optimize memory on the DNS server at the expense of additional memory usage across millions of clients that now need to parse an unordered response?
Dylan1680719 hours ago
The memory involved is a kilobyte. The optimization isn't important anywhere. The fragility is what's important.
Also no, the client doesn't need more memory to parse the out-of-order response, it can take multiple passes through the kilobyte.
fweimer17 hours ago
For most client interfaces, it's possible to just grab the addresses and ignore the CNAMEs altogether because the names do not matter, or only the name on the address record.
Of course, if the server sends unrelated address records in the answer section, that will result in incorrect data. (A simple counter can detect the end of the answer section, so it's not necessary to chase CNAMEs for section separation.)
mintflow15 hours ago
After reading the article, I am wondering is that is there no test case to coverage the behavior that modify the CNAME order in the response? I think it should be simple to run a fleet of various OS/DNS client combinations to test the behavior.
And I also being shocked that Cisco Switch goes to reboot loop with this DNS order issue.
mcfedr9 hours ago
everything about this reads like an excuse from a team that doesnt want to admit they screwed up
nitpicking at the RFCs when everyone knows DNS is a big old thing with lots going on
how do they not have basic integration tests to check how clients resolve
it seems very unlike cloudflare of old that was much more up front - there is no talk of the need to improve process, just blaming other people
kayson21 hours ago
> However, we did not have any tests asserting the behavior remains consistent due to the ambiguous language in the RFC.
Maybe I'm being overly-cynical but I have a hard time believing that they deliberately omitted a test specifically because they reviewed the RFC and found the ambiguous language. I would've expected to see some dialog with IETF beforehand if that were the case. Or some review of the behavior of common DNS clients.
It seems like an oversight, and that's totally fine.
bombcar21 hours ago
I took it as being "we wrote the tests to the standard" and then built the code, and whoever was writing the tests didn't read that line as a testable aspect.
kayson21 hours ago
Fair enough.
supriyo-biswas21 hours ago
My reading of that statement is their test, assuming they had one, looked something like this:
rrs = resolver.resolve('www.example.test')
assert Record("cname1.example.test", type="CNAME") in rrs
assert Record("192.168.0.1", type="A") in rrs
Which wouldn't have caught the ordering problem.
hdjrudni20 hours ago
It's implied that they intentionally tested it that way, without any assertions on the order. Not by oversight of incompetence, but because they didn't want to bake the requirement in due to uncertainty.
account424 hours ago
That approach only makes sense if tests are immutable though. If you are unsure if the order matters you should still test for it so you get a reminder to re-check your assumptions when the order changes.
skywhopperjust now
That would be silly to stick that tightly to a 40 year old standard. They can easily observe the behavior of every other public DNS resolver (they are Cloudflare, so gathering data on such a scale should be easy) and see how they return results.
Honestly, though, I’d be surprised if they actually even considered it. Everything about the article says to me that the engineer(s) who caused this problem are desperately trying to deflect blame for not having a comprehensive test suite. Sorry, but you don’t go tweaking order of results for such a long-standing, high volume, and crucial protocol just because the 40 year old spec isn’t clear about it.
mcfedr9 hours ago
its pretty concerning that such a large organisation doesnt do any integration tests with their dns infrastructure
ShroudedNight21 hours ago
I'm not an IETF process expert. Would this be worth filing errata against the original RFC in addition to their new proposed update?
Also, what's the right mental framework behind deciding when to release a patch RFC vs obsoleting the old standard for a comprehensive update?
hdjrudni20 hours ago
I don't know the official process, but as a human that sometimes reads and implements IETF RFCs, I'd appreciate updates to the original doc rather than replacing it with something brand new. Probably with some dated version history.
Otherwise I might go to consult my favorite RFC and not even know its been superseded. And if it has been superseded with a brand new doc, now I have to start from scratch again instead of reading the diff or patch notes to figure out what needs updating.
And if we must supersede, I humbly request a warning be put at the top, linking the new standard.
ShroudedNight20 hours ago
At one point I could have sworn they were sticking obsoletion notices in the header, but now I can only find them in the right side-bar:
CloudFlare is a terrorist organization destroying the web.
0xbadcafebee13 hours ago
It's kind of weird that they didn't expect this. DNS resolvers are famously inconsistent, with changes sometimes working or not working, breaking or not breaking. Virtually any change you make to what DNS serves or how will cause inconsistent behavior somewhere. (DNS encompasses hundreds of RFCs)
runningmike19 hours ago
The end of this blog is …. “ To learn more about our mission to help build a better Internet,”
It's a pity they have to make an entirely new RFC, rather than amend the old RFC. Having independent RFCs and not a single unified "internet standard" under version control is a bit of a bummer in this manner.
paulddraper21 hours ago
> RFC 1034, published in 1987, defines much of the behavior of the DNS protocol, and should give us an answer on whether the order of CNAME records matters. Section 4.3.1 contains the following text:
> If recursive service is requested and available, the recursive response to a query will be one of the following:
> - The answer to the query, possibly preface by one or more CNAME RRs that specify aliases encountered on the way to an answer.
> While "possibly preface" can be interpreted as a requirement for CNAME records to appear before everything else, it does not use normative key words, such as MUST and SHOULD that modern RFCs use to express requirements. This isn’t a flaw in RFC 1034, but simply a result of its age. RFC 2119, which standardized these key words, was published in 1997, 10 years after RFC 1034.
It's pretty clear that CNAME is at the beginning.
The "possibly" does not refer to the order but rather to the presence.
If they are present, they are are first.
kiwijamo17 hours ago
Some people (myself included) read that as "would ideally come first, but it is not neccessary that it comes first". The language is not clear IMHO and could be worded better.
afiori7 hours ago
In my native language the literal translation of possibly has a distinct preferably meaning but I feel that in English it does not.
It might be a victim of polite/ironic/sarcastic influences to language that turns innocuous words into contronyms
urbandw311er18 hours ago
The whole world knows this except Cloudflare who actually did know it but are now trying to pretend that they didn’t.
purwantoroa732 hours ago
Have you guys use Vercel + Cloudflare?
albert_e11 hours ago
The kind of "optimization" that Cloudflare is attempting to do here ... doesnt that transfer the burden of more expensive parsing downstream to all the DNS clients instead?
Sounds low key selfish / inconsiderate to me
... to push such a change without adequate thought or informed buy in by consumers of that service.
kunley5 hours ago
Yeah, but you know, they needed to save extra bytes in the Rust implementation of their services, so wherever Rust pops up it apparently justifies any such action. ;)
urbandw311er18 hours ago
I feel like they fucked it up then, when writing the post-mortem, went hunting for facts to retrospectively justify their previous decisions.
skywhopper3 hours ago
This all reads like an embarrassed engineer who can’t admit they neglected to have a comprehensive to-the-byte test suite for their second-most-important-on-the-Internet DNS server, overcompensating by blaming a 40-year-old standard that (1) they probably hadn’t consulted, and (2) no one else seems to have issues with; and proposing to update core Internet standards, rather than just accept that they made a mistake when they assumed they could just append to what any regular user of DNS expects to be a meaningfully-ordered list.
frumplestlatz21 hours ago
Given my years of experience with Cisco "quality", I'm not surprised by this:
> Another notable affected implementation was the DNSC process in three models of Cisco ethernet switches. In the case where switches had been configured to use 1.1.1.1 these switches experienced spontaneous reboot loops when they received a response containing the reordered CNAMEs.
... but I am surprised by this:
> One such implementation that broke is the getaddrinfo function in glibc, which is commonly used on Linux for DNS resolution.
Not that glibc did anything wrong -- I'm just surprised that anyone is implementing an internet-scale caching resolver without a comprehensive test suite that includes one of the most common client implementations on the planet.
therein21 hours ago
After the release got reverted, it took an 1hr28min for the deployment to propagate. You'd think that would be a very long time for CloudFlare infrastructure.
rhplus20 hours ago
We should probably all be glad that CloudFlare doesn't have the ability to update its entire global fleet any faster than 1h 28m, even if it’s a rollback operation.
Any change to a global service like that, even a rollback (or data deployment or config change), should be released to a subset of the fleet first, monitored, and then rolled out progressively.
tuetuopay20 hours ago
Given the seriousness of outages they make with instant worldwide deploys, I’m glad they took it calmly.
steve197720 hours ago
They had to update all the down detectors first.
renewiltord21 hours ago
Nice analysis. Boy I can’t imagine having to work at Cloudflare on this stuff. A month to get your “small in code” change out only to find some bums somewhere have written code that will make it not work.
stackskipton21 hours ago
Or when working on massive infrastructure like this, you write plenty of tests that would have saved you a month worth of work.
They write reordering, push it and glibc tester fires, fails and you quickly discover "Crap, tests are failing and dependency (glibc) doesn't work way I thought it would."
jeroenhd6 hours ago
With glibc falling over but systemd-resolved working as intended, I suspect their Linux tests may have accidentally passed. Most desktop Linux installs and a whole lot of cloud Linux installs would've accidentally been saved from glibc's bug by systemd-resolved.
renewiltord21 hours ago
I suspect that if you could save them this time, they'd gladly pay you for it. It'll be a bit of a sell, but they seem like a fairly sensible org.
rjh2919 hours ago
It was glibc's resolver that failed - not exactly obscure. It wasn't properly tested or rolled out, plain and simple.
urbandw311er18 hours ago
Or — hot take — to find out that you made some silly misinterpretation of the RFC that you then felt the need to retrospectively justify.
dudeinjapan12 hours ago
Philosophers have agonized over this question since time immemorial.
inkyoto14 hours ago
This could be a great fit for Prolog, in fact, as it excels at the search.
Each resolved record would be asserted as a fact, and a tiny search implementation would run after all assertions have been made to resolve the IP address irrespective of the order in which the RRsets have arrived.
A micro Prolog implementation could be rolled into glibc's resolver (or a DNS resolver in general) to solve the problem once and for all.
PunchyHamster15 hours ago
TL;DR everyone implemented RFC properly (if missing some defensive coding), cloudflare decided it's optional and then learned that everyone did implement RFC properly, just some also did some additional work to make sure servers made wrong still were supported
torstenvl16 hours ago
EDIT: Why the drive-by downvotes? If someone thinks I'm wrong, I'm happy to hear why.
> One such implementation that broke is the getaddrinfo function in glibc, which is commonly used on Linux for DNS resolution.
> Most DNS clients don’t have this issue.
The most widespread implementation on the most widespread server operating system has the issue. I'm skeptical of what the author means by "Most DNS clients."
Also, what is the point of deploying to test if you aren't going to test against extremely common scenarios (like getaddrinfo)?
> To prevent any future incidents or confusion, we have written a proposal in the form of an Internet-Draft to be discussed at the IETF. If consensus is reached...
Pretty sure both Hyrum's Law and Postel's Law have reached the point of consensus.
Being conservative in what you emit means following the spec's most conservative interpretation, even if you think the way it's worded gives you some wiggle room. And the fact that your previous implementation did it that way for a decade means people have come to rely on it.
1vuio0pswjnm718 hours ago
"One such implementation that broke is the getaddrinfo function in glibc, which is commonly used on Linux for DNS resolution. When looking at its getanswer_r implementation, we can indeed see it expects to find the CNAME records before any answers:"
Wherever possible I compile with gethostbyname instead of getaddrinfo. I use musl instead of glibc
Nothing against IPv6 but I do not use it on the computers and networks I control
1vuio0pswjnm716 hours ago
NB. This is not code that belongs to me
When compiling software written by others, sometimes there are compile-time options that allow not using getaddrinfo or IPv6
For example,
links (--without-getaddrinfo)
haproxy (USE_GETADDRINFO="")
tnftp (--disable-ipv6)
elinks (--disable-ipv6)
wolfssl (ipv6 disabled by default)
stunnel (--disable-ipv6)
socat (--disable-ipv6)
and many more
Together with localhost TLS forward proxy I also use lots of older software that only used gethostbyname, e.g., original netcat, ucspi-tcp, libwww, original links, etc.
Generally I avoid mobile OS (corporate OS for data collection, surveillance and ad services)
Mobile data is disabled. I almost never use cellular networks for internet
Mobile sucks for internet IMHO; I have zero expectation re: speed and I cannot control what ISPs choose to do
For me, non-corporate UNIX-like OS are smaller, faster, easier to control, more interesting
immibis18 hours ago
Your code runs slower on mobile devices, since (as a rule of thumb) mobile networks are ipv6-only and ipv4 traffic has to pass through a few layers of tunneling.
1vuio0pswjnm712 hours ago
O5QXGIBLGIXC4LQK
charcircuit22 hours ago
Random DNS servers and clients being broken in weird ways is such a common problem and will probably never go away unless DNS is abandoned altogether.
It's surprising how something so simple can be so broken.
> The answer to the query, possibly preface by one or more CNAME RRs that specify aliases encountered on the way to an answer.
The "possibly preface" (sic!) to me is obviously to be understood as "if there are any CNAME RRs, the answer to the query is to be prefaced by those CNAME RRs" and not "you can preface the query with the CNAME RRs or you can place them wherever you want".
But also.. the programmers working on the software running one of the most important (end-user) DNS servers in the world:
1. Changes logic in how CNAME responses are formed
2. I assume some tests at least broke that meant they needed to be "fixed up" (y'know - "when a CNAME is queried, I expect this response")
3. No one saw these changes in test behavoir and thought "I wonder if this order is important". Or "We should research more into this", Or "Are other DNS servers changing order", Or "This should be flagged for a very gradual release".
4. Ends up in test environment for, what, a month.. nothing using getaddrinfo from glibc is being used to test this environment or anyone noticed that it was broken
Cloudflare seem to be getting into thr swing of breaking things and then being transparent. But this really reads as a fun "did you know", not a "we broke things again - please still use us".
There's no real RCA except to blame an RFC - but honestly, for a large-scale operation like there's this seems very big to slip through the cracks.
I would make a joke about South Park's oil "I'm sorry".. but they don't even seem to be
We used to say at work that the best way to get promoted was to be the programmer that introduced the bug into production and then fix it. Crazy if true here...
"Testing environment" sounds to me like a real network real user devices are used with (like the network used inside CloudFlare offices). That's what I would do if I was developing a DNS server anyway, other than unit tests (which obviously wouldn't catch this unless they were explicitly written for this case) and maybe integration/end-to-end tests, which might be running in Alpine Linux containers and as such using musl. If that's indeed the case, I can easily imagine how noone noticed anything was broken. First look at this line:
> Most DNS clients don’t have this issue. For example, systemd-resolved first parses the records into an ordered set:
Now think about what real end user devices are using: Windows/macOS/iOS obviously aren't using glibc and Android also has its own C library even though it's Linux-based, and they all probably fall under the "Most DNS clients don't have this issue.".
That leaves GNU/Linux, where we could reasonably expect most software to use glibc for resolving queries, so presumably anyone using Linux on their laptop would catch this right? Except most distributions started using systemd-resolved (most notable exception is Debian, but not many people use that on desktops/laptops), which is a locally-cached recursive DNS server, and as such acts as a middleman between glibc software and the network configured DNS server, so it would resolve 1.1.1.1 queries correctly, and then return the results from its cache ordered by its own ordering algorithm.
They absolutely should have unit tests that detect any change in output and manually review those changes for an operation of this size.
OP said:
"However, we did not have any tests asserting the behavior remains consistent due to the ambiguous language in the RFC."
One could guess it's something like -- back when we wrote the tests, years ago, whoever did it missed that this was required, not helped by the fact that the spec proceeded RFC 2119 standardizing the all-caps "MUST" "SHOULD" etc language, which would have helped us translsate specs to tests more completely.
> "The order of RRs in a set is not significant, and need not be preserved by name servers, resolvers, or other parts of the DNS." [from RFC]
> However, RFC 1034 doesn’t clearly specify how message sections relate to RRsets.
The developer(s) was assuming order didn't matter in general, cause the RFC said it didn't for one aspect, and intentionally made a change to order for performance reasons. But it turned out that change did matter.
Mistakes of this kind seem unavoidable, this one doesn't necessary say to me the developers made a mistake i never could or something.
I think the real conclusion is they probably need tests using actual live network stacks with common components, and why didn't they have those? Not just unit tests or with mocks, but tests that would have actually used real getaddrinfo function in glibc and shown it failing?
This is the part that is shocking to me. How is getaddrinfo not called in any unit or system tests?
I would hazard a guess that their test environment have both the systemd variant and the Unbound variants (Unbound technically does not arrange them, but instead reconstructs it according to RFC "CNAME restart" logic because it is a recursive resolver in itself), but not just plain directly-piped resolv.conf (Presumably because who would run that in this day and age. This is sadly just a half-joke, because only a few people would fall on this category.)
Which goes to show, one person’s “obvious understanding” is another’s “did they even read the entire document”.
All of which also serves to highlight the value of normative language, but that came later.
You might not find it ambiguous but it is ambiguous and there were attempts to fix it. You can find a warmed up discussion about this topic here: https://mailarchive.ietf.org/arch/msg/dnsop/2USkYvbnSIQ8s2vf...
And perhaps this is somewhat pedantic, but they also write that “RFC 1034 section 3.6 defines Resource Record Sets (RRsets) as collections of records with the same name, type, and class.” But looking at the RFC, it never defines such a term; it does say that within a “set” of RRs “associated with a particular name” the order doesn’t matter. But even if the RFC had said “associated with a particular combination of name, type, and class”, I don’t see how that could have introduced ambiguity. It specifies an exception to a general rule, so obviously if the exception doesn’t apply, then the general rule must be followed.
Anyway, Cloudflare probably know their DNS better than I do, but I did not find the article especially persuasive; I think the ambiguity is actually just a misreading, and that the RFC does require a particular ordering of CNAME records.
(ETA:) Although admittedly, while the RFC does say that CNAMEs must come before As in the answer, I don’t necessarily see any clear rule about how CNAME chains must be ordered; the RFC just says “Domain names in RRs which point at another name should always point at the primary name and not the alias ... Of course, by the robustness principle, domain software should not fail when presented with CNAME chains or loops; CNAME chains should be followed”. So actually I guess I do agree that there is some ambiguity about the responses containing CNAME chains.
I just commented the same.
It's pretty clear that the "possibly" refers to the presence of the CNAME RRs, not the ordering.
"With a sufficient number of users of an API, it does not matter what you promise in the contract: all observable behaviors of your system will be depended on by somebody."
combined with failure to follow Postel's Law:
"Be conservative in what you send, be liberal in what you accept."
What's reasonable is: "Set reserved fields to 0 when writing and ignore them when reading." (I heard that was the original example). Or "Ignore unknown JSON keys" as a modern equivalent.
What's harmful is: Accept an ill defined superset of the valid syntax and interpret it in undocumented ways.
This is useful as it allows the ISA to remain compatible with code which is unaware of future extensions which define new functionality for these bits so long as the zero value means "keep the old behavior". For example, a system register may have an EnableNewFeature bit, and older software will end up just writing zero to that field (which preserves the old functionality). This avoids needing to define a new system register for every new feature.
MCP seems to be a new round of the cycle beginning again.
I'm dead serious, we should be in a golden age of "programming in the large" formal protocols.
A weak warning that's just an entry in a scrolling console means nothing to end users and can be ignored by devs. A strong warning that comes out as a modal dialog can still be ignored by devs and then just annoys users. See the early era of Windows UAC for possibly the most widespread example of a strong warning added after the fact.
Through the appropriate channels; in-band and out-of-band.
It also does not give any way to actually see a warning message, where would we even put it? I know for a fact that if my glibc DNS resolver started spitting out errors into /var/log/god_knows_what I would take days to find it, at best the resolver could return some kind of errno with perror giving us a message like "The DNS response has not been correctly formatted", and then hope that the message is caught and forwarded through whatever is wrapping the C library, hopefully into our stderr. And there's so many ways even that could fail.
Start with milliseconds, move on to seconds and so on as the unwanted behavior continues.
In this case the broken resolver was the one in the GNU C Library, hardly an obscure situation!
The news here is sort of buried in the story. Basically Cloudflare just didn't test this. Literally every datacenter in the world was going to fail on this change, probably including their own.
I would expect most datacenters to use their own local recursive caching DNS servers instead of relying on 1.1.1.1 to minimize latency.
https://blog.cloudflare.com/zone-apex-naked-domain-root-doma... , and I quote directly ... "Never one to let a RFC stand in the way of a solution to a real problem, we're happy to announce that CloudFlare allows you to set your zone apex to a CNAME."
The problem? CNAMEs are name level aliases, not record level, so this "feature" would break the caching of NS, MX, and SOA records that exist at domain apexes. Many of us warned them at the time that this would result in a non-deterministic issue. At EC2 and Route 53 we weren't supporting this just to be mean! If a user's DNS resolver got an MX query before an A query, things might work ... but the other way around, they might not. An absolute nightmare to deal with. But move fast and break things, so hey :)
In earnest though ... it's great to see how now CloudFare are handling CNAME chains and A record ordering issues in this kind of detail. I never would have thought of this implicit contract they've discovered, and it makes sense!
Related, the phrase "CNAME chains" causes vague memories of confusion surrounding the concepts of "CNAME" and casual usage of the term "alias". Without re-reading RFC1034 today, I recall that my understanding back in the day was that the "C" was for "canonical", and that the host record the CNAME itself resolved to must itself have an A record, and not be another CNAME, and I acknowledge the already discussed topic that my "must" is doing a lot of lifting there, since the RFC in question predates a normative language standard RFC itself.
So, I don't remember exactly the initial point I was trying to get at with my second paragraph; maybe there has always been some various failure modes due to varying interpretations which have only compounded with age, new blood, non-standard language being used in self-serve DNS interfaces by providers, etc which I suppose only strengthens the "ambiguity" claim. That doesn't excuse such a large critical service provider though, at all.
It is a nightmare, but the spec is the source of the nightmare.
CNAMES are a huge pain in the ass (as noted by DJB https://cr.yp.to/djbdns/notes.html)
(Yes, there are other recursive resolver implementations, but they look at BIND as the reference implementation and absent any contravention to the RFC or intentional design-level decisions, they would follow BIND's mechanism.)
Hey, where can I find A.
Answer: A is actually B
Answer: Also B can be found at 42
If a small business or cloud app can't resolve a domain because the domain is doing something different, it's much easier to blame DNS, use another DNS server, and move on. Or maybe just go "some Linuxes can't reach my website, oh well, sucks for the 1-3%".
Cloudflare is large enough that they caused issues for millions of devices all at once, so they had to investigate.
What's unclear to me is if they bothered to send patches to broken open-source DNS resolvers to fix this issue in the future.
Based on what we have learned during this incident, we have reverted the CNAME re-ordering and do not intend to change the order in the future.
To prevent any future incidents or confusion, we have written a proposal in the form of an Internet-Draft to be discussed at the IETF.
That is, explicitly documenting the "broken" behaviour as permitted.
Parsing the answer section in a single pass requires more finesse, but does it need fancier data structures than a string to string map? And failing that you can loop upon CNAME. I wouldn't call a depth limit like 20 "a rather low limit on the number of CNAMEs in a response", and max 20 passes through a max 64KB answer section is plenty fast.
That seems like some doubling-down BS to me, since they earlier say "It's ambiguous because it doesn't use MUST or SHOULD, which was introduced a decade after the DNS RFC." The RFC says:
>The answer to the query, possibly preface by one or more CNAME RRs that specify aliases encountered on the way to an answer.
How do you get to interpreting that, in the face of "MUST" being defined a decade later, as "I guess I can append the CNAME to the answer?
Holding onto "we still think the RFC allows it" is a problem. The world is a lot better if you can just admit to your mistakes and move on. I try to model this at home and at work, because trying to "language lawyer" your way out of being wrong makes the world a worse place.
I don't think they're advising anyone create both a CNAME and TXT at the same label - but it certainly looks like that from the weird screenshot at step 5 (which doesn't match the text).
I think it's mistakenly a mish-mash of two different guides, one for 'how to use a CNAME to point to a third party DMARC service entirely' and one for 'how to host the DMARC record yourself' (irrespective of where the RUA goes).
My main point was however that it's really not okay that CloudFlare allows setting up other record types (e.g. TXT, but basically any) next to a CNAME.
The more things change, the more things stay the same. :-)
That's the only reasonable conclusion, really.
I expect this is why BIND 9 has the 'servfail-ttl' option. [0]
Turns out that there's a standards-track RFC from 1998 that explicitly permits caching SERVFAIL responses. [1] Section 8 of that document suggests that this behavior was permitted by RFC 1034 (published back in 1987).
[0] <https://bind9.readthedocs.io/en/v9.18.42/reference.html#name...>
[1] <https://www.rfc-editor.org/rfc/rfc2308#section-7.1>
Please order the answer in the order the resolutions were performed to arrive at the final answer (regardless of cache timings). Anything else makes little sense, especially not in the name of some micro-optimization (which could likely be approached in other ways that don’t alter behaviour).
Something is broken in Cloudflare since a couple of years. It takes a very specific engineering culture to run the internet and it's just not there anymore.
Nowhere the RFC suggests multiple CNAMEs need to be in a specific order.
It’s always DNS.
There's also so much of it, and it mostly works, most of the time. This creates a hysteresis loop in human judgement of efficacy: even a blind chicken gets corn if it's standing in it. Cisco bought cisco., but (a decade ago, when I had access to the firehose) on any given day belkin. would be in the top 10 TLDs if you looked at the NXDOMAIN traffic. Clients don't opportunistically try TCP (which they shouldn't, according to the specification...), but we have DoT (...but should in practice). My ISPs reverse DNS implementation is so bad that qname minimization breaks... but "nobody should be using qname minimization for reverse DNS", and "Spamhaus is breaking the law by casting shades at qname minimization".
"4096 ought to be enough for anybody" (no, frags are bad. see TCP above). There is only ever one request in a TCP connection... hey, what are these two bytes which are in front of the payload in my TCP connection? People who want to believe that their proprietary headers will be preserved if they forward an application protocol through an arbitrary number of intermediate proxy / forwarders (because that's way easier than running real DNS at the segment edge and logging client information at the application level).
Tangential, but: "But there's more to it, because people doing these things typically describe how it works for them (not how it doesn't work) and onlookers who don't pay close attention conclude "it works"." http://consulting.m3047.net/dubai-letters/dnstap-vs-pcap.htm...
Randomization within the final answer RRSet is fine (and maybe even preferred in a lot of cases)
We thought it as just the default ntp servers abut had some reboot during this event because www.cisco.com was unavailable.
Honestly, it shouldn't matter. Anybody who's using a stub resolver where this matters, where /anything/ matters really, should be running their own local caching / recursing resolver. These oftentimes have options for e.g. ordering things for various reasons.
> To prevent any future incidents or confusion, we have written a proposal in the form of an Internet-Draft to be discussed at the IETF
Of course.
As a website host/maintainer, I am happy that the DNS 'ANY' query has been deprecated.
I am sure if you are a network engineer or ISP, then it propbably annoys you no end.
As an aside, I am super annoyed at Cloudflare for calling their proxy records "CNAME" in their UI. Those are nothing like CNAMEs and have caused endless confusion.
Also no, the client doesn't need more memory to parse the out-of-order response, it can take multiple passes through the kilobyte.
Of course, if the server sends unrelated address records in the answer section, that will result in incorrect data. (A simple counter can detect the end of the answer section, so it's not necessary to chase CNAMEs for section separation.)
And I also being shocked that Cisco Switch goes to reboot loop with this DNS order issue.
nitpicking at the RFCs when everyone knows DNS is a big old thing with lots going on
how do they not have basic integration tests to check how clients resolve
it seems very unlike cloudflare of old that was much more up front - there is no talk of the need to improve process, just blaming other people
Maybe I'm being overly-cynical but I have a hard time believing that they deliberately omitted a test specifically because they reviewed the RFC and found the ambiguous language. I would've expected to see some dialog with IETF beforehand if that were the case. Or some review of the behavior of common DNS clients.
It seems like an oversight, and that's totally fine.
Honestly, though, I’d be surprised if they actually even considered it. Everything about the article says to me that the engineer(s) who caused this problem are desperately trying to deflect blame for not having a comprehensive test suite. Sorry, but you don’t go tweaking order of results for such a long-standing, high volume, and crucial protocol just because the 40 year old spec isn’t clear about it.
Also, what's the right mental framework behind deciding when to release a patch RFC vs obsoleting the old standard for a comprehensive update?
Otherwise I might go to consult my favorite RFC and not even know its been superseded. And if it has been superseded with a brand new doc, now I have to start from scratch again instead of reading the diff or patch notes to figure out what needs updating.
And if we must supersede, I humbly request a warning be put at the top, linking the new standard.
https://datatracker.ietf.org/doc/html/rfc5245
I agree, that it would be much more helpful if made obvious in the document itself.
It's not obvious that "updated by" notices are treated in any more of a helpful manner than "obsoletes"
Reminds me of https://news.ycombinator.com/item?id=37962674 or see https://tech.tiq.cc/2016/01/why-you-shouldnt-use-cloudflare/
> If recursive service is requested and available, the recursive response to a query will be one of the following:
> - The answer to the query, possibly preface by one or more CNAME RRs that specify aliases encountered on the way to an answer.
> While "possibly preface" can be interpreted as a requirement for CNAME records to appear before everything else, it does not use normative key words, such as MUST and SHOULD that modern RFCs use to express requirements. This isn’t a flaw in RFC 1034, but simply a result of its age. RFC 2119, which standardized these key words, was published in 1997, 10 years after RFC 1034.
It's pretty clear that CNAME is at the beginning.
The "possibly" does not refer to the order but rather to the presence.
If they are present, they are are first.
It might be a victim of polite/ironic/sarcastic influences to language that turns innocuous words into contronyms
Sounds low key selfish / inconsiderate to me
... to push such a change without adequate thought or informed buy in by consumers of that service.
> Another notable affected implementation was the DNSC process in three models of Cisco ethernet switches. In the case where switches had been configured to use 1.1.1.1 these switches experienced spontaneous reboot loops when they received a response containing the reordered CNAMEs.
... but I am surprised by this:
> One such implementation that broke is the getaddrinfo function in glibc, which is commonly used on Linux for DNS resolution.
Not that glibc did anything wrong -- I'm just surprised that anyone is implementing an internet-scale caching resolver without a comprehensive test suite that includes one of the most common client implementations on the planet.
Any change to a global service like that, even a rollback (or data deployment or config change), should be released to a subset of the fleet first, monitored, and then rolled out progressively.
They write reordering, push it and glibc tester fires, fails and you quickly discover "Crap, tests are failing and dependency (glibc) doesn't work way I thought it would."
Each resolved record would be asserted as a fact, and a tiny search implementation would run after all assertions have been made to resolve the IP address irrespective of the order in which the RRsets have arrived.
A micro Prolog implementation could be rolled into glibc's resolver (or a DNS resolver in general) to solve the problem once and for all.
> One such implementation that broke is the getaddrinfo function in glibc, which is commonly used on Linux for DNS resolution.
> Most DNS clients don’t have this issue.
The most widespread implementation on the most widespread server operating system has the issue. I'm skeptical of what the author means by "Most DNS clients."
Also, what is the point of deploying to test if you aren't going to test against extremely common scenarios (like getaddrinfo)?
> To prevent any future incidents or confusion, we have written a proposal in the form of an Internet-Draft to be discussed at the IETF. If consensus is reached...
Pretty sure both Hyrum's Law and Postel's Law have reached the point of consensus.
Being conservative in what you emit means following the spec's most conservative interpretation, even if you think the way it's worded gives you some wiggle room. And the fact that your previous implementation did it that way for a decade means people have come to rely on it.
Wherever possible I compile with gethostbyname instead of getaddrinfo. I use musl instead of glibc
Nothing against IPv6 but I do not use it on the computers and networks I control
When compiling software written by others, sometimes there are compile-time options that allow not using getaddrinfo or IPv6
For example,
links (--without-getaddrinfo)
haproxy (USE_GETADDRINFO="")
tnftp (--disable-ipv6)
elinks (--disable-ipv6)
wolfssl (ipv6 disabled by default)
stunnel (--disable-ipv6)
socat (--disable-ipv6)
and many more
Together with localhost TLS forward proxy I also use lots of older software that only used gethostbyname, e.g., original netcat, ucspi-tcp, libwww, original links, etc.
Generally I avoid mobile OS (corporate OS for data collection, surveillance and ad services)
Mobile data is disabled. I almost never use cellular networks for internet
Mobile sucks for internet IMHO; I have zero expectation re: speed and I cannot control what ISPs choose to do
For me, non-corporate UNIX-like OS are smaller, faster, easier to control, more interesting
It's surprising how something so simple can be so broken.