Commit 439781aa authored by Bob Halley's avatar Bob Halley
Browse files

add

parent 10a05eab
Network Working Group P. Vixie
Request for Comments: 2671 ISC
Category: Standards Track August 1999
Extension Mechanisms for DNS (EDNS0)
Status of this Memo
This document specifies an Internet standards track protocol for the
Internet community, and requests discussion and suggestions for
improvements. Please refer to the current edition of the "Internet
Official Protocol Standards" (STD 1) for the standardization state
and status of this protocol. Distribution of this memo is unlimited.
Copyright Notice
Copyright (C) The Internet Society (1999). All Rights Reserved.
Abstract
The Domain Name System's wire protocol includes a number of fixed
fields whose range has been or soon will be exhausted and does not
allow clients to advertise their capabilities to servers. This
document describes backward compatible mechanisms for allowing the
protocol to grow.
1 - Rationale and Scope
1.1. DNS (see [RFC1035]) specifies a Message Format and within such
messages there are standard formats for encoding options, errors,
and name compression. The maximum allowable size of a DNS Message
is fixed. Many of DNS's protocol limits are too small for uses
which are or which are desired to become common. There is no way
for implementations to advertise their capabilities.
1.2. Existing clients will not know how to interpret the protocol
extensions detailed here. In practice, these clients will be
upgraded when they have need of a new feature, and only new
features will make use of the extensions. We must however take
account of client behaviour in the face of extra fields, and design
a fallback scheme for interoperability with these clients.
Vixie Standards Track [Page 1]
RFC 2671 Extension Mechanisms for DNS (EDNS0) August 1999
2 - Affected Protocol Elements
2.1. The DNS Message Header's (see [RFC1035 4.1.1]) second full 16-bit
word is divided into a 4-bit OPCODE, a 4-bit RCODE, and a number of
1-bit flags. The original reserved Z bits have been allocated to
various purposes, and most of the RCODE values are now in use.
More flags and more possible RCODEs are needed.
2.2. The first two bits of a wire format domain label are used to denote
the type of the label. [RFC1035 4.1.4] allocates two of the four
possible types and reserves the other two. Proposals for use of
the remaining types far outnumber those available. More label
types are needed.
2.3. DNS Messages are limited to 512 octets in size when sent over UDP.
While the minimum maximum reassembly buffer size still allows a
limit of 512 octets of UDP payload, most of the hosts now connected
to the Internet are able to reassemble larger datagrams. Some
mechanism must be created to allow requestors to advertise larger
buffer sizes to responders.
3 - Extended Label Types
3.1. The "0 1" label type will now indicate an extended label type,
whose value is encoded in the lower six bits of the first octet of
a label. All subsequently developed label types should be encoded
using an extended label type.
3.2. The "1 1 1 1 1 1" extended label type will be reserved for future
expansion of the extended label type code space.
4 - OPT pseudo-RR
4.1. One OPT pseudo-RR can be added to the additional data section of
either a request or a response. An OPT is called a pseudo-RR
because it pertains to a particular transport level message and not
to any actual DNS data. OPT RRs shall never be cached, forwarded,
or stored in or loaded from master files. The quantity of OPT
pseudo-RRs per message shall be either zero or one, but not
greater.
4.2. An OPT RR has a fixed part and a variable set of options expressed
as {attribute, value} pairs. The fixed part holds some DNS meta
data and also a small collection of new protocol elements which we
expect to be so popular that it would be a waste of wire space to
encode them as {attribute, value} pairs.
Vixie Standards Track [Page 2]
RFC 2671 Extension Mechanisms for DNS (EDNS0) August 1999
4.3. The fixed part of an OPT RR is structured as follows:
Field Name Field Type Description
------------------------------------------------------
NAME domain name empty (root domain)
TYPE u_int16_t OPT
CLASS u_int16_t sender's UDP payload size
TTL u_int32_t extended RCODE and flags
RDLEN u_int16_t describes RDATA
RDATA octet stream {attribute,value} pairs
4.4. The variable part of an OPT RR is encoded in its RDATA and is
structured as zero or more of the following:
+0 (MSB) +1 (LSB)
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
0: | OPTION-CODE |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
2: | OPTION-LENGTH |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
4: | |
/ OPTION-DATA /
/ /
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
OPTION-CODE (Assigned by IANA.)
OPTION-LENGTH Size (in octets) of OPTION-DATA.
OPTION-DATA Varies per OPTION-CODE.
4.5. The sender's UDP payload size (which OPT stores in the RR CLASS
field) is the number of octets of the largest UDP payload that can
be reassembled and delivered in the sender's network stack. Note
that path MTU, with or without fragmentation, may be smaller than
this.
4.5.1. Note that a 512-octet UDP payload requires a 576-octet IP
reassembly buffer. Choosing 1280 on an Ethernet connected
requestor would be reasonable. The consequence of choosing too
large a value may be an ICMP message from an intermediate
gateway, or even a silent drop of the response message.
4.5.2. Both requestors and responders are advised to take account of the
path's discovered MTU (if already known) when considering message
sizes.
Vixie Standards Track [Page 3]
RFC 2671 Extension Mechanisms for DNS (EDNS0) August 1999
4.5.3. The requestor's maximum payload size can change over time, and
should therefore not be cached for use beyond the transaction in
which it is advertised.
4.5.4. The responder's maximum payload size can change over time, but
can be reasonably expected to remain constant between two
sequential transactions; for example, a meaningless QUERY to
discover a responder's maximum UDP payload size, followed
immediately by an UPDATE which takes advantage of this size.
(This is considered preferrable to the outright use of TCP for
oversized requests, if there is any reason to suspect that the
responder implements EDNS, and if a request will not fit in the
default 512 payload size limit.)
4.5.5. Due to transaction overhead, it is unwise to advertise an
architectural limit as a maximum UDP payload size. Just because
your stack can reassemble 64KB datagrams, don't assume that you
want to spend more than about 4KB of state memory per ongoing
transaction.
4.6. The extended RCODE and flags (which OPT stores in the RR TTL field)
are structured as follows:
+0 (MSB) +1 (LSB)
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
0: | EXTENDED-RCODE | VERSION |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
2: | Z |
+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+
EXTENDED-RCODE Forms upper 8 bits of extended 12-bit RCODE. Note
that EXTENDED-RCODE value "0" indicates that an
unextended RCODE is in use (values "0" through "15").
VERSION Indicates the implementation level of whoever sets
it. Full conformance with this specification is
indicated by version "0." Requestors are encouraged
to set this to the lowest implemented level capable
of expressing a transaction, to minimize the
responder and network load of discovering the
greatest common implementation level between
requestor and responder. A requestor's version
numbering strategy should ideally be a run time
configuration option.
If a responder does not implement the VERSION level
of the request, then it answers with RCODE=BADVERS.
All responses will be limited in format to the
Vixie Standards Track [Page 4]
RFC 2671 Extension Mechanisms for DNS (EDNS0) August 1999
VERSION level of the request, but the VERSION of each
response will be the highest implementation level of
the responder. In this way a requestor will learn
the implementation level of a responder as a side
effect of every response, including error responses,
including RCODE=BADVERS.
Z Set to zero by senders and ignored by receivers,
unless modified in a subsequent specification.
5 - Transport Considerations
5.1. The presence of an OPT pseudo-RR in a request should be taken as an
indication that the requestor fully implements the given version of
EDNS, and can correctly understand any response that conforms to
that feature's specification.
5.2. Lack of use of these features in a request must be taken as an
indication that the requestor does not implement any part of this
specification and that the responder may make no use of any
protocol extension described here in its response.
5.3. Responders who do not understand these protocol extensions are
expected to send a response with RCODE NOTIMPL, FORMERR, or
SERVFAIL. Therefore use of extensions should be "probed" such that
a responder who isn't known to support them be allowed a retry with
no extensions if it responds with such an RCODE. If a responder's
capability level is cached by a requestor, a new probe should be
sent periodically to test for changes to responder capability.
6 - Security Considerations
Requestor-side specification of the maximum buffer size may open a
new DNS denial of service attack if responders can be made to send
messages which are too large for intermediate gateways to forward,
thus leading to potential ICMP storms between gateways and
responders.
7 - IANA Considerations
The IANA has assigned RR type code 41 for OPT.
It is the recommendation of this document and its working group
that IANA create a registry for EDNS Extended Label Types, for EDNS
Option Codes, and for EDNS Version Numbers.
This document assigns label type 0b01xxxxxx as "EDNS Extended Label
Type." We request that IANA record this assignment.
Vixie Standards Track [Page 5]
RFC 2671 Extension Mechanisms for DNS (EDNS0) August 1999
This document assigns extended label type 0bxx111111 as "Reserved
for future extended label types." We request that IANA record this
assignment.
This document assigns option code 65535 to "Reserved for future
expansion."
This document expands the RCODE space from 4 bits to 12 bits. This
will allow IANA to assign more than the 16 distinct RCODE values
allowed in [RFC1035].
This document assigns EDNS Extended RCODE "16" to "BADVERS".
IESG approval should be required to create new entries in the EDNS
Extended Label Type or EDNS Version Number registries, while any
published RFC (including Informational, Experimental, or BCP)
should be grounds for allocation of an EDNS Option Code.
8 - Acknowledgements
Paul Mockapetris, Mark Andrews, Robert Elz, Don Lewis, Bob Halley,
Donald Eastlake, Rob Austein, Matt Crawford, Randy Bush, and Thomas
Narten were each instrumental in creating and refining this
specification.
9 - References
[RFC1035] Mockapetris, P., "Domain Names - Implementation and
Specification", STD 13, RFC 1035, November 1987.
10 - Author's Address
Paul Vixie
Internet Software Consortium
950 Charter Street
Redwood City, CA 94063
Phone: +1 650 779 7001
EMail: vixie@isc.org
Vixie Standards Track [Page 6]
RFC 2671 Extension Mechanisms for DNS (EDNS0) August 1999
11 - Full Copyright Statement
Copyright (C) The Internet Society (1999). All Rights Reserved.
This document and translations of it may be copied and furnished to
others, and derivative works that comment on or otherwise explain it
or assist in its implementation may be prepared, copied, published
and distributed, in whole or in part, without restriction of any
kind, provided that the above copyright notice and this paragraph are
included on all such copies and derivative works. However, this
document itself may not be modified in any way, such as by removing
the copyright notice or references to the Internet Society or other
Internet organizations, except as needed for the purpose of
developing Internet standards in which case the procedures for
copyrights defined in the Internet Standards process must be
followed, or as required to translate it into languages other than
English.
The limited permissions granted above are perpetual and will not be
revoked by the Internet Society or its successors or assigns.
This document and the information contained herein is provided on an
"AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Acknowledgement
Funding for the RFC Editor function is currently provided by the
Internet Society.
Vixie Standards Track [Page 7]
Network Working Group M. Crawford
Request for Comments: 2672 Fermilab
Category: Standards Track August 1999
Non-Terminal DNS Name Redirection
Status of this Memo
This document specifies an Internet standards track protocol for the
Internet community, and requests discussion and suggestions for
improvements. Please refer to the current edition of the "Internet
Official Protocol Standards" (STD 1) for the standardization state
and status of this protocol. Distribution of this memo is unlimited.
Copyright Notice
Copyright (C) The Internet Society (1999). All Rights Reserved.
1. Introduction
This document defines a new DNS Resource Record called "DNAME", which
provides the capability to map an entire subtree of the DNS name
space to another domain. It differs from the CNAME record which maps
a single node of the name space.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [KWORD].
2. Motivation
This Resource Record and its processing rules were conceived as a
solution to the problem of maintaining address-to-name mappings in a
context of network renumbering. Without the DNAME mechanism, an
authoritative DNS server for the address-to-name mappings of some
network must be reconfigured when that network is renumbered. With
DNAME, the zone can be constructed so that it needs no modification
when renumbered. DNAME can also be useful in other situations, such
as when an organizational unit is renamed.
3. The DNAME Resource Record
The DNAME RR has mnemonic DNAME and type code 39 (decimal).
Crawford Standards Track [Page 1]
RFC 2672 Non-Terminal DNS Name Redirection August 1999
DNAME has the following format:
<owner> <ttl> <class> DNAME <target>
The format is not class-sensitive. All fields are required. The
RDATA field <target> is a <domain-name> [DNSIS].
The DNAME RR causes type NS additional section processing.
The effect of the DNAME record is the substitution of the record's
<target> for its <owner> as a suffix of a domain name. A "no-
descendants" limitation governs the use of DNAMEs in a zone file:
If a DNAME RR is present at a node N, there may be other data at N
(except a CNAME or another DNAME), but there MUST be no data at
any descendant of N. This restriction applies only to records of
the same class as the DNAME record.
This rule assures predictable results when a DNAME record is cached
by a server which is not authoritative for the record's zone. It
MUST be enforced when authoritative zone data is loaded. Together
with the rules for DNS zone authority [DNSCLR] it implies that DNAME
and NS records can only coexist at the top of a zone which has only
one node.
The compression scheme of [DNSIS] MUST NOT be applied to the RDATA
portion of a DNAME record unless the sending server has some way of
knowing that the receiver understands the DNAME record format.
Signalling such understanding is expected to be the subject of future
DNS Extensions.
Naming loops can be created with DNAME records or a combination of
DNAME and CNAME records, just as they can with CNAME records alone.
Resolvers, including resolvers embedded in DNS servers, MUST limit
the resources they devote to any query. Implementors should note,
however, that fairly lengthy chains of DNAME records may be valid.
4. Query Processing
To exploit the DNAME mechanism the name resolution algorithms [DNSCF]
must be modified slightly for both servers and resolvers.
Both modified algorithms incorporate the operation of making a
substitution on a name (either QNAME or SNAME) under control of a
DNAME record. This operation will be referred to as "the DNAME
substitution".
Crawford Standards Track [Page 2]
RFC 2672 Non-Terminal DNS Name Redirection August 1999
4.1. Processing by Servers
For a server performing non-recursive service steps 3.c and 4 of
section 4.3.2 [DNSCF] are changed to check for a DNAME record before
checking for a wildcard ("*") label, and to return certain DNAME
records from zone data and the cache.
DNS clients sending Extended DNS [EDNS0] queries with Version 0 or
non-extended queries are presumed not to understand the semantics of
the DNAME record, so a server which implements this specification,
when answering a non-extended query, SHOULD synthesize a CNAME record
for each DNAME record encountered during query processing to help the
client reach the correct DNS data. The behavior of clients and
servers under Extended DNS versions greater than 0 will be specified
when those versions are defined.
The synthesized CNAME RR, if provided, MUST have
The same CLASS as the QCLASS of the query,
TTL equal to zero,
An <owner> equal to the QNAME in effect at the moment the DNAME RR
was encountered, and
An RDATA field containing the new QNAME formed by the action of
the DNAME substitution.
If the server has the appropriate key on-line [DNSSEC, SECDYN], it
MAY generate and return a SIG RR for the synthesized CNAME RR.
The revised server algorithm is:
1. Set or clear the value of recursion available in the response
depending on whether the name server is willing to provide
recursive service. If recursive service is available and
requested via the RD bit in the query, go to step 5, otherwise
step 2.
2. Search the available zones for the zone which is the nearest
ancestor to QNAME. If such a zone is found, go to step 3,
otherwise step 4.
3. Start matching down, label by label, in the zone. The matching
process can terminate several ways:
Crawford Standards Track [Page 3]
RFC 2672 Non-Terminal DNS Name Redirection August 1999
a. If the whole of QNAME is matched, we have found the node.
If the data at the node is a CNAME, and QTYPE doesn't match
CNAME, copy the CNAME RR into the answer section of the
response, change QNAME to the canonical name in the CNAME RR,
and go back to step 1.
Otherwise, copy all RRs which match QTYPE into the answer
section and go to step 6.
b. If a match would take us out of the authoritative data, we have
a referral. This happens when we encounter a node with NS RRs
marking cuts along the bottom of a zone.
Copy the NS RRs for the subzone into the authority section of
the reply. Put whatever addresses are available into the
additional section, using glue RRs if the addresses are not
available from authoritative data or the cache. Go to step 4.
c. If at some label, a match is impossible (i.e., the
corresponding label does not exist), look to see whether the
last label matched has a DNAME record.
If a DNAME record exists at that point, copy that record into
the answer section. If substitution of its <target> for its
<owner> in QNAME would overflow the legal size for a <domain-
name>, set RCODE to YXDOMAIN [DNSUPD] and exit; otherwise
perform the substitution and continue. If the query was not
extended [EDNS0] with a Version indicating understanding of the
DNAME record, the server SHOULD synthesize a CNAME record as
described above and include it in the answer section. Go back
to step 1.
If there was no DNAME record, look to see if the "*" label
exists.
If the "*" label does not exist, check whether the name we are
looking for is the original QNAME in the query or a name we
have followed due to a CNAME. If the name is original, set an
authoritative name error in the response and exit. Otherwise
just exit.
If the "*" label does exist, match RRs at that node against
QTYPE. If any match, copy them into the answer section, but
set the owner of the RR to be QNAME, and not the node with the
"*" label. Go to step 6.
Crawford Standards Track [Page 4]
RFC 2672 Non-Terminal DNS Name Redirection August 1999
4. Start matching down in the cache. If QNAME is found in the cache,
copy all RRs attached to it that match QTYPE into the answer
section. If QNAME is not found in the cache but a DNAME record is
present at an ancestor of QNAME, copy that DNAME record into the
answer section. If there was no delegation from authoritative
data, look for the best one from the cache, and put it in the
authority section. Go to step 6.
5. Use the local resolver or a copy of its algorithm (see resolver
section of this memo) to answer the query. Store the results,
including any intermediate CNAMEs and DNAMEs, in the answer
section of the response.
6. Using local data only, attempt to add other RRs which may be
useful to the additional section of the query. Exit.
Note that there will be at most one ancestor with a DNAME as
described in step 4 unless some zone's data is in violation of the
no-descendants limitation in section 3. An implementation might take
advantage of this limitation by stopping the search of step 3c or
step 4 when a DNAME record is encountered.
4.2. Processing by Resolvers
A resolver or a server providing recursive service must be modified
to treat a DNAME as somewhat analogous to a CNAME. The resolver
algorithm of [DNSCF] section 5.3.3 is modified to renumber step 4.d
as 4.e and insert a new 4.d. The complete algorithm becomes:
1. See if the answer is in local information, and if so return it to
the client.
2. Find the best servers to ask.