Research and Development 1/^Archief/2009-2010/14/Server-to-server and Dialback specification

Uit Werkplaats
Ga naar: navigatie, zoeken
Bagjoke.jpg

Research and Development 1

Patrick van Bommel
Sjaak Smetsers


 © comments



  • Property "Auteur1" (as page type) with input value "  Research and Development 1/^Archief/2009-2010/14Gebruiker:Sjors Gielen" contains invalid characters or is incomplete and therefore can cause unexpected results during a query or annotation process.
  • Property "Auteur2" (as page type) with input value "  Research and Development 1/^Archief/2009-2010/14Gebruiker:Rob ten Berge" contains invalid characters or is incomplete and therefore can cause unexpected results during a query or annotation process.
  • Property "Auteur3" (as page type) with input value "  Research and Development 1/^Archief/2009-2010/14Gebruiker:Hans Harmannij" contains invalid characters or is incomplete and therefore can cause unexpected results during a query or annotation process.
  • Property "Auteur4" (as page type) with input value "  Research and Development 1/^Archief/2009-2010/14" contains invalid characters or is incomplete and therefore can cause unexpected results during a query or annotation process.
  • Property "Type" (as page type) with input value "{{{type}}}" contains invalid characters or is incomplete and therefore can cause unexpected results during a query or annotation process.

Specification server to server contact in XMPP

Introduction

This specification was written by our group in order to clarify XMPP server to server communication. Even though all group members are Dutch, this specification is in English in the hope that it will be useful for more people.

There is NO WARRANTY that the information provided here is correct. We are NOT responsible for any negative impact this information might have, including but not limited to explosion of servers. This information comes from various sources, most notably [XMPP] and [XMPP Dialback]. Other information comes from our own experience talking to other servers.

By writing this down, we hope to achieve various sources. First and foremost, we hope to bring order in our knowledge, in order to improve our production tempo for the rest of our project. Second, we hope this document will become a good reference for us and others to look at when we have questions. Third, we hope this document will help others with the same problems.

In this document, we will assume basic knowledge of DNS and XML.

Authors

  • Rob ten Berge <r.berge@student.ru.nl>
  • Hans Harmannij <j.harmannij@student.ru.nl>
  • Sjors Gielen <s.gielen@student.ru.nl>

Setting up a server to server connection

Servers will often have to establish connections to other servers. A timer might go off, maybe the server pings another server on startup, or another peer on the network is taking some action that requires our server to talk to another server.

In this case, we assume a network on which all servers are unknown to each other. For example, take in mind the SMTP network. There are millions of SMTP servers around the world and only a handful actually know each other, for example because they are owned by the same company. On other networks, like IRC, servers might all know each other and have a form of authentication, "logging in" to each other, for example using shared secrets.

It's important, here, to know the difference between "identification" and "authentication". "Identification" is what two equal peers on a network do. They tell each other who they are. "Authentication" is a seperate process that confirms that identification. XMPP requires peers on a network to identify and authenticate each other. Between servers otherwise unknown to each other, a common but somewhat unsecure option for this is XMPP Dialback, described in [XMPPDB] and later in this document.

This document will start by describing exactly what steps are required to connect to another server. Usually, servers should also be able to connect to us. This is even required when we implement Dialback, described later in this document. A lot of behaviour we should assume when servers connect to us can be derived from this first text, but this document will also describe explicitly how to behave when a server connects to us, later on.

Finding the server and port to connect to

XMPP makes use of DNS SRV records as specified by [DNSSRV]. Similar to MX records (as in [SMTP]), they allow for automatic service discovery for a domain. However, MX records only specify the hostname providing the SMTP service for a domain, while SRV records allow specifying a hostname and a port number providing any service for a domain. The service we need is contained in the hostname we request.

An example is the Google Wave Sandbox server. Google Wave is a collaboration protocol built upon XMPP. The Sandbox server is one of the official Wave servers set up by Google. If we want to connect to it, we first have to find its hostname and port. To do this, we ask the DNS nameservers for the domain for the SRV record of the XMPP server of that domain. The XMPP servers use TCP, so the exact hostname to request is "_xmpp-server._tcp.wavesandbox.com". When we ask the command-line tool 'dig' to request this information for us, this is a possible reply:

; <<>> DiG 9.5.1-P3 <<>> srv _xmpp-server._tcp.wavesandbox.com
;; global options:  printcmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 42380
;; flags: qr rd ra; QUERY: 1, ANSWER: 5, AUTHORITY: 13, ADDITIONAL: 0

;; QUESTION SECTION:
;_xmpp-server._tcp.wavesandbox.com. IN	SRV

;; ANSWER SECTION:
_xmpp-server._tcp.wavesandbox.com. 86389 IN SRV	20 0 5268 xmpp-server2.l.google.com.
_xmpp-server._tcp.wavesandbox.com. 86389 IN SRV	20 0 5268 xmpp-server3.l.google.com.
_xmpp-server._tcp.wavesandbox.com. 86389 IN SRV	20 0 5268 xmpp-server4.l.google.com.
_xmpp-server._tcp.wavesandbox.com. 86389 IN SRV	5 0 5268 xmpp-server.l.google.com.
_xmpp-server._tcp.wavesandbox.com. 86389 IN SRV	20 0 5268 xmpp-server1.l.google.com.

The SRV records returned have four parameters. The first two make up the priority. First, the records with the lowest first number should be taken. Then, the records with the lowest second number should be taken. One of those records should be tried randomly, and if something fails, the next one should be taken. In this case, we use "0 0 5268 xmpp-server.l.google.com". The two remaining parameters are the TCP port, 5268, and the hostname, xmpp-server.l.google.com. This hostname should then be resolved by its general A or AAAA records, to get the IP address to connect to. Usually, this will be done automatically by the system libraries.

Connecting to the server

Now, a TCP connection should be established to xmpp-server.l.google.com, port 5268. This is done by the system libraries, so it is outside the scope of this document. Once the connection is established, we should send the start of an XML stream.

XMPP consists of XML streams. In a TCP connection, both side starts an XML stream using which stanzas are sent. Stanzas, in XMPP, are complete XML elements, childs to the XML stream element. At first, one TCP connection is enough for two servers to communicate. Everything one server sends to the other, is part of or forms a complete XML stanza within the stream. When a server receives a complete XML stanza, it may respond with a complete XML stanza, or it may send an XML stanza for other reasons. When a stream eventually ends, a server sends the close tag, the other server also sends the close tag and then the XML document is complete.

So right after we connect to a server, we start our stream. Only after we have started ours, the remote server will start their stream. This is where various server implementations differ. To date, there are only two versions of XMPP: pre-1.0 and 1.0. In this document, we will speak XMPP 1.0 as in [XMPP], but we must be prepared to service other servers too. This is described later in this document.

The <stream:stream> tag has a few required attributes, like the XML namespaces to use, the XMPP version, and our own domain name. A server must have a domain name it services. For example, for Jabber @gmail.com contacts, the domain is "gmail.com", and for MyFirstXmppService @xmppservice.myfirst.org, the domain is "xmppservice.myfirst.org". In this example, we will assume our domain name is "ruwave.org", and the domain we are connecting to is "acmewave.com". You might notice this is Google Wave-related, but that doesn't matter, as at this point XMPP servers act the same no matter the service running on top of XMPP.

We send our stream start tag. It contains a general XML namespace to indicate we are a server, and another general XML namespace to indicate we're speaking using a standard XML stream. We also send an XML namespace for "dialback", a service required by many XMPP servers, described later in this document.

<stream:stream	xmlns='jabber:server'
		xmlns:db='jabber:server:dialback'
		xmlns:stream='http://etherx.jabber.org/streams'
		version='1.0'
		from='dazjorz.com'>

Promptly, the other server responds with a similar stream start.

<stream:stream	xmlns='jabber:server'
		xmlns:db='jabber:server:dialback'
		xmlns:stream='http://etherx.jabber.org/streams'
		version='1.0'
		from='acmewave.com'
		id='abcdefg'>

As you see, this server also speaks XMPP 1.0. It's identifier for this conversation is "abcdefg". This is not important at this point. Because this server speaks XMPP 1.0, it will also send another features tag right after this. Had the server been speaking XMPP pre-1.0, there would have been no 'version' attribute to the stream:stream tag, and it would not send this 'features' tag.

<stream:features>
	<starttls xmlns='urn:ietf:params:xml:ns:xmpp-tls'/>
	<dialback xmlns='urn:xmpp:features:dialback'><optional/></dialback>
</stream:features>

This server apparantly supports both TLS and Dialback, the latter of which it deems optional.

Securing, identifying, authenticating

[XMPP] says the following (5.1):

Support for STARTTLS is REQUIRED in XMPP client and server

implementations. An administrator of a given deployment MAY necessitate the use of TLS for client-to-server communication, server-to-server communication, or both. A deployed client SHOULD use TLS to secure its stream with a server prior to attempting the completion of SASL negotiation (Section 6), and deployed servers SHOULD use TLS between two domains for the purpose of securing

server-to-server communication.

In our experience, there were some servers that didn't know TLS. Either way, this one does, so we ask to initiate TLS by sending the following stanza:

<starttls xmlns='urn:ietf:params:xml:ns:xmpp-tls'/>

Before doing anything, we will wait for the other side to confirm:

<proceed xmlns='urn:ietf:params:xml:ns:xmpp-tls'/>

We should now do a library call to enable TLS. For example, in Qt, we would now call the 'startClientEncryption' method on our socket, which should be a 'QSslSocket'. How you should do this in your case is outside the scope of this document. At least it is important to know that many servers require the TLS certificate to be trusted. This means that it must not be a self-signed certificate. Other servers don't care and allow you to send any certificate.

We will assume you have succesfully set up a TLS connection. Inside the TLS connection, the old stream is forgotten and a new one started. The process for this is not different from the earlier process, except that "starttls" should not be listed in <stream:features> this time around. For example, now the server sends:

<stream:features>
<dialback xmlns='urn:xmpp:features:dialback'><optional/></dialback>
</stream:features>

It depends on the other servers' policy what happens here. If it is policy that only trusted certificates are accepted by a server in the TLS handshake, seeing a trusted certificate for a domain could be enough for a server to trust the identity of the other side. However, in our experiences, servers almost always also require Dialback at this point.

TODO

Dialback contact in XMPP as per [XMPPDB]

Server Dialback is a feature of XMPP to provide weak identity verification using DNS and a secret string shared among XMPP server within a single domain. It can only be used to confirm that a server is part of a certain domain, and is therefore unsuitable for authentication. The order of events during Dialback in one direction is as follows:

  1. The Connecting Server makes a dialback key and sends it to the Receiving Server.
  2. The Receiving Server connects to an Authorative Server on the domain the Connecting Server claims to be from.
  3. The Receiving Server sends the key to the Authorative Server.
  4. The Authorative Server determines if the key is from an XMPP server on its domain.
  5. The Authorative Server sends back the result to the Receiving Server.
  6. The Receiving Server sends the dialback result to the Connecting Server.

If the key is deemed valid, this sequence is repeated with the roles of the Connecting Server and the Receiving Server switched around. After the dialback result in both directions was valid the servers have weakly identified themselves and the exchange of other XMPP stanzas may now begin.

This graph from [XMPPDB] shows the first half of the sequence quite well:

Originating               Receiving
  Server                    Server
-----------               ---------
    |                          |
    |  [if necessary,          |
    |   perform DNS lookup     |
    |   on Target Domain,      |
    |   open TCP connection,   |
    |   and establish stream]  |
    | -----------------------> |
    |                          |                   Authoritative
    |   send dialback key      |                       Server
    | -------(STEP 1)--------> |                   -------------
    |                          |                          |
    |                          |  [if necessary,          |
    |                          |   perform DNS lookup,    |
    |                          |   on Sender Domain,      |
    |                          |   open TCP connection,   |
    |                          |   and establish stream]  |
    |                          | -----------------------> |
    |                          |                          |
    |                          |   send verify request    |
    |                          | -------(STEP 2)--------> |
    |                          |                          |
    |                          |   send verify response   |
    |                          | <------(STEP 3)--------- |
    |                          |
    |  report dialback result  |
    | <-------(STEP 4)-------- |
    |                          |

Example

The Connecting Server sends a dialback key to the Receiving Server:

<db:result
    from='ruwave.org'
    to='acmewave.com'>
    56vz1vb8zdr4bt8b4r7gz6rds5v1dbt7hkjykjuo654dez7se
</db:result>

The Receiving Server opens an XMPP connection to the Authoritative Server of ruwave.org (not shown) and sends:

<db:verify
     from='acmewave.com'
     to='ruwave.org'
     id='db01'>
    56vz1vb8zdr4bt8b4r7gz6rds5v1dbt7hkjykjuo654dez7se
</db:verify>

The Receiving Server is then told by the ruwave.org's Authoritative Server whether the key is valid or invalid (valid here):

<db:verify
     from='ruwave.org'
     to='acmewave.com'
     id='db01'
     type='valid'/>

The Receiving Server then returns the result to the Connecting Server:

<db:result
    from='acmewave.com'
    to='ruwave.org'
    type='valid'/>

The identify of ruwave.org has now been weakly verified, and it's acmewave's turn to verify itself.

The Receiving Server sends a dialback key to the Connecting Server:

<db:result
    from='acmewave.com'
    to='ruwave.org'>
    1bxt65ju4khcl64hdzh4j7uk9ck4j6s4fs
</db:result>

The Connecting Server opens an XMPP connection to the Authoritative Server of acmewave.com (not shown) and sends:

<db:verify
     from='acmewave.com'
     to='ruwave.org'
     id='db001'>
    1bxt65ju4khcl64hdzh4j7uk9ck4j6s4fs
</db:verify>

The Connecting Server is then told by the acmewave's Authoritative Server whether the key is valid or invalid (valid here again):

<db:verify
     from='ruwave.org'
     to='acmewave.com'
     id='db001'
     type='valid'/>

The Connecting Server then returns the result to the Receiving Server to finalize dialback for a single stream:

<db:result
    from='acmewave.com'
    to='ruwave.org'
    type='valid'/>

In this example both dialback results came back valid, thus this stream is now ready for the actual exchange of XMPP stanzas. If either of the results had been invalid, an error stanza would've been sent and the TCP connection terminated.

SASL

SASL is used for authentication.

 TODO

Example

S: <stream:features>
     <mechanisms xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>
       <mechanism>PLAIN</mechanism>
       <required/>
     </mechanisms>
   </stream:features>
C: <auth xmlns='urn:ietf:params:xml:ns:xmpp-sasl'
            mechanism='PLAIN'>AHVzZXJvbmUAbWF5ZW50ZXI=</auth>
S: <success xmlns='urn:ietf:params:xml:ns:xmpp-sasl'/>

Example where a client logs in succesfully with username 'userone' and password 'mayenter'.

 TODO

Resource Binding

As per [XMPP Core], a client should do resource binding after succesfully authenticating using SASL.

Resource binding is a method to create a unique identifier for a stream between a server and a client by combining the client's identifier (usually their username given during SASL negotiation), the domain name of the server the client has an account at, and the resource id generated during the resource binding step into a full jabber id. The resulting jabber id must be unique among the currently running streams serverside. Resource binding must be advertised in the stream features by the server after a succesful SASL negotiation with a client:

<stream:features>
        <bind xmlns='urn:ietf:params:xml:ns:xmpp-bind'/>
</stream:features>

After this the client has two options: ask the server to generate the resource id, or generate the resource id itself. In the first case the client sends an iq stanza with an empty bind element:

<iq id='foodiebar' type='set'>
    <bind xmlns='urn:ietf:params:xml:ns:xmpp-bind'/>
</iq>

The server then gives the resource id by returning the full jabber id:

<iq id='foodiebar' type='result'>
    <bind xmlns='urn:ietf:params:xml:ns:xmpp-bind>
        <jid>
             foobar@dazjorz.com/0dbe485be953a
        </jid>
    </bind>
</iq>

In the second case the client generates a resource id and sends that to the server:

<iq id='foodiebar' type='set'>
    <bind xmlns='urn:ietf:params:xml:ns:xmpp-bind>
        <resource>
             foobar@dazjorz.com/iliekcookiez
        </resource>
    </bind>
</iq>

The server then returns the full jabber id if there was no error and there was no policy forbidding clientgenerated resource ids:

<iq id='foodiebar' type='set'>
    <bind xmlns='urn:ietf:params:xml:ns:xmpp-bind>
        <jid>
             foobar@dazjorz.com/iliekcookiez
        </jid>
    </bind>
</iq>
 TODO

Session establishment

XMPP 3921bis ([XMPPIM]) says the following:

     Note: [RFC3921] specified one additional precondition: formal
     establishment of an instant messaging and presence session.
     Implementation and deployment experience has shown that this
     additional step is unnecessary.  However, for backward
     compatibility an implementation SHOULD still offer that feature
     and note in the stream feature that negotiation of the feature is
     discretionary (via the <optional/> child element).  This enables
     older software to connect while saving newer software to skip a
     round trip.

Therefore, implementing session support in a server implementing RFC 3921bis, is as simple as implementing these two steps:

  • 1. Advertise optional session support in the stream:features. When a client is authenticated using SASL, append this to your stream:features:
 <session xmlns="urn:ietf:params:xml:ns:xmpp-session"><optional/></session>
  • 2. Respond to the session set IQ request. This can be pretty much a direct reply using only two variables, our own hostname and the IQ request ID.
 C: <iq id="my_iq_id" to="example.org" type="set"><session xmlns="urn:ietf:params:xml:ns:xmpp-session"/></iq>
 S: <iq from='example.org' type='result' id='qxmpp4' />

After this, the session has been established and even older clients will have initialized the connection completely.

Sources