josuah.net

Choosing the destination MX

how does qmail choose its destination mail server?

In qmail-remote.c, a variable prefme is compared with the .pref field from each item of an ipalloc struct (an array of struct { ip; pref; }).

qmail-remote.c
...
333   static ipalloc ip = {0};
...
396   for (i = 0;i < ip.len;++i)
397     if (ipme_is(&ip.ix[i].ip))
398       if (ip.ix[i].pref < prefme)
399         prefme = ip.ix[i].pref;
...

What is .pref and where is it taking its data from?

In dns.c, there is a function dns_mxip() (still with K&R declaration):

dns.c
...
312 int dns_mxip(ia,sa,random)
313 ipalloc *ia;
314 stralloc *sa;
315 unsigned long random;
316 {
...

Inside, we have a call to findmx(), iterating on all the MX records found from DNS.

dns.c
...
350  while ((r = findmx(T_MX)) != 2)
351   {
352    if (r == DNS_SOFT) { alloc_free(mx); return DNS_SOFT; }
353    if (r == 1)
...

The position, preference and IP of the MX record are passed through static global variables to dns_mxip() (why not... no threads here).

dns.c
...
 25 static union { HEADER hdr; unsigned char buf[PACKETSZ]; } response;
 26 static int responselen;
 27 static unsigned char *responseend;
 28 static unsigned char *responsepos;
 29
 30 static int numanswers;
 31 static char name[MAXDNAME];
 32 static struct ip_address ip;
 33 unsigned short pref;
...

For each result, ia is filled with the IPs one by one through static int dns_ipplus(ipalloc *ia, stralloc *sa, int pref):

dns.c
...
369  while (nummx > 0)
370   {
...
389    switch(dns_ipplus(ia,&mx[i].sa,mx[i].p))
390     {
391      case DNS_MEM: case DNS_SOFT:
392        flagsoft = 1; break;
393     }
...
397   }

And that is where qmail-remote gets its list of preferences: from DNS only, no configuration impacting ip[].ix.pref at all.

What is the effect of comparing prefme with each ip[].ix.pref from DNS?

Just below if (flagallaliases) prefme = 500000;, it chooses the first ip.[].ix.pref ip that is lower than the prefme threshold.

qmail-remote.c
...
404   for (i = 0;i < ip.len;++i)
405     if (ip.ix[i].pref < prefme)
406       break;
...

Just above if (flagallaliases) prefme = 500000;, the loop finds the lowest pref value for IPs matching those of the server (SIOCGIFCONF) and store it into pref.

qmail-remote.c
...
395   prefme = 100000;
396   for (i = 0;i < ip.len;++i)
397     if (ipme_is(&ip.ix[i].ip))
398       if (ip.ix[i].pref < prefme)
399         prefme = ip.ix[i].pref;
...

ipme.c
...
 42   struct ip_mx ix;
...
 97       byte_copy(&ix.ip,4,&sin->sin_addr);
 98       if (ioctl(s,SIOCGIFFLAGS,x) == 0)
 99         if (ifr->ifr_flags & IFF_UP)
100           if (!ipalloc_append(&ipme,&ix)) { close(s); return 0; }
...

That filters out all the MX entries that have a lower preference than qmail's own IPs. That looks like having the effect of capturing the mail in case qmail-remote has to send it out, but the server it send it to has the same IP (in other words: send a mail to myself through IP).

It is still possible to send a mail to another mail server among the MX entries, even if there is an IP of the local interface in the pool of MX responses : with a lower MX than the one that match our IPs.

This looks like a safeguard against misconfiguration: a mail for a same IP as one of which qmail listen on needs to be send through qmail-local, not through qmail-remote!

Why a high value for prefme in some cases?

qmail-remote.c
...
401   if (relayhost) prefme = 300000;
402   if (flagallaliases) prefme = 500000;
...

When the mail server needs to relay everything to somewhere, or for these addrmangle cases, (special case) qmail-remote bypasses this mechanism: allow all ips regardless of the context, and gonna do what it gotta do what it gotta do: forward to the first of the IPs found.

Why 500000, why not 5725 or 42?

qmail-remote.c
...
395   prefme = 100000;
...
401   if (relayhost) prefme = 300000;
402   if (flagallaliases) prefme = 500000;
...

Debugging purposes? rfc1035 says DNS rr for MX preferences are unsigned 16 bits, so it maxes out to 65536, and we cannot have a value so hight coming from the DNS.

A printf() after the two loop would show whether the mechanism was triggered or not.