Overview

This document covers the selection process for a 1.11B domain.

Summary

Gen.XYZ offers domain names that meet specific criteria for $1.00 per year. There is nothing special about these other than their price. The Registrar has decided that because they are numeric only they have limited use for real humans. They offer some suggested use cases.

Sample Use Cases

Suggested UseExample
App Testing0000001.xyz
VoIP number9998422.xyz
Dates06022017.xyz
Sequential blocks to pair with serial numbers12300000.xyz – 12399999.xyz

What do we know so far?

  • We can get access to a cheap domain name.
  • The domain name might not be human friendly.
  • The domain name will contain only digits.
  • The number of characters in the domain will ≥ 6 and ≤ 9.

What to look for?

We can look into interesting numbers that are less than 10 digits long. What are some interesting things to look at?

  • Is the number prime?
  • Is the number a palindrome?
  • Are all the digits in the number unique?

How do we do it?

Here is some Python code that will answer these questions. Some things to note, this code starts at the lowest numbers and works higher. This code isn’t fast.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
import logging
log = logging.getLogger(__name__)
logging.basicConfig(
        format='%(asctime)s %(levelname)-8s %(message)s',
        filename='run.log',
        level=logging.DEBUG
        )

def is_prime(n):
    if n <= 1:
        return False
    for i in range(2, int(n**0.5) + 1):
        if n % i == 0:
            return False
    return True

def is_palindrome(num):
    num = str(num)
    return num == num[::-1]

def has_unique_digits(i):
    data = {}
    for c in str(i):
        data[c] = data.get(c, 0) + 1
    for (k, v) in data.items():
        if v > 1:
            return False
    return True

def main():
    lower =           1
    upper = 999_999_999
    for i in range(lower, upper):
        z = str(i).zfill(6)
        if i % 10_000 == 0:
            p = (i / upper ) * 100
            log.debug(f"i: {z} / {upper} -- {p:.02f}");
        if len(str(i)) > 0 and is_prime(i) and has_unique_digits(i):
            log.info(str(i).zfill(6) + ".xyz")

if __name__ == "__main__":
    main()

After running this you’ll end up with an run.log that contains the interesting domains.

After spot checking some of these it becomes clear that some of these are already registered. We should exclude those.

We can do better.

Possible solutions

How do we know if a domain is registered? We have a few options.

  • whois1 the domain.
  • check for an SOA DNS resource record.

The registrar has Guidance for use of the Whois Service. As part of this guidance they mention rate limits. The key take away is If the query rate from an "untrusted" source exceeds 9,000 queries per hour, the source is permanently blocked.. I’d rather not get permanently blocked while doing research and without exceeding this 9,000 queries per hour this project is infeasible.

We’re going to use a DNS query for the SOA record for the domain. We have our own internal resolver. This should spread the load more. We’re talking to N authoritative name servers instead of 1 WHOIS server.

Before we go further I should point out a few things.

  • Math is hard.
  • Reading is hard.
  • This project was for fun.
  • There are probably better ways to do this.

We’re going to update the code to use asyncio. This will get us more throughput. There are some things we need to think about.

  • How do we limit concurrency?
  • How can we resume this if something goes wrong?
  • How can we track progress?

We can use asyncio.Semaphore to limit the number of dns queries. Every so often we can write the current we’re checking to a file, we can use the contents of this file as the starting number. This buys us checkpointing. Every N numbers we can log a line that shows what number we’re processing. We’ll get some additional metadata like a timestamp.

Updated code on Github

Results?

Through the magic of time we now know that it takes ~20 hours to process all of the possible numbers. After processing we’re left with 280290 possible domain names to choose from for our next project.

The final code and a complete list of available domains that match the afore mentioned criteria and be found in the Github repository.

Out of the frying pan into the fire.

We now have a strong case of choice paralysis.


  1. See the WHOIS Protocol Specification for more details. ↩︎