Digit Sum

loopspace

2025-03-22

Home
1. Solving the Puzzle
2. Whither Polynomials?
3. A Source of Perturbation
4. Least Value
5. Surprising Sequences
6. Conclusion

1 Solving the Puzzle

A little while ago, Jo Morgan posted a puzzle on twitter that she'd found in one of her journeys through different maths texts:

[S]uppose a positive two-digit whole number is divided by the sum of its digits, how can you find the largest and smallest possible answers without searching for all possible answers on your calculator?

I gave it a miss, as it ticks all the boxes of puzzles that I don't like. A bit later, Ben Orlin wrote a short post on his blog about it. I skimmed through it, saw which puzzle it was, and was ready to give it a miss again when I read the last paragraph which seemed to address me directly. So I thought a little bit about it, and ended up with the following solution.

Theorem 1

Let $p (x) \in ℝ [x]$ be a polynomial with only positive coefficients1. Let $a > b > 0$ be real numbers and let $d = \deg (p)$ . Then

1At the risk of starting a flame war, zero is a positive number.

\frac{a^{d}}{b^{d}} \geq \frac{p (a)}{p (b)}

Proof

Let $p (x) = c_{0} + c_{1} x + c_{2} x^{2} + \dots + c_{d} x^{d}$ . By assumption, $c_{j} \geq 0$ . Then

\begin{aligned} \frac{a^{d}}{b^{d}} p (b) & = \frac{a^{d}}{b^{d}} (c_{0} + c_{1} b + c_{2} b^{2} + \dots + c_{d} b^{d}) \\ = c_{0} (\frac{a}{b})^{d} + c_{1} (\frac{a}{b})^{d - 1} a + c_{2} (\frac{a}{b})^{d - 2} a^{2} + \dots + c_{d} a^{d} \\ \geq c_{0} + c_{1} a + c_{2} a^{2} + \dots + c_{d} a^{d} \\ = p (a) \end{aligned}

That last inequality comes from the fact that since $\frac{a}{b} > 1$ and each $c_{j} \geq 0$ , then $c_{j} (\frac{a}{b})^{d - j} \geq c_{j}$ .

Since $b > 0$ and each $c_{j} \geq 0$ , $p (b) \geq 0$ and so rearranging yields

\frac{a^{d}}{b^{d}} \geq \frac{p (a)}{p (b)}

as required.

Corollary 2

Fix a base $k$ and length of representation $d$ . Let $n$ be a number whose representation in base $k$ has length $d$ . Let $σ_{k} (n)$ be the digit sum of $n$ in this representation and consider the quotient:

\frac{n}{σ_{k} (n)}

Then this achieves its maximum of $k^{d - 1}$ when $n$ has representation $a 0 \dots 0_{k}$ (with $d$ digits in total).

Proof

Clearly, if $n$ has representation $a 0 \dots 0_{k}$ , then:

\frac{n}{σ_{k} (n)} = \frac{a 0 \dots 0_{k}}{a} = 10 \dots 0_{k} = k^{d - 1}

Now let $n$ have representation $c_{d - 1} c_{d - 2} \dots c_{1} c_{0 k}$ . Define $p (x)$ to be the polynomial $c_{0} + c_{1} x + \dots + c_{d - 1} x^{d - 1}$ . Then the digit sum of $n$ is $p (1)$ and $n$ itself is $p (k)$ . Hence

\frac{n}{σ_{k} (n)} = \frac{p (k)}{p (1)}

and by the above Theorem, this is bounded above by $k^{d - 1}$ .

Note that this establishes the largest value. The original problem asked for both the largest and smallest values, but originally I overlooked that it asked for both and focussed completely on finding the largest value. After explaining where the above came from, I'll return to the problem of the smallest values as it turns out to be a little more intricate.

2 Whither Polynomials?

The idea of using polynomials to study this problem was sparked by a conjunction of a few things:

I've studied what are called integral polynomials in the past, and there'd been another problem posted on twitter recently that recalled that to mind.
Ben Orlin's post suggests looking at the original problem for other bases, which means needing some flexibility.
James Tanton's Exploding Dots is quite prevalent on twitter, and the mathematical underpinnings of the main idea is that there is a close relationship between polynomials and representations of an integer in a given base.

These all meant that when thinking of how to study the original problem, the idea of passing to polynomials wasn't an unnatural one. It was as if it was already at the back of my mind, and just needed a nudge to move to the front.

The final nudge was that the question is about maximising something. This suggests using calculus, but applying calculus to actual numbers is slightly problematic. On the other hand, applying it to polynomials is very natural.

But it is not the polynomials themselves that I'm applying the techniques of calculus to. I don't intend to differentiate a polynomial. Rather, it is the space of polynomials. I intend to differentiate a curve of polynomials.

Let me back-track slightly. The advantage of polynomials over representations of numbers in the Exploding Dots saga is that one doesn't have to worry about carries when doing arithmetic. Or rather, one deals with the arithmetic first and the carries second, rather than interleaving them. So to subtract $97$ from $134$ , you would subtract in columns to get $1 (3 - 9) (4 - 7) = 1 (- 6) (- 3)$ and then recognise this as $100 - 60 - 3 = 37$ . The initial stage is analogous to arithmetic of polynomials.

Now throw in the concept of calculus, and specifically of small variation. With polynomials, it is possible to have things like $. 5 x + . 25$ . We're not, though, used to seeing $(. 5) (. 25)$ as the representation of a number in base $10$ (it would be $. 5 \times 10 + . 25 = 5.25$ ). But if we're trying to use the techniques of calculus to study this situation, we need to be able to consider small variations of a thing. So we'd better put that thing in a place where small variations are possible. Instead of studying, say, $67$ , we study $6 x + 7$ so that we can also consider $6.1 x + 7.1$ .

In actual fact, this took me down a slight blind alley. I did consider the map $ℝ [x] \to ℝ$ defined by:

p (x) \mapsto \frac{p (k)}{p (1)}

for $k \in ℝ$ , and looked at what happens to this when $p$ is perturbed to $p + h q$ for small $h$ . The idea is quite simple: if $p$ maximises this quotient then perturbing it in any direction should mean that the value of the quotient goes down and so studying its behaviour under small change will help find maxima. This can be made rigorous, but at this stage I wasn't concerned with that.

Unfortunately, there's a big flaw in the argument. The map above is not defined for the whole of $ℝ [x]$ because we can have $p (1) = 0$ . And as we approach a polynomial with $p (1) = 0$ , then the quotient can get arbitrarily large. So maximising the quotient over the whole of $ℝ [x]$ (or at least where the quotient is defined) is not a viable line of enquiry.

But nothing should be discarded completely, even if it itself leads nowhere. And although the direct application of calculus didn't work, the idea of considering perturbations did.

3 A Source of Perturbation

The techniques of calculus aren't just about finding maxima and minima. Differential calculus hinges on the idea of "If I just change the input a little bit, what will happen to the output?". The derivative tells us how to answer that question.

Now we're really interested in $ℕ [x]$ rather than $ℝ [x]$ , but we can nevertheless take the idea of perturbing the input and seeing what happens to the output. The other thing that calculus teaches us is that, providing the function is "nice" in some technical fashion, we don't need to consider every possible perturbation but just "enough".

So let's start with a polynomial $p \in ℝ [x]$ 2 of a fixed degree, say $d$ , and consider perturbing it in the simplest fashion. That would be adding $1$ to just one of its coefficients. Let us define $e_{j} \in ℝ [x]$ to be the polynomial $e_{j} (x) = x^{j}$ , then we're considering what happens to the quotient $p (a) / p (b)$ when we replace $p$ by $p + e_{j}$ with $j \leq d$ . A little algebraic manipulation leads to:

2we'll specialise a little as we go through, but it can be instructive to see where the restrictions are needed

\begin{aligned} \frac{(p + e_{j}) (a)}{(p + e_{j}) (b)} & = \frac{p (a) + a^{j}}{p (b) + b^{j}} \\ = \frac{p (a)}{p (b)} \times \frac{1 + \frac{a^{j}}{p (a)}}{1 + \frac{b^{j}}{p (b)}} \end{aligned}

So the quotient increases when we replace $p$ by $p + e_{j}$ if that second term in the product is greater than $1$ .

Here's where we make our assumptions. We want everything to be positive so that we don't have to worry about the inequality when multiplying and dividing. The simplest way to achieve that is to assume that $a, b \geq 0$ and all the coefficients of $p$ are positive. We can therefore manipulate the condition that the second term is greater than one as follows.

\begin{aligned} \frac{1 + \frac{a^{j}}{p (a)}}{1 + \frac{b^{j}}{p (b)}} & > 1 \\ 1 + \frac{a^{j}}{p (a)} & > 1 + \frac{b^{j}}{p (b)} \\ \frac{a^{j}}{p (a)} & > \frac{b^{j}}{p (b)} \end{aligned}

Lastly, we obtain:

\frac{a^{j}}{b^{j}} > \frac{p (a)}{p (b)}

So by considering what happens when we perturb the polynomial $p$ slightly, we end up with the above inequality to consider. Let us pause to consider what that means: if $j$ is such that the inequality in (1) holds, then the quotient for $p + e_{j}$ is larger than that for $p$ and so we want to "move" in that direction.

Now if we assume that $a > b$ , then if the inequality in (1) is satisfied for a particular $j$ , it is also satisfied for any larger $j$ , such as the degree of $p$ itself. So an interesting question is to determine, for a given polynomial $p$ , the minimum $j$ such that the inequality in (1) holds.

In telling the story of my approach I should take a moment to say that because the original problem was posed for two-digit numbers, and I originally only considered it for two-digit numbers, I didn't realise at first the significance of the inequality in (1). With only two directions to consider, the general form in (1) is too simple to see the full structure. It was only when I considered the general case that I saw how this inequality is not just a piece of the solution but is, in fact, the solution in its entirety.

The required realisation was that (1) can be written as:

\frac{e_{j} (a)}{e_{j} (b)} > \frac{p (a)}{p (b)}

and that $e_{j}$ is a perfectly valid polynomial in its own right. So if we can show that for a given polynomial $p$ , this inequality holds for some $j \leq d$ , then it must hold for $d$ , the degree of $p$ . And if we can show that, then we have established that

\frac{e_{d} (a)}{e_{d} (b)} > \frac{p (a)}{p (b)}

and so the quotient is maximised among polynomials of degree at most $d$ when we take a monomial of degree $d$ .

This then led me to hypothesise and prove Theorem 1, whose proof (and implication for the original problem) is given at the outset.

Because of the role that the inequality in (1) will come to play in looking for the minimum value of the quotient, I feel it worth pointing out that although it gave me the idea behind Theorem 1, the more general form isn't actually used. So if I were to write this more in the style of an academic paper, Theorem 1 would appear with very little evidence of where it came from.

4 Least Value

In my original working on this problem I completely ignored the part of the question asking about the minimum value of the quotient. This wasn't deliberate – once I started thinking about the largest value then I forgot about the original setting altogether. It was only much later (and indeed after an earlier version of this was posted on my website) that I realised I'd missed out half the problem. Having used polynomials to such success on the first part, I wanted to see if I could do so for the second.

Obviously, my first strategy was to try to mirror the work of the search for the maximum value. A similar analysis leads to wanting to have the inequality:

\frac{a^{j}}{b^{j}} < \frac{p (a)}{p (b)}

As for the maximum, if this holds for some $j$ then it holds for all smaller $j$ . Mirroring the argument for the maximum, we'd then expect to want to work with monomials of minimum degree. However, we can't do this because we have fixed the degree of $p$ . It does, though, suggest that we will get the minimum value for the quotient if we push the coefficients of $p$ into lower degree.

We can make this more precise. Let $p$ be a polynomial of degree $d$ with positive coefficients $c_{j}$ . Let $\hat{p}$ be $p$ without its term of degree $d$ . Let $a > b > 0$ . Define $q$ to be the polynomial $c_{d} x^{d} + \hat{p} (b)$ . Then $q (b) = p (b)$ and

\begin{aligned} q (a) & = c_{d} a^{d} + \hat{p} (b) \\ = c_{d} a^{d} + \hat{p} (a) + \hat{p} (b) - \hat{p} (a) \\ = p (a) + \hat{p} (b) - \hat{p} (a) \end{aligned}

As $p$ has positive coefficients and $a > b > 0$ , $\hat{p} (a) \geq \hat{p} (b)$ so $q (a) \leq p (a)$ . Thus

\frac{q (a)}{q (b)} \leq \frac{p (a)}{p (b)}

Thus any suitable polynomial $p$ can be replaced by one of the form $c_{d} x^{d} + c_{0}$ with a smaller quotient. This suggests that we only need to look at polynomials of this form. A little more work shows that actually we only need to look at polynomials of the form $x^{d} + c$ and that the quotient gets smaller as $c$ gets bigger.

This looks promising, but there is a tiny snag. To apply the work to the original problem we need the coefficients of the polynomial bounded above by some value (in fact, we'll have $c_{j} < a$ ). So we can't simply keep increasing the constant term.

What this analysis does tell us is that finding minima in the space of all polynomials, or even all polynomials with positive coefficients, is unlikely to be a fruitful strategy. So we need new ideas.

At this point I have a confession to make that will not surprise anyone who has read A Tale of Two Puzzles. My next move was to write a computer program to find the minimum value of the quotient for a variety of different conditions. In decimal notation, what I found was that the minimum for two digits was at $19$ , at three for $109$ , and then at $109 \dots 9$ until we get to $15$ digits, at which point there are two zeros until around $115$ digits. My rather simple program reaches the limit of the precision of my computer at this juncture. With base $2$ , the results of the calculations are smaller and it can be seen that the number of terms of the form $10 \dots 01 \dots 1$ with a given number of zeros which minimises the quotient roughly doubles (in fact, it appears to be $2^{n} + 1$ which seems reasonable).

Intrigued by this, I returned to the inequality of 2 and thought a bit more about it. The problem with Inequality 2 is that it is a wish-list rather than a fact. If that inequality is satisfied then increasing the $j$ th coefficient of $p$ will reduce the quotient, but that doesn't help with figuring out when that inequality will be satisfied.

It took me several blind alleys and more time than I'm willing to admit to, to realise that it was better to rearrange it into the following (using the fact that everything is positive so multiplying up doesn't change the inequality sign):

a^{j} p (b) - b^{j} p (a) < 0

The crucial insight of arranging it like this is that the term that corresponds to the $x^{j}$ th term in $p (x)$ vanishes. So this is unaffected by changing the $j$ th coefficient within $p$ and means that if the inequality in 3 is satisfied, then increasing the coefficient of the $x^{j}$ th term to as high as it can be reduces the value of the quotient. Conversely, if the inequality in 3 goes the other way then we want to reduce the coefficient of the $x^{j}$ th term as much as possible.

The form of the inequality in 2 shows that there is some $j_{0} < d$ such that 2 holds for all $j \leq j_{0}$ and doesn't hold for all $j > j_{0}$ . This means that for $j > j_{0}$ we reduce the coefficient as far as possible, meaning that we set it to $0$ for $j \neq d$ and $1$ for $j = d$ , while for $j \leq j_{0}$ we set it as high as we're allowed, which recalling the origin of the problem means that we set it to $a - 1$ .

So our minimising polynomial is:

p (x) = (a - 1) (1 + x + x^{2} + \dots + x^{j_{0}}) + x^{d}

and $j_{0}$ is the maximum power such that the inequality in 3 is satisfied.

Writing $1 + x + x^{2} + \dots + x^{j_{0}}$ as $\frac{x^{j_{0} + 1} - 1}{x - 1}$ , we can rewrite our minimising polynomial as:

p (x) = x^{d} + (a - 1) \frac{x^{j_{0} + 1} - 1}{x - 1}

Substituting in $x = a$ yields:

p (a) = a^{d} + (a - 1) \frac{a^{j_{0} + 1} - 1}{a - 1} = a^{d} + a^{j_{0} + 1} - 1

Although we might be interested in general $b$ , we're most interested in $b = 1$ . Substituting this into the original expression yields:

p (1) = (a - 1) (j_{0} + 1) + 1 = j_{0} a + a - j_{0}

Putting this into the inequality we see that $j_{0}$ is the largest integer such that:

\begin{aligned} a^{j_{0}} (j_{0} a + a - j_{0}) - a^{d} - a^{j_{0} + 1} + 1 & < 0 \\ j_{0} a^{j_{0} + 1} + a^{j_{0} + 1} - j_{0} a^{j_{0}} - a^{d} - a^{j_{0} + 1} + 1 & < 0 \\ j_{0} a^{j_{0} + 1} - j_{0} a^{j_{0}} - a^{d} + 1 & < 0 \\ j_{0} a^{j_{0}} (a - 1) & < a^{d} - 1 \\ j_{0} a^{j_{0}} & < \frac{a^{d} - 1}{a - 1} \end{aligned}

While this isn't quite a formula for $j_{0}$ , it is a criterion that can be easily used to find it.

In conclusion, then, the solution for the minimum value of the quotient for $d$ digits in base $10$ is achieved for the number:

10 \dots 09 \dots 9

where the number of $9$ s is the largest number $j_{0}$ satisfying:

j_{0} 10^{j_{0}} < \frac{10^{d} - 1}{9}

This feels a little more complicated than the situation for the maximum value.

5 Surprising Sequences

Using new symbols, the relation:

\max {k : k b^{k} < \frac{b^{n} - 1}{b - 1}}

leads to several interesting sequences.

First there is simply the sequence whose $n$ th term is the above maximum. This sequence is largely dull. With base $b = 10$ , it starts out:

0, 1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 11, 12, 13, 14, 15, 16, 17, 18, 19

The difficulty here is that the sequence only does something interesting at increasingly infrequent intervals. The next hiccough occurs with the value $111$ , and after that at $1111$ , and so on.

Rearranging the defining inequality shows a bit more precisely what is going on. Multiplying up by $b - 1$ gives:

(b - 1) k b^{k} < b^{n} - 1

Now $b^{n} - 1$ is the largest number with $n$ digits in base $b$ . So this says that $(b - 1) k b^{k}$ must have at most $n$ digits in base $b$ . Normally, increasing $n$ by $1$ means that $k$ also increases by $1$ because the dominating factor in the $(b - 1) k b^{k}$ is the $b^{k}$ . But every so often, increasing $k$ by $1$ also adds to the number of digits and so going from $(b - 1) (k - 1) b^{k - 1}$ to $(b - 1) k b^{k}$ increases the number of digits by $2$ . This causes the pauses in the sequence.

The presence of the $b - 1$ factor is what means that this happens when $k$ has representation all $1$ s in base $b$ .

What is more interesting is to record the $n$ s at which these hiccoughs occur. In base $10$ , we get:

1, 3, 14, 115, 1116, 11117, \dots

We get an equivalent pattern in other bases, for example in base $3$ the sequence starts:

1, 3, 7, 17, 45, 127, 371, 1101, 3289, \dots

and note that:

\begin{aligned} 3 + 1 + 3 & = 7 \\ 9 + 3 + 1 + 4 & = 17 \\ 27 + 9 + 3 + 1 + 5 & = 45 \\ 81 + 27 + 9 + 3 + 1 + 6 & = 127 \end{aligned}

The general formula is:

a_{n} = b^{n - 2} + b^{n - 3} + \dots + b + 1 + n = \frac{b^{n - 1} - 1}{b - 1} + n

Interestingly, the OEIS has an entry for this sequence in base $2$ (A006127) and for base $3$ (A233656) but not for any other base that I've searched, including base $10$ ; maybe I should send that one in3.

3One problem, I guess, is that there are an infinite number of these sequences.

6 Conclusion

Although the original approach of using calculus to study the problem proved to be a red herring, it did put me on the right path to discover an approach that worked. Having taught and researched in calculus for many years, I've come to the conclusion that calculus is such an extremely successful concept that Mathematicians have continually sought to push it into other areas of Mathematics, whether it wants to go there or not. Although I didn't ultimately use any of the techniques of calculus, using the ideas of it proved successful. So, sometimes, it's worth trying something strange to attack a problem because you never know what ideas that might spark.

Using calculus here might seem like the classic sledgehammer to crack a walnut, but in Mathematics we are in the unique position of being able to crack the walnut and then reassemble it to try different ways to crack it. By watching carefully how it cracks under the sledgehammer, we can sometimes see just the right place to tap it gently to make it simply fall apart.

Lastly, calculus is awesome. Someone should write a book about it.

Contents