Digit Sum

loopspace

2019-11-26

# 1 Solving the Puzzle

A little while ago, Jo Morgan posted a puzzle on twitter that she'd found in one of her journeys through different maths texts. I gave it a miss, as it ticks all the boxes of puzzles that I don't like. A bit later, Ben Orlin wrote a short post on his blog about it. I skimmed through it, saw which puzzle it was, and was ready to give it a miss again when I read the last paragraph which seemed to address me directly. So I thought a little bit about it, and ended up with the following solution.

Theorem 1

Let $p\left(x\right)\in ℝ\left[x\right]$ be a polynomial with only positive coefficients1. Let $a>b>0$ be real numbers and let $d=\mathrm{deg}\left(p\right)$. Then

1At the risk of starting a flame war, zero is a positive number.

 $\frac{{a}^{d}}{{b}^{d}}\ge \frac{p\left(a\right)}{p\left(b\right)}$

Proof

Let $p\left(x\right)={c}_{0}+{c}_{1}x+{c}_{2}{x}^{2}+\dots +{c}_{d}{x}^{d}$. By assumption, ${c}_{j}\ge 0$. Then

 $\begin{array}{rl}\frac{{a}^{d}}{{b}^{d}}p\left(b\right)& =\frac{{a}^{d}}{{b}^{d}}\left({c}_{0}+{c}_{1}b+{c}_{2}{b}^{2}+\dots +{c}_{d}{b}^{d}\right)\\ & ={c}_{0}\left(\frac{a}{b}{\right)}^{d}+{c}_{1}\left(\frac{a}{b}{\right)}^{d-1}a+{c}_{2}\left(\frac{a}{b}{\right)}^{d-2}{a}^{2}+\dots +{c}_{d}{a}^{d}\\ & \ge {c}_{0}+{c}_{1}a+{c}_{2}{a}^{2}+\dots +{c}_{d}{a}^{d}\\ & =p\left(a\right)\end{array}$

That last inequality comes from the fact that since $\frac{a}{b}>1$ and each ${c}_{j}\ge 0$, then ${c}_{j}\left(\frac{a}{b}{\right)}^{d-j}\ge {c}_{j}$.

Since $b>0$ and each ${c}_{j}\ge 0$, $p\left(b\right)\ge 0$ and so rearranging yields

 $\frac{{a}^{d}}{{b}^{d}}\ge \frac{p\left(a\right)}{p\left(b\right)}$

as required.

Corollary 2

Fix a base $k$ and length of representation $d$. Let $n$ be a number whose representation in base $k$ has length $d$. Let ${\sigma }_{k}\left(n\right)$ be the digit sum of $n$ in this representation and consider the quotient:

 $\frac{n}{{\sigma }_{k}\left(n\right)}$

Then this achieves its maximum of ${k}^{d-1}$ when $n$ has representation $a0\dots {0}_{k}$ (with $d$ digits in total).

Proof

Clearly, if $n$ has representation $a0\dots {0}_{k}$, then:

 $\frac{n}{{\sigma }_{k}\left(n\right)}=\frac{a0\dots {0}_{k}}{a}=10\dots {0}_{k}={k}^{d-1}$

Now let $n$ have representation ${c}_{d-1}{c}_{d-2}\dots {c}_{1}{c}_{0}$. Define $p\left(x\right)$ to be the polynomial ${c}_{0}+{c}_{1}x+\dots +{c}_{d-1}{x}^{d-1}$. Then the digit sum of $n$ is $p\left(1\right)$ and $n$ itself is $p\left(k\right)$. Hence

 $\frac{n}{{\sigma }_{k}\left(n\right)}=\frac{p\left(k\right)}{p\left(1\right)}$

and by the above Theorem, this is bounded above by ${k}^{d-1}$.

In the rest of this, I'll explain where that came from.

# 2 Whither Polynomials?

The idea of using polynomials to study this problem was sparked by a conjunction of a few things:

1. I've studied what are called integral polynomials in the past, and there'd been another problem posted on twitter recently that recalled that to mind.

2. Ben Orlin's post suggests looking at the original problem for other bases, which means needing some flexibility.

3. James Tanton's Exploding Dots is quite prevalent on twitter, and the mathematical underpinnings of the main idea is that there is a close relationship between polynomials and representations of an integer in a given base.

These all meant that when thinking of how to study the original problem, the idea of passing to polynomials wasn't an unnatural one. It was as if it was already at the back of my mind, and just needed a nudge to move to the front.

The final nudge was that the question is about maximising something. This suggests using calculus, but applying calculus to actual numbers is slightly problematic. On the other hand, applying it to polynomials is very natural.

But it is not the polynomials themselves that I'm applying the techniques of calculus to. I don't intend to differentiate a polynomial. Rather, it is the space of polynomials. I intend to differentiate a curve of polynomials.

Let me back-track slightly. The advantage of polynomials over representations of numbers in the Exploding Dots saga is that one doesn't have to worry about carries when doing arithmetic. Or rather, one deals with the arithmetic first and the carries second, rather than interleaving them. So to subtract $97$ from $134$, you would subtract in columns to get $1\left(3-9\right)\left(4-7\right)=1\left(-6\right)\left(-3\right)$ and then recognise this as $100-60-3=37$. The initial stage is analogous to arithmetic of polynomials.

Now throw in the concept of calculus, and specifically of small variation. With polynomials, it is possible to have things like $.5x+.25$. We're not, though, used to seeing $\left(.5\right)\left(.25\right)$ as the representation of a number in base $10$ (it would be $.5×10+.25=5.25$). But if we're trying to use the techniques of calculus to study this situation, we need to be able to consider small variations of a thing. So we'd better put that thing in a place where small variations are possible. Instead of studying, say, $67$, we study $6x+7$ so that we can also consider $6.1x+7.1$.

In actual fact, this took me down a slight blind alley. I did consider the map $ℝ\left[x\right]\to ℝ$ defined by:

 $p\left(x\right)↦\frac{p\left(k\right)}{p\left(1\right)}$

for $k\in ℝ$, and looked at what happens to this when $p$ is perturbed to $p+hq$ for small $h$. The idea is quite simple: if $p$ maximises this quotient then perturbing it in any direction should mean that the value of the quotient goes down and so studying its behaviour under small change will help find maxima. This can be made rigorous, but at this stage I wasn't concerned with that.

Unfortunately, there's a big flaw in the argument. The map above is not defined for the whole of $ℝ\left[x\right]$ because we can have $p\left(1\right)=0$. And as we approach a polynomial with $p\left(1\right)=0$, then the quotient can get arbitrarily large. So maximising the quotient over the whole of $ℝ\left[x\right]$ (or at least where the quotient is defined) is not a viable line of enquiry.

But nothing should be discarded completely, even if it itself leads nowhere. And although the direct application of calculus didn't work, the idea of considering perturbations did.

# 3 A Source of Perturbation

The techniques of calculus aren't just about finding maxima and minima. Differential calculus hinges on the idea of "If I just change the input a little bit, what will happen to the output?". The derivative tells us how to answer that question.

Now we're really interested in $ℕ\left[x\right]$ rather than $ℝ\left[x\right]$, but we can nevertheless take the idea of perturbing the input and seeing what happens to the output. The other thing that calculus teaches us is that, providing the function is "nice" in some technical fashion, we don't need to consider every possible perturbation but just "enough".

So let's start with a polynomial $p\in ℝ\left[x\right]$2 of a fixed degree, say $d$, and consider perturbing it in the simplest fashion. That would be adding $1$ to just one of its coefficients. Let us define ${e}_{j}\in ℝ\left[x\right]$ to be the polynomial ${e}_{j}\left(x\right)={x}^{j}$, then we're considering what happens to the quotient $p\left(a\right)/p\left(b\right)$ when we replace $p$ by $p+{e}_{j}$ with $j\le d$. A little algebraic manipulation leads to:

2we'll specialise a little as we go through, but it can be instructive to see where the restrictions are needed

 $\begin{array}{rl}\frac{\left(p+{e}_{j}\right)\left(a\right)}{\left(p+{e}_{j}\right)\left(b\right)}& =\frac{p\left(a\right)+{a}^{j}}{p\left(b\right)+{b}^{j}}\\ & =\frac{p\left(a\right)}{p\left(b\right)}×\frac{1+\frac{{a}^{j}}{p\left(a\right)}}{1+\frac{{b}^{j}}{p\left(b\right)}}\end{array}$

So the quotient increases when we replace $p$ by $p+{e}_{j}$ if that second term in the product is greater than $1$.

Here's where we make our assumptions. We want everything to be positive so that we don't have to worry about the inequality when multiplying and dividing. The simplest way to achieve that is to assume that $a,b\ge 0$ and all the coefficients of $p$ are positive. We can therefore manipulate the condition that the second term is greater than one as follows.

 $\begin{array}{rl}\frac{1+\frac{{a}^{j}}{p\left(a\right)}}{1+\frac{{b}^{j}}{p\left(b\right)}}& >1\\ 1+\frac{{a}^{j}}{p\left(a\right)}& >1+\frac{{b}^{j}}{p\left(b\right)}\\ \frac{{a}^{j}}{p\left(a\right)}& >\frac{{b}^{j}}{p\left(b\right)}\end{array}$

Lastly, we obtain:

$\frac{{a}^{j}}{{b}^{j}}&>\frac{p\left(a\right)}{p\left(b\right)}$
 1

So by considering what happens when we perturb the polynomial $p$ slightly, we end up with the above inequality to consider. Let us pause to consider what that means: if $j$ is such that the inequality in (1) holds, then the quotient for $p+{e}_{j}$ is larger than that for $p$ and so we want to "move" in that direction.

Now if we assume that $a>b$, then if the inequality in (1) is satisfied for a particular $j$, it is also satisfied for any larger $j$, such as the degree of $p$ itself. So an interesting question is to determine, for a given polynomial $p$, the minimum $j$ such that the inequality in (1) holds.

In telling the story of my approach I should take a moment to say that because the original problem was posed for two-digit numbers, and I originally only considered it for two-digit numbers, I didn't realise at first the significance of the inequality in (1). With only two directions to consider, the general form in (1) is too simple to see the full structure. It was only when I considered the general case that I saw how this inequality is not just a piece of the solution but is, in fact, the solution in its entirety.

The required realisation was that (1) can be written as:

 $\frac{{e}_{j}\left(a\right)}{{e}_{j}\left(b\right)}>\frac{p\left(a\right)}{p\left(b\right)}$

and that ${e}_{j}$ is a perfectly valid polynomial in its own right. So if we can show that for a given polynomial $p$, this inequality holds for some $j\le d$, then it must hold for $d$, the degree of $p$. And if we can show that, then we have established that

 $\frac{{e}_{d}\left(a\right)}{{e}_{d}\left(b\right)}>\frac{p\left(a\right)}{p\left(b\right)}$

and so the quotient is maximised among polynomials of degree at most $d$ when we take a monomial of degree $d$. And this is precisely what Theorem 1 shows is true.

# 4 Conclusion

Although the original approach of using calculus to study the problem proved to be a red herring, it did put me on the right path to discover an approach that worked. Having taught and researched in calculus for many years, I've come to the conclusion that calculus is such an extremely successful concept that Mathematicians have continually sought to push it into other areas of Mathematics, whether it wants to go there or not. Although I didn't ultimately use any of the techniques of calculus, using the ideas of it proved successful. So, sometimes, it's worth trying something strange to attack a problem because you never know what ideas that might spark.

Using calculus here might seem like the classic sledgehammer to crack a walnut, but in Mathematics we are in the unique position of being able to crack the walnut and then reassemble it to try different ways to crack it. By watching carefully how it cracks under the sledgehammer, we can sometimes see just the right place to tap it gently to make it simply fall apart.

Lastly, calculus is awesome. Someone should write a book about it.