Die Grundlehren der mathematischen Wissenschaften in Einzeldarstellungen mit besonderer Beriicksichtigung der Anwendungsgebiete Band 148 H erausgegeben von J. L. Doob . E. Heinz' F. Hirzebruch . E. Hopf· H. Hopf W. Maak . S. MacLane . W. Magnus· D. Mumford M. M. Postnikov· F. K. Schmidt· D. S. Scott· K. Stein Geschaftsfohrende H erausgeber B. Eckmann und B. L. van der Waerden K. Chandrasekharan Introduction to Analytic Number Theory I Springer-Verlag New York Inc. 1968 Prof. Dr. K. Chandrasekharan Eidgeniissische Technische Hochschule Ziirich Geschiiftsfiihrende Herausgeber: Prof. Dr. B. Eckmann Eidgeniissiscbe Technische Hochschule Ziirich Prof. Dr. B. L. van der Waerden Mathernatisches Institut der Universitiit Ziirich ISBN-13: 978-3-642-46126-2 e-ISBN-13: 978-3-642-46124-8 DOl: 10.1007/978-3-642-46124-8 Aile Rechte vorbehalten. Kein Teil dieses Buches darf ohne schriftliche Genehrnigung des Springer·Verlages iibersetzt odeT in irgendeiner Form vervietnUtigt werden. © by Springer-Verlag Berlin· Heidelberg 1968 Softcover reprint of the hardcover 1st edition 1968 Library of Congress Catalog Card Number 68-21990 Titel-Nr. 5131 Preface This book has grown out of a course of lectures I have given at the Eidgenossische Technische Hochschule, Zurich. Notes of those lectures, prepared for the most part by assistants, have appeared in German. This book follows the same general plan as those notes, though in style, and in text (for instance, Chapters III, V, VIII), and in attention to detail, it is rather different. Its purpose is to introduce the non-specialist show to some of the fundamental results in the theory of numbers, how analytical methods of proof fit into the theory, and to prepare the ground for a subsequent inquiry into deeper questions. It is published in this series because of the interest evinced by Professor Beno Eckmann. to I have to acknowledge my indebtedness to Professor Carl Ludwig Siegel, who has read the book, both in manuscript and in print, and made a number of valuable criticisms and suggestions. Professor Raghavan Narasimhan has helped me, time and again, with illuminating comments. Dr. Harold Diamond has read the proofs, and helped me to remove obscurities. I have to thank them all. August 1968 K.C. Contents Chapter I The unique factorization theorem § 1. § 2. § 3. § 4. § 5. § 6. Primes . . . . . . . . . . . . The unique factorization theorem. . . . . . . . . A second proof of Theorem 2 . . . . . . . . . . . Greatest common divisor and least common multiple Farey sequences . . . . The infinitude of primes. . . . . . . . . . . . . I I 3 5 6 9 Chapter II Congruences § 1. Residue classes. . . . . . . . . . . . § 2. Theorems of Euler and of Fermat. . . . § 3. The number of solutions of a congruence 11 13 15 Chapter III Rational approximation of irrationals and Hurwitz's theorem § 1. § 2. § 3. § 4. Approximation of irrationals . Sums of two squares . . Primes of the form 4k ± 1 Hurwitz's theorem . . . 18 20 21 22 Chapter IV Quadratic residues and the representation of a number as a sum of four squares § 1. § 2. § 3. § 4. The Legendre symbol. . . . . . . . Wilson's theorem and Euler's criterion Sums of two squares Sums of four squares . . . . . . . . 26 27 29 31 Chapter V The law of quadratic reciprocity § 1. Quadratic reciprocity . . . . . . . . . § 2. Reciprocity for generalized Gaussian sums . . . . . 34 34 Contents § 3. Proof of quadratic reciprocity § 4. Some applications . . . . . VII 39 42 Chapter VI Arithmetical functions and lattice points § I. § 2. § 3. § 4. § 5. § 6. 45 45 Generalities . . . . . . . . The lattice point function r(n) The divisor function d(n) The function u(n). . . . The Mobius function ~ (n) Euler's function rp(n) . . 47 54 55 59 Chapter VII Chebyshev's theorem on the distribution of prime numbers § I. § 2. § 3. § 4. § 5. The Chebyshev functions Chebyshev's theorem . . Bertrand's postulate. . . Euler's identity. . . . . Some formulae of Mertens 63 67 71 76 81 . . . . Chapter VIII Wey!'s theorems on uniform distribution and Kronecker's theorem § 1. § 2. § 3. § 4. § 5. Introduction. . . . . . . . . . . . . Uniform distribution in the unit interval . Uniform distribution modulo! . Weyl's theorems. . Kronecker's theorem . . . . . 84 84 86 87 91 Chapter IX ~inkowski's theorem on lattice points in convex sets § I. Convex sets . . . . . § 2. Minkowski's theorem. § 3. Applications. . . . . 97 98 102 Chapter X Dirichlet's theorem on primes in an arithmetical progression § I. Introduction . § 2. Characters. . . . . . . . . . . . . . . . . . . . . . . . . 105 107 VIII Contents § 3. Sums of characters, orthogonality relations. § 4. Dirichlet series, Landau's theorem § 5. Dirichlet's theorem. . . . . . . . . . . 109 111 117 Chapter XI The prime number theorem § 1. The non-vanishing of ( (1 + it) § 2. The Wiener-Ikehara theorem § 3. The prime number theorem 122 124 128 A list of books Notes Subject index . 131 132 139 Chapter I The unique factorization theorem § 1. Primes. We assume as known the positive integers 1,2,3, ... , the negative integers - 1, - 2, - 3, ... , and zero, which we reckon as an integer. By the non-negative integers we mean the positive integers together with zero. We assume as known the elementary arithmetical operations on integers. An integer a is said to be divisible by an integer b ,*0, if there exists an integer c, such that a = b c. We then say that b divides a, or b is a divisor of a, and indicate this by writing bla. We also say that a is an integral multiple or just a multiple of b. We write b,r a to indicate that b does not divide a. The following propositions are easily verified: if bla, and a>O, and b>O, then 1 ~b~a; if b la, and c Ib, then cia; if b la, and c ,*0, then b clac; if cia, and clb, then c!(ma+nb), for all integers m and n. '* Given two integers a and b, b 0, there exist unique integers q and r, such that a=bq+r, where O~r< Ibl. We call q the quotient, and r the remainder in the division of a by b. If bla, then r=O. An integer p, where p> 1, is a prime number, or a prime, if its only positive divisors are 1 and p. An integer greater than 1, which is not a prime, is called composite. In this chapter we shall prove that every integer greater than 1 can be represented as a product of primes, and that such a representation as a product is unique, except for the order of the factors. We shall also prove that there exist infinitely many primes. § 2. The unique factorization theorem. We begin with the following simple THEOREM 1. If n is an integer greater than 1, then n is a product of primes. PROOF. Either n is a prime, or it is composite. In the former case, there is nothing more to prove. If n is composite, 'then, by definition, there exist integers d, such that 1 < d < n, and din. Let m be the least of such 1 Chandrasekharan, Analytic Number Theory The unique factorization theorem 2 divisors. Then m must be a prime, for otherwise there exists an integer k, such that 1 <k<m, and kim. That would imply that kin, and 1 <k<m, which contradicts the definition ofm. Thus m is a prime Pl' say. We then write n = Pl . r, where 1 < r < n, and repeat the same process with r, to obtain n = Pl . P2 . s, where P2;?; Pl' and 1:;:; s < r < n. This process clearly breaks off after a finite number of steps, since there are only finitely many integers between 1 and n. We therefore obtain (1) which concludes the proof. We note, in passing, that if n=ab, then a and b cannot both be It follows that any composite integer n has a prime greater than factor p, such that P:;:; By grouping together the equal primes in the representation (1), and changing the indices, if necessary, we can rewrite (1) as Vn. Vn. (2) where Pl <P2 < ... <Pk, and a;>O, for i= 1,2, ... , k. This is called the standard form of n. We are now in a position to prove the unique factorization theorem, which is also known as the fundamental theorem of arithmetic (Theorem 2). THEOREM 2. The standard form of an integer n, which is greater than 1, is unique. We shall give three proofs of this theorem. The first proof uses only Theorem 1. The second is connected with the solution oflinear equations in integers, while the third makes use of the theory of F arey sequences. FIRST PROOF OF THEOREM 2. The standard form of a prime is clearly unique. Suppose, if possible, that some positive integers > 1 have two different standard forms. Let N be the smallest such integer, with Every P is distinct from every q, since any prime common to both the representations would divide N to yield an integer N' < N with the same property as N, which is impossible by the definition of N. We may assume that § 3. A second proof of Theorem 2 3 Now PI =!=ql· Let us suppose, as we may, that PI <ql. We define the number Since PIIP, and PIIN, it follows that PII(N - P), where N -P=(ql -pdqz ... qm> 1. Therefore we can write (3) where the ti are primes for i = 1,2, ... , h. We can also write ql -PI as a product of primes, say if ql - PI> 1. Then we get (4) as another representation of N -P as a product of primes. We have seen that none of the p's is equal to a q. In particular, PI is not equal to any q. Nor is PI equal to any r, for it is clear that pd/(ql -PI)' so that no factorization of ql - PI can contain Pl. Thus the integer N - P has two factorizations, namely (3) and (4), which are distinct, since only one of them contains Pl. This is the case even if ql - PI = 1. But 1 < N - P < N, which contradicts the minimality of N. Hence there exists no integer n> 1 with more than one standard form. § 3. A second proof of Theorem 2. This is based on the solution of certain linear equations in integers. We need some preparation. Let a and b denote integers, not both zero. Their greatest common divisor, denoted by (a, b), is defined to be the largest positive integer which divides both a and b. If (a,b)=1, we say that a is prime to b, or that a and b are relatively prime. We shall see that if (a,b)=d, the equation ax+by=d has a solution in integers x,y.1t follows from this that if P is a prime, and plab, then pia or plb, and this, in turn, implies the unique factorization theorem. A non-empty set of integers S with the property mES and nES => m-nES, is called a module. It follows from the definition that if m,nES, then O=m-mES, 1* -n=O-nES, m+n=m-( -n)ES. The unique factorization theorem 4 More generally, if aES, bES, then ax+b YES, where x and yare integers. If a module contains only 0, we call it the trivial module. A nontrivial module obviously contains infinitely many positive, and negative, integers. We can say a little more. THEOREM 3. Every non-trivial module S consists of all integral multiples of a positive integer. PROOF. Since S is not the trivial module, it contains some positive integers. Let d be the smallest such integer. Then S contains all integral multiples of d. In order to show that these are the only elements of S, take any nES. We can write n=dk+c, where k and c are integers, and O~c<d. Since dES, it follows that dkES. Since nES, we have n-dkES, that is CES. But c<d, and d is the smallest positive integer in S. Hence c = 0. Therefore n is an integral multiple of d. From this we deduce THEOREM 4. If a and b are given integers, the module S = {a x + by}, where x and yare integers, is the set of all integral multiples of d=(a,b). PROOF. It is easy to see that the set S is a module. By Theorem 3 we know that S is the set of all integral multiples of some positive integer e. Therefore e divides all elements of S; in particular, ela, and elb. Since d is the greatest common divisor of a and b, we must have e~d. On the other hand, dl(ax + by) for all integers x,y, so that d divides every element of S. In particular, die. Hence d~e. Thus e=d, and the result follows. It is now clear that the following theorem holds: THEOREM 5. The equation ax + by = n is soluble in integers x and y if and only if (a,b)ln. COROLLARY 1. If (a,b)=d, then ax+by=d is soluble in integers x and y. In other words, the greatest common divisor of a and b is a linear combination of these integers with integer coefficients. COROLLARY 2. Any common divisor of a and b divides (a,b). These results lead to THEOREM 6 (EUCLID). If albc, and (a,b)=1, then alc. PROOF. Since (a,b) = 1, there exist integers x and y, such that ax+by=1. If we multiply by c, we get acx+bcy=c, and since albc, it follows that al(acx+bcy), or ale. COROLLARY. If p is a prime, and p li01 Pi' i=1,2, ... ,r, then P=Pi for at least one i. where Pi is a prime for § 4. 5 Greatest common divisor and least common multiple We are now in a position to give A SECOND PROOF OF THEOREM 2. Suppose that an integer N has two different standard forms, Then p11q~1 q~2 ... q~r, hence, by the Corollary of Theorem 6, P1 = qi for some i, 1 ~ i ~ r. In the same way we see that every p equals some q, and every q equals some p. Therefore k = r, and since both forms are arranged in ascending order, we have with P1 < P2 < ... < Pk' We shall see that a i = bi for i = 1,2, ... ,k. For if a i > bi for some i, we can divide both sides by pri and obtain where Pi divides the left-hand side, but not the right-hand side, which is impossible. Similarly it is impossible that a i < bi' Hence a i = bi for all i, and the standard form is unique. § 4. Greatest common divisor and least common multiple. Related to the greatest common divisor of two integers a and b, defined in § 3, is the least common multiple. DEFINITION. The least common multiple {a,b} of two integers a and b, where ab =1= 0, is the smallest positive integer which is divisible by both a and b. The relationship between (a,b) and {a,b}, where ab>O, pressed by the identity ab=(a,b)·{a,b}. IS ex- (5) To prove this, consider the integer f.1=ab/(a,b). Since (a,b)lb, f.1 is an integral multiple of a. Similarly f.1 is an integral multiple of b. Thus f.1 is a common multiple of a and b. Let v be an integer which is some other common integral multiple of a and b, and consider the number v f.1 v'(a,b) ab We know that (a,b)=ax+by for some integers x and y. Hence v -= f.1 v'(ax+by) vx vy =-+-. ab b a The unique factorization theorem 6 But via and vlb are integers, hence vl/1 is an integer. Thus any common integral multiple of a and b is an integral multiple of /1. Hence /1 is their least common multiple, and ab /1 = (a,b) = {a,b}. Incidentally we have shown that the least common multiple of a and b divides any common multiple of a and b. If a is a positive integer, we can write IJ(~O, where the product extends over all primes p, and IJ( is a non-negative integer which is zero except for finitely many p. If a prime p does not divide a, then the corresponding exponent IJ( is zero. Similarly we have {J ~ O. It is easy to see that (a,b) = TIpffiin[a,/lJ, (6) § 5. Farey sequences. If hand k are integers, and k > 0, we call hlk a fraction, with numerator h, and with denominator k. A fraction hlk is called irreducible, or reduced, if (h, k) = 1. A fraction hlk is called proper, if 0 ~ hlk ~ 1. A Farey sequence of order n, where n is a positive integer, is the sequence Fn of all irreducible, proper fractions hlk, with 1 ~ k ~ n, arranged in non-decreasing order. For example, Fs is the sequence A Farey fraction is a term in a Farey sequence of some order. We note that every rational number min, such that 0 ~ min ~ 1, is equal to a Farey fraction. It follows from the unique factorization theorem (Theorem 2) that a reduced fraction is unique. In other words, two reduced fractions which are equal must be identical. Since we do not wish to use Theorem 2, however, we have to allow for the possibility that two Farey fractions may be equal without being identical. In that case, we arrange them in increasing order of their numerators. The following theorem rules out such a possibility in fact, and prepares the ground for a third proof of Theorem 2. THEOREM 7 (F AREY-CAUCHY). If 11m is the immediate successor of hlk in the Farey sequence FN , then kl- hm = 1. Farey sequences § 5. 7 The result is seen to be true, by actual verification, for FN , We shall assume it true for FN, and prove it for FN+1 • Let alb be a reduced proper fraction which does not belong to FN. Then b;;::: N + 1, and alb must lie between some two consecutive fractions hlk and 11m of FN, say PROOF. 1~N~5. h k a I m - ~ - ~-, b equality being allowed, since the uniqueness of reduction of a fraction is not assumed. Define the integers A and p, as follows: A=ka-hb, p,=bl-am. Then A;;::: 0, p,;;::: 0, and A+ P, > 0, since we have assumed the theorem to be true for FN, to which hlk and 11m belong. Further AI + p,h = kal- ham = a(kl- hm) = a, since kl-hm= 1 by the induction hypothesis on FN. Similarly (7) Am+p,k=b, and (A,p,) = 1, since (a,b)=1. Thus, if a b Al+p,h ,A;;:::O, Am+p,k p,;;:::0, hlk~alb~llm, A+p,>O, (a,b)=l, then (A,p,)=1. Conversely, if A and p, are integers, such that A;;:::O, p,;;:::0, A+p,>O, (A,p,)=l, and we define a,b by a=AI+p,h, b=Am+p,k, then uniquely A=ka-hb, p,=bl-am, and (a,b)=l, so the fraction alb is reduced, and hlk~alb~llm, since kl-hm=l. Thus alb belongs to FM , for some M. Since k>O, m>O, (A,p,) = 1, we also see that b~m+k exactly in the three cases A,p,=O,l; 1,1; 1,0; giving a,b=h,k; l+h, m+k; I,m. Now A#O, for if A=O, then alb = (p,h)/(Pk), which is not reduced unless p, = 1, in which case b = k by (7), and that contradicts the assumption that b;;::: N + 1 > k. Similarly p, # 0. Hence b ~ m + k only if A=p,=l. Now b;;:::N+1, and if (alb)EFN+l' 'then b=N+l. Further m+k;;:::N+1, since 8 The unique factorization theorem hlk and 11m being consecutive terms in FN. It follows that if b = N + 1, then A. = 1 and It = 1. Hence a a h+l bEFN + l => a=h+l,b=k+m, b = k+m' and this fraction alb clearly satisfies the theorem with respect to its neighbours hlk and 11m, since kl-hm=1, by the induction hypothesis on FN. Thus the theorem holds for FN+ l if it holds for FN. Since we know that it does hold for Fl , it holds for all Fn. It follows from Theorem 7 that a reduced fraction is unique. DEFINITION. Thefraction (h+l)I(k+m) is called the mediant of the fractions hlk and 11m. Implicit in the proof of Theorem 7 is the result that the mediant of two Farey fractions is a Farey fraction, as well as THEOREM 8. The fractions which belong to FN+ l but not to FN are mediants of the neighbouring fractions in FN. A consequence of Theorem 7 is THEOREM 9. If hlk, h"lk", h'lk' are successive fractions belonging to the same Farey sequence, then h"lk" = (h+h')j(k+k'). PROOF. By Theorem 7, we have kh"-hk"=1, and k"h'-h"k'=1, and by subtraction we get the required relation. THEOREM 10. If hlk and 11m are two successive fractions in a Farey sequence FN' then k + m ~ N + 1. PROOF. Since h k< h+l I k+m < m' the mediant of hlk and 11m does not belong to FN, hence k+m > N. Finally we prove THEOREM 11. If N > 1, no two successive fractions in FN have the same denominator. PROOF. Let k > 1. If h'lk is the immediate successor of hlk in FN, then h + 1 ~ h' < k, and we would have h k< h h +1 k -1 < -k- ~ h' k· Thus hl(k-1) would lie between hlk and h'lk in FN, which contradicts our assumption about hlk and h'lk. The infinitude of primes § 6. 9 THIRD PROOF OF THEOREM 2. We can now apply our knowledge of Farey sequences to prove that the equation ax+b Y= 1, where (a,b)= 1, is soluble in integers x,y. This implies, as we have already seen, Theorem 2. Since the conclusion is trivially true when ab = 0, or when a = b, we shall suppose that b > a > 0, and (a, b) = 1. Consider the fraction a/b. It occurs as a term in a Farey sequence, for example in Fb • Let h/k be its immediate predecessor in that sequence. Then by Theorem 7 we have ka-hb=l, so that x=k and y= -h give a solution of our equation. § 6. The infinitude of primes. We have obtained three different proofs of the unique factorization theorem. We shall now show that there are infinitely many primes. THEOREM 12 (EUCLID). The number of primes is infinite. We shall give two different proofs of this theorem, the first by Eucl~d, and the second by G. P61ya. A third proof, due to Euler, is given in Chapter VII, § 1. FIRST PROOF OF THEOREM 12 (EUCLID). Let 2,3,5, ... ,p be the set of all primes up to p, and consider the integer q=(2·3·5 ... p)+1. It is not divisible by any of the primes up to p. Since q> 1, either q is itself a prime greater than p, or is divisible by a prime greater than p. In either case, there exists a prime greater than p. Hence the number of primes is infinite. If Pn denotes the nth prime, it follows from this argument that PmlilPi +1 n Pi+ 1 <p~+ 1 for n> 1. i=1 n for an m > n. Hence Pn+1 ~Pm ~ Actually the argument can be made to yield a little more. One can prove that n~l, with Pn < 22n - 1 for n > 1. For suppose that p1~2, p2~22, P3~24' ... ,Pn~22n-1. Then and we have the required result by induction. The unique factorization theorem 10 P61ya's proof of Theorem 12 uses a property of Fermat numbers. A Fermat number In is an integer of the form In=22" + 1, n~ 1. We shall see that Theorem 12 is a consequence of THEOREM 13 (POLYA). Any two different Fermat numbers are relatively prime. PROOF. Let In and In+k (k > 0) be any two Fermat numbers. Suppose that m is a positive integer, such that ml/n' and ml/n+k' Setting x = 2 2", we have In+k-2 = x2k_l = x2k-1_x2k-2+ ... _1 h x+l ' so that In l(fn +k - 2). It follows that m l(fn +k - 2). Since m also divides In+k' this implies that m12. But Fermat numbers are odd. Therefore m = 1, which proves Theorem 13. SECOND PROOF OF THEOREM 12 (POLYA). It follows from Theorem 13 that each of the Fermat numbers 11'/2'''''/n is divisible by an odd prime which does not divide the others. Hence there are at least n odd primes not exceeding In. Consequently there are infinitely many primes. Further, if we allow n = 0, with 10 = 3, then since P1 = 2, and there are at least n odd primes not exceeding In for n ~ 1, we obtain Pn+2 ~/n' where Pn denotes the nth prime. That is Pn+2 ~ 2 2 " + 1, which is better than the previous estimate. Fermat observed that 11=5, 12=17, 13=257, 14=65537 are all primes, and conjectured that all In are primes. This was disproved, however, by Euler, who showed that Is is divisible by 641. A simple proof, due to G. T. Bennett, runs as follows: Is = 2 25 +1 = 232+1 = (2'2 7)4 +1. Set 27 = a, and 5 = b. Then Is = (2a)4+ 1 = 24 a4+ 1. Now 24 = 1 +3b, or 24 = 1 + b(a - b3). Hence Is = (1 +ab-b4)a4 +1 = (1 +ab)[a4 +(I-ab)(1 +a2b2)], which implies that l+ab(=641) divides Is. It does not seem to be known whether any Fermat numbers, other than the first four, are primes. Chapter II Congruences § 1. Residue classes. Let a, b, and m be integers, and m >0. We say that a is congruent to b modulo m, if ml(a - b). We express this in symbols as: a == b(modm), and call it a congruence. If m,t'(a-b), we say that a is incongruent to b modulo m, and write a 'jE b (mod m). The congruence relation is an equivalence relation, for it is reflexive, since a==a(modm); symmetric, since a==b(modm) implies b==a(modm); and transitive, since a==b(modm) and b==c(modm) imply a==c(modm). Thus the relation "== (mod m)" partitions the integers into disjoint equivalence classes A, B, C, ... , such that two integers are congruent modulo m if and only if they lie in the same class. These classes are called residue classes modulo m. Clearly the integers 0,1, ... ,m-1 all lie in different residue classes. Since any integer n can be written as n=qm+r, 0::0; r::O;m-1, every integer is congruent modulom to one of the integers 0,1, ... ,m -1. Therefore there are exactly m residue classes modulo m, and the integers 0,1, ... , m -1 form a set of representatives of these classes. Congruences can be added, subtracted, or multiplied, like ordinary equalities. If a == b (mod m), and c == d (mod m), then a + c == b + d (mod m), a-c == b-d(modm), and ac == bd(modm). For, if ml(a-b), and ml(c-d), then ml{(a-b)±(c-d)}; further ml(a-b)c, so that ac==bc (modm); and ml(c-d)b, so that bc == bd(modm); and since the congruence relation is transitive, we have ac == bd(modm). In general one cannot divide congruences. We have 2 == 12 (mod 10), but 1'jE 6 (mod 10). Let A and B be two residue classes. Then, according to the above rules, if a is an arbitrary element of A, and b of B, then a + b always lies in the same residue class, which we call the sum A + B. Likewise we use the notations A - B and A· B, and speak of the difference, or product, of two residue classes. It is easy to see that the residue classes modulom form an abelian group with respect to addition. The zero element of this group is the class which contains all integral multiples of m, and the inverse of a class A is the class A' which contains the negatives of all members of A. The congruence ax == c(modm) 12 Congruences II is equivalent to the linear equation ax-my=c, and, by Theorem 5 of Chapter I, we see that it has a solution in integers x,y if (a,m) = 1. The solution is unique, up to congruence, for if aX l == c (mod m), and axz == c(modm), then a(xl-xz)==O(modm), or mla(xl-xz). But since (a,m)=I, this implies that ml(x l - x z ), or Xl == Xz (modm). Therefore, if Xo,Yo is a particular solution of the linear equation ax+by=n, (a,b)=I, the general solution is given by X= Xo - b t, Y = Yo + a t, where t is an integer. We can also express the result which we have just obtained for congruences by saying that if A, C and X are residue classes modulo m, the equation A X = C has a single solution X, if the elements of A are prime to m. Those residue classes modulo m whose elements are prime to m are called prime residue classes. They form an abelian group with respect to multiplication, the unit class being the one which contains the integer 1. Each prime residue class has an inverse, for if (a,m)= 1, there exists an integer a' such that aa'==l(modm). Let us consider the additive group of all residue classes modulo a prime p. With the exception of the zero class, they are all prime residue classes, hence form also a multiplicative abelian group. The distributive law A(B + C) = A B + A C is a simple consequence of the distributive law for integers. We therefore have THEOREM I. The residue classes modulo a prime p form a field of p elements. RESIDUE SYSTEMS. We have distinguished the prime residue classes modulo m from among all the m residue classes modulo m. A complete residue system modulo m consists of one representative of each residue class. Thus a set of m integers is a complete residue system modulo m only if its members are pairwise incongruent modulo m. On the other hand, a complete prime residue system modulo m consists of one representative of each prime residue class modulo m. For example, the integers 0,1, ... , 7 form a complete residue system (mod 8), while 1,3,5 and 7 form a complete prime residue system (mod 8). Theorems of Euler and of Fermat § 2. 13 EULER'S FUNCTION cpo Euler's function cp is defined for all positive integers n by the relation: cp(n) equals the number of integers among 1,2, ... , n which are prime to n. It follows from the definition that cp(n) is also the number of prime residue classes modulo n. § 2. Theorems of Euler and of Fermat. If a l ,a2, ... , am is a complete residue system modulo m, and if k is an integer prime to m, then the set kal,ka2, ... , ka m is also a complete residue system modulo m, for these m integers are easily seen to be pairwise incongruent modulo m. More generally, if (k,m)=l, and h is some integer, the set ka;+h (i = 1,2, ... , m) is also a complete residue system modulo m. On the other hand, if r 1> r2, ... , r",(m) is a complete prime residue system modulo m, and if (a,m)= 1, then the integers ar l ,ar2, ... , ar",(m) also form a complete prime residue system. Hence or Since rl ,r2, ... , r",(m) are prime to m, we have THEOREM 2 (EULER). If (a,m)= 1, then a",(m) =. 1 (modm). A particular case of this theorem, where m is a prime, was discovered by Fermat. THEOREM 3 (FERMAT). If P is a prime, and (a,p)=l, then aP-l=.l (modp). To prove an important property of Euler's function, we need THEOREM 4. Let (m,m') = 1. If a runs through a complete residue system (modm), and a' through a complete residue system (modm'), then am' +a'm runs through a complete residue system (mod mm'). PROOF. There are mm' integers am' +a'm, and every two of them are incongruent (mod mm'), for if a~ m + a l m' =. a~ m + a2m' (mod mm'), then from which it follows, since (m,m') = 1, that a l =.a 2(modm). Similarly a'l =.a~(modm'). 14 II Congruences DEFINITION. An arithmetical function is a complex-valued function defined on the set of positive integers. An arithmetical function f is multiplicative, if (i) f is not identically zero,and (ii)(m,n)=1 implies that f(mn) =f(m) f(n). Theorem 4 can be used to prove THEOREM 5. Euler's function q; is multiplicative. PROOF. Since q;( 1) = 1, q; is not identically zero. Let (m, m') = 1, and let a and a' run through complete residue systems modulo m, and modulo m', respectively. Then, by Theorem 4, am' +a'm runs through a complete residue system (mod mm'). Therefore q;(mm') is the number of integers am' +a'm which satisfy the condition (am' +a'm, mm')= 1. But this is equivalent to the two conditions (am' +a' m,m)= 1, and (am' +a'm,m')= 1, (am',m)= 1, and (a' m, m') = 1, (a,m)= 1, and (a',m')= 1. or to orto Since there are q;(m) values of a for which (a,m)= 1, and q;(m') values of a' for which (a',m') = 1, there are q;(m)·q;(m') values of am' +a'm which are prime to mm'. Hence q;(mm') = cp(m)· cp(m'). This proof leads also to the following THEOREM 5'. If (m,m') = 1, and if a runs through a complete prime residue system (modm), and a' through a complete prime residue system (mod m'), then am' + a' m runs through a complete prime residue system (mod mm'). Theorem 5 can be used to calculate q;(n). Every integer n> 1 can be written in the standard form n= Il pfi, i= 1 so that q;(n) = Il q;(Pfi), . i= 1 and q;(n) is known if we know q;(pa) for a prime p. We have obviously q;(p) = p -1. If a> 1, consider the complete residue system modulo 15 The number of solutions of a congruence § 3. pa, namely 1,2, ... , pa. Exactly pa - 1 of these integers are not prime to pa, namely the multiples p, 2 p, 3 p, ... , pa of p. Therefore -~). q>(pa)=pa_ pa-1=pa(1 Thus r r ( 1) q>(n)= in q> (pi i) = in pii 1 - ~ , or q>(n)=n TI (1 - ~). pin (1) P Another important property of q> is given by THEOREM 6. L cp(d)=m. dim TI pii. The divisors ofm are then ofthe form TI p~i, PROOF. Let m= i; 1 where O~bi~ai' i; 1 Hence ~ CP(d)=(bl ..~hr) q> (~ Pf) = (bl.~'br) i~ cp(pfi), o~~~~ o~~~~ by Theorem 5. By writing out the terms and rearranging, we obtain a, L cp(d) = TI L q>(pf') dim = i;1 b,;O TI [q>(I)+CP(Pi)+"'+cp(pii)] i; 1 = r TI [1+(pi- 1)+p;(pi- 1)+"'+pi,-1(pi- 1)] i; 1 = TI pii=m. i; 1 § 3. The number of solutions of a congruence. We have seen earlier in this chapter that if (a,m) = 1, the linear congruence ax=c(modm) is soluble, and has, up to congruence, but one solution. We now raise the question of the number of solutions of a polynomial congruence aO x n+a 1 x n- 1 + ". +an=O(modp), where ao, a 1 , .•• , an are integers, n> 1, and P is a prime. II Congruences 16 If X is a solution of this congruence, so is any integer congruent to x (modp). For this reason, when we speak of the number of solutions of a congruence, we mean the number of residue classes whose elements satisfy the congruence. The number of solutions is therefore equal to the number of representatives of a complete residue system (modp) which satisfy the congruence. Such congruences may have solutions or not. For example x 2 = 3 (mod 7) has no solution. On the other hand, we know by Fermat's theorem (Theorem 3) that the congruence x p - I =1(modp} has the p-1 solutions x=1,2, ... ,p-1. Since x p - I =1(modp} if p,rx, we have xP=x(modp} for all x, and x p + l =x 2 (modp}, and so on; any power greater than p-1 can be reduced, so that we may assume the degree n < p. Further we shall suppose that (ao,p)= 1, to ensure that the congruence is really of degree n. The answer to the question raised at the beginning of this section is given by THEOREM 7 (LAGRANGE). The congruence aOxn+a l x n- l + ... +an=O(modp}, (ao,p)= 1 (2) has at most n solutions. PROOF. We use induction. The theorem is true for n = 1, since (ao,p) = 1. Now suppose the theorem true with n -1 in place of n. It is trivially true for the degree n, if the congruence (2) has no solution. If it does have a solution, say Xl' then (3) If we subtract this from (2), we get ao(xn-xD+al (x n - l _ X~-l)+ ... + an- l (x- xl)=O(modp}, (4) which is obviously satisfied by any solution of (2). But (4) can be written as (x-xl)(aOx n - 1 +b l x n - 2 + ... +bn_I)=O(modp), where b l ,b2 , ... , bn - l are integers which depend on Xl and on the integers ao, ... ,an - l . Therefore every solution of (2) must satisfy either the congruence (x-xl)=O(modp), § 3. 17 The number of solutions of a congruence which yields the original solution x = Xl' or aOx n - l +b l x n - 2 + ... +b n - 1 =O(modp), (ao,p)= 1, which is of degree n -1, and has, by the induction hypothesis, at most (n -1) solutions. In either case (2) can have at most n solutions, as claimed. 2 Chandrasekharan, Analytic Number Theory Chapter III Rational approximation of irrationals and Hurwitz's theorem § 1. Approximation of irrationals. Let ~ be a real number which is irrational. Then given e > 0, we know that there exists a rational number h/k, such that I~ - h/kl < e, since the set of rational numbers is dense in the space of real numbers. The problem we now wish to consider is the size of the difference I~ - h/k I as a function of k. Unless there is a statement to the contrary, we shall assume that < ~ < 1, and that h/k is irreducible, and k > 0. ° THEOREM 1. If ~ is irrational, and N a positive integer, then there exists a rational number h/k, with denominator k:::; N, such that 1 ~_~I<_1 k kN . PROOF. For any real number x, let [x] denote the integral part of x, that is the integer m, such that m:::;x<m+ 1. We then have 0< n ~ - [n~] < 1, the first inequality being strict since ~ is irrational. As n takes the values 1,2, ... , N, we get N different numbers n ~ - [n all of which lie in the open interval (0,1). Consider the N sub-intervals (O,I/N), (I/N,2/N), ... ,((N -1)jN, 1). Either each of these sub-intervals contains in its interior exactly one of the numbers n ~ - [ n ~], or there exists a sub-interval which contains more than one of them. In the first case, the interval (0, liN) contains one of the numbers, and therefore O<m~-[m~]<I/N, for an integer m such that l:::;m:::;N. That is, < ~ - [m ~]/m < limN, and we have thus found a rational number h/k with the desired property. Ifthe sub-interval (0,1/N) contains none of the numbers n ~ - [n ~], 1 :::; n:::; N, then there exists another sub-interval which contains at least two such numbers, say n~-[n~] and m~-[mn We then have two integers m and n, with 0< m < n:::; N, such that n ° l(n~-[n~])-(m~-[m~])1 or l(n-m)~-([n~]-[m~])1 1 < N' 1 < N' 19 Approximation of irrationals §1. If we set n-m=k, and [n~]-[m~]=h, then we have again 1 ";c_~I<_1 k kN' with k<N. A slightly stronger result than Theorem 1 is THEOREM 2. If ~ is irrational, and N a positive integer, then there exists a rational number hjk, with k~N, such that I~ - ~ I < k(N1+ 1)· PROOF. This can be proved in the same way as Theorem 1. Let xo=O, X 1 'X 2 ' ... 'X N , X N + 1 =1 be the N+2 different numbers 0,1, and n ~ - [ n ~], n = 1,2, ... , N, in ascending order. Then 1 is the sum of the N + 1 positive and irrational differences Xn + 1 - X n , n = 0, 1, ... , N, hence xn+l-xn<lj(N+l) for at least one value ofn. This implies, as in the proof of Theorem 1, that there exists a rational number hjk such that I~ -~I 1 < k(N +l)' where k~N. Another proof of Theorem 2 uses Farey sequences. If FN denotes the Farey sequence of order N, then, since ~ is irrational, ~¢FN for any N. But ~ lies between some two consecutive fractions ajb and cjd belonging to FN. Let ajb < ~ < cjd. Consider the mediant (a + c)j(b + d). From Chapter I we know that ajb«a+c)j(b+d)<cjd. Hence either ajb<~«a+c)j(b+d), or (a+c)j(b+d)<~<cjd. But (a+c)j(b+d)¢FN' since ajb and cjd are consecutive terms III FN. Hence b + d ~ N + 1. Therefore we have either O<~ or a - b< a+c abc-ad b+d - b = b(b+d) °< d - ~ < d c c a+c b+d = bc-ad d(b+d) = 1 b(b+d) 1 = d(b+d) ~ 1 b(N+l)' 1 ~ d(N+l)· Since ajb and cjd belong to FN, they are irreducible, and b~N, d~N. We have therefore obtained the required approximant hjk (equal to ajb or cjd). We can consider the validity of Theorem 2 when ~ is rational, say ~=ljm, with (l,m) = 1, and m>N. Then ~¢FN' and we can follow the same proof as above, except that we may now have ~ = (a + c)j(b + d), 2* 20 Rational approximation of irrationals and Hurwitz's theorem III which would not allow us to claim the strict inequality of Theorem 2. Thus we have THEOREM 3. If ~ is rational, and N a positive integer, and ~ = lim, (I,m) = 1, where m>N, then there exists an irreducible fraction h/k with denominator k:( N, such that Theorem 1 implies, since N ~ k, the following THEOREM 4. If ~ is irrational, then there exist infinitely many rationals h/k, such that This is sometimes expressed by saying that an irrational ~ can be approximated to within 1/k2 by a rational h/k. Since ~-h/k can be written as (~+n)-(h+kn)/k, where n is an integer, Theorems 1, 2, 3, and 4 are true without the assumption 0 < ~ < 1. § 2. Sums of two squares. Theorem 3 can be used to show that certain integers are representable as sums of two squares. THEOREM 5. If n and A are positive integers, such that n\(A 2 + 1), then there exist integers sand t, such that n = S2 + t 2. PROOF. The case n = 1 is trivial. We assume therefore that n ~ 2, and define N=[VnJ. Then n>N for n~2. Since n\(A 2 +1), itfollowsthat (n, A) = 1. Hence A/n is a reduced fraction with denominator n > N, and by Theorem 3 there exists a reduced fraction r/s, such that I-An - -r I :( s 1 , O<s:(N. (N + 1)s That is n \As-rn\ :( - - N +1 = n [Vn]+1 II: < V n. Let As-rn=t. Then t is an integer, and s2+t 2 =s2+(As-rn)2 =s2(A2+1)-2Asrn+r2 n 2. Since n divides the right-hand side of the equation, we must have n\(s2+t 2). But O<s:(N:(Vn, and \t\<Vn, which together imply that 0 < S2 + t 2 < 2 n. Since S2 + t 2 is a multiple Primes of the form 4 k ±1 § 3. 21 of n, we must have S2 + t 2 = n, so that n is representable as a sum of two squares. It is easy to see; besides, that (s,t) = 1. For (s,t)=(s,As-rn)=(s,rn). However, rls is irreducible, hence (r,s) = 1. Thus (s,t)=(s,n). But n=s2 +t 2, hence s2(A2+1) 1= - 2Asr+r2 n. n Since, by hypothesis, + l)/n is an integer, it follows that any common divisor of sand n must divide 1, hence (s, n) = 1 = (s, t). (A 2 COROLLARY. Ifnisapositiveinteger,and nl(A 2 +B 2), where (A,B)=I, then there exist integers sand t, such that n = S2 + t 2. PROOF. We use the identity (A 2 + B2)( C2 + D2) = (A D + B C)2 + (A C - B D)2. Since (A,B) = 1, we know from Chapter I that there exist integers C andD,suchthat AC-BD=1. We then have (A2 +B2)(C2 +D2)=(AD+BC)2 + 1, so that if nl(A 2 +B2), then nl{(AD+BCf+l}. This, by Theorem 5, implies that n = S2 + t 2 • § 3. Primes of the form 4k ± 1. Euclid's proof of the existence of infinitely many primes was given in Chapter I. Every prime number other than 2 is odd, and an odd number is either of the form 4k-l or 4k+ 1, where k is an integer. We shall show, by arguments similar to Euclid's, that both these sequences contain infinitely many primes. THEOREM 6. There exist infinitely many primes of the form 4k-1. PROOF. Let ql,q2' ... , qr be the first r primes of the form 4k-1. Define N=4Q1Q2 ... Qr-1. Then N is an odd number. Therefore all its divisors are of the form 4k-l or 4k+ 1. But N cannot have only divisors of the form 4k+ 1, since the product of two integers of that form is again of the same form, whereas N is of the form 4 k - 1. Hence N has a prime divisor of the form 4 k - 1. But N is not divisible by Ql, ... ,qr. Therefore there exists a prime of the form 4k-l, which is greater than qr. THEOREM 7. There exist infinitely many primes of the form 4k+ 1. PROOF. Suppose, if possible, that 5,13, ... , p are the only primes of the form 4 k + 1, of which p is the largest. Define the integer N = (2·5· 13 ... p)2 + 1. 22 Rational approximation of irrationals and Hurwitz's theorem III Since N is odd, all its divisors must be odd. By Theorem 5, every prime divisor q of N is of the form q = S2 + t 2 . For q to be odd, one of the two integers sand t must be odd, and the other even. Then q = S2 + t 2 == 1 (mod 4). That is, every prime divisor of N is of the form 4k + 1. This leads, however, to a contradiction, since N> 1, and is not divisible by any of the primes 5,13, ... , p, which, according to our hypothesis, were the only primes of the form 4 k + 1. § 4. Hurwitz's theorem. We begin by sharpening Theorem 4. THEOREM 8. If ~ is irrational, there exist irifinitely many irreducible fractions h/k, such that 1 ~_~1<_1 k 2k2 . PROOF. Let FN be the Farey sequence of order N> 1. Then ~ lies between some two consecutive fractions a/b, c/d belonging to FN , so that a c - < b ~ <-. d We shall prove the theorem by showing that one at least of the inequalities (1) holds. For, if this were false, we should have, since ~ 1 a c - b > 2b 2 ' d - ~ ~ is irrational, 1 > 2d 2 ' (2) which imply, since bc-ad= 1, that (b-d)2 <0. Hence we have either ~ a 1 - b < 2 b2 ' c or d- 1 ~ < 2 d2 • Thus there exists a fraction h/k in F N (equal to a/b or c/d), such that 1 ~_~1<_1 k 2k2 . Since (c/d)-(ajb) = 1j(bd), and because of the choice of hjk, we have 1 1 I~ - kh I< bd1 ~ b+d-1 ~ N' §4. 23 Hurwitz's theorem if we note that b+d~ N + 1 because of Theorem 10 of Chapter I. There exist infinitely many such fractions h/k, since N is at our disposal, which proves Theorem 8. In Theorem 4 we showed that any irrational ~ can be approximated to within 1/k2 by an infinity of rationals h/k. In Theorem 8 that approximation was improved to 1/2k2. The question arises whether this result can be further improved. Does there exist a number c > 2, such that ~ can be approximated to within l/c k2 by an infinity of rationals h/k? The answer to this question is given by Hurwitz's theorem, which follows. THEOREM 9 (HURWITZ). If tive real number, such that c ~ numbers h/k, such that ~ is an irrational number, and c any posi- 0, then there exist infinitely many rational 1 ~_~1<_1 k ck 2 . 0, If c > then there exist irrationals ~ for which the above approximation holds only for finitely many rationals h/k. PROOF (KHINCHIN). Let F N be a Farey sequence of order N> 1, and h/k, h'/k' two successive terms in it such that h/k<~<h'/k'. We may suppose that either k' > (VS; l)k, or el5-1)k k' < V J 2 . For if el5-1)k ~V_ J __ 2 el5+1)k < k' < --,-V_ J __ 2 ' then 115+1 k + k' > _V_ J _ max(k k') 2 ' , and we can replace FN by FM , M = k + k', and one of h/k, h'/k' by their mediant (h+ h')/(k+ k'), since k(h+ h') - h(k + k') =(k+ k')h' - (h + h')k' = 1 (cf. Theorem 7, Chapter I). Thus, if k'/k=w, then w>(0+1)/2, or w«0-1)/2. In either case we have 1 +w- 2 > VSw- 1 , since _1 (1 VS + ~) _ ~ w2 W = _1_ tIS w 2 (w _ tIS2+ 1 ) (w _ tIS2-1 ) > O. Rational approximation of irrationals and Hurwitz's theorem 24 III Hence 1 (1k2 + k,21) = VS1 (1 + OJ21) > OJk12' VS P so that h' h k' - k = 1 1 1 (1 kk' = k2OJ < VS 1) k2 + k,2 ' which implies that h 1 h' 1 k VSk2 k' VSk,2' -+-->---Hence one of the open intervals contains ¢. Reasoning as before, we see that there exist an infinity of such approximations, which proves the first part of the theorem. To prove the second part, we assume that c > VS, and consider the irrational number ¢ =t(l + VS). We shall see that ¢ has only finitely many rational approximants hjk satisfying the inequality . 1 ¢_~1<_1 k ck (3) 2 Let c= VSjex, ex where 0< < 1, and suppose that . 1 ~_1+VSI<_ex k 2 VS k2 This can be written as 181 < ex, if we set in which case k VSk h - - = -2 2 8 + --, VSk or 2 2 h -hk-k =8 8 +-. 2 2 5k Since hand k are integers, it follows that h2 - hk - k2 cannot be zero unless h = k = O. But it is impossible that k = 0, hence Ih 2- h k - k21 ~ 1. §4. Hurwitz's theorem 25 rx. 2 k2 < - - 5(1-rx.) (4) Since 181 < rx. < 1, we have or Thus the denominator k of a fraction h/k which satisfies (3) must satisfy (4). Since rx. is given, k can have only finitely many integral values, and because of (3), h can have only finitely many integral values. Thus, if inequality (3) can hold only for finitely many fractions h/k, c> which completes the proof of Theorem 9. Following the remark at the end of Theorem 4, Theorems 8 and 9 hold without the assumption that 0 < ~ < 1. VS, Chapter IV Quadratic residues and the representation of a number as a sum of four squares § 1. The Legendre symbol. The theory of quadratic residues is a fundamental part of the theory of numbers. It can, for instance, be applied to prove such elegant results as Euler's theorem that every prime number of the form 4 k + 1 is a sum of two squares, and Lagrange's theorem that every positive integer is a sum of four squares. Let p be an odd prime, and a an integer such that (a,p) = 1. If there exists an integer x such that x2=a(modp), then a is called a quadratic residue modulo p. If there exists no such x, then a is called a quadratic non-residue modulo p. We shall sometimes write aRp to indicate that a is a quadratic residue modulo p, and aN p to indicate that it is a quadratic non-residue modulo p. In order to find out how many of the integers 1,2,3, ... , p -1 are quadratic residues modulo p, we should know how many of the congruences x 2 =a(modp) (1) are soluble when a runs through the integers 1,2,3, ... , p-1. Let us consider the integers 12 ,2 2 ,3 2 , .•• , (p; 1y. They are all mutually incongruent (modp). For if we take any two of them, say r2 and S2, r=l=s, then r2=s2(modp) would imply that r=s (modp), or r= -s(modp), and both alternatives are excluded, since 1~r, s~(P-1)/2. Further, r2=(p-rf(modp). It follows from these two remarks that the integer a in (1) assumes t(P-1) different values, when x runs through the set 1,2,3, ... ,p-1. Hence there are exactly t(P-1) quadratic residues modulo p, and t(P -1) quadratic non-residues. THE LEGENDRE SYMBOL. Let p be an odd prime, and m an integer such that (m, p) = 1. We define the Legendre symbol (;) by the relations = (~) p { + 1, ~f mRp, -1,lfmNp. (2) Wilson's theorem and Euler's criterion §2. 27 It is convenient to extend Legendre's definition by defining (;) = 0, if plm. Since there are as many quadratic residues as non-residues (modp), it follows that § 2. Wilson's theorem and Euler's criterion. The following result, known as Wilson's theorem, but first proved by Lagrange, expresses a characteristic property of primes. THEOREM 1. If P is a prime, then (p - I)! =- 1(mod p). PROOF. If P = 2, the conclusion is obvious. Therefore let p > 2. From the discussion in § 1 of Chapter II it follows that to any x in the set 1,2, ... , (p -1), there corresponds one and only one x' in the same set, such that xx'= 1(modp). (3) Further x = x' if and only if x = 1 or p -1. For the congruence x 2 =I(modp) is equivalent to (x-l)(x+ 1) O(modp), so that either x=l(modp), which implies that x=l, or x= -1(modp), which im- = plies that x = p - 1. From (3) it follows that 2·3 ... (p-2) =1(modp). If we multiply this, in turn, by the congruence we get 1(P -1) =-1 (modp) , 1·2·3 ... (p-l)= -1(modp), (4) which is Wilson's theorem. Note that ifp is composite, then it can be factorized as p = qr, 1 <q<p. Hence q occurs as a factor in the product 1· 2·3 ... (p -1), and the congruence (p -I)! + 1 =O(modq) 28 Quadratic residues and the representation of a number as a sum of four squares IV is impossible; so also the congruence (P-1)!+1 :;:O(modp). Thus Wilson's theorem states a property characteristic of the primes. Now let p be an odd prime, and (a,p) = 1. We shall see that if a is a quadratic residue modulo p, then ai-(P-I):;: 1 (modp). For the congruence x 2 :;: a (mod p) is then soluble, and (x,p) = 1, since (a,p) = 1. If we raise this congruence to the power t(P -1), which is an integer since p is odd, we get XP-l :;: a}(P-I)(modp). But by Fermat's theorem (Theorem 3, Chapter II), we know that x p - l :;: 1(mod p). Hence a!(p-l):;: 1(mod p). On the other hand, the congruence XHP-l) :;: 1(mod p) has at most t(P-l) solutions, because of Lagrange's theorem (Theorem 7, Chapter II). And we know from § 1 that there are exactly t(p-1) quadratic residues. Each of them, as we have just seen, satisfies it, hence there are no other solutions. Thus we obtain THEOREM 2 (Euler's criterion). Suppose p is an odd prime, and a is any integer. Then a!(p-l):;: 1(mod p), if and only if a is a quadratic residue modulo p. Now if p is an odd prime, and (x,p) = 1, then by Fermat's theorem xP-l-l ( P-l -I}\(P-l) x + 1 :;:O(modp). = x 2 2 Hence either x hp - or 1 ):;: 1(mod p), Xf(p-l):;: -1(modp). (5) (6) Since, by Theorem 2, a quadratic non-residue does not satisfy (5), it must satisfy (6). Combining this observation with the definition of the Legendre symbol, we obtain THEOREM 3. If P is an odd prime, then m!<p-l) :;: ( ; ) (modp). § 3. Sums of two squares 29 COROLLARY. We have (~)(~) = (~n), which means that the product of two quadratic residues, or non-residues, modulo p, is again a quadratic residue; but the product of a quadratic residue with a quadratic non-residue, modulo p, is again a quadratic nonresidue. § 3. Sums of two squares. Let p be an odd prime, and set m = p-l in Theorem 3. Since p -1 == -1 (mod p), we get (P-1) But (~1) = ±1, == (-1) t(p-l) (modp). and (_l)t(P-l)= ±1, and p;::3. Hence (~1) = (_1)i(p-l), from which it follows that -1 is a quadratic residue (mod p) of all primes p == 1 (mod 4), and a quadratic non-residue (modp) of all primes p==3(mod4). This leads us to THEOREM 4 (EULER). Every prime of the form 4k + 1 is representable as a sum of two squares. PROOF. If P is a prime of the form 4k + 1, then -1 is a quadratic residue of p. That is, the congruence x 2 == -1 (mod p) has a solution. Therefore there exists an integer A, such that pl(A 2 + 1). This implies, by Theorem 5 of Chapter III, that p is a sum of two squares. The result that if p is a prime of the form 4k+ 1, then pl(A 2 + 1), for some integer A, can be sharpened as follows. THEOREM 5. If P is a prime, such that p == I (mod4), then there exists an integer x, such that x2 + 1= m p, where 0 < m < p. PROOF. Since -1 is a quadratic residue of p, there exists an integer x of the set 1,2,3, ... ,t(p-l), which satisfies the congruence x 2 == -1 (modp). That is, x 2 + 1 = mp for some integer m. But x < p/2, therefore x 2 +1 <(pj2)2+1<p2. Hence x 2 +1=mp, with O<m<p. A result similar to Theorem 5 is the following 30 Quadratic residues and the representation of a number as a sum of four squares THEOREM 6. IV If P is an odd prime, there exist integers x and y such that 1+x 2 +y2=mp, where O<m<p. PROOF. The integers x 2, O~x~t(p-l), are pairwise incongruent (modp); so are the integers -1- y2, 0~y~Hp-1). But these two sets together contain p + 1 integers, and since there are only p residue classes (modp), some member x 2 of the first set must be congruent to some member - 1 - y2 of the second set. Thus x 2 == -1- y2 (mod p) or 1 + x 2 + y2 = m p. But 0 ~ x, y ~ Hp -1). Therefore hence 1+x 2 +y2=mp,0<m<p, as claimed in the theorem. We have seen that every prime p such that p == 1 (mod 4) is representable as a sum of two squares. But other integers also have that property. For instance 10=12+3 2. The following theorem gives a necessary and sufficient condition for a positive integer to be representable as a sum of two squares. THEOREM 7. A positive integer n is a sum of two squares if and only all its prime factors of the form 4 k + 3 have even exponents in the standard form of n. if - For the proof of Theorem 7 we need two lemmas. We call a representation n = x 2 + y2 primitive if (x,y) = 1, and imprimitive otherwise. LEMMA l. If n is divisible by a prime p, where p == 3 (mod 4), then n has no primitive representations. PROOF. If n has a primitive representation, say n=x 2 + y2, (x,y)= 1, then pl(X 2 +y2), but p,rx, p,ry. And since (p,x) = 1, the equation m x - t P= c is soluble in integers m and t, for all integral c, and in particular for c = y. Hence there exists an integer m such that mx==y(modp), §4. Sums of four squares 31 which implies that X2 + (mx)2 :=X2 + y2 :=O(modp). Therefore plx2(m 2+ 1), and since p,.r x, it follows that pl(m 2+ 1). That is, m2 := -1(modp). In other words, -1 is a quadratic residue modulo a prime p of the form 4 k + 3, which is impossible, as we have seen at the beginning of §3. Thus the lemma is proved. LEMMA 2. If P is a prime, p:=3(mod4), and c is an odd integer, such that pCln but pC+ I ,.r n, then n cannot be represented as a sum of two squares. PROOF. Suppose, if possible, that n = x 2 + y2, where (x, y) = d. Then wehave x=dX,y=dY, with (X, Y)= 1, and n=d 2(X2+ y2)=d 2N, say. Let p' be the highest power of p which divides d. Then p<-2, is the highest power of p which divides N. And c - 2 r > 0, since c is odd. Thus we have an integer N, such that N = X2 + y2, (X, y) = 1, and piN, where p:=3(mod4). This contradicts Lemma 1, hence Lemma 2 is proved. PROOF OF THEOREM 7. The condition is necessary, for Lemma 2 implies that if n is a sum of two squares, then every prime factor of n, of the form 4k+3, has an even exponent in the standard form ofn. The condition is also sufficient, for if n is a positive integer such that every prime factor of the form 4 k + 3 which occurs in its standard form has an even exponent, then n can be written as n=nin2, where n2 has no prime factors of the form 4 k + 3. Therefore the only prime factors of n2 are either the number 2 or odd primes of the form 4k+ 1. Now 2 is representable as a sum of two squares 12 + 12, and every odd prime of the form 4 k + 1 can be represented as a sum of two squares. Further the identity (xi + Yi)(x~ + y~) = (Xl X2 + YI Y2)2 + (Xl Y2 - x 2Ytf shows that the product of two numbers each of which is representable as a sum of two squares is likewise representable. Hence n2 = a2+ b2, which implies that n=(n 1 a)2+(n 1 b)2. § 4. Sums of four squares. We conclude this chapter with a result which is as famous as it is elegant. THEOREM 8 (LAGRANGE). Every positive integer n is a sum of four squares. PROOF. Since 1 = 12 + 0 2 + 0 2 + 0 2, we suppose in what follows that n> 1. The identity (xi +x~ +x~ +x~)(Yi + y~ + y~ + y~)=zi +z~ +z~ +z~, (7) 32 Quadratic residues and the representation of a number as a sum of four squares IV where +X2Y2 +X3Y3 +X4Y4' Z2 =X 1 Y2 -X2Yl +X3Y4 -X4Y3, Z3=X1Y3- X 3Yl +X4Y2- X2Y4' Z4=X1Y4 -X4 Yl +X2Y3 -X3Y2, Zl =X1Yl shows that a product of two integers, each of which is representable as a sum of four squares, is likewise representable. Every integer n> 1 is a product of primes, and 2 = 12 + 12 + 0 2 + 0 2 • It suffices therefore to show that every odd prime is representable as a sum of four squares. It follows from Theorem 6 that if p is an odd prime, then there exists an integer m < p, such that mp=xI +x~ +x~ +xi, where Xl,X2,X3,X 4 are not all divisible by p. Given any odd prime p, let mo denote the smallest positive integer such that (8) If mo = 1, there is nothing more to prove. Suppose that mo > 1. We shall first show that mo must be odd. For ifmo is even, then Xl,X2,X3,X 4 are either all even, or all odd, or two even and two odd (for instance X l ,X2 even, and X3,X4 odd). Since !.m 2 oP = (Xl +X2)2 (Xl -X2)2 (X3 +X4)2 (X3 -X4)2 2 + 2 + 2 + 2 ' we see that tmoP is a sum of four integral squares, not all of which are divisible by p. But this contradicts the minimality of mo. Hence mo ~ 3, and we can write xi=bim O+ Yi, (i= 1,2,3,4), (9) where the integer bi can be so chosen that IYi I< t mo. For if the division of Xi by the odd number mo gives xi=b;mo+Y;, where y;>tm o, then we can write xi=(b;+ l)mo+(yi-mo)=bimo+Yi, where -tmO<Yi<O. Now Xl ,X2,X 3 ,X4 are not all divisible by mo, for that would imply that mo Ip, which is impossible, since 1 < mo < p. Therefore YI + Y~ + Y~ + yi > O. Thus we have Sums of four squares §4 33 But it follows from (8) and (9) that yi + y~ + y~ + y~ =0 (modmo)· Thus we have integers x;,y;(i=1,2,3,4), such that xi+x~+x~+x~=mop, and mo<p, Identity (7) therefore gives us four integers Z1,Z2,Z3,Z4, such that zi +z~ +z~ + z~ =m~m1 p. But (10) 4 4 4 Z1 = L X;y;= L x;(x;-b;mo)= L xf(modmo)=O(modmo)' ;=1 ;=1 ;=1 Similarly Z2 = Z3 = Z4 = O(mod mo)' Hence z;=mot;, where t; is an integer for i=1,2,3,4. On substituting these values in (to), we get m1P=ti +d +t~ +ti, with 0<m 1<mo. But this contradicts the minimality of mo. Hence mo = 1, and Theorem 8 is proved. 3 Chandrasekharan, Analytic Number Theory Chapter V The law of quadratic reciprocity § 1. Quadratic reciprocity. Let p and q be two distinct odd primes. Then the Legendre symbols determined if (~) (~) and (~) are defined. Can (~) be is known? Gauss's law of quadratic reciprocity shows that that is indeed possible. THEOREM I (GAUSS). If P and q are distinct odd primes, then Since t(p-l)·t(q-1) is odd if and only if p=q=3(mod4), Theorem 1 can be restated as follows: (~) = - (~). and (~)= if p=q=3(mod4), (;). otherwise. We shall deduce the law of quadratic reciprocity from a reciprocity formula for certain exponential sums. § 2. Reciprocity for generalized Gaussian sums. Let m and n be two non-zero integers. Then a generalized Gaussian sum is defined as g(m,n)= Inl L e7ti!ffk2+ 7timk. (1) k=l When m is even, this reduces to a Gaussian sum. Theorem 1 can be deduced from a formula connecting g(m,n) and g( -n,m), which we state as §2 Reciprocity for generalized Gaussian sums THEOREM 2. If m and n are non-zero integers, then -1- g(m, n) = e4.i (1- Imn I)sgn(mn) -1- g( - n, m), vr;;I where sgnr=r/lrl PROOF. 35 ~ if if r=l=O, and sgnr=O (2) r=O. We shall use complex integration for the proof. Consider the integral f(X) = f(X, r) = JrI>(u) du, (3) c where e1t'iTU 2 rI>(u) = rI>(u, X) = rI>(u, X, r) = + 2niXu -----::----c-- e21tiu _1 . (4) Here u is a complex variable, X an arbitrary complex number, r a complex number with positive real part, and C a line in the complex u-plane through the point u = t which is inclined at an angle n/4 to the positive real axis. We shall first show that the integral converges. For this we shall estimate the function rI> in any strip (of finite width), which is bounded by two lines parallel to C. If we set i1t u=c+re 4 , where c and r are real, c bounded and r variable, and r=Rer+iImr, then and ru 2 +2X u=irr 2 +2e* (rc+X)r+(rc+2X)c, so that Hence where A and B are constants independent of r. Further Ie2 1tiu - 11 ~ [1 - Ie2 1tiu I[ = 11 - e - V2 1tr I. Now r-+± 00 as lul-+oo in the strip, so that if lui is large enough, then (6) 3* The law of quadratic reciprocity 36 v Combining (5) and (6) we have (7) 1<P(u)I~A1'e-"r2Ret+Blrl, in the strip chosen, if lui is large enough. Hence the integral J<P(u)du c converges. We shall next show that g(m, n), for n > 0, is the value of the integral J<P(u)du for a suitably chosen contour y. Let y be the parallelogram formed by the line e, the line en parallel to e which cuts the real axis at the point n+t, n>O, and the lines L 1 , and L 2 , in the upper and lower half-planes respectively, which are parallel to the real axis and at a positive distance from it (Fig. 1). Fig. 1 Now <P(u) is a meromorphic function of u, and if y is taken in the positive sense, then by Cauchy's theorem of residues, we have n J<P(u)du= L e"itk2+2"iXI<. (8) k=l Because of (7), <P(u)-+O uniformly as lul-+oo in the strip, while the two sides of y parallel to the real axis are of constant width. Hence the integrals along these sides tend to zero, when L1 and L2 go to infinity, away from the real axis. Thus we are left with n J<P(u)du- J<P(u)du= L e"itk2+2"iXk. ~ C k=l From (4), however, we have <P(u + n, X) = e"itn2 + 2"iXn <P(u, X +. n), (9) 37 Reciprocity for generalized Gaussian sums §2 so that JcI>(u)du = e7titn2+ 27tiXn f(X +Tn), en where f is defined as in (3). Hence (9) becomes e7titn2+27tiXn L e1[itk +27tiXk, n f(X +Tn)- f(X) = (10) 2 k=l which is a relation between f(X) and f(X +Tn). We shall now seek another such relation and compare the two. For this purpose, we start with the identity f e~itU2 f f(X+1)-f(X) = {e 27ti (X+l)u_ e 21[iXu}du e21t1u _1 C X2 . 2' -nie7t !!U +21t1Xu du=e t = C f e (X)2 nit u+t dUo C Now let C' be the line parallel to C, obtained by the translation U--+U+XjT. Then X2 f(X + 1)- f(X)=e -1[i~ f e7titU2 duo C' That this integral converges is clear from the estimate (5). By integrating again around a parallelogram, as before, and using the estimate (5) with X = 0, it can be seen that J e7titU2 J du = C' du, e7titU2 Co where Co is the line parallel to C' through the origin. On Co we have u = r e7ti /4 , with r real. Therefore f e1[itu 2 Co f (1) du = e ~ e- 7ttr2 ~ dr= e I" -(1) say. Hence .( 1 X2) f(X + 1)- f(X)=/' 4-, It' By iteration of this formula m times, we get m-l f(X +m)-f(X)=I,' .e (X+V)2) L /' 4--t-, v=o 38 v The law of quadratic reciprocity where m is a positive integer_ If we replace X by X second relation we are seeking, namely m-1 f(X +rn)- f(X +rn-m)=lt - _[1 I /' +r n - m, we get the (X+tn-m+v)2] "4- t _ (11) v=o From (11) and (10) we obtain enitn2+2niXnf(X +rn-m)- f(X) = I n enitk2+2niXk_1tenitn2+2niXn I m _[1 en, "4- (X+tn-v)2] t v=l k=l (12) If in this we put X=mI2, and r=mln, m>O, n>O, we have n "e L., nik2'!'.+nimk n ~(l-mn) m =1min e 4 " L., v=l k=l nivn-v2ni!!. m (12') e Here if we set m=n=l, we get 11 =1, that is If we now make the substitution t--+t0, where r is real and positive, we get f 00 1t = e- 1ttt2 dt = ~- (13) -00 If we use formula (13), with r = min, in formula (12'), then we get 1 Ie n Vnk=l nik2!!!+nimk n 1 vm ~(l-mn) 1 ~(l-mn) = __ e 4 vm Ie m 'ltivn- ,,21ti~ m v=l = --e4 Ie -7tivn-v27ti~ m m v=l 39 Proof of quadratic reciprocity §3 and this, by the definition of g(m,n), leads to vm 1 1 Vn g(m,n) = 1ti(1-mn) e4 g( -n,m), (14) which proves the theorem for m > 0, n > 0. If m>O, and n<O, then -n, m>O, and (14) gives 1 11:: Vm g( -n,m) = 1 1 r-: e ~(1 +mn) V -n g( -m, -n), or 1 --g( -m, -n)=e ~ -1ti(1-lmnl) 1 vm --g( -n,m). 4 But by definition, g( -m, -n)=g(m,n), hence 1 --g(m,n)=e ~(1 -Imnl)sgn(mn) ~ 1 vm --g(--n,m), as claimed. If m < and n < 0, the reciprocity formula (2) remains valid, since g( -m, -n)=g(m,n), g(n, -m)=g( -n,m), and (1-lmnl)sgn(mn) remains unchanged if m and n are replaced by - m, - n respectively. This concludes the proof of Theorem 2. It may be noted that this proof does not assume the result ° -00 but obtains it as a byproduct. § 3. Proof of quadratic reciprocity. The law of quadratic reciprocity, stated in Theorem 1, can be elegantly deduced from the reciprocity formula for generalized Gaussian sums proved in Theorem 2. Since k2 == k(mod 2), we can replace k by k2 in the definition of g(m,n) given in (1), and write g(m,n)= ~ L. nik 2 "'(n+ 1) en. k=1 Now let n be an odd prime, and m some integer prime to n. We then have n-l g(m,n)=l+Le k=1 m nik2-(n n + 1) • 40 v The law of quadratic reciprocity If k 2 == p (mod n), then it is easy to see that xik2!!!(n + 1) e = n xip!!!(n + 1) e n But if == p (mod n), and 1 ~ k ~ n -1, then p is a quadratic residue modulo n, and (n - k)2 == k 2 == p (mod n). Thus if k runs through the integers 1,2, ... ,n-1, then k 2 (taken modulo n) runs twice through the set of quadratic residues modulo n. Hence k2 g(m,n) xip!!!(n+ 1) = 1+2Ie n (15) , p where p runs through the set of quadratic residues modulo an odd prime n. Now consider the sum where v runs through the quadratic non-residues modulo n. We obviously have ~ 1 + L,.e xip!!!(n+ 1) n ~ + L,.e xiv!!!(n+ 1) n = p n~,t L. xik!!!(n+ 1) en. k=O xik!!!(n+ 1) But n + 1 is even, and therefore e n is the root of unity, say '1, and '1 1 since n,r m. Thus '* g(m,n) ~ xip!!!(n+ 1) = L,.e n kth ~ xiv!!!(n+ 1) L,.e - n power of an nth (17) • p We now consider the two possibilities (~) = + 1, and (~) = -1. (a) If m is a quadratic residue modulo n, and p runs through all quadratic residues modulo n, then by the Corollary of Theorem 3 of Chapter IV, pm likewise runs through all the quadratic residues. And if v runs through all the non-residues, so does vm. Hence g(m,n) = ~ xip(!!..±..!) L,.e n - p = g(1,n) = (~) g(1,n). ~ xiv(n+ 1) L,.e n v (by (17)) §3 Proof of quadratic reciprocity 41 (b) If m is a quadratic non-residue modulo n, then by reasoning again as in case (a), we have Iexiv(~) Ie xiP(n+~) g(m,n) = n - v n p = -g(1,n) = (;) g(1,n). We have therefore shown that if n is an odd prime, and (m,n) = 1, then g(m,n) = (;) g(1,n). (18) On the other hand, from Theorem 2 it follows that - 1 ~(l-n) g(1,n) = e 4 Vn g(-n,1), and since, by definition, g( - n, 1) = 1, we have g(1,n) = Vn e~(l-n) 4 (19) . from (18) and (19) we get the important formula (m)n - 1 ~(n-l) = 1;: e Vn (20) g(m,n), where n is an odd prime, and m is an integer such that (m,n) = 1. If m = -1, this gives (-1) Vn1 - n =- ~(n-l) e4 g(-1,n). But by (2), we have xi xi 1 4(n-l) "4(n-l) g( -1,n) = e g( -n, -1) = e Vn since g( -n, -1) = 1. Hence ( 1) - _ n xi -(n-l) = e2 = (-1) n-l - 2 • (21) Here n is an odd prime. Let us now assume that m is also an odd prime. 42 v The law of quadratic reciprocity Then it follows from (20) and (2) that m xi 1 xi -(n-l) -(l-mn) - = e4 () vm e4 • n -- g( -n,m). If we use (20) once again, we get m n (_) xi -(n-l) = e4 . xi -(l-mn) e4 . -n m -xi (m-l) ( ) e 4 _ • But because of (21). Hence m _) ( n Since (;Y = -(n-l)(m-l) -xi ( = e 4 n) _ m n-l. m-l = (-1) 2 2 ( n) _. m 1, it follows that (m)n (n)_ m - - -(-1)~'9 , which proves Theorem 1. § 4. Some applications. Theorem 1 was concerned with the value of (~). when p and q are distinct odd primes. In order to determine whether or not a given even integer is a quadratic residue modulo an odd prime, we have to evaluate the Legendre symbol (~). This can be done by an application of (2) and (20). THEOREM 3. If P is an odd prime, then (p2) - In other words, (~) \p = {+ = (-1) p2;1 . 1, if p = ± 1 (mod8), -1, if p = ±3 (mod8). (22) (23) §4 Some applications PROOF. 43 From (20) we have (2) 1 \P = tIP e ~(P-1) g(2,p), and from (2) we have 1 tIP g(2,p) ~(1-2p) = e4 1 V2 g( - p,2), while from the definition of g(m,n) we have nip g(-p,2) = l+e2 . Thus (2) _ e _1t!P ( = __ l+e V2 P AN 1tiP) EXAMPLE. 2 1(_1tiP 1tiP) = - e 4+e 4 V2 =(-1) p2-1 8. Let us use Theorems 1 and 3 to evaluate ( 12703). 16361 Here both 12703 and 16361 are primes. By Theorem 1 we have ( 12703) (16361) 16361 = 12703 ' and since 16361 =3658(mod 12703), we have ( 16361) ( 3658 ) 12703 = 12703 . Since (:n) (;)-(%), = we have C32~083) = C2~03) C;7~3) C;;03) = C;7~3) C;;03) = _ = _ (12;~3)- (12;~3) (by Theorem 3) (by Theorem 1) G;) G!) G:) (:1) (5~) G;). = 44 v The law of quadratic reciprocity Since G:) C1Y = 2 = 1, and similarly G;) = 1, we get finally G!~~~) = (:1)(:1) (52 9) = 1· (:1) (-1) = e31) = (~) = 1. A REMARK. As seen in Chapter IV, if p is a given odd prime, then (~) = (:} for all integers m'=m(modp). On the other hand, from Theorem 3 we know that (%) has the same value for all odd primes p which lie in the arithmetical progressions 8m± 1, or in 8m±3. Theorem 1 can be used to show, more generally, that if q is a fixed odd prime, then (24) where p' is a prime such that p'=p(mod4q). For if p'=p(mod4q), then p'=p(mod4), so that (mod 2). By Theorem 1 we have (%)G) = t(p'-1)=t(p-1) (_1t~l,q;1 =(_1)P;l,Q;1 = (~)(;). Further~since p:=p(mod4q~ webave p'=p(modq~ hence (~) ~ (~). Thus ~) = (p) as claimed in (23). Chapter VI Arithmetical functions and lattice points § 1. Generalities. We recall that an arithmetical function is a complex-valued function defined on the set of positive integers. Many of the arithmetical functions we shall consider are integer-valued. An arithmetical function f is multiplicative, if (i) f is not identically zero, and (ii) f(m n) = f(m) -j(n), if (m, n) = 1. Condition (i) may be given an alternative form, namely f(I)=1. Euler's function cp, introduced in Chapter II, is an example. We have proved that it is multiplicative, and that cp (pa) = pa(I-I/p), for every prime p, and positive integer a. Many arithmetical functions behave irregularly, and it is often more interesting to study the summatory function of an arithmetical function f, namely F(N) = N L f(n), n=1 than f itself. Some of the arithmetical functions in which we are interested have a simple geometrical interpretation. They count the number of lattice points in certain regions. A lattice point is a point in n-dimensional Euclidean space, n ~ 1, with integer co-ordinates. § 2. The lattice point function r(n). The arithmetical function r(n) gives the number of representations of an integer n ~ 1 as a sum of two integral squares; in other words, the number of solutions of the equation x 2 + y2 = n, in integers x, y. Solutions which differ only in sign, or order, are counted as distinct. Thus r(l) =4, since 1 =(± 1)2 +0 2 = 0 2 + (± W. It follows that r(n) is not multiplicative. We have seen in Chapter IV, Theorem 7, that r(n)=O, ifn is a prime of the form 4k+3. On the other hand, we have seen in Chapter III, Theorem 6, that there are infinitely many such primes. Hence r(n)=O for infinitely many values of n, and since r(n) ~ 0, it follows that lim r(n)=O. n-+ 00 One can seek to estimate the order of magnitude of r(n), and prove that r(n)=O(n£), for every 8>0. That is, Ir(n)ln-£<K, where K is a 46 VI Arithmetical functions and lattice points constant independent of n. It is more interesting, however, to study the (modified) summatory function N R(N)= L r(n), r(O)=l. n=O Geometrically speaking, R(N) is the number of lattice points inside and on the circumference of the circle x 2+ y2 = N. It is easy to see that the magnitude of R(N) is approximately equal to the area of the circle. THEOREM 1 (GAUSS). R(N)=nN + OWN). PROOF. The lattice points in the plane are the vertices of squares each of which is of unit area. To each lattice point inside or on the circle x 2 + y2 = N, we can associate a square, of which it is, for instance, the "south-west" corner. Then R(N) is equal to the sum of the areas of these squares. Some squares are not entirely inside the circle; on the other hand, some parts of the circle are not covered by the squares (Fig. 2). /' / - ....- r-..., j 1'\ / i\ \ '" 1\ j .. !"'-. r- --- /' IL 1/ Fig. 2 VI However, since the diagonal of each ~uare is all the squares are contained inside the circle x 2+ y2 = WN + }l2)2, so that R(N)<n(VN + J/2)2. Similarly the squares completely cover the smaller circle of radius VN so that J/2, R(N»n(VN - J/2f, N~2. The divisor function d(n) §3 47 We thus have n(N -2V2N +2)<R(N)<n(N +2V2N +2), and hence R(N)=nN + O(ViV). § 3. The divisor function den). The arithmetical function den) gives the number of positive divisors of the positive integer n. THEOREM 2. The divisor function den) is multiplicative. PROOF. We have obviously d(1)= 1. And if (m,n)= 1, then every divisor of mn can be uniquely represented as the product of a divisor of m, and of a divisor of n. Conversely, every such product is a divisor ofmn. Hence d(mn)=d(m)·d(n). r n > 1, THEOREM 3. I• r d(n)= TI (a i+ 1). r with the standard form n= TI pi', i=l then i= 1 PROOF. Since den) is multiplicative, we have den) = TI d(pi')· i= 1 The only positive divisors of pi' are the (ai+l) integers 1,Pi,p~, ... ,pi'. Hence den) = TI (a i + 1). i= 1 The divisor function can be interpreted geometrically. The number of positive divisors of n is equal to the number of solutions of x y = n in positive integers x, y. Therefore den) is the number of lattice points (x,y) in the "upper right quadrant" of the (x,y)-plane, which lie on the hyperbola xy=n. THE ORDER OF den). It follows from Theorem 3 that den) can be made as large as we please. But den) = 2, if n is a prime. Therefore lim d(n)=2. "-00 THEOREM 4. For every Ll >0, there exists a sequence of integers ni for which d(nJ (logn;)'1 - - - , -->00, as i-->oo. (1) 48 Arithmetical functions and lattice points VI PROOF. If Ll > 0, let k be the integer defined by k::::; Ll < k + 1. Let PHl be the (k+ 1)th prime, and let n=(2·3·5···Pk+l)m, where m is a positive integer. By Theorem 3, we have But mk + 1 = { }k+ 1 logn log(2' 3·5 ... PH d > c(logn)k+ 1, (2) where c is a constant independent of n. If we now take m = 1,2,3, ... , we get an infinite sequence of positive integers n, for which d(n) > c(logn)k+ 1, and if we set k + 1 = Ll + b(b > 0), then for that sequence, we have d(n) ---.1 (logn) > c(logn)b--+ oo, as n--+oo, so that the theorem is proved. On the other hand, we have THEOREM 5. d(n)=o(n b), for every b>O. In other words, d(n)/nb--+O, as n--+oo. For' the proof of this theorem we require THEOREM 6. If f is a multiplicative, arithmetical function, and f(pm)--+o, as pm--+oo, where p is a prime, and m a positive integer, (that is, f(n)--+O, as n runs through the set of prime powers), then f(n)--+O, as n --+00. PROOF. Since f(pm)--+o, as pm--+ 00, f satisfies the following conditions: (i) there exists a positive constant A, such that for all p and m; (ii) there exists a constant B, such that if pm> B, then If(pm) I< 1; and (iii) given e>O, there exists an N(e), such that if pm>N(e), then If(pm) I<e. 49 The divisor function den) §3 Clearly A and B are independent of e, p and m, and N(e) depends only on e. Let n> 1, with the standard form (3) Since f is multiplicative, we have f(n) = f(p~')-j(P22) .. .f(p~r). (4) Consider all prime powers pa, and let C be the number of those prime powers which do not exceed B. Then C is independent of nand e. For the corresponding factors f(Pii) in (4) we can apply inequality (i); their product, in absolute value, is therefore less than A C . The remaining factors of f(n) are, in absolute value, all less than 1, by (ii). Again there are only finitely many integers of the form pa which do not exceed N(e). Therefore there are only finitely many integers whose standard form contains only factors of the form pa with pa$;N(e). Let P(e) be the upper bound of all such integers. If we now choose n > P(e), then the standard form of n must contain at least one factor pa > N(e), and we can therefore apply (iii), namely If(pa)1 <e. Hence, if n > P(e), then we have If(n)l< AC·e, so that f(n)--+O as n--+oo. PROOF OF THEOREM 5. The function f(n) = d(n)/no is multiplicative, and Since logp?dog2, it follows that for every 6>0, we have 2 logpm f(pm) $; - 12 . ------;;;-;l og p --+ 0, as pm --+ 00. Hence, by Theorem 6, we have d(n) - 0 --+ n 0, as n --+00, for every 6> 0, as claimed. It can be shown that given e>O, there exists a number N(e), such that logn d(n)<2(1+£)IOgIOQ,l, 4 Chandrasekharan, Analytic Number Theory for n>N(e), 50 Arithmetical functions and lattice points VI and that, for infinitely many integers n, we have \ogn d(n) > 2(1-e)\Og!ogn THE AVERAGE ORDER OF d(n). Let us consider the summatory function N D(N) = Since d(n) = L d(n). n=l L 1= L 1, we have xy=n tin N D(N)= L d(n)= L L 1, n=! or l~n~Nxy=n D(N)= L 1. Clearly D(N) is the number of lattice points in the "first" quadrant (that is, upper right), which lie on or below the hyperbola x y = N, the points on the axes being excluded since xy=O for them. To estimate the order of magnitude of D(N), we need THEOREM 7. If g is a monotone decreasing function of the real variable t,definedfor t~l, with g(t»O for t~l, then x L g(n)= f g(t)dt+A+O(g(X»), 1 l:S:;n~X where n is a positive integer, X~ 1, and A is a constant depending only on g. PROOF. Consider the closed interval [n, n + 1J. Since g is decreasing, we have n+l J g(t)dt~g(n). g(n+ 1)~ n Therefore n+l O~An=g(n)- f g(t)dt~g(n)-g(n+ 1). n If M and N are arbitrary positive integers, with M < N, then N N L An~ L {g(n)-g(n+ 1)} =g(M)-g(N + 1), n=M n=M and since g(t) > 0 for t ~ 1, it follows that N L n=M An~g(M), for all N>M. (5) 51 The divisor function den) §3 00 00 In particular, L An:::;g(l), sincegisdefinedat 1, so that the series LAn n=l n=l converges. Set 00 Then, by (5), we have NooN A= L An+ L An= L An+O(g(N + 1)), n=l n=N+l n= 1 or n+ 1 N J g(t)dt} + O(g(N + 1)), A = L {g(n) n=l n from which it follows that N N+ 1 Lg(n)= J g(t)dt+A+O(g(N+1)). n=l 1 If we set N = [X], then this takes the form [XJ+ 1 J L g(n)= l';n';X g(t)dt+A+O(g([X]+1)), 1 where n runs through integer values only. But g is positive and decreasing, so that [XJ+ 1 J g(t)dt:::;g(X),O<g([X] + 1):::;g(X), x hence x L 1 Jg(t)dt+A+O(g(X)), g(n)= 1 ~n~X as claimed. COROLLARY 1. There exists a constant y (Euler's constant), such that ~= L l';n';X n COROLLARY +y+O(~). X 2. Since f x dt tlogt -- = 2 4" logX loglogX -loglog2, 52 Arithmetical functions and lattice points VI we have 1- = 10glogX +B+O ( 1 ), L nlogn XlogX 2'Sn':;X where B is a constant. We are now in a position to prove THEOREM 8. D(N)=NlogN + O(N). PROOF. As already mentioned, D(N) is the number of lattice points in the upper right quadrant of the (x,y) plane, which lie on or below the hyperbola x y = N, but not on the axes. Clearly these points lie to the left of the line x = N, and below the line y = N (Fig. 3). We count N Fig. 3 them by considering the lattice points on each vertical line with an integral abscissa. The number of lattice points on an ordinate of length N/x is [N/x] , so that If we set [N/x]=N/x-Ox' O~Ox<l, then N D(N)=N 1 N N 1 L - - L Ox=N L - + O(N), x=l x x=l x=l X N since L Ox<N. x=l From Corollary 1 of Theorem 7 it follows that D(N)=NlogN + O(N), as claimed. The divisor function den) §3 53 Theorem 8 can be considerably sharpened. As a first step we prove THEOREM 9 (DIRICHLET). D(N)=NlogN +(2y-l)N + o (VN), where y is Euler's constant. PROOF. The hyperbola x y = N is symmetric relative to the line x = y. Therefore the regions ABGEO and CDOFG (in Fig. 4) contain the same number of lattice points. The total number of lattice points, B(l,N) A - c a o E Fig. 4 in the "first" quadrant, which are on or below the hyperbola (but not on the axes) is therefore equal to twice the number of lattice points in ABGEO, minus the number of lattice points in the square OFGE. Thus D(N)=2 1 =2 L 1-[VNJ 2 =2 1 ';x';vN L 1 ';y';Njx L 1-[VNJ 2 l';x';vN ~xy:-$;N L 1 ';x';vN [NJ x [VNY If we set [N/xJ=N/x-8 x, O~8x<1, and [VNJ=VN-8, O~8<1, then we get But L 1 8x = O(VN), ';x';vN 82 =0(1), VI Arithmetical functions and lattice points 54 hence D(N)=2N L -x1 - N + O(tIN)· l';;x.;;j/N An application of Corollary 1 of Theorem 7 now gives the result claimed. The error term O(VN) was improved by G. VORONOI to O(Nl/310g N). It is conjectured that the correct error term is O(N t +"), with an arbitrary 8>0. On the other hand, it is known that the error term is not O(Nl/4). § 4. The function a (n). Associated with the function d(n) is the arithmetical function a(n) which gives the sum of the positive divisors of n. More generally, one can define ak(n)= Ldk, k=0,1,2, ... , din so that ao(n)=d(n) and a(n)=a1 (n). THEOREM 10. The arithmetical function ak(n) is multiplicative. PROOF. The same considerations as in Theorem 2 imply that if (m, n) = 1, then Ld' L d'= L d*, dim d'in dOlmn which shows that a(n) is multiplicative, and similarly also ak(n). THEOREM 11. If n> 1, with the standard form n = n pi', then r i= 1 (6) PROOF. Since ak is multiplicative, we have In particular, if k = 1, then (7) An old problem concerning the function a(n) is that of perfect numbers. A positive integer N is called perfect if a(N)=2N. That is, §5 The Mobius function Jl(n) 55 N equals the sum of all its positive divisors which are smaller than N. For example, 6 and 28 are perfect numbers. A M ersenne number is an integer of the form 2" -1; if it is a prime, it is called a M ersenne prime. Mersenne primes and perfect numbers are connected by the following THEOREM 12. If 2"+ 1-1 is a prime, then 2"(2 n + 1-1) is a perfect number. PROOF. Let N = 2"(2"+ 1-1) = 2" p, where p is a prime. Then, by (7), u(N)=(2"+ 1-1)(P+ 1)=(2"+ 1-1)2"+ 1=2N." Hence N is a perfect number. Euler observed that this result has a partial converse, namely THEOREM 13 (EULER). Every even perfect number is of the form 2" p, where p = 2n + 1 -1 is a M ersenne prime. PROOF. Let N=2"N' be perfect, n~1, and N' odd. Then u(N)=2N =2"+1 N'. Since u is multiplicative, we have u(N) = u(2") u(N'), and since u(2") = 2"+ 1 -1 by (7), we have (2"+ 1-1)u(N') = 2"+ 1N'. Hence (2"+1_1)IN'. If we set N'=(2n+1_1)N", then u(N')=2n+ 1Nil, and Nil <N'. But N'+N =2n+1 N"=u(N'). Now both N' and Nil divide N', and their sum is u(N'). Hence N' has no other divisors, and therefore is a prime. But N' =(2"+ 1-1)N". Therefore Nil = 1, and N' =2n + 1_1, which proves Theorem 13. It is not known whether there exist infinitely many even perfect numbers (that is, infinitely many primes of the form 2n-1). Nor is it known whether there exist odd perfect numbers. Mersenne primes are primes of the form 2n -1. It is simple to see that if n> 1, and a is a positive integer, and an - 1 is a prime, then a=2 and n is a prime. For if a>2, then (a-1)I(a n-1); and if a=2, and n=kl, 1<k~l, then (2k-1)1(2"-1). I § 5. The Mobius function J,l(n). The Mobius function J,l is an arithmetical function defined by the following three properties: Arithmetical functions and lattice points 56 VI (i) J.l(I) = 1; (ii) J.l(n)=( -1)\ if n is a product of k different primes; (iii) J.l(n)=O, otherwise; that is, if n is divisible by a square different from l. An immediate consequence of the definition is THEOREM 14. The Mobius function J.l is multiplicative. THEOREM 15. We have LJ.l(d) = {I, ~f n=l, 0, If n> 1. din PROOF. Let n > 1, with the standard form n = divisors d ofn, for which J.l(d)#O, are: m TI pfi. The only ;=1 Thus hence ~J.l(d) = 1 - (7) + (;) - (~) + ... = (1-1)m = 0. One can alternatively define the Mobius function by Theorem 15, and deduce properties (i), (ii), (iii) from it. The most important applications of this function stem from the so-called Mobius inversion formulae. THEOREM 16. (The first Mobius inversion formula). If f is an arithmetical function, and g(n) = )' f(d) , ~ then f(n)= L J.l(d) din PROOF. LJ.l(d)g din (~) = LJ.l(d) L din d'iJ g(~). d f(d') = L J.l(d)f(d') = L f(d') L J.l(d) dd'in d'in di~ = f(n) (by Theorem 15). 57 The Mobius function Jl{n) §5 Theorem 16 has a converse given by THEOREM 17. If h(n) = IIl(d)f din then (~) = dinIII (~) f(d), f(n) = I h(d). din PROOF. When d runs through the divisors of n, so does n/d. Hence I din h(d) = I din h (J) = I I din d'IJ Il (d~') f(d') = I Il (dnd') f(d') = dd'in I d'in () f(d') I Il dnd' dl~ d' = f(n) (by Theorem 15). As an application of Theorem 16, let us consider the relation L rp(d) = n, din which was proved in Chapter II, Theorem 6. From Theorem 16 it follows that rp(n) = L Il(d) din n - = n d Il(d) L. din (8) d As another application, we can consider the von Mangoldt function A, defined by IOgp, if n is a prime power pm, m > 0, A(n) = { 0, otherwise. LA(d) = logn. THEOREM 18. din PROOF. Let n> 1, and have the standard form n = the definition of A, we have r LA(d) = L din ai i= 1 r L A(pf) = L a;logpi = logn, i=l a=l which proves the theorem. r TI pfi. i=l Then, by Arithmetical functions and lattice points 58 VI In conjunction with the first Mobius inversion formula, Theorem 18 gives n A(n) = L l1(d) log -. din d Since LI1(d)=O, if n> 1, by Theorem 15, and logl =0, it follows that din (9) A(n) = - LI1(d)logd. din THEOREM 19. (The second Mobius inversion formula). If f is a function defined for x ~ 1, and L f(~), g(x) = n n~x then f(x) = L l1(n)g n~x and conversely. The sum L (~), for n x~ 1, [xl is interpreted as L, and a sum without terms is 0. n=1 n~x PROOF. From the definition of g we have, if x ~ 1, L l1(n)g (~) = n n~x L l1(n) n~x L f :<x (~) = L mn l1(n)f m,n m~; l~mn~x (~). mn If we rearrange this last sum by grouping together terms for which mn=r, 1 :(r:(x, we get L m,n l1(n)f l~mn:::::;x (~) = mn L 1 ~r~x f (~) LI1(n) = f(x) , r n/r and the first part of Theorem 19 is proved. To prove the converse, let f(x) = L f m~x (~) = m L m~x L l1(n)g ~x n-...::;:;; f: l1(n)g (~), x ~ 1. n n",x (~) = mn L m,n 1 ~mn~x and, as above, this last sum can be written as 1 ;;~x g (~) ~fl(n) = g(X). l1(n)g Then (~), mn 59 Euler's function cp(n) §6 § 6. Euler's function cp(n). We return to Euler's function cp. We know that cp(n) < n, if n> 1. On the other hand, if n = pm, where p is a prime, m~ 1, and p> l/e, O<e< 1, then [cf. Chapter II] cp(n) =n (l-D > n(l- e). From these inequalities we obtain lim cp(n) = 1. THEOREM 20. n n~oo Another result on the order of magnitude of cp(n) is THEOREM 21. For every b > 0, we have cp(n) as ~~OO, n PROOF. The result is trivial if (j n~oo. > 1. If (j:::; 1, we set nl-a f(n)=-. cp(n) Then f is multiplicative, and because of Theorem 6, it is sufficient to prove that f(pm)~o as pm ~oo. In fact, we have, for every (j > 0, 1 (pm) _ _ - _CP_ _ - ma f(pm)-pm(l-a)-P (1) 1 1-- >- _ ma~oo c:;.--2 P p It follows from Theorem 20, or Theorem 21, that the assertion cp(n) = O(nA) is false for every Ll < 1. THE AVERAGE ORDER OF <p(n). Let us consider the behaviour of the summatory function of cp, namely <P(t) = L cp(n); <P(N) is the number of terms in the Farey sequence of order N. THEOREM 22 (MERTENS). <P(t) = 3t2 -2 n + O(tlogt). PROOF. Since <P(t) = ) ' )' 1 = )' 1, 1 ~~t 1 !;;:~n 1 ~m~n~t (m,n)=l (m,n)=l 60 Arithmetical functions and lattice points VI we see that tP(t) is equal to the number of lattice points with relatively prime co-ordinates, which lie inside or on a right-angled triangle: O<y~x~t. We consider the square 0 < x ~ t, 0 < Y ~ t. The line x = y divides it into two right-angled triangles, each of which contains the same number of lattice points, with relatively prime co-ordinates. One of them is given by 0 < y ~ x ~ t. The only lattice point with relatively prime co-ordinates on the line x = y is the point x = y = 1. If 'P(t) denotes the number of lattice points with relatively prime co-ordinates in the above mentioned square, then 'P(t)=2tP(t)-I, (10) for the point x = y = 1 is counted in both the triangles. The total number of lattice points in the square O<x~t, is [t]2, so that [t]2 = 1. O<y~t L O<m~t <n~t o If we arrange them according to the size of the greatest common divisor of their co-ordinates m and n, we have [t]2= L L 1. (11) 1 ~d~t O<m~t O<n~t (m,n)=d Since (m, n) = d, if and only if (mid, nld) = 1, there exists a one-one correspondence between the lattice points with the co-ordinates m, n such that O<m~t, O<n~t, (m,n)=d, and the pairs of integers m',n', such that O<m' t ~-, d O<n' t (m',n') = 1. ~-, d But by the definition of 'P, there are exactly 'P(tld) such pairs m',n'. Hence (11) can be written as [t]2= L 1 ",d"'t (t) 'P - . d (12) Since it is 'P(t) which we want, we apply the second Mobius inversion formula to (12), and obtain 'P(t)= L J1(d)[~J2, t~1. 1 ",d"'t d 61 Euler's function cp(n) §6 Now tid = [tid] + 8, with 0 ~ 8 < 1, so that If'(t) = 1];~tJl(d){~ +O(l)f ~)+o( =t2 I Jl(d)+2t.O( I 1 ~d~t d 2 1 ~d~t d 1 I1), ~d~t since IJl(n) I~ 1. From Corollary 1 of Theorem 7, we know that ~) = 2t'O(IOgt+Y+o(~)) =O(tlogt), t 2t.O( I 1~d~t d and o( I 1) =O(t). Hence 1 ~d~t Jl(d) -d 2 If'(t)=t 2 I + O(tlogt). (13) 1 ~d~t To estimate the sum in (13), we observe that and Ii [t]+ 1 i I Jl(d) < d2 [t]+ 1 ~< d2 fco ~ = o(~) du = [t] [t] u2 t' Thus (13) gives co If'(t)=t 2 d~1 Jl(d) 7 + O(tlogt). coJl~ (14) co Here the series I - 2 can be evaluated as follows. Since I n- 2 , and d=1 d n=1 co I Jl(m)m- 2 are both absolutely convergent, we can multiply them m=1 1 Jl(m) C I 1: . I - 2 = I -i, where Cv = I Jl(k). Since n=1 n m=1 m v=1 V co kJv Cl =1, and cn=O for n>l, by Theorem 15, and I n- 2 =n 2 /6, we have n=1 out, and get <.0 co I n=1 co Jl(n) _ 2 n - co (CO ~)-1 I 2 n=1n _ - ~ 2' n 62 Arithmetical functions and lattice points VI If we substitute this in (14), we get lJ'(t) = 6t 2 -2 7t + O(tlogt), 3t 2 -2 7t + O(tlogt), which, together with (10), gives 4>(t) = (15) as claimed. RELATION .BETWEEN cp AND (J. It is interesting to note that the results on the order of cp lead to results on the order of (J, and vice versa. This follows from THEOREM 23. There exists a positive constant C, such that C< PROOF. If n = in notation, n < 1, for n ~ 2. 2 (16) npa, then we know from (7), with the obvious change pin (J(n) Since (J(n) cp(n) = n pin pa+1_1 P- 1 cp(n)=n =n n I_p-a-1 pin -1 1- P n (1 - ~), P pin we have _(J(_n)-,-cp(_n) _ 2 n n(1 pin 1) ---;:tl P < 1, which proves the second inequality in (16). On the other hand, n(1 - ~) ~ n (1 - 12) > n (1 - 12), P P P pin pin p since 1-1/p2 < 1, and the product on the right extends over all the primes p. This gives the first inequality in (16). Chapter VII Chebyshev's theorem on the distribution of prime numbers § 1. The Chebyshev functions. We have seen in Chapter I that there are infinitely many prime numbers. If we denote by n(x) the number of primes not exceeding x, it follows that n(x)-Hl) as x---+oo. The prime number theorem, which we shall prove in Chapter XI, tells us much more, namely that lim x-+oo ~= x/logx 1. There are several intermediate results of interest, which we shall prove in this chapter. We begin with a result, due to Euler, that the sum L lip, extended over all the primes, diverges, from which it follows that the number of primes is infinite. THEOREM I (EULER).The sum L lip, and the product are both divergent, as p runs through all the prime numbers. TI (l-l/p)-l, PROOF. We shall first show that the product diverges, and then deduce that the series also does. Let P(x)= ( 1 - -1 p~x p TI )-1 , S(x)= 1 L -, p~x P x~2. If u is a real number, 0 < u < 1, and m a positive integer, we have 1 l_um + 1 I-u I-u -- > = I+u+ ... +um • We can set u = lip, where p is a prime. If we do this for all primes p ~ x, and m'ultiply the resulting inequalities, we get P(x» TI p~x We now choose m, such that TI p~x (I + ~P + ... + P~). 2m~x. Then (I + ~ + ... + ~) ~ I I p P n=l n 64 Chebyshev's theorem on the distribution of prime numbers VII for the integers n, such that 1 < n ~ [x], have as prime factors only those primes p ~ x, and the inequality 2m ;?: x ensures that every term in the sum on the right-hand side comes from the product on the left. Hence f [xl+ 1 P(x» [xl 1 n= 1 n L - > du - > logx. U 1 Hence the product f1(1-1/p)-1 diverges. To prove the divergence of the series, we consider the expansion log ( -1-) = u 1-u + -u + -u + "', -1 ~ u < 1. 2 3 2 3 If u>O, we have The geometric series on the right converges for 2 log ( -1-) - u < u , 1-u 2(1-u) °< lui < 1, so that we get u < 1. Setting u = lip, for all p ~ x, and adding together the resulting inequalities, we obtain 1 1 1 1 L <- L 2p,,;x p(p-1) 2n=2 n(n-1) 10gP(x)-S(x)<- 00 2' so that S(x» Hence L lip 10gP(x) -t> loglogx -to diverges, which completes the proof of Theorem 1. THE FUNCTIONS SAND ljI. Chebyshev's functions Sand ljI are defined as follows: 9(x)= L logp, x>O, p a prime, (1) p~x and ljI(x)= L logp, x>O. (2) The sum in (2) extends over pairs p, m, where p is a prime, and m is a positive integer, such that pm~x. This means that if pm is the highest 65 The Chebyshev functions §1 power of p not exceeding x, then log p is counted exactly m times in the sum. For example, I/t(10) = 310g2 + 210g3 + logS + log7. In Chapter VI, § 5, we introduced the von Mangoldt function A(n) = { IOgp, if n = pm, m a positive integer, 0, otherwise. From (2) it is immediate that I/t(x) = L A(n). (3) Further it follows from (1) and (2) that e.9(x) equals the product of all primes p:::; x; and, for x ~ 1, e"'(X) is the least common multiple of all positive integers :::; x. If pm:::; x, then p:::; x 1/m, and conversely. Hence (2) leads to the relation I/t(x)=9(x)+9(X 1/2 )+9(X 1/3 )+ ... , (4) the series being finite, since 9(x)=0 for x<2. If pm:::;x<pm+1, x~1, then log p occurs exactly m times in I/t(x), and m= [logx/logp]. Hence we have a fourth expression for I/t(x), namely I/t(x)= L [ -IOgx] ·logp. p';;x (5) logp We shall now establish a connexion between the functions n(x) 9(x) x/log x ' x THEOREM I/t(x) x 2. Let n(x) n(x) /1 = lim - - , L1 = lim x-+oo x/logx x-+oo x/logx' . 9(x) 12 = lIm - , x- 00 X L 2 = lim x-oo 9(x) x - -1·- I/t(x) L 31m - - . x-oo 5 Chandrasekharan, Analytic Number Theory X 66 Chebyshev's theorem on the distribution of prime numbers PROOF. VII It follows from (4) that .9(x)::;; I/J(x), and from (5) that logx L - I/J(x)::;; p';;;x logp 'logp=logx L 1, p';;;x that is I/J(x)::;; n(x)log x. Hence .9(x)::;; I/J(x)::;; n(x)logx. If we divide throughout by x, and let x -> 00, we get (6) L2 ::;;L 3 ::;;L 1· Let us choose a real number !Y., 0 <!Y. < 1, and keep it fixed. Let x> l. Then .9(x)~ logp, L XIX <p~x and since we have logp>logx~, .9(x)~!Y.logx L 1, which implies that .9(x) ~ !Y.logx{n(x) - n(x~)). But n(x') < x~ trivially, so that .9(x) > !Y.n(x)logx - !Y.x'logx, or .9(x) - X log x logx > !Y.n(x) - - - !Y. -1- . X x -, Since O<!Y.<1, it follows that (logx/x1-')->O, as x->oo. Hence L2 ~!Y.Ll' for every real !Y., such that 0 <!Y. < 1. Hence L2 ~ L 1 • On combining this with (6), we get L1 =L 2 =L 3 • The proof that 11 = 12 = 13 runs along similar lines. It follows from Theorem 2 that if one of the three functions n(x) .9(x) x/log x ' x I/J(x) x tends to a limit as x->oo, then so do the others, and all three limits are the same. Thus in order to prove the prime number theorem, it is sufficient to show that lim t/I(x)/x = l. x-' 00 Chebyshev's theorem §2 67 § 2. Chebyshev's theorem. We shall use Theorem 2 to prove the following THEOREM such that 3 (CHEBYSHEV). There exist constants a and A, O<a<A, if x is sufficiently large, we have x x a -1- < n(x)<A -1-' ogx ogx PROOF. Let n(x) n(x) 1= l i m - - L= lim - - . x-+oo xjlogx' x-+oo xjlogx We shall prove Theorem 3 by showing that L~41og2, and 1~log2. By Theorem 2 these two inequalities are, however, equivalent to - x-+oo .9 (x) x . ljJ(x) L= lim - 1= hm x-+oo PROOF OF x ~ 410g2, (7) ~ log2. (8) (7). The binomial coefficient N = (2n) = (n+ 1)(n+2) ... (2n) n 1·2·3···n has the following properties: (i) N is an integer, which occurs as the largest term in the binomial expansion of (1 + 1)211, which has (2n+ 1) positive terms, so that (9) (ii) N is divisible by the product of all primes p, such that n < p ~ 2 n, for every such prime appears in the numerator of N, while its denominator is not divisible by any prime p > n. Because of (ii), we have N ~ p, hence n n<p~2n L 10gN~ logp=.9(2n)-.9(n). n<p~2n But from (9) we get 10gN <2nlog2. Hence .9(2n)- .9(n) < 2nlog2. If we set n=I,2,2 2 , ••• ,2m we get 5* 1 (10) in (10), and add the resulting inequalities, Chebyshev's theorem on the distribution of prime numbers 68 VII m L 2r<2m+llog2, 9(2m)-9(1)<log2 1 r= or (11) since 9(1)=0. Now let x>1, and m a positive integer, such that 2m-l~x<2m. Since the function 9 is non-decreasing, (11) gives 9(x) ~ 9(2m) < 2m+ 110g 2 ~4xlog2. Hence 9(x) - - < 410g2, x which implies that 9(x) L= lim x-+ 00 X ~ 410g2, as claimed in (7). PROOF OF (8). The second part of Chebyshev's theorem is proved differently. It uses an important formula for the number of times a given prime divides m! We say that a prime p divides the integer n exactly k times, if pkl n, and pH 1 ,r n. LEMMA. The number of times a prime p exactly divides m! is equal to the series being finite since [x] = 0 for 0 < x < 1. Among the integers 1,2, ... , m, there are exactly [m/p] which are divisible by p, namely p,2p, ... , [~J p. (12) The integers between 1 and m which are divisible by p2 (a subset of the set (12)) are [mJ 2 p2 ,2 p2 , ... , p2 p, which are [m/p2] in number, and so on. (13) 69 Chebyshev's theorem §2 The number of integers between 1 and m, which are divisible by pr but not by pr+ 1 is exactly [m/pr] - [m/pr+ 1 J. Hence p divides m! exactly (14) times, which proves the lemma. In order to prove (8), we consider the integer N = (2n) = (2n)! . (n !)2 n Let p be any prime, such that p:( 2 n. Then the numerator of N is divisible by p exactly times, and n! is divisible by p exactly times, so that the denominator of N is divisible by p exactly times. Hence N is divisible by p exactly vp times, where Therefore Since [~~] = [;r]= 0 when pr>2n, that is when IOg2n] r> [ - logp , we have vp = Mp ([2n] r~l Ii - 2 [n]) pr ' Mp = [IOg2n] logp . (15) 70 Chebyshev's theorem on the distribution of prime numbers VII However, for any real y, we have [y]~y<[y]+l, or 2[y]~2y<2[y]+2, and [2y]~2y<[2y]+1, from which it follows that -1 < [2 y] - 2 [y] < 2, hence [2y] - 2[y] = 0, or On using this in (15), we get vp N= ~ 1. (16) M p' hence TIpMp. TIpvp~ p~2n (17) p~2n On the other hand, (5) and (15) give ~(2n) = IOg2n] L [- logp = L Mplogp, pGn so that logp po%2n e"'(2n) = TI p~ pMp, 2n hence by (17), 10gN ~ l/J(2n). From (9) we have 10gN > 2nlog2 -10g(2n + 1). Hence for every positive integer n, we have ~(2n) > 2nlog2 -log(2n + 1). Let x be now a real number, x> 2, and let n = [xI2] n > (xI2) -1, and 2n ~ x. From (18), therefore, we get ~(x) ~ ~(2n) or (18) ~ 1. Then > (x - 2)log2 -log(x + 1), ~(x) x- 2 log(x + 1) - - > --10g2 , x x x hence l/J(x) 1= lim - - ~ log2, x- 00 x which proves Theorem 3. If follows from Theorem 3 that the number of primes is infinite, and, in fact, that the series lip, extended over all the primes, diverges. Let Pn be the nth prime. Then n(Pn) = n, and since we have L x n(x»a'--, logx a>O, Bertrand's postulate §3 71 for sufficiently large x, it follows that n=lt(pn»a'~ > VP,;, logPn if n is sufficiently large. Hence logPn < 2logn, so that apn < nlogPn < 2nlogn, co for sufficiently large n. It follows that the series co L I/nlogn. comparison with the divergent series L I/Pn diverges, in n= 1 n=2 § 3. Bertrand's postulate. The following theorem was conjectured by BERTRAND but first proved by CHEBYSHEV. THEOREM 4 (Bertrand's postulate). If n is a positive integer, there exists a prime P such that n < P ~ 2 n. Chebyshev's proof of this is based on ideas similar to those used in the proof of Theorem 3. The result is first proved for large values of n, and then verified for smaller values with the aid of a table of primes. We shall give here a proof due to S. S. PILLAI, which is simpler, in as much as it avoids the use of Stirling's formula for r (n), and reduces the number of verifications to a minimum. In proving Chebyshev's theorem, we applied inequality (9), namely 2n _2 _ <N<22n 2n+l ' for the binomial coefficient N = (2nn). and deduced (11) from it, namely 9(2m) < 2m+ 11og2. (11) We shall now require the sharper estimate 22n 22n --<N<-- 2Vn n~2, tnn' (19) in order to prove that (11) holds not only for powers of 2, but for all positive integers n, that is 9(n) < 2nlog2, n ~ 1. PROOF OF (19). Define the number p= 1 . 3 . 5 ... (2 n - 1) 2·4·6 ... (2n) . (20) 72 Chebyshev's theorem on the distribution of prime numbers VII Since 1·3·5 ... (2n-l) 2·4·6 ... (2n) p = . 2·4·6 ... (2n) 2·4·6 ... (2n) (2n)! = -::---::- 22"(n!)2' we have 22 " P = N. It is obvious that 1> (1-~) (1-~) (1_~)"'(1 __ 1 ) 4 6 (2nf ' 2 22 2 which can also be written as 1> (~) (~) (~) ... ((2n-l)(2n+l)) 42 22 (2n)2' 62 or 1> (2n+ I)P2 > 2nP 2 = 2n -4 2" N 2, which gives the second inequality in (19). Similarly we have 1 > (1 -~) 3 (1 2 -~) 52 (1 _ ~) ... (1 _ 1 ) 7 (2n -If ' 2 which can be written as 1> (~) (~) (~) ... ((2n-2)2n) 2 2 3 (2n-l)2' 7 52 or 1 24n 1 >--=-- 4nP2 4nN 2 ' which gives the first inequality in (19). Thus (19) is proved. PROOF OF (20). This is trivial for n = 1 and n = 2. Assuming that it is true for some n ~ 2, we shall deduce that 9(2n -1) < 2(2n -1)log2, which would imply that 9(2n) = 9(2n -1) < 4nlog2. Consider the integer -1). N = ~ (2n) = (2n)! . ~ = (2n -I)! = (2n 2 2 n (n!)2 2n n!(n -I)! n-l This is divisible by all primes p, such that n < p ~ 2 n - 1, and therefore also by their product. Hence N -~ 2 n n<p';;2n-l p. 73 Bertrand's postulate §3 On taking logarithms, we get N log "2 ~ 8(2n -1) - 8(n). But from (19) we have 10gN < 2nlog2 - tlog2n. On combining these two inequalities, we get 8(2n -1) - 8(n) < (2n -1)log2 -tlog2n. But, by hypothesis, we have 8(n) < 2nlog2, hence 8(2n -1) < 2nlog2 + (2n -1)log2 - t log2n, which implies, since n ~ 2, that 8(2n -1) < 2(2n -1)log2, which is the sought inequality. Thus if (20) is proved for a certain positive integer n ~ 2, then it also holds for the integer 2 n -1, and hence for 2n. If 8(n) < 2nlog2, for every n in an interval of the form 2' - 1 < n ~ 2', r ~ 1, then it is true also for every n in the interval 2' < n ~ 2'+1. It follows by induction that (20) is true for n ~ 1. We shall need (19) and (20) for Pillai's proof of Theorem 4. PROOF OF THEOREM 4 (S. S. PILLAI). In order to prove Theorem 4, we shall prove that 8(2 n) - 8(n) > 0 for n ~ 26 , and verify the inequality directly for 1 ~ n < 26 . We consider once again the binomial coefficient (cf. (17)) N = (2n) = (2n)! = n where (n!f n pVp p';;2n ' Then (21) We split this sum into four parts :£1' :£2':£3 and :£4' corresponding to the following four different ranges of values of the prime p, namely (i) n<p~2n; (ii) 23n < p~n; (iii) ~<p ~ 23n, n~5; (iv) p~~. 74 Chebyshev's theorem on the distribution of prime numbers VII In 1:1 we have n/p<l, so that [n/p] =0; and 1~2n/p<2, so that [2n/p] = 1, and [2n/p2] =0. Hence vp=l, and we obtain 1:1 = L L vplogp= logp=8(2n)-8(n). (22) In 1:'2 we have l~n/p<t, so that [n/p]=l and [2n/p] =2. Further, if n;;:: 3, then [2 n/p2] = O. Hence 1:2 =0, for n;;::3. (23) In 1:'3 we have n;;::5, and n/p2 <2n/p2 < 1, so that vp=[2n/p] -2 [n/p] =0, or 1 (cf. (16)). Hence 1:3~ L IOgp=8(23n) - 8 (V2n). jI2n<p"'2n/3 But 8(V2n)= L logp;;::log2 p",jI2n L 1 =n(V2n)log2. p",jI2n Hence (24) In 1: 4 we apply Chebyshev's inequality (cf. (17)) vp~Mp IOg2n] = [- , logp and get L 1:4~ Mplogp~ p",Vln log2n L - - ·logp=log2n L p",J/2n logp 1, p",J/2n that is 1: 4 ~ n(V2n)log2n. (25) By combining (21), (22), (23), (24) and (25), we obtain, for n;;::5, logN ~8(2n)-8(n)+8 (23n) - n(V2nHlog2-log2n), which can be written as ,9(2n)-,9(n);;::logN -,9 (23n) - n(V2n)logn. (26) Bertrand's postulate §3 75 From this we shall deduce that 9(2n)-9(n»0, for sufficiently large n. For this purpose we need three inequalities: (a) logN>2nlog2-log(2Vn), which is a consequence of the first inequality in (19); (b) ge3n ) = .9 ([2 nJ) < 2[23n}Og2, if 3 n~2, because of (20); and (c) n n(n)::=;-, 2 if n~8, because every even integer greater than 2 is composite. On using (a), (b) and (c) in (26), we get, for n ~ 32, 0 4n 9(2n)-.9(n»2nlog2-log(2Vn) - 3"log2 - -2- logn, which can also be written as 2n .9(2n)-.9(n) > ( 3" - 1) log2 It remains for us to show that 2n - 1) 10g2 ( 3" (0+1) 2 (0+1) 2 logn. 10gn>0, (27) (28) for sufficiently large n. It is easy to see that (28) holds for n=26. We shall prove that it holds also for n > 26 • For this purpose we write (28) in the form 0-~ logn _ 3V2.10g~>0. 2 log2 log2 ~ (29) If we replace n by a real variable x, and observe that both the functions 3 logx Vh---, 2 log2 and 3V2log~ log2 ~ have a positive derivative for X~26, so that they are increasing in that range, while their sum is positive for x = 2 6, it follows that the sum remains positive for x> 26. Hence 9(2n)-.9(n»0, n?26. That is, Bertrand's postulate is true for n?2 6 =64. (30) Chebyshev's theorem on the distribution of prime numbers 76 VII Now every prime, but the first, in the sequence 2,3,5, 7, 13,23,43,67 (31) is smaller than twice its predecessor. Hence to each positive integer n ~ 66, there corresponds at least one prime p, such that n < p ~ 2 n. This completes the proof of Theorem 4. § 4. Euler's identity. The identity 1 00 L sreal, ~=TI(1_p-S)-1, s>1, (32) n p where p runs through all the primes, is a special case of the following n=l THEOREM 5. Let f be a multiplicative arithmetical function, and let the 00 series L f(n) be absolutely convergent. Then we have the identity n=l 00 L f(n) = n= 1 TI (1 + f(p)+ f(p2)+ .. -), (33) p where the product on the right-hand side is absolutely convergent. If f is completely multiplicative, that is f(mn) =f(m) f(n), for all positive integers m, n, then 00 (34) n= 1 PROOF. p Since f is multiplicative, f(1) = 1. Let P(x)= TI (1 + f(p)+ f(P2) + .. -). p~x Since P(x) is the product of finitely many absolutely convergent series, we can multiply them out and get P(x) = L f(n'), where n' runs through all positive integers which have no prime factor greater than x. If we set 00 S= then L f(n), n=l P(x)-S= - Lf(n"), where n" runs through all positive integers which have at least one prime factor greater than x. Obviously n" > x, so that IP(x)-SI ~ Llf(n") I ~ L: n>x If(n)l. Euler's identity §4 If we let 00 L If(n)I--+O, then X-HI), 77 L If(n)1 since n >x is, by hypothesis, n= 1 convergent. Hence lim P(x)=S, as claimed in (33). x--+ 00 The product on the right-hand side of (33) converges absolutely, since 00 L If(p)+ f(p2)+ ···1 ~ L (If(P) 1+ If(p2)1 + ...)~ L If(n)1 < 00. n=2 p~x p~x We now consider the case in which We see from (35) that the series f (35) is completely multiplicative. L (If(P)1 + If(P2)1 + ...), p extended over all the primes, is convergent. But now f(pn) = (j(P)t, hence (If(P)1 + If(pW + ...) L p is convergent. Since each term in this sum is a geometric series, it follows that If(P) 1 < 1. Hence 00 n=l p p = TI(1-f(p)t 1 , p which completes the proof of Theorem 5. Euler's identity results from (34), if we set f(n) = n -s, s> 1. Let 1 L --; = TI (1- p-S)-l, 00 ((s)= n=l n (s real, s> 1). p Then where p runs through all the primes, and m through all positive integers. Differentiating term by term, we get hence C(s) _ ;, A(n) - - L... - ((S)-n=l nS (s real, s> 1), ' (36) 78 VII Chebyshev's theorem on the distribution of prime numbers where A is the von Mangoldt function defined in Chapter VI, § 5. The term-wise differentiation is permissible, because both the series p L p-Slogp -s converge uniformly for s ~ 1 + c5 > 1. I-p 00 The right-hand side of (36) is a Dirichlet series of the form an n -s, ~)og(l- p -S), and p L n=1 whose coefficients an are given by the von Mangoldt function A(n). With the help of (36) we shall show that if any of the functions n(x) 9(x) x/log x ' x ~(x) x tends to a limit as x-+ 00, that limit must be equal to 1. We know already from Theorem 2 that if any of these three functions tends to a limit, so do the others, and all three limits are the same. We shall work with the function ~(x)/x, and use the relation L A(n). ~(x)= n~x We shall need the identity f ~(x) 00 - C(s) = s ((s) XS + 1 dx (sreal, s> 1). 1 This can be obtained from Abel's summation formula. THEOREM 6 (ABEL). Let 0 ~ Al ~ }o2 ~ ... be a sequence of real numbers, such that An-+oo as n-+oo, and let (an) be a sequence of complex numbers. Let A(x)= L an, and q>(x) a complex-valued function defined for x ~ O. Then k k-l L anq>(An}=A(Ak)q>(Ak}- L A(An)(q>(An+l}-q>(An))· (37) n=1 n=1 If q> has a continuous derivative in (O,oo), and written as X~Al' then (37) can be x (38) If, in addition, A(x) q>(x)-+O as x-+ 00, then 00 00 (39) provided that either side is convergent. 79 Euler's identity §4 PROOF. If we define A(jI.o)=O, then we have k k n=l n=l L ancp(An)= L (A(An)-A(An-d}CP(An) k-l L A(An)(CP(An+l)-CP(An)}, =A(Ak)CP(Ak)- n=l which proves (37). To prove (38), let k be the largest integer, such that Ak :::;; x. Then, since cP has a continuous derivative cP', the sum on the right-hand side of (37) equals k-l An+l n= 1 An L A(An) S cp'(t)dt, while the first term on the right-hand side of (37) equals x A(Ak)CP(Ak)=A(x)cp(x)- S A(t)cp'(t)dt, since A(t) is a step function which is constant in the interval Ak:::;; t < Ak+ l' Thus (38) follows from (37), and (39) from (38) if we let x~ 00. This completes the proof of Theorem 6. If we set An=n, an=A(n), and cp(x)=x- s (s real, s> 1), then A(x)=tjJ(x), and A(x)cp(x)~O as x~oo, since tjJ(x):::;;n(x)logx<xlogx (cf. Proof of Theorem 2), so that A(x)cp(x)=O(x1 - S logx) =0(1). Thus from (36) and (39) we obtain f 00 - __ .ns) __ s ((s) tjJ(x) dx (s real, s> 1). xs+1 We are now in a position to prove THEOREM 7. . n(x) -.- n(x) hm --:::;; 1:::;; hm - - . x/logx x-+oo x/logx x-+oo PROOF. We shall prove that · tjJ(x)./ 1 ./ -I' tjJ(x) I1m "'" "'" x-oo 1m X , X x-oo and apply Theorem 2. Let f(s) = - ns)/((s), for every real s> 1, and let 1= lim tjJ(x) , x-oo X 1'= lim (s-l)f(s), s-+1+0 L= lim tjJ(x) , x-oo [;= X lim (s-l)f(s). s-+1+0 (40) 80 Chebyshev's theorem on the distribution of prime numbers VII Obviously we have I ~ L, and l' ~.G. We shall first show that I ~ l' ~.G ~ L, and then that l' =.G = 1. Together they give Theorem 7. If B>L, then tjJ(x)jx<B for x~xo=xo(B), and we may assume that x o> 1. From (40) we have, for s> 1, oo tjJ(x) fxo tjJ(x) foo B f(s)=s f x s + 1 dx<s x s + 1 dx+s X S dx, 1 1 Xo so that f(s)<s XO tjJ(x) f ~1 ~ dx+s 1 fOO B fXO tjJ(x) -dx<s -2- dx x ~ 1 1 sB + --, s-1 which can be written in the form (s-1)f(s)<s(s-1)K +sB, where If s--+ 1 +0, we obtain .G ~ B. Since this holds for every B > L, we must have .G ~ L. Similarly we prove that I ~ l', so that I ~ l' ~.G ~ L. To show that l' = z.; = 1, we shall show that lim _(S-1)2t(S)=1, s"'l + 0 and lim (s-1)((s)= 1. s"'l +0 Together they imply that (s-1)f(s)--+1. as 8--+1+0. For s> 1, the function x- S is a decreasing function of x, so that f-<L 00 1 dx 00 XS n= 1 f 00 -1 < 1 + nS 1 that is 1 s -«(s)<-, s-1 s-1 which implies that (s -1)((s)--+ 1 as s--+ 1 +0. dx -, XS Some formulae of Mertens §5 On the other hand, for s> 1, and decreasing, so that the function x-Slogx x~e, f 81 IS 00 -ns)= and on substituting L00 logn n=l n X S- 1 = = logx dx+0(1), S 1 XS eY, we get f 00 - -12 -ns) = (s-1) ye- Y dy+0(1)= o - -12 (s-1) +0(1). Thus (s-1)f(s) = _ (s-1)2ns) -d, (s -1)((s) as s-d+O. Hence l' = L = 1, which implies that 1:::;; 1:::;; L. Taken together with Theorem 2, this proves Theorem 7. . n(x) . . It follows that if - - tends to a hmlt as x --+ 00, then that limit xjlogx must be equal to 1. § 5. Some formulae of Mertens. THEOREM 8. As x --+ 00 we have A(n) L- n~x n = logx +0(1); ff logp L - p""x p = logx+0(1), (41) x ljJ(t) dt = logx + 0(1), 1 L ~ = log log x + C + 0 p""x p (_1_) , logx (42) (43) where C is a constant. PROOF. We use a weak form of Stirling's formula, namely log(m!) = mlogm + O(m), as m--+CI). We know from Theorems 2 and 3 that ljJ(m) = O(m), as 6 m--+ 00. (44) (45) By the Lemma proved in the course of Theorem 3, we have Chandrasekharan, Analytic Number Theory 82 Chebyshev's theorem on the distribution of prime numbers m! = n p[!!!]+[mJ+ ... p p VII , p~m or L rLP~] logp = n~m )' [~] A(n), n log(m!) = (46) p"~m where A is the von Mangoldt function [cf. (3)]. To prove (41), we put ; = [;] + 6n , where 0 ~ 6 n < 1, in (46), so that m log(m!) = ~ - A(n) + O(m), n-...::m n on using (45). If we divide by m, and apply (44), we get ~ n-..;:m A(n) - n = logm +0(1). Replacing the integer m by the real variable x, we get the first formula in (41). The second formula in (41) follows from the inequality I A(n) )' - n /~x 81 1) 10gpi ~)' 2 p p~x )' p?:x + 3' +... p logp < logp < L p(P-l) We can deduce (42) from (41) by using (45). For I/J (t) = and for x ~ 1, we have f I/J~t) f L x 00. p L A(n), n';;t x t dt = A(n) d: t n';;t 1 1 = f x L A(n) n';;x dt '2 = t L A(n) (1 1) - - - n';;x n x = I/J(x) L -A(n) - . n';;x n x n Formula (43) can be proved by using (41) together with Abel's summation formula. Let (Pn) be the sequence of primes in natural order, and A(x) = L an, where h';;X and B(x) = ~ bn , Pn~:X~ where an 10gPn = --, ~ §5 If Some formulae of Mertens X ~ 2, then, by Theorem 6, we have 83 f x B(x) = A(x) all -=-+ P~x 10gPn logx A(u)du u(logu)2 . 2 From the second formula in (41), we have A(x) = logx + E(x), where IE(x)1 < K, for all x ~ 2, K being a constant. Hence f--+ f x E(x) B(x) = 1 +-+ logx x du E(u) ulogu u(logU)2 2 2 du f x E(x) = 1 + - - + (loglogx -loglog2) + logx . . f E(u) u (log u) 2 duo 2 00 Smce IE(x)1 < K, the mtegral E(u)du u(logu) 2 converges, and 2 B(x) E(u)du = loglogx + ( 1 -loglog2 + 0 0 ) 2 + E*(x), f u(logu) where E(x) E*(x)=-- logx 2K logx IE*(x)I<--, 6" E(u)du u(logu)2' x so that This proves (43). f 2 00 for x~2. Chapter VIII Weyl's theorems on uniform distribution and Kronecker's theorem § 1. Introduction. We have seen in Chapter III that to any given irrational number ~, there correspond infinitely many rational numbers p/q, such that I~ - p/ql < 1/q2. From this follows Dirichlet's theorem that corresponding to any given irrational number ~, there exist infinitely many pairs of integers p and q, such that q~ differs from p by as little as we please. For given e, 0 < e < 1, we consider the integer 1 + [l/e]. Since there exist infinitely many rationals p/q, such that Iq~ - pi < l/q, it follows that there exist infinitely many fractions p/q, with denominator q ~ 1 + [l/e], for which we have Iq ~ - pi < l/q < e. Dirichlet's theorem can be generalized as follows. Given any irrational number 0, an arbitrary real number 0(, and positive real numbers Nand e, there exists integers nand p, such that n>N, and InO-p-O(I<e. If 0( = 0, this reduces to the above-mentioned theorem of DIRICHLET. If 0 < Q( < 1, and e is an arbitrarily small positive number, it follows that the fractional part of nO, namely {n O} = n 0 - [n 0], is arbitrarily close to 0(. In other words, the numbers ({nO}), n= 1,2,3, ... , are everywhere dense in the interval [0,1). This generalization of Dirichlet's theorem is itself a special case of a deeper result due to HERMANN WEYL on the uniform distribution of numbers, which we shall prove in this chapter. If we are concerned with the fractional parts of real numbers, it is of advantage to introduce a new notion. Two real numbers Xl' X2 are said to be congruent modulo 1, if they differ by an integer. The relation of being congruent modulo 1 is clearly an equivalence relation, which partitions all real numbers into equivalence classes, the elements of each equivalence class consisting of all real numbers with the same fractional part. The map x-+e 27tix induces a one-one correspondence between these equivalence classes and the points of the unit circle. § 2. Uniform distribution in the unit interval. Let S be a finite set of real numbers O(l,0(2' ... 'O(Q contained in the interval [0,1), that is O~O(j<l, l~j~Q. 85 Uniform distribution in the unit interval §2 ° Given any pair of real numbers a,b, such that ~ a < b ~ 1, we define an interval function <p(a,b) by the requirement that <p(a,b) equals the number of IX'S which are contained in the interval [ a, b), that is those numbers IXj for which we have a~lXj<b, 1~j~Q. We define the discrepancy of the set S to be the number D, where <p(a,b) I. D=sup l---(b-a) (1) Q a,b Clearly 0< D ~ 1. If we denote the interval [a, b) by I, and its length by III, and write <p(I) for <p(a, b), then (1) takes the form sup I-<p(I) - III I. D= lC[O,l) (1 )' Q Given an irifinite sequence of real numbers 1X1,1X2,"" in the interval [0,1), we denote by Dn the discrepancy ofthe first n terms of the sequence. We say that the sequence (IX i) is uniformly distributed, if Dn~O as n~oo. Let <Pn(a,b)=<Pn(I) be the number of IX/S with a~lXj<b and 1 ~j ~ n. It follows from the definition that if the sequence (lXi) is uniformly distributed in [0,1), then clearly <Pn(a,b) ~ (b _ a), n (2) ° as n~ 00, for each pair of real numbers a, b, such that ~ a < b ~ 1. But the converse is also true: if (2) holds for each such interval [a, b), then the sequence (lXi) is uniformly distributed. For the interval [0,1) can be split up into a finite number of subintervals (Ik), say, each of length b, 0< b < 1. Now given any interval [c,d), where ~ c < d ~ 1, let r denote the number of intervals (Ik)' each of length b, which lie in the interior of [c,d). Their total length is rb, and we have rb> (d - c) - 2b. If r' denotes the number of intervals Ik which intersect [c,d), then r'b«d-c)+2b. Since (2) holds for each interval [a,b), it holds, in particular, for an interval Ik of length b. Thus given s > 0, there exists a number N(s), such that ° 86 Weyl's theorems on uniform distribution and Kronecker's theorem VIII for all n>N(e), and all k. If we choose e=b 2 , we get (1- b)b ~ <p)Ik) ~ (1 + b)b, n for all n>N'(b), which implies that ((d-c)-2b}(1-b) <Pn(c, d) -- ~ n ~ (d-c)+2b}(1+b), and since d - c ~ 1, it follows that l<Pn~,d) _ (d-C)I ~ 3b+2b 2 , for n> N'(b), for any interval [c,d) c [0,1), with b independent of the interval. This implies that Dn~O as n~oo. Thus we have proved THEOREM 1. An infinite sequence of real numbers (IX;), i= 1,2, ... , such that 0 ~ lXi < 1, is uniformly distributed, if and only if <pn(a, b) ~ (b _ a), n as n ~ 00, for each pair of real numbers a and b, such that 0 ~ a < b :::; 1. Here <pn(a,b) equals the number of lXi' such that a~lJ.i<b, and 1 ~j~n. We remark that a uniformly distributed sequence (lX i ) is everywhere dense in the unit interval [0,1). § 3. Uniform distribution modulo 1. An infinite sequence of real numbers (lXi), not necessarily contained in the unit interval, is said to be uniformly distributed modulo 1, if the corresponding sequence of fractional parts ({IX;}) is uniformly distributed in the sense already defined in § 2. Thus, if Dn is the discrepancy, as defined in § 2, of the first n terms of the sequence ({ lX i }), then Dn ~ 0 as n ~ 00. We shall see that this condition has an alternative, but equivalent, formulation in terms of a new notion of discrepancv modulo 1. Given a set S of real numbers IXbIX2'"'' IJ.Q, let T denote the set of real numbers (lXk + t), where 1 ~ k ~ Q, and t runs through all integers. Given any pair of real numbers a and b, such that b~a, let <p*(a, b) denote the number of elements of T, which are contained in the interval [a,b). Then <p*(a + t, b + t) = <p*(a, b) (3) 87 Weyl's theorems §4 for any integer t. Further cp*(a,b) = cp(a,b), if 0 ~ a < b ~ 1, (4) where cp(a, b) is defined, as in §2, for ({IXk}), 1 ~k~Q. The discrepancy modulo 1 of the set S is defined to be D*, where D* = sup O<:;b-a<:;l Icp*(a,b) Q I (5) (b - a) . Here a runs through all real numbers, but in view of (3), we may assume that O~ a < 1. If D is the discrepancy of the fractional parts of the numbers in S, we have trivially D ~ D*, because of (1), (4) and (5). On the other hand, we also have D*~2D, since any interval [a,b), where O~a<1, and b - a ~ 1, is the disjoint union of at most two intervals each of which is of the form [a', b'), where either 0 ~ a' < b' ~ 1, or 1 ~ a' < b' ~ 2. Thus cp*(a,b) = LCP*(a',b'), where the sum b - a = L(b' - a' ), L extends over at most two terms. Hence ICP*~,b) _ (b _ a)1 ~L ICP*(~,bl) - (b ' - a')1 ~ 2D, because of (1), (3) and (4), and of the fact that there are at most two terms in Therefore D* ~ 2D. Thus, given a set S of real numbers (IXj), 1 ~j ~ Q, we have defined first the discrepancy D of their fractional parts, and secondly D*, their discrepancy modulo 1, and the two are connected by the inequalities L. (6) If (IXj) is an infinite sequence of real numbers, not necessarily contained in the unit interval, let Dn denote the discrepancy of the first n terms of the corresponding sequence of fractional parts ({IXj}), while D: denotes their discrepancy modulo 1. It follows from (6) that if Dn~O as n~oo, then D:~O as n~oo, and conversely. Thus we have proved THEOREM 2. An infinite sequence of real numbers (IXj) is uniformly distributed modulo 1, if and only if D:~O as n~oo, where D: is the discrepancy modulo 1 of the first n terms of the sequence (IX;). § 4. Weyl's theorems. THEOREM 3. If (IXj) is an infinite sequence of real numbers, such that for j= 1,2, ... , a necessary and sufficient condition for (IXj) O~IXj< 1, to be uniformly distributed is that 88 Weyl's theorems on uniform distribution and Kronecker's theorem 1 lim n-+co n 1 n L: f(r:x h) h=l = S f(x)d x, VIII (7) 0 for every function f which is Riemann integrable in 0 ~ x ~ 1. PROOF. We may assume f to be real-valued, for otherwise we can consider the real and imaginary parts separately. The sufficiency of condition (7) for the sequence (r:x) to be uniformly distributed is easy to prove. Given any interval [a, b), such that 0 ~ a <b~ 1, we take f to be the characteristic function of [a,b): f(x) = 1 if a~x<b, while f(x)=O otherwise. Then 1 .;, <fJn(a, b) f(r:x h ) = - - , n h= 1 n - L... (8) 1 while S f(x)dx=b-a. Condition (7) therefore implies that o lim <fJn(a,b) = b-a, n n--+oo (9) which, by Theorem 1, implies that the sequence (r:x) is uniformly distributed. Conversely, if (r:x) is uniformly distributed, then (9) holds, so that (7) holds for the characteristic function f of any interval [a, b) contained in [0,1], and because of linearity, (7) holds also for any step function in [0,1]' If f is Riemann integrable in [0,1], then given e > 0, one can f1' f2 such that f1 ~f ~f2' and find two step functions 1 S(f2(X)- f1(X))dx<e. Since (7) holds for f1> we have o so that, if n is sufficiently large, Since f~ f1' it follows that 1 n - L: f(r:x » n h h=l 1 S f(x)dx-2e, 0 Weyl's theorems §4 89 for sufficiently large n. Similarly we get 1 n 1 n h= 1 0 - L f{(J.h) < Sf{x)dx+2t:, for sufficiently large n. Thus for sufficiently large n, which proves (7) for every Riemann integrable function in [0,1]. THEOREM 4. If ({3) is an irifinite sequence of real numbers, not necessarily contained in the unit interval, a necessary and sufficient condition for ({3) to be uniformly distributed modulo 1 is that 1 lim .->00 n L e21timPh =0, n (10) h=l for every integer m i= 0, where i2 = - 1. PROOF. Let ({3) be uniformly distributed modulo 1, and let (J.j denote the fractional part of {3j. Then {(J.j) is uniformly distributed in the unit interval. If in Theorem 3 we take !(x)=e 21timx, where m is an integer, and mi=O, it follows that 1 lim .-00 n L e21t1mah= Se21t1mx dx=0, n h=l . 1. 0 which is the same as (10), since rJ.h differs from {3h by an integer. Conversely, if (10) holds for every integer mi=O, we have and we shall show that condition (7) is satisfied for every Riemann integrable function in [0,1 J. Obviously (7) holds for f(x)= 1, and it holds, by our hypothesis, for f{x) = e21timx, where m is an integer different from zero. Hence it holds also for any trigonometric polynomial of the form ao+(a1cos2nx+b 1 sin2nx)+ ... +(amcos2nmx+bmsin2nmx), where the a's and b's are constants. Now any continuous periodic function f, of period 1, can be approximated by a trigonometric poly- 90 Weyl's theorems on uniform distribution and Kronecker's theorem VIII nomial of that kind. That is, given e > 0, there exists a trigonometric polynomial f., such that If-f.l<e. Set f1 = J.-e, 1 and f2 = J.+e, so that f1 ~f ~f2' and S(f2(X)- f1(x))dx o =2e. As in the proof of Theorem 3, it follows that (7) holds for any continuous periodic function of period 1. Confining attention to the basic interval [0,1 J, for any step function f in [0,1 J we can find two continuous periodic functions f1 and f2, such that f1 ~ f ~ f2' and 1 S (f2(X)- f1(x))dx<e. Hence (7) holds for a step function f in [0,1J, o which implies, as before, that it holds for any Riemann integrable function in [0,1 J. This completes the proof of Theorem 4. As an application of Theorem 4, we have THEOREM 5. If ~ is any irrational number, then the infinite sequence n= 1,2, ... , is uniformly distributed modulo 1. (n~), PROOF. Let m be an integer different from zero. Set wish to show that 1 lim n-+oo n m~=I1. We L e21tih~=0. n h=l As '1 is real, but not integral, since ~ is irrational, we have so that and Theorem 4 then gives the result. COROLLARY. If tional parts ({n~}), is an irrational number, then the sequence of fracn=1,2,3, ... , is everywhere dense in the unit interval. ~ The concept of uniform distribution can be generalized to spaces of dimension greater than one. Let (p(j)) be an infinite sequence of points in a p-dimensional Euclidean space, where p ~ 1, and let the coordinates of the point p(jl be given by (X j1 ,X j2 , ... , x jp ). Let (Xjr denote the fractional part of Xj" namely {x jr }, so that 0~(Xjr<1, for 1 ~ r ~ p. If we denote by {p(j)} the vector of fractional Kronecker's theorem §5 91 parts ({x j1 },{X j2 }, ... , {x jp }), then the point {p(j)} lies in the unit cube defined by O~Xj< 1, 1 ~j~p. Let V denote a rectangle, that is the cartesian product of p intervals, contained in the unit cube, and let IVI denote its (Lebesgue) measure, which is the product of the lengths of the corresponding intervals. We say that the infinite sequence (pUl) is uniformly distributed modulo 1, if and only if the corresponding sequence ({ p(j)}) is uniformly distributed in the unit cube; that is, if and only if lim <l'n(V) = n lVI, n-+ 00 for every rectangle V contained in the unit cube, where <l'n(V) denotes the number of points among the first n terms of the sequence ({ pW}) which are contained in V. As in the one-dimensional case, this is equivalent to the statement <l'n(V) s~p I-n- - IVI I-+0, as n-+oo. THEOREM 5'. The sequence {p(j)} is uniformly distributed in the unit cube if and only if 1 lim n---co n I n e21ti[ml~hl + m2~h2 + ... + mp~hpl = 0, h=l for every set of integers (m 1 ,m2, ... , mp)¥(O,O, ... , 0). The proof here runs along the same lines as in the case of one variable. We have only to observe that a 'step function' can be approximated, for example, by twice continuously differentiable functions, which have uniformly convergent Fourier series. The following generalization of Theorem 5 is a consequence. THEOREM 6. If ~1'~2' ... , ~p are real numbers, such that ~1'~2' ... , ~p, 1 are linearly independent over the integers (that is, there exists no linear p relationoftheform L lj~j=l, where I and ljare integers, and (/1,12, ... ,lp,l) j=l #(0,0, ... ,0,0»), then the sequence n~=(n~1,n~2, ... ,n~p), n = 1,2, ... , is uniformly distributed modulo 1. where § 5. Kronecker's theorem. Theorem 6 implies that the sequence ({n~}), where {n~} =({n~d,{n~2}' ... , {n~p}), is everywhere dense in the unit cube. This is known as Kronecker's theorem, and is a generalization to higher dimensions of the theorem mentioned in §1. We state it as 92 Weyl's theorems on uniform distribution and Kronecker's theorem VIII THEOREM 7. If Ol,02, ... ,Ok,1 are real numbers linearly independent over the integers, 1X1,1X2, ... ,lXk are arbitrary real numbers, and N and e are positive real numbers, then there exist integers nand Pl,P2, ... ,Pk, such that for m=1,2, ... ,k. We shall give another version of this theorem, namely THEOREM 8. If 0l,02' ... , Ok are real numbers which are linearly independent over the integers, IXl,1X2, ... ,lXk are arbitrary real numbers, and T and e are positive real numbers, then there exist a real number t, and integers Pl,P2, ... ,Pk, such that t>T, and ItOm-Pm-IXml<e, for m= 1,2, ... , k. We shall see that Theorem 7 is equivalent to Theorem 8. Let us first assume Theorem 8, and show that Theorem 7 follows from it. To prove Theorem 7 in the form given, it suffices to prove it with O<Om~ 1 for 1 ~m~k. For if 1,Ol' ... , Ok are linearly independent over the integers, so are 1,O~, ... , 0;', where OJ=Oj-qj, and (q) are suitable integers; and the inequality InO~-p~-lXml<e, for an integer P;", implies that InOm-Pm-lXml<e, wherepm=p;"+nqm. Letustherefore assume that O<Om~l for l~m~k, and O<e<l,andthat 01,02, ... ,Ok,1 are linearly independent over the integers. Then by Theorem 8, with k+ 1 instead of k, N + 1 instead of T, and te instead of e, applied to the set there exist integers P1 , P2, ... , PH l' and a real t, with t > N + 1, such that and It-PHtl<!e. It follows that Pk+l>t-te>N, since t>N+1, and e<1. And, since O<Om~ 1, we have IPH 1 Om - Pm -lXml ~ ItOm - Pm -lXml + I(PH 1 - t)Oml ~ ItOm-Pm-lXml+ IPk+l -tl<e, for m = 1,2, ... , k. Thus Theorem 7 is proved with n = Pk + l. Conversely, let us assume Theorem 7, and prove Theorem 8. If k= 1, Theorem 8 is trivial, so that we assume k> 1. It is sufficient to prove the theorem for Om>O, m= 1, ... ,k. Let Ol,02, ... ,Ok be linearly Kronecker's theorem §5 93 independent over the integers. Then the numbers (}1 (}z (}k-l (}k (}k (}k - , - , ... ,--,1 are also linearly independent. If we apply Theorem 7, with N = the set T(}k, to it follows that there exist integers Pl,PZ"",Pk-l, and n with n>N, such that In ~~ - Pm-rY.m I < s, m= 1,2, ... , (k-1). If we set t = n/(}k' then t> T, and It(}m-Pm-rY.ml <s, m= 1,2, ... , (k-1), while trivially so that we have the conclusion of Theorem 8 for the set Similarly one can prove Theorem 8 for the set These two conclusions together imply that Theorem 8 is valid for the set for if the difference of t (}m from rY. m is nearly an integer, and the difference of t' (}m from Pm is nearly an integer, then the difference of (t + t') (}m from rY. m + Pm is nearly an integer. Thus the equivalence of Theorem 7 and Theorem 8 is proved. We shall now give a proof of Theorem 8 due to H. BOHR. PROOF OF THEOREM 8. If c is real, T>O, and iZ = -1, then we have T . -1 hm T T--+oo f't eCl dt = o {O, if c # 0, 1, if c=O. Weyl's theorems on uniform distribution and Kronecker's theorem 94 VIII Thus if c. is real, and X(t)= L b.eCvit , cm#c n .=1 if m#n, (11) then (12) Let F(t)= 1 + k L e2lti(t8m-llm), (13) m=l where t is real, and qJ(t) = IF(t)l. Then obviously O~qJ(t)~k+ 1. If Theorem 8 is true, then for a sufficiently large t, every number t (}m - O(m is nearly an integer, and qJ(t) is nearly k+1. For if xm=t(}m-O(m, and e > 0 is given, there exists a lJ, such that if Pm is an integer, and IXm - Pm 1< lJ, then le2ltiXm - 11 < e. Conversely, if qJ(t) is nearly k + 1 for some large t, then every term in the sum (13) must be nearly 1, since no term can exceed 1 in absolute value, and Theorem 8 must be true. This can be seen as follows. If there exists an '1, 0<'1< 1, such that qJ(t)~k+ 1-'1, and z=e2ltixm=x+iy, say, then it follows that lyl~2'11/2. For k+ 1-'1 ~ qJ(t)~(k-l)+ll +e2ltixml, or 2~ 11 +e2ltiXml ~2-'1, for m= 1,2, ... , k. And 11 +Z12 =(1 +X)2 + y2 =(1 +X)2 +(I-x2)=2+2x~(2-'1)2 ~4-4'1, so that 1 ~x~ 1-2'1. Now y2 = l-x2 =(I-x)(1 +x)~2(I-x)~4", which implies that IYI~2,,1/2. Therefore Iz-ll<4,,1/2. Thus Theorem 8 will be proved if it is proved that lim qJ(t) ~ k + 1. t-+ 00 Let (14) Kronecker's theorem §5 95 and p be a positive integer. Then (15) nl+"'+nk~p nj"90,j= 1, ...• k where the coefficients anI, ...,nk have the following properties: (i) they are positive; (ii) their sum L a n" ... ,nk=t/tP(I,I, ... , 1)=(k+l)P; (iii) they are at most (p + l)k in number. We use this formalism to consider If we use (15) with e 21[i( t6 r aj) in place of xi' we see that FP(t) is a sum of the form given in (11), with 2n(n 1e1+ ... +nke k) taking the place of c•. Since the e's are linearly independent, the c:s are all different. In place of the b. in (11), we have the anl, ... ,nk given in (15), multiplied by the factor e-21[i(nlal+···+nkak). Hence (16) Since cp(t) ~ k + 1, to prove (14) it is sufficient to prove that lim cp(t) < k + 1 (17) t .... 00 is impossible. Now (17) implies that IF(t)1 = cp(t) ~ A < k + 1, for sufficiently large t, hence f T 1 lim- T .... However, hence ooT f T IF(t)IPdt~ o 1 APdt = AP. limooT T .... o o -If T Ib.1 ~ Tlim - IF(t)IPdt ~ AP, .... oo T o 96 Weyl's theorems on uniform distribution and Kronecker's theorem VIII so that every coefficient in (15) satisfies the inequality Since there are at most (p + II such coefficients, we have (k + l)P= La n1 ••• nk ~ (p + l)k A.P. Since /L=Aj(k+l)<I, and /L P(p+l)k--+O, as p--+oo, it follows that (17) is impossible, so that (14) is established, and with it the theorem. Chapter IX Minkowski's theorem on lattice points in convex sets § 1. Convex sets. We have encountered in Chapter VI problems connected with the number of lattice points in certain regions of the plane. If W denotes the Euclidean space of dimension n, n ~ 1, we call a point in it a lattice point if all it co-ordinates are integers. In this chapter we shall prove Minkowski's theorem that a convex set in R n , symmetric about the origin, whose volume is greater than 2n , contains a lattice point other than the origin. DEFINITIONS. Let S be a set in W. If A is a real number, we denote by AS the set obtained by magnifying S by the factor A, that is AS = [AxlxES]. We say that S is convex, if and only if XES and YES imply that AX+IlYES, for all real numbers A, Il, such that A~O, 1l~0, A+Il=1. If S is convex, so is AS. We say that S is symmetric with respect to the origin, or just symmetric, if and only if XES implies that - XES. If S is symmetric, so is AS. If g is a lattice point in R n , the set Sg, called the translate of S by g, is defined by the property that XES9 if and only if x - g E S. If S is a Lebesgue measurable set of measure V(S), then V(S) = V(Sg)' for any lattice point g. CONVEX SYMMETRIC SETS. (a) If S is convex and symmetric, and XES, then AXES, for every real A, such that IAI ~ 1. For if XES, then -XES because S is symmetric, and (~2 +~) 2 X +(~-~) 2 2 (-x) = AX E S ' if IAI ~ 1, because S is convex. (b) If S is convex and symmetric, and XES, YES, then AX + IlY E S, for all real A and Il, such that IAI + IIlI ~ 1. If A= 0, or Il = 0, this reduces to property (a). Let us therefore assume that A#O, and Il#O, and define 81 = sgnA, 82 = sgnll. Then, because of property (a), and of the assumption 1.11 + IIlI ~ 1, we have x'=8 1 (1AI+llll)xES, y'=8 2 (IAI+IIli)YES. If we define 7 Chandrasekharan, Analytic Number Theory 98 Minkowski's theorem on lattice points in convex sets (J'= 1111 1,11 + 1111 IX , then p > 0, (J' > 0, and p + (J' = 1. Since S is convex, it follows that p x' + (J' y' E S. But p x' + (J' y' = AX + IlY. Hence we have property (b). § 2. Minkowski's theorem. THEOREM I (MINKOWSKI). A bounded, measurable, convex, symmetric set S in Rn, of measure V> 2n, contains a lattice point different from the origin. We shall give a proof of this theorem, due to C. L. SIEGEL, which is based on a formula for the measure of a bounded, measurable, convex, symmetric set which does not contain a lattice point different from the origin. The assumption of boundedness in Theorem 1 is not necessary (cf. Theorem 3, and the Notes on Chapter IX). PROOF OF THEOREM 1 (SIEGEL). Let S be a bounded, measurable, convex, symmetric set in R n of measure V, and let L 2 (S) denote the set of square-integrable functions on S. Let cp E L2 (S), and define cp(x) = for x¢S. We write, as usual, k=(k l ,k2, ... ,kn), x=(X l ,X 2, ... ,xn), kx= kl Xl + k2X2 + ... + knxn, and dx = dXl dX 2... dx n· Consider the function ° f(x) = LCP(2x-2k), (1) k where k runs through all the lattice points in Rn. For any given x, this sum is finite, since cp vanishes outside S, and S is bounded. Since k runs through all lattice points, the sum remains unaltered by the substitution k.-+k.+ 1. Thus f(x) is periodic in each of the variables Xl 'X 2' ... ,Xn, with period 1. Parseval's formula for the Fourier series of f gives I Ifl2 dx E = I la11 2 , (2) where E is an n-dimensional cube of side 1, I a lattice point in Rn, and al is the Fourier coefficient of f, namely al = I f(x)e- 27tilX dx. E Because of (1), this implies that al = IIcp(2x-2k)e-27tiIXdx, = II cp(2x-2k)e-27tiIXdx, E k k E (3) 99 Minkowski's theorem §2 where k runs through all the lattice points in Rn. Set x-k=t. Then as x ranges over E, and k over all lattice points, t ranges over all of Rn. Thus R" R" If we now write 2 t = x, then since ({J vanishes outside S, we get a1= 2- n S({J(x)e-rri1xdx. s (4) On the other hand, we get from (1), SIfl 2dx = E S L(L({J(2X-2k)({J(2X-2k 1j\ dX = S L({J(2x-2k)({J(2x)dx E k' = k rn S L ) R" k rn L ({J(x-2k) ({J(x)dx = R" k k S({J(x-2k)-;P(x)dx. (5) S If we use (4) and (5) in (2), we get L S({J(x-2k)({J(;)dx = rn I IS({J(x)e-rrilxdxI2. (6) k S I S Now if ({J(x-2k)({J(x) #0, then we have XES, and x-2kES. And because S is symmetric and convex, it follows that tx+t(2k-x) = kES. Therefore, if S contains no lattice point different from the origin, we must have ({J(x-2k)({J(x)=0 for k#O, in which case (6) reduces to S1({J(xW dx = 2 -n I I S({J(x) e-rrilx dxl2. (7) S I S If we now choose ({J, such that ((J(x) = 1 for XES, then SI({J(X) 12 dx = V, and (7) gives S V = rn ~ I~ e-rrilxdxl2 = 2- n (V 2 + I~O I ~ e-rrilxdxl)· Since -I runs through all lattice points if I does, we can write this in the form (8) which is Siegel's formula for the measure V of a bounded, measurable, convex, symmetric set S in R n , which contains no lattice point other than the origin. It follows that V:( 2n , and Theorem 1 is an immediate consequence. If we wanted only to prove Minkowski's theorem, and not formula (8), we could use Schwarz's inequality S1112 dx ~ laol 2, E 100 Minkowski's theorem on lattice points in convex sets IX instead of Parseval's formula. We have J ao = 2-" cp(x)dx = 2-"V s by (4), and if S contains no lattice point other than the origin, then by (5), we have hence V~2". Theorem 1 is false for some bounded, measurable, convex, symmetric sets of measure V = 2", as can be seen by considering the set: Ix;! < 1, 1 ~ i ~ n. This has measure V = 2", but contains no lattice point other than the origin. If S is closed, however, we have THEOREM 2. A closed, bounded, convex, symmetric set S in R", of measure V(S);;?; 2", contains a lattice point other than the origin. PROOF. Given e, 0<e<1, consider the set S'=(1+e)S. Since Sis measurable, so is S', and if V(S) and V(S') denote the respective measures, then V(S') = (1 + e)" V(S) ;;?; 2"(1 + e)" > 2". Therefore, by Theorem 1, S' contains a lattice point I. other than the origin. Since S is bounded, so is S', and there are only a finite number of possibilities for I.. Therefore there exists a lattice point 10 , other than the origin, such that 10 E(1 + e)S for every e, 0 < e < 1. That is 10/(1 +e)ES. If e-+O, it follows that 10ES, since S is closed, and the proof of Theorem 2 is complete. Theorem 2 implies the following THEOREM 2'. If S is a bounded, convex, symmetric set of measure V(S);;?; 2", then there exists a lattice point, other than the origin, in the closure of S. PROOF. Given a bounded, convex, symmetric set S, consider S, the closure of §. S is convex since S_ is; it is closed; and bounded, since S is; and V(S);;?; V(S);;?;2". Hence S is a set which satisfies the conditions of Theorem 2. Hence it contains a lattice point other than the origin. To provide an alternative proof of Minkowski's theorem, we first prove the following LEMMA (G. D. BIRKHOFF). If S is a measurable set in R", of measure V(S) > 1, then there exist two distinct points XES and YES, such that x - y is a lattice point. 101 Minkowski's theorem §2 PROOF. Let g=(gl,g2, ... ,gn) be any lattice point, and consider the cube [xilgi~xi<gi+l], i=I,2, ... ,n. Let sg denote the intersection of S with this cube: sg == S n [(Xl, ... ,Xn)ER", gi~Xi<gi+ 1, 1 ~i~nJ. Let S~ 9 be the translate of S9 by - g (cf. § 1). Then S~ 9 is contained in the unit cube 0 ~ Xi < 1, 1 ~ i ~ n. Let its measure be Yg. Then Yg is also the measure of sg, and Yg = V> 1. Since the unit cube has L 9 measure 1, it follows that there exist at least two sets S~g and S~9" where g and g' are different lattice points, which overlap. In other words, there exist two points X E sg, Y E S9', such that x - g = y - g'. Therefore XES, YES, and x-y=g-g', which is a lattice point. (It need not, of course, be in S.) Hence the lemma. With this lemma we prove THEOREM 3 (MINKOWSKI). If S is a measurable, convex, symmetric set of measure v> 2n (possibly V = (0), then it contains a lattice point other than the origin. PROOF. Consider the set is, whose measure is (!t V> 1. By the above lemma, there exist two different points XE!S, YE!S, such that x- Y = g, a lattice point. Now !S is convex and symmetric, because S is. It follows that !X-!y=!gE!S, by property (b) of §1; hence gES, and g is not the origin since x and yare distinct. Thus the proof of the theorem is complete. These theorems can be applied to homogeneous linear forms. Let ei=ailxl+ai2x2+"'+ainxn, i=I,2, ... ,n, (9) be n homogeneous linear forms in n variables Xl""'Xn with real coefficients aij' Let ,1 be the determinant of the matrix (aij)' We suppose, at first, that ,1 # O. These forms define a linear transformation of the x-space into the e-space, and if a set S is convex and symmetric in the x-space, its image T in the e-space is also convex and symmetric, since convexity and symmetry are unaffected by linear transformations. But the measure is altered, and if ,1 # 0, then Jdel de2 ... den = 1,11 JdX l dX2 ... dxn, (10) s so that the measure of T is 1,11 times the measure of S. Consider (the linear transformation L of Rn into itself given by T (X1,X 2, ... ,xn)-(';b"',';n)' The image of points with integer co-ordinates is called the lattice A associated with L. The determinant of L is called the determinant of the lattice A. IX Minkowski's theorem on lattice points in convex sets 102 An application of Theorem 3 to the ~-space gives THEOREM 4, If A is a lattice with determinant L1 =1= 0, and P a measurable, convex, symmetric set of measure V> 2nlL11 (possibly V = CIJ), then P contains a point of A different from the origin, An application of Theorem 2 leads to THEOREM 4'. If A is a lattice with determinant L1 =1= 0, and P is a closed, bounded, convex, symmetric set of measure V?: 2nlL1l, then P contains a point of A different from the origin. § 3. Applications. (A) Consider the closed set S defined by the inequalities I~d ~ ci , III the x-space i = 1,2, ... ,n. S is obviously symmetric. It is convex, for if XES, YES, and where A?: 0, Il?: 0, A + 11 = 1, then (11) Z = Ax + 11 Y, la i1 z1 +aiZzl+"'+ainznl ~ Ala il Xl + ... + ainxnl + Ill ail Yl + ... + ainYnl ~ max (Ia il Xl + ... + ainxnl, lail Yt + ... + ainYnj)· n S is bounded, for if (lXi) is the inverse matrix of (a i), then ~i = L aijxj j=l implies that Xi = I j=l lXij~j' so that Ixd ~ Ilcxijlc j . By formula (10), the measure of Sis 2nlL1I-1cICl"'Cw The corresponding set in the is a rectangle of measure 2nCl cl ... Cn' An application of Theorem 4' therefore gives ~-space THEOREM 5. If ~l'~l""'~n are homogeneous linear forms in the variables Xl' Xl"'" Xn , with real coefficients, and determinant L1 =1= 0, and if Cl,Cl""'C n are real numbers >0, such that C1 Cl ... cn?:IL1I, then there exist integers XI,Xl,""X n, not all zero, for which I~ll ~ cl , I~ll ~ Cz,··., I~nl ~ Cn' We can, in particular, choose ci =IL1II/n, i=I,2, ... ,n, and have the same bound for all the n inequalities in (11). We have, so far, assumed that L1 =1= 0. If L1 = 0, then it is easily seen that the set S in the x-space defined by (11) has infinite volume if Ci > for every i, and the conclusion of Theorem 5 remains valid. If, instead of (11), we consider fewer inequalities than the number of variables, namely ° 103 Applications §3 then the set which they define in the x-space cannot be J>ounded. But the conclusion of Theorem 5 holds good, because of Theorem 3. There exist integers Xl,X2' .•. 'X", not all zero, which satisfy the m inequalities in (12). We note that the case m < n is reduced to the former case m = n, L1 = 0, by writing condition (12) for i = m exactly n - m + 1 times. (B) As a second application, consider the set T in the ~-space defined by the inequalities It is obviously symmetric. It is convex, for if ~=(~l' ... '~n)ET, ~'=(~~,~~, ... ,~~)ET, and A~O, p.~0, A+p.=1, then ktl IA ~k + p. ~~I ~ Aktl I~kl + p. ktl I~~I ~ max Ctl I~kl, ktll~~I). If n=2, T is a square; if n=3, T is an octahedron. The volume of T can be calculated as follows. T consists of 2n congruent parts, one in each octant, and that part which lies in the octant ~1 >0, ~2 >0, ... , ~n>O, has the volume hence T has volume V = (2 cnn! . If en ~ n ! 1L11, Theorem 4' gives THEOREM 6. There exist integers 1~11+ 1~21+ Xl,X 2 , ... , X n , not all zero, such that ... + I~nl~(n! IL1Dl/n. 1 Since 1~1 ~2 ••• ~nll/n ~ - (l~ll + ... + I~nl), this implies n THEOREM 6'. There exist integers X 1'X 2 ' ... , X n , not all zero, such that (C) As a third application, we consider the set P in the defined by the inequalities ~i + ~~ + ... +~; ~ c2 • ~-space Minkowski's theorem on lattice points in convex sets 104 IX It is symmetric; and convex, for n L ()·~k+Il~~)2=,F L k=l k=l Its volume is c n ~~+2A.1l n n k=l k=l L ~k~~+112 L ~? JJ ••. cnnn/2 d~l ... d~n = r (n/2+ 1) = cnsn, say. 1: ~~ ~ 1 Hence, if c?=2(ILlI/sn)1/n, we can apply Theorem 4' and get THEOREM 7. There exist integers X 1 'X 2 ' ... , X n, not all zero, such that This theorem can be carried over to a general positive definite quadratic form n Q(X 1, ... , xn)= L arsxrx., r,S= 1 ° with real ars = asr . Q is positive definite if and only if Q(x 1 , ... , xn) > for all X 1'X 2 ' ..• ,Xn , other than 0,0, ... ,0. The determinant D of the matrix (a rs ) is called the determinant of Q, and D > 0, if Q is positive definite. Any positive definite form Q can be expressed as Q=~i+~~+ ... +~;, where the ~k are linear forms in X 1 'X 2 ' ... , X n , with real coefficients, and determinant Theorem 7 can therefore be restated as VD. THEOREM 8. If Q is a positive definite quadratic form in n variables, with determinant D, then there exist integers X 1 ,X2, ... , X n, not all zero, such that Q(X 1, ... , where Sn = nn/2/ r(n/2 + 1). xn)~4 (~y/n, Chapter X Dirichlet's theorem on primes in an arithmetical progression § 1. Introduction. We have seen by elementary arguments that there exist infinitely many primes, and that, in fact, each of the arithmetical progressions 4k+1 and 4k+3, where k=1,2,3, ... , contains infinitely many primes (Chapter III, § 3). We shall now prove Dirichlet's theorem that there exist infinitely many primes in any arithmetical progression a+mk, where a and m are integers, m>O, (a,m)= 1, and k runs through all positive integers. We proved in Chapter VII that the series L lip diverges, where p runs through all the primes. The proof can be reformulated as follows. For real s> 1, we have Euler's identity L ----;1 = f1 ( 1 - n= 1 n (_1_) I I-x = n=1 so that, for O<x:::::;t, we have log p p and, for 0 < x < 1, we have log 1 ----; 00 ((s)= xn < n (_1_) I-x I )-1 xn = n=1 , _x_, I-x < 2x. Thus for any prime p, and real s> 1, we get the inequality ( 1)-1 <-. log 1 - p' Hence ( 1)-1 10g((s}=Llog 1-----; p If L lip p p 2 pS <2LP-s, s>1. p were convergent, we should have 2 L p-s <2 L lip. p know, however, that ((I +e}--+oo, as e--+ +0. Hence diverge. We p L1/p must p x Dirichlet's theorem on primes in an arithmetical progression 106 L lip Just as the divergence of is connected with the behaviour of p 00 L n - S (s > 1), the divergence of the series L lip, where a n=l p=a(modm) and m are integers, m>O, (a,m)= 1, is connected with the behaviour of ((s) = 00 L anlns, where both s and the coefficients n=l an are complex numbers. We prepare for a study of the connexion by considering the function ((s) for complex values of s. Let s = 0' + i t, where 0' and t are real, and i 2 = - 1. Let us assume, to begin with, that 0' > 1. For real, positive x, we set X S = e'IOgx, where logx is the real natural logarithm of x. We then have Dirichlet series of the form 1 00 1 00 L -Isl = n=l L ---;;, n=l n n 00 so that the series L 1/n s n= 1 converges absolutely for 0' > 1, and uniformly in any half-plane 0' ~ 1 + b > 1, where it defines a regular analytic function. Because of the absolute convergence of the series for 0' > 1, by Theorem 5 of Chapter VII, the identity ((s)= 1 f1 ( 1)-1 L 00 ~ = n=l n p 1- ~ P remains valid for complex s with real part 0' > 1. The absolute convergence of the product (I-lipS) -1 for f1 follows from that of the series L lips. p Thus in the half-plane 0' 0' >1 > 1, ((s) p can be represented by this absolutely convergent product of non-zero factors. Hence ((s) ~ 0, for 0' > 1. The function ((s), defined for 0'> 1, by the relation 00 ((s)= L n= 1 1 ~, n is analytic in the half-plane 0' > 0, except for a simple pole, with residue 1, at the point s = 1. In order to prove this, we use Abel's summation formula given in Theorem 6 of Chapter VII, with An=n, <p(x)=x- s, and an = 1. Then A(x) = [x], the integral part of x, and 107 Characters §2 00 Now [X ]/xS --+0, as x--+ <Xl, for for (1 (1 L 1/n > 1; and the series s n=l converges > 1. If we write [u] = u - {u}, we get the representation f f 00 ;, ~= n'::l nS s 00 du - s US 1 {u} d u = _s_ - s us + 1 s-l 1 {u} d u, us + 1 1 that is f f 00 f 00 ~ = 1 + _1_ - n=l nS s-l s 1 {u} d u us + 1 ((1 > 1). (1) Obviously we have O~ {u} < 1. The integral in (1) is therefore absolutely and uniformly convergent in every half-plane (1 ~ [) > 0, and represents a regular function of s for (1)0. Hence ((s) is meromorphic in (1)0, with a simple pole at s= 1 with residue 1. It is called Riemann's zetafunction. § 2. Characters. A character X of a finite, abelian group G is a complex-valued function, not identically zero, defined on the group, such that if AEB, BEG, then X(AB)=X(A)X(B). If E denotes the unit element of G, and A - 1 denotes the group inverse of AEG, the characters of G have the following properties. (i) X(A)#O, for every AEG. For if X(A)=O, then X(A)X(A -1) =X(AA- 1 )=X(E)=0. That is, X(C)=X(E)X(C)=O, for every CEG, which contradicts the definition of X. We observe that X(E) = 1. (ii) If G is of order h, then Ah=E, for every AEG. Hence X(At = X(A h) = X(E) = 1. That is, X(A) is an hlb root of unity. The character Xl' defined by the property Xl(A)=l for every AEG, is called the principal character of G. (iii) An abelian group of order h has exactly h characters. We first prove this property for cyclic groups. A group G is cyclic, if it consists of the powers A, A 2 , ... , A r = E, of a single element A, which is called a generator of G. The order r of G is the smallest positive integer r, such that A r = E. Let X be a character of the cyclic group G. Then (a) X is completely defined by the value X(A), for X(An) = (X(A»)n; (b) Ar=E implies that (x (A»' = 1, that is, X(A) is an rIb root of unity; (c) if p is an rIb root of unity, then we can define a character X by the relation X(A) = P (that is, 108 Dirichlet's theorem on primes in an arithmetical progression x X(A n) = pn), for if A a!. A a2 = Aa" then a l + az == a3 (modr), hence pal. pa 2 = pa3. Since there exist only r different rlh roots of unity, it follows from (a) and (b) that there are at most r different characters of G. On the other hand, (c) implies that there are at least r characters. Hence a cyclic group of order r has exactly r characters. In order to prove property (iii) for an arbitrary abelian group G, we use the following result: every finite (multiplicative) abelian group G is a direct product of cyclic groups. Suppose that G = G1 X ... X Gb where Gj is cyclic for 1 ~j ~ k. Let rj be the order of Gj , and Aj a generator of Gj . The order of G is then h = rl r z ... rb and every A E G can be uniquely expressed as A=A~'A~ ... A~", O~tj~rj-l, j=1,2, ... ,k. If X is a character of G, we then have If p j is an r/h root of unity, then there exists one and only one character X of G, such that X(A)=pj, j=1,2, ... ,k. Since Pj can take exactly rj different values, G has exactly h different characters, where h = r 1 r z '" rk' (iv) Let G be a finite, multiplicative, abelian group of order h. It follows from property (i) that X(E) = 1 for every character X of G. We shall now see that given any A E G, A # E, there exists a character X, such that X(A) # 1. We again use the representation of G as a direct product of cyclic groups. As in (iii), let A = Atl' A~ ... A~k. Since A # E, not all ti are zero. For example, let t 1#O. We take X(A z)=X(A 3)="'=X(Ak)=1, and 21ti (v) The characters of a finite, multiplicative, abelian group G again form a finite, multiplicative, abelian group G. By the 'product' X' X" of two characters x' and X" of G we mean the character X defined by the property: X(A) = X'(A) x" (A), AEG. To see that X' X" is, in fact, a character, we observe that X(AB)= X'(AB)x"(AB)= x'(A)x'(B)x" (A)x" (B) = X(A)x(B). The principal character Xl of G is the unit element of G. The inverse character X-I of a character X is defined by the requirement X-I(A) =X(A- 1), so that X-I(A)=(X(A)t l . We see that X-I is, in fact, a character, for X-I(AB)= X((A B)-l)= X(A -1)x(B- 1)= [1(A)x-1(B). The character X considered in (iv) generates a cyclic subgroup of G, of order rl . Similarly there exist cyclic subgroups of orders r2, ... ,rk' The argument used to show that G has exactly h distinct characters, §3 Sums of characters, orthogonality relations 109 where h is the order of G, shows that G is the direct yroduct of these cyclic subgroups of orders r1, r2, ... , rk. Hence G and G are isomorphic, sucq an isomorphism depending on the decomposition of G into cyclic factors, which is not unique in general, and on the choice of generators for these cyclic factors. § 3. Sums of characters, orthogonality relations. Let G be a finite, multiplicative, abelian group, of order h. Let us consider the sum S= Lx(A), A where A runs through all elements of G, and the sum T= LX(A), x where X runs through all elements of the character group G. If B is a fixed element of G, and A runs through all elements of G, so does A B. Hence S'X(B) = LX(AB)= LX(A)=S, A A ° which implies that (X(B) -1) S = 0. Hence either S = 0, or S # and X(B) = 1 for every BEG, in which case X= Xl' the principal character, and the sum S has the value h, the order of G. Hence S= h if X-X x(A) = { ' . - 1, 0, If X# Xl' I A (2) If we mUltiply the sum T by X'(A), where X' is some character of G, then we similarly obtain X'(A)'T = I X(A)x'(A) = l. I x(A)= T. l. Henceeither T=O, or x'(A)=l for every X'EG, in which case, because of (iv) of § 2, A =E and T =h. Thus '" T= .;x(A) {h' if = 0, if A = E, A#E. (3) Let m be a positive integer. We know that the cp(m) prime residue classes modulo m form a multiplicative abelian group of order h = cp(m) (Chapter II, §1). We can therefore consider the characters of this group. 110 Dirichlet's theorem on primes in an arithmetical progression x But the definition of character can be carried over from the prime residue classes modulo m to the integers themselves, as follows. We define x(a)=x(A), if aEA, where A is a prime residue class modulo m. Then obviously x(a) = X(b), if a=b(modm); and x(ab)=x(a)x(b), if (a,m)=(b,m)= 1. Since X(A)#O for every prime residue class A, it follows that x(a)#O, if (a,m)=1. This definition applies only to integers (l which are prime to m. We can extend it to all integers by the requirement that x(a) =0, if (a,m» 1. A character modulo m is therefore an arithmetical function x, with the properties: x(a)=x(b), if a=b(modm), x(ab)=x(a)x(b), for all integers a and b, x(a)=O, if (a,m» 1, x(a)#O, if (a,m)= 1. There exist <p(m) characters modulo m, where <p(m) is the number of integers, not exceeding m, which are prime to m. They form a (multiplicative) abelian group, which is isomorphic to the group of prime residue classes (modm). The unit element of this group is the principal character Xl' which is such that X1 (a) = 1 if (a, m) = 1. Further we have the relations of orthogonality: L x(n) = {<p(m), n(modm) 0, L x(n) = x {<p(m), 0, if X=Xl' if X#Xl' if n=l(modm), if n#l(modm). (4) (5) EXAMPLES. (1) Let m = 4. Then there are 2 prime residue classes, namely the class E consisting of integers congruent to 1 (mod 4), and the class A of integers congruent to 3 (mod 4). A and E form a cyclic group of order 2. There are two characters Xl and X2, where Xl (E) = Xl (A)= 1, the principal character, and By the definition of character, carried over to the integers, we have Xl (n) = {a,1, if n is even, if n is odd, §4 Dirichlet series, Landau's theorem 111 and 0, if n is even, 1, if n:= 1(mod 4), -1, if n:=3(mod4). X2(n) = { Further we have Xl (1)+ Xl (3)=2, Xl (1)+ X2(1)=2, X2(1)+ X2(3)=0, Xl (3)+ X2(3)=0. (II) Let m=5. Then the prime residue classes are E,A,A 2,A 3, where A is the class of all integers congruent to 2 (mod 5). A2 is then the class of integers congruent to 4(mod 5), and A 3 the class of integers congruent to 3 (mod 5). E contains all integers congruent to 1 (mod 5). The four characters are as follows: Xl (E) = Xl (A)= Xl (A2)= Xl (A 3) = 1, X2(E) = 1, X2(A)= i, X2(A2) = -1, X3(E)=1, X3(A)= -1, X3(A 2)= 1, X4(E) = 1, X4(A) = -i, X4(A2) = -1, X2(A 3)= -i, X3(A 3)=-1, X4(A 3)= i. § 4. Dirichlet series, Landau's theorem. A Dirichlet series is a series 00 of the form L ann- s, where s is a complex number, and the coefficients n=l an are likewise complex numbers. More generally, a series of the form L 00 n= 1 a 00 ~ A~' or "L.. an e- SAn , n=l where O<Al <A2< ... , and An--+OO as n--+oo, is called a Dirichlet series. Many of the Dirichlet series which appear in the theory of numbers are of the type an n - s, and we shall consider some elementary properties of such series. We usually write s = (J + i t, where (J and t are real, and i 2 = -l. L 00 THEOREM 1. If the series L aJn s converges for s = so, it converges n= 1 uniformly in the angular region defined by larg(s - so)1 ~ n/2 - e< n/2. PROOF. We may suppose, without loss of generality, that so=O, for OOa n; = L n=1 and the convergence of ooa 1 ns:· nS-so = L n=l L ann -s, L bn , where n= 1 b for s = so, is equivalent to the conver- 00 gence of 00 L nS~so' n=l bn = an· n -so. x Dirichlet's theorem on primes in an arithmetical progression 112 00 Let L an 00 converge. Then lim rn=O, where rn= n=1 n-'CX) a; n=M n = avo Let M v=n+l and N be positive integers, such that M < N. Then N L L N L rn - 1 -rn n=M n' If 0">0, we have I(n~lr ~, H:r ~:, I"ISI'L~:, ~ I:t· -(n~l)") - n n Further, if e>O is given, then Irnl <e, for pendent of s. Hence, for M> no, we have If 0" n~no(e), where no is inde- > 0, and M > no (e), we therefore have the estimate i n'ani ~ ~ _1_ + ~ ~ 2e ls l . In=M M M a 0" a 0" To prove the required uniform convergence, we observe that lsi -= 0" 1 coslargsl ~ 1 cos(n/2-O) that is, for every s, such that largsl ~ 1 sinO' =-- n/2-()<n/2, we have which proves Theorem 1. It follows that if L an/no converges for s = 0" 0 + i to, then it converges for all s=O"+it, with 0">0"0' Hence we have 00 THEOREM plane plane. 2. If L an/no converges for s = so, it converges in the half- n=l 0" > 0"0, and uniformly in every compact set contained in that half- From the uniform convergence we also have 113 Dirichlet series, Landau's theorem §4 00 THEOREM 3. If L anln s converges for s = So to sum f(so), where f(s) n=l denotes its sum function in the half-plane a> a 0, then f(s) ~ f(so), as S~So along any path in the region larg(s-so)1 ~ nI2-8<nI2. Theorem 2 shows that the region of convergence of a Dirichlet series is a half-plane. For if the points of the real axis are divided into two classes U and L, such that U= {ai I: a;n is convergent}, n=l L= {ai I: a;n is divergent }, n=l then every member of U is greater than any member of L, and the classification defines a real number a 0, such that the series converges for a> ao, and diverges for a < a o, the case a = ao being undecided. If U is empty, we define a 0 = + 00, and if L is empty, a 0 = - 00. The number a o is called the abscissa of convergence, the line a = a o the line of convergence, and the half-plane a> ao the half-plane of con00 vergence, of the Dirichlet series 00 The series I 00 L I a.ln S • n=l n !/n S converges nowhere (a 0 = + (0), while the series n= 1 1/(n! nS ) converges everywhere (a 0 = - (0). n=l Theorem 1, together with Weierstrass's theorem on uniform limits of analytic functions, gives 00 THEOREM 4. A Dirichlet series L ann- s represents in its half-plane of n=l convergence a regular analytic function of s, whose successive derivatives are obtained by term wise differentiation of the series. These theorems do not say anything about the convergence of the series, or the regularity of the sum function, on the line of convergence. In contrast to a power series which always has a singularity on the circle of convergence, a Dirichlet series need not necessarily have any singularity on the line of convergence. Nor can we conclude from the convergence or divergence of a Dirichlet series at a fixed point on the line of convergence, the regularity or singularity of the sum function of the series at that point. We shall revert to this question a little later. 8 Chandrasekharan, Ana1ytic Number Theory Dirichlet's theorem on primes in an arithmetical progression 114 x 00 00 if L n= 1 The series L an/n s is absolutely convergent n=l Ian lin" is convergent. The abscissa of absolute convergence (j of ABSOLUTE CONVERGENCE. 00 is the abscissa of convergence of L lanllns• n=l Obviously we have (j ~ 0"o, since absolute convergence implies convergence. If (j>0"0, then there exists a strip of the complex s-plane in which the series converges but not absolutely. This strip 0"0 < 0" < (j is called the strip of conditional convergence. To take an example, the series the series L anlns 00 (_1)n-1 n=l n L S converges for real s > 0, since it is an alternating series of decreasing terms. It obviously diverges for real s < 0. Hence 0"0 = 0. It converges absolutely for 0" > 1, and diverges absolutely for 0" < 1. Hence (j = 1. The strip of conditional convergence has width 1. It is interesting to note that L 00 1)n-1 ( nS n=l = (1-2 1 - s )((s), for 0" > 0, (6) where ((s) is the Riemann zeta-function, for the series on the left is absolutely convergent for 0" > 1 and can therefore be rearranged: I n=l (-1 t- =(~ + ~ + ~ + ...) _ 2 (~ + ~ + ~s + ...) nS 1 1" = 2S (1-2 1 - S 3S )((s), 2S for 4S 6 0">1. -1r- 1 InS But the series L( converges for 0">0, and the function 1 S ((s) (1- 2 - ) is regular for 0" > 0, the simple pole of ((s) at s = 1 being cancelled by the zero of 1- 2 1 - s • Hence (6) is valid, by analytic continuation, for 0">0. We have noted that the strip of conditional convergence of the series in (6) is of width 1. It can be shown that the strip of conditional convergence of any Dirichlet series L allin s can be at most of width 1, so that if it converges for a given s, it converges absolutely when the real part of s is increased by 1 + e with any e > 0. 00 THEOREMS. For any Dirichlet series L n=l an/ns , we have (j-O"o:(1. 115 Dirichlet series, Landau's theorem §4 00 PROOF. If L an/no converges, then lim iani/n" =0, hence the series n=l 00 n-+CX) L iani/n1+a+< converges for 6>0. n=l This theorem does not hold for Dirichlet series of the more general form LanA;", where (An) is not the set of positive integers, as the following examples show: 00 (_1)n Ln= 2 (logn)S L n=2 converges for ( _1)n Vn(logn)" C1 > 0, but never absolutely; converges for all s, but never absolutely. We now return to the question of the regularity of the sum function ofa Dirichlet series Lan/ns on the line of convergence. In case the coefficients (an) are non-negative, we have THEOREM 6 (LANDAU). If an~O for all n~1, and C1 0 is finite, then the point of intersection of the real axis with the line of convergence is a 00 singularity of the sum function f(s) of the Dirichlet series L an/n". n=l Since an~O, we have u=C1o. We can assume, without loss of generality, that C10=0. We wish to show that the point s=O is a singularity of f If f were regular at s=O, then the Taylor series of f at the point s = 1 would have a radius of convergence p > 1. Hence there would exist a real s<O, for which the Taylor series PROOF. 1t (s L --J<vl (1) 00 v=O converges. But, for C1 > v! 0, 00 f(s) = L ane-sIOgn, n=l and by Theorem 4, so that 8* x Dirichlet's theorem on primes in an arithmetical progression 116 The Taylor series of fat s = 1 is therefore 00 (s-l)" an(-logn)" 00 I -v., n=l I v=o (l-s)V 00 00 I -,I v=o v. n=l n an(logn)" n Since all terms of this double series are non-negative, if s<O, we may interchange the order of summation, and it follows that I n=l I an n v=o (l-s)'~IOgn)" v. converges for some s<O. However, 00 " (1- s)'(logn)' v! 1... v=o =e(l-s)logn. 00 L ann- s Hence converges for some s<O, which is impossible, since n=l 0"0=0. Hence the points s=O must be a singularity of f(s). M UL TlPLICA TlON OF DIRICHLET SERIES. The formal product of two given 00 Dirichlet series Cn = L I 00 I aJk and s k=l 00 bm/m is defined to be s m=l L cJn s , where n=l akbm· If both the given series are absolutely convergent for km=n a given s, they can be multiplied out and rearranged; and the series 00 I cJn s is then absolutely convergent, and is called the product of the n=l given series. For 0" > 0"0, let ~ bm g(s) = 1... m=l mS The function h(s), where h(s)= f(s)·g(s) is representable by the product of the Dirichlet series in the half-plane 0" > 0" 0 + 1, by Theorem 5. The representation of a function by a Dirichlet series is unique, as shown by the following 00 THEOREM 7. If the series L an/n s , n=l 00 and L bn/ns , n=l converge in a common half-plane, and if their sum functions coincide in a non-empty open set contained in that half-plane, then an = bn for all n ~ 1. Dirichlet's theorem §5 117 00 PROOF. Consider the Dirichlet series ~)an - bn)/ns. It converges in a n=l half-plane (1 > (10' say, where it defines a regular analytic function. That function vanishes on a non-empty open set contained in that half-plane. Hence it is identically zero in the whole half-plane (1 > (10. Let M be the first value of the index n, such that an i: b n, and let cn=an-bn. Then, for (1)(10, we have ~ Cn L.J(f n=l n = ~ L 00 = 0 L...a' Cn n n=M M+l Cn n" Hence Because of the uniform convergence of the series for (1 > (10 + 2, if we let (1-+00, it follows that CM=O. This contradicts the definition of M. Hence Cn = 0 for all n ~ 1. § 5. Dirichlet's theorem. We shall now apply the knowledge of characters obtained from §3, and of Dirichlet series obtained from §4, to series of the form I: n=l X(~), n s=(1+it, (7) where X is a character modulo m. There are qJ(m) such series, where qJ is Euler's function. Since Ix(n)1 :::; 1, the series in (7) converges for (1 > 1, in comparison with the series L 1/ns, and we denote its sum function by L(s,X). For different characters X, we get different functions L(s,X), and these are called Dirichlet's L-functions. To study their properties, it is convenient to distinguish the case where X is the principal character Xl' from the case where Xi: Xl. (i) If X i: Xl' then the series in (7) converges in the half-plane (1 > 0, x(n) are bounded, which can be seen as follows. since the partial sums L n::S;x If we partition the integers from 1 to [x] into residue classes (modm), and write [x] =mq+r, O:::;r:::;m-l, then [xl (m 2m mq n~x x(n) = n~l x(n) = ~ + m~l + ... + m(q!-t) + 1 ) mq+r x(n) + m~/(n), Dirichlet's theorem on primes in an arithmetical progression 118 x and because of the orthogonality relation (4), we have mq+r L x(n) = L x(n), mq+l n~x which implies that In~x x(n) I ~:~>x(n)l~r<m. Since n- a , for 0">0, decreases monotonically to zero as n-HI) , it follows that Lx(n)/n s converges for real s=O">O, and consequently for all s in the half-plane 0" > 0, if X=f. Xl. If 0" < 0, it obviously diverges. Its abscissa of convergence 0"0 = 0, and the abscissa of absolute convergence iT = 1. By Theorem 4, the function L(s, X), X=f. Xl' is a regular analytic function of s, for 0" > 0. . (ii) If X= Xl' we use, once again, Euler's identity L -1 -; = 00 ((s)= n= 1 n ( 1)-1 TI 1 - ----; P p ,0"> 1, where pruns through all the primes. Since each character X is a completely multiplicative arithmetical function, by Theorem 5 of Chapter VII, we have, for all X, the identity L(s,X)= I x(~) = n=l TI(1 _ X(:»)-l, n P 0">1. p (8) This implies that L(s, X) =f. 0, for 0" > 1. If Xl is the principal character (mod m), we know that Xl (a) = { 1, if (a,m)= 1, 0, if (a,m» 1. Using this in (8), we get or L(S,Xl)=((S)·TI(l-p-S) (0">1). (9) plm We have seen that ((s) is meromorphic in the half-plane 0">0, having a simple pole at s = 1, with residue 1, as its only singularity. Hence L(s, Xl) is regular for 0" > 0, except for the point s = 1, where it has a simple pole with residue TI (1-p-l)=<p(m)/m [cf. Chapter II, (1)]. plm For the proof of Dirichlet's theorem we need the following LEMMA. If X=f.XI' then L(1,X)=f.0. 119 Dirichlet's theorem §5 PROOF. It is sufficient to show that the product P(s) = nx L(s, X) where Xruns through all characters (modm), is not regular for a>O. For if L(1,X)=0 for at least one character X#Xl, then the simple pole at s= 1 of L(S,Xl) in the product P(s) would be cancelled by the zero of L(s, X) at s = 1, and P(s) would be regular for a> O. For a>1, we have Ix(P)p-SI~p-(1<1, so that we define log (1 _ P" X(P))-l = L X(Pk). k kpks Then the function logL(s,X) is uniquely defined in the half-plane a> 1, and given by X(Pk) (10) logL(s,X)= L --;;S, p,k kp where p runs through all the primes, and k through all positive integers. The double series is absolutely convergent for a> 1. Further e1ogL(s,X) = L(s,X)· If we sum log L(s, X) over all the characters X(mod m), we get Q(s)=logP(s)= L log L(s, X) = L L x X p,k X(Pk) --;;s. kp Since there are only finitely many X, we can interchange the order of summation, and obtain Q(s)= 1 k ks L X(Pk). L P p,k X Since Lx(a) = {q>(m), x 0, if a=1(modm), otherwise, we have Q(s)=q>(m) L pk=l(modm) 1 ~. kp If we define if n=pk=l(modm), otherwise, (11) 120 Dirichlet's theorem on primes in an arithmetical progression then 00 Q(s)=L n= 1 x an s' n where the coefficients (an) are non-negative. We know that the series converges for (J> 1. In order to find its abscissa of convergence, let p be a prime such that p,t'm. By Euler's theorem (Theorem 2, Chapter II) we have ph=l(modm), where h=q>(m). If we consider the series (11) for real s, and take only the terms for which k = h, then Q(s) > Since L lip L p,l'm diverges, and 1 hs p = L lip 1 Lp hs - L P plm 1 hs' P is finite, it follows that the series in plm (11) diverges for s= Ilh. Hence, if r:t. is the abscissa of convergence of the Dirichlet series Q(s), then r:t. ~ 11h. But P(s) = eQ(s) = 1 + Q(s) Q2(S) + - - + .... 2! (12) The product of two convergent Dirichlet series, with non-negative coefficients, is again a Dirichlet series with non-negative coefficients, which converges in the intersection of the two half-planes of convergence. Hence, along with Q(s) all the powers Qn(s) are absolutely convergent, so that the series P(s) in (12) can be written as a Dirichlet series which has non-negative coefficients. Thus if the Dirichlet series of Q(s) converges, so does the Dirichlet series of P(s). Conversely, if the Dirichlet series of P(s) converges for some real s, then so does the Dirichlet series of Q(s), because its coefficients are non-negative, for that value of s. Hence the Dirichlet series of P(s), which is unique, has the same abscissa of convergence (J 0 = r:t. as the Dirichlet series of Q(s). By Theorem 6, the point s = r:t. is a singularity of P(s). But we know that r:t. ~ 11h > O. Hence the function P(s) is not regular in the whole halfplane (J> O. Thus the lemma is proved. We are now in a position to prove the main theorem of this chapter, namely THEOREM 8 (DIRICHLET). If m is a positive integer, and (a, m) = 1, then there exist iriflnitely many primes p=a(modm). L PROOF. It is sufficient to prove that the series lip summed over all primes p=a(modm) diverges. For this purpose, we use the functions L(s,X)· Dirichlet's theorem §5 If (1 121 > 1, then by (10), we have 10gL(s,X)= X(pk) L L --;;;. 00 p k=1 kp If we separate the terms for which k = 1 from the others, we get (13) 10gL(s,X)= LX(P)p-s+R(s,X), p where the series converges for (1)t. Since (a,m) = 1, there exists an integer b, such that ab== l(modm). If we multiply (13) by X(b), and sum over all characters x(modm), we get LX(b)logL(s,X)= L LX(bp)p-s+ LX(b)R(s,X), (1)1. p x x x Since R(s,X) is regular for (1)t, the function R*(s) = LX(b)R(s,X) is also regular for (1 > t. Further x LX(bp) x h if bp== 1(modm), ={ ' 0, otherwise. If ab == 1 (modm), then the congruence b p == 1 (modm) is equivalent to p==a(modm). Hence LX(b)logL(s,X)=h L p-s+R*(s). p=a(modm) x (14) If we now let S--'> 1 + 0 along the real axis, the left-hand side of (14) tends to 00. For L(S,XI)--'>oo as s--'>1+0; L(s,X), X#XI, is regular for (1)0; L(l,X)#O for X#XI by the lemma; and 10gL(s,X), X#Xl' as defined by (10), has a finite limit as s--'>1+0, because of the formula C 10gL(s,X) = - I.;(u,X) J--du+logL(c,X), L(u,X) ,> for S=(1) 1, c>(1, if we note that L(u,X)#O for u~1, X#XI, and that L(S,X) is regular for (1)0, X#XI' Further R*(s) is regular for ( 1 ) t . Hence L p=a(modm) Hence L p=a(modm) lip diverges. p-S--,> 00, as s--'>1+0. Chapter XI The prime number theorem § 1. The non-vanishing of ,(1 + it). We have seen in the preceding chapter that Dirichlet's L-functions have the property that L(I,X)#O for X# Xl' and used it to show that every arithmetical progression of the form a+mk, where m>O, (a,m)= 1, and k= 1,2, ... , contains infinitely many primes. We shall now prove that the Riemann zeta-function has the property that ,(1 +it)#O for t#O, and use it to prove the prime number theorem. The prime number theorem is usually stated in the form X (1) n(x) - - , logx where n(x) denotes the number of primes not exceeding x, and the symbol - in (1) means that n(x)/(x/logx)-+l as x-+oo. Since we have seen in Chapter VII that (1) is equivalent to proving that lim l/I(x) = 1, (2) x"'" 00 x where l/I is Chebyshev's function, we shall prove the prime number theorem in this form. For this we need the relation co - C(s) _ sf l/I(u)du '(s) u·+ l ' (3) I which we have proved in § 4, Chapter VII, for real s> 1, as a consequence of Abel's summation formula. By analytic continuation, (3) is valid for complex s with real part u> 1. (We write, as usual, s = u + it, with u, t real, and i2 = - 1). If we substitute u = eX in (3), we get co - "(s) = s(s) f l/I(~)e-x'dx, u> 1, (4) o from which we shall deduce that l/I(eX)-eX, that is l/I(x)-x, as x-+oo. The non-vanishing of W +it) §1 123 We have already seen that '(s) is analytic in the half-plane 0'>0, except for a simple pole at s= 1 with residue 1, and that '(s)#O for 0'> 1. We shall now prove that '(s)#O on the line 0'= 1. THEOREM 1 (HADAMARD-DE LA VALLEE POUSSIN). ,(1 +it)#O. If t # 0, then PROOF. If 0'> 1, then we have '(s)= n(1-p-')-l, p and if we take logarithms, then as in Chapter X, 1 Im,p , mpm. log'(s)= 0'>1, (5) where m runs through all positive integers, and p through all primes. Hence log I'(s) I= Re(log'(s)) = Re Now I m,p 00 ! (I _1_). m,p mpm. 1/(mpm.) = I cn/n' is a Dirichlet series with coefficients n=2 1. m -, If n=p, n C = :, otherwise. Hence Since Cn -Cn = ' n-·.t = -Cn ( cos(tlogn)-isin(tlogn)), n' n" n" it follows that I 00 loglC(s)l= C : cos(tlogn). n=2 n Hence log 1'3 (0') ,4(0' + i t)C(a + 2 i t)1 = 3log 1'(0')1 +4log 1'(0' + it)1 + log 1'(0' + 2 i t)1 =I c:n (3+4cos(tlogn)+cos(2tlogn))~0, since Cn~O, and 3 +4cosO+cos20=2(1 +COSO)2 ~O, (6) 124 The prime number theorem XI for real 8. Hence so that we have I(a -1)(a)13 .1 na + it) 14 'I(a+ 2 it)1 . a-I ~ _1_. a-I (7) We shall show that the assumption that (I +it)=O for t=to#O, leads to a contradiction. For if we take t = to in (7), and let a-+ 1 + 0, then the right-hand side tends to 00, while the left-hand side tends to the limit I('(I+it oW'I((1+2it o)l, under the assumption that nl+it o)=O; and the limit is finite, since ((s) is analytic for a>O, s# 1. Hence ((I +ito)#O, which proves the theorem. § 2. The Wiener-Ikehara theorem. We deduce the prime number theorem from the following THEOREM 2 (WIENER-IKEHARA). Let A(x) be a non-negative, nondecreasing function of x, defined for 0:::; x < 00. Let the integral JA(x)e-XSdx, o s=a-fit, converge for a> 1 to the function f(s). Let f(s) be analytic for a ~ 1, except for a simple pole at s = 1 with residue 1. Then lim e- X A(x) = 1. X-' 00 PROOF. We shall prove the theorem in two parts. Setting B(x)=e- x A(x), (8) we shall first prove that, for any A> 0, lim y-'oo r B v sin v (y - -) dV=1!. A v 2 -2- (9) -00 We shall then deduce from (9) that lim B(x)= 1. FIRST PART. Since, for a> 1, we have f 00 f(s)= o (10) f 00 A(x)e-XSdx, _1_ = s-1 o e-(S-l)xdx, 125 The Wiener-Ikehara theorem §2 it follows that 00 f(s) - _1_ = f(B(X)-1)e-(S-1)XdX s-1 (0">1). o If we put 1 g(s)=f(s) - - , and s-1 then g(s) is analytic for For 2>0, we have 0" ~ gE(t)=g(1+8+it), 1, because of the assumption on f(s). U (I - ~'De,"d' ~ U(I - ~'D 2A 8>0, 2A gil) e'" J(B(x)-I)e-('H<~dx) (00 dl. (11) We wish to show that the order of integration in (11) can be interchanged. Since A(x) is non-negative and non-decreasing, we have for real s, and x>O, 00 f(s) = f 00 A(u)e-USdu~A(x) o f e-usdu = A(x)e-XS s ' x that is, A(x)~sf(s)exs. Since f(s) is analytic for 0"> 1, it follows that A(x)=O(eXS ) for every s> 1, which implies that A(x)=o(eXS ) for every s>1. Hence B(x)e-h=A(x)e-(lH)x= 0(1), for every 15>0. This implies that the integral 00 S(B(x)-l}e-(E+it)xdx o converges uniformly in the interval - 2 2 ~ t ~ 2 2. Hence we can interchange the order of integration in (11), and obtain f 00 = o (B(x)-l}e- EX sin 2 2(y-x) dx. Il(Y_X)2 (12) The prime number theorem 126 XI Since g(s) is analytic for 0" ~ 1, it follows that ge(t)--+g(1 + it), as £--+0, uniformly in any interval - 2 A~ t ~ 2 A. Further 00 . f - sin z A(Y-X) hm e ex dX e->O A(Y-X)Z o 00 = f sin z A(Y-X) d x. A(Y-X)Z 0 Hence the limit 00 sinZ A(Y-X) lim f B(x)e-ex Z dx e->O A(Y-X) o exists. Further, since the integrand is non-negative, and monotone increasing as £--+0, we have 00 00 sinz A(Y - x) lim f B(x)e-ex dx e->O A(Y-X? o = f B(x) 0 sin z A(Y - x) dx. A(Y-X)Z Thus we get from (12) Z). 00 -1 f g(1+it) (1\-t-\ ) . e,ytdt 2A 2 -2). = 00 f B(x) sinZA(Y-x) Z dx - fSinZA(Y-X) Z dx. A(Y-X) A(Y-X) 0 0 Now, if we let y--+ a), then by the Riemann-Lebesgue lemma, the lefthand side tends to zero, while on the right-hand side the second term gives lim y-> 00 f oo o sinz A(Y - x) ----=-z-dx A(Y-X) = . hm y->oo J sin z v -z-dv=n; v -00 hence lim y->oo v sin z v B (Y - -) -Z- dv=n, ). v J -00 which proves (9). SECOND and PART. We shall prove (10) in two steps, namely lim x-> 00 B(x)~ 1, (13) lim x-> 00 B(x)~ 1. (14) The Wiener-Ikehara theorem §2 127 Given positive numbers a and A, let y > a/A. Then, by (9), we have f (y - ~) a B lim y--+ 00 sin: v dv~n, A v -a because the integrand is non-negative. Further A(u) = B(u)e U is nondecreasing; hence, for -a~v~a, we have which implies that Hence f( a) e -~ 7sin2v d v ~ n, a lim y--+ 00 B y- T .Ie -a that is -lim B (y - -a) Ii y--+ 00 f a -~ e.le sin 2 v dv~n. -2v -a For fixed numbers a and A we have lim B(y-a/Ii)= lim B(y). Hence f a 2a e -Y lim B(y) y--+ 00 y-oo y-oo sin 2 v -2- dv~n v -a for all a>O, A>O. Now let a ....HYJ and A....HX), in such a way that a/A~O. Then f 00 lim B(y) y--+ 00 sin2 v -v dv~n, 2 -00 or n lim y--+ 00 which proves (13). B(y)~n, XI The prime number theorem 128 We shall use (13) to prove (14), for (13) implies that IB(x)1 ~ c, for a suitable constant c, so that for fixed positive a and A, and a sufficiently large y, we have l+-;:h,dV'; c 1 f -;;dv 1B(Y-;:) T~V;) '<Y 00] [ -a sin 2 V V + a sin 2 V V + sin 2 V As before, for - a ~ V ~ a, we have so that (16) -a -a From (9), (15) and (16) we get ..; {I +Jl Si::v That is 11: ~c ff 00] -a + [ - 00 dv sin 2 V -2- v +!~"! + +i),': }'i::V dv. dv + lim B(y)e 3:.<l A y~oo a f a sin 2 V -2- v dv . -a Now let a-HfJ, and A-HfJ, such that ajA-+O. We then get 11: ~ 11: lim B(y), y-->oo which proves (14), and hence also Theorem 2. § 3. The prime number theorem. If 1/1 denotes Chebyshev's function (cf. Chapter VII), we take A(x)= !jJ(e and note that !jJ is nondecreasing and I/I(eX);;:?:O. Relation (4) enables us to verify the other hypothesis of Theorem 2, for ((s) is analytic for 0' > 0, except for a simple pole at s= 1, and, after Theorem 1, ((s) does not vanish in the half-plane O'~ 1. Hence I/I(eX)",eX, or 1/1 (x) "'x, as X-+OO, which is the prime number theorem. Thus the prime number theorem follows from the Wiener-Ikehara theorem, if we assume that ((1 + it) 0 for t O. On the other hand, X ), * * 129 The prime number theorem §3 if we assume the prime number theorem, it is easy to deduce that ((I+it)#O for t#O. For let 00 __ C'{s) __1_ = I !fr(x)-x dx <P(s) r( ) 1 s+ 1 , 0 ' > 1. s .. s sx Then <P(s) is regular for 0'>0, except for simple poles at the zeros of ((s). The prime number theorem implies that t/!(x)=x+o(x), as x--+ 00. Hence, given e > 0, there exists a number xo(e), such that for x ~ xo(e) > 1, we have l!fr(x)-xl < ex. Thus, for 0' > 1, we have XO 00 I!fr(X)-X I I<P(s) I < I x2 dx + Ie XU dx, Xo and, since 00 00 e e < I -dx = - e- , I -dx XU XU 0'-1 XO we get e 0' -1 I<P(s) I < K + --, 0'>1, where K = K(xo) = K(e). Thus (0' -l)I<P(s)1 < K(a -1) + e, 0' > 1. If we now let 0'--+ 1 + 0, we get, for any fixed t, lim (0' -1) <P(a + it) = O. u-+1+0 (17) If l+it, for t#O, were a zero of((s),then the limit of (a-l)<P(a+it), as 0' --+ 1 + 0, would be equal to the residue of <P(s) at the simple pole s = 1 + it, and therefore different from zero, which contradicts (17). Hence ((l+it)#O for t#O. Thus the assertion that ((I +it)#O for t#O is 'equivalent' to the prime number theorem. Another equivalent assertion is that Pn '" nlogn, where Pn denotes the nth prime, when the primes are arranged in natural order. 9 Chandrasekharan, Analytic Number Theory XI The prime number theorem 130 For, if n(x)logx x ---> 1, as x ---> 00, then logn(x) + loglogx -logx ---> 0, hence logn(x) - - - ---> logx 1, so that n(x) logn(x) - - - - - ---> x 1, from which it follows that Pn '" nlogn, if we take x = Pn. Conversely, if x is defined by the inequality Pn ~ X < Pn+l, and Pn"'nlogn, then Pn+l"'(n+1)log(n+1)",nlogn, so that x",nlogn, or x'" ylogy, where y = n(x) = n. That is, logx '" logy, hence x y"'-. logx A list of books L. E. Dickson, History of the theory of numbers (Carnegie Institution, Washington), i (1919), ii (1920), iii (1923), reprinted (Chelsea, New York, 1952). G. H. Hardy and E. M. Wright, An introduction to the theory of numbers (Oxford University Press, 1938, 2nd edition, 1945). A. E. Ingham, The distribution of prime numbers (Cambridge University Press, 1932), reprinted (Stechert-Hafner, New York, 1964). E. Landau, Handbuch der Lehre von der Verteilung der Primzahlen, (2 volumes, Teubner, Leipzig, 1909), reprinted (Chelsea, New York, 1953). J. V. Uspensky and M. A. Heaslet, Elementary number theory (McGrawHill, New York, 1939). I. M. Vinogradov, An introduction to the theory of numbers (Pergamon Press, London, 1955). Notes Notes on Chapter I As general references, see J. V. Uspensky and M. A. Heaslet, loco cit., Chs. 1-6; and G. H. Hardy and E. M. Wright, loco cit., Chs. 1-3. § 2. Theorem 2 was stated by Gauss, Disquisitiones Arithmeticae, (1801), § 16, reprinted in his Werke, i (1863), 15. For what we call the "first proof of Theorem 2", reference may be made to E. Zermelo, Gottinger Nachrichten (new series), i (1934), 43 -44. According to Zermelo, his proof dates from 1912. See also H. Hasse, l.for Math. 159 (1928), 3-6; and F. A. Lindemann, Quarterly l. Math. (Oxford), 4 (1933), 319 - 320. § 3. For the "second proof of Theorem 2", see E. Heeke, Vorlesungen uber die Theorie der algebraischen Zahlen, (1923), Ch. 1. What we have called a module of integers is simply a subgroup of the additive group of integers. For Theorem 6, see Euclid's Elements, book 7, prop. 30, given in T. L. Heath's The thirteen books of Euclid's Elements (Cambridge, 1926). § 5. Farey's name is associated with the Farey sequences because of Cauchy, who noticed J. Farey's statement of Theorem 7, without proof, in 1816, and published a proof himself. See A. Cauchy, Oeuvres, 2" serie, tome 6, 146. Theorems 7 and 9 seem to have been first stated and proved by C. Haros in 1802. See Dickson's History, loco cit., i, 156. The following comment by C. L. Siegel on the proof of Theorem 7 may be of interest: "Let kl-hm= 1, k>O, m>O. The homogeneous linear substitution 2=ka-hb, p.= -ma+lb of the integer variables a,b has the inverse a=21+hp., b=m2+kp.. Hence the conditions h/k~a/b~l/m, b>O, (a,b)= 1, are satisfied if and only if 2~0, p.~0, 2+p.>0, (2,p.) = 1, and then b~m+k exactly in the three cases 2,p.=0,1; 1,1; 1,0. This is independent of the notion of FR'" § 6. For Theorem 12, see Euclid's Elements, book 9, prop. 20. For P61ya's proof of Theorem 13, see G. P61ya and G. Szego, Aufgaben und Lehrsiitze aus der Analysis, (1925), ii, 133, 342. The remark about allowing fo=3 is due to C. L. Siegel. The proof, by G. T. Bennett, of Euler's result that fs is divisible by 641, is given in the book by Hardy and Wright, loco cit., 15. An alternative proof is given by Kraitchik, Thiorie des nombres (Paris, 1926), ii, 221. Notes 133 Notes on Chapter II As general references, see Uspensky and Heaslet, loco cit., Chs. 6, 7; Hardy and Wright, loco cit., Ch. 5; and Vinogradov, loco cit., Chs. 1,2. § 1. The theory of congruences was developed by Gauss in his Disquisitiones Arithmeticae, loco cit., though Fermat and Euler were perhaps aware of some of the main results. § 2. For Fermat's statement of Theorem 3, in 1640, see his Oeuvres, ii, 209. Euler proved Theorem 2 in 1760. See his Opera, (l), ii, 531. See also Dickson's History, loco cit., i, Ch. 3. § 3. For Theorem 7, see Lagrange, Oeuvres (1868), ii, 667 -9. Notes on Chapter III § 2. For the proofs of Theorems 5 and 7, see, for instance, H. Rademacher, Lectures on elementary number theory (Blaisdell, New York, 1964), 33 - 35. § 3. For Theorem 6, see Lucas, TMorie des nombres (1891), i, 353 -4. § 4. Theorem 9 is due to A. Hurwitz, Math. Annalen, 39 (1891), 279 - 284. The proof given here is due to A. Khinchin (= A. Khintchine), Math. Annalen, 111 (1935), 631-637, and the author's attention was drawn to it by Raghavan Narasimhan. In the author's Einfiihrung in die Analytische Zahlentheorie, Springer Lecture Notes, 29 (1966), Ch. 3, a different proof was sketched, which originated with L. R. Ford, American Math. Monthly, 45 (1938), 586-601. Notes on Chapter IV As general references, see Uspensky and Heaslet, Hardy and Wright, and Vinogradov, loco cit. § 1. For the introduction of the Legendre symbol, see Legendre, Essai sur la tMorie des nombres (1798), 2 nd edition (1808), § 135. We do not consider the case p=2, since all integers are quadratic residues modulo 2. § 2. The first published proof (1773) of Wilson's theorem is due to Lagrange, Oeuvres, iii, 425. The theorem was first stated by Waring, Meditationes algebraicae (1770), 218, and attributed to J. Wilson. Hardy and Wright say that "there is evidence that it was known long before to Leibniz". § 3. Theorems 5, 6, 7 can be found in Hardy and Wright's book, loco cit., 70, 297. The proof of Theorem 7 given here is due to Hermite, Journal de Math. (1), 13 (1848), 15; Oeuvres, i, 264. 134 Notes § 4. Waring stated without proof that every positive integer is a sum of four squares, M editationes algebraicae (1770), 204 - 5, and Lagrange proved it the same year, see his Oeuvres, iii, 189. See also Dickson's History, loco cit., ii, Ch.8. Notes on Chapter V § 1. Theorem 1, though stated by Euler, and partly proved by Legendre, was completely proved by Gauss in 1795. See P. Bachmann, Niedere Zahlentheorie (1902), i, Ch. 6, where several proofs are described. §§ 2-3. The idea of proving Theorem 1 by means of a reciprocity formula for Gaussian sums goes back to Kronecker, Monatsber. Kgl. Preuss. Akad. Wiss. Berlin (1880), 686 - 698; 854 - 860; J. for die reine und angewandte Math. 105 (1889), 267 -268; Werke (1929), iv, 278- 300. [There is, however, a reference to a paper by Schaar (1848), on the reciprocity formula for Gaussian sums, in Lindel6fs Calcul des Residus, p. 68, as pointed out by C. L. Siegel.] It was extended to algebraic number fields by E. Hecke, Gottinger Nachrichten (1919), 265-278; Werke, 235-248; and by C. L. Siegel, Gottinger Nachrichten (1960),1-16; Ges. Abhandlungen (1966), iii, 334-349. The proof given here is, in substance, Siegel's. The integral, used in the proof of Theorem 2, is of importance in the theory of the zeta-function of Riemann. See C. L. Siegel, Quellen und Studien zur Geschichte der Math. 2 (1932), 45 - 80; Gesammelte Abhandlungen (1966), i, 275. For the evaluation of ordinary Gaussian sums by contour integration, see also L. J. Mordell, Messenger of Math. 48 (1919), 54-56. The deduction of (14) from (12) is slightly shorter here than in the author's Lecture Notes (loc. cit. Notes, Ch. 3), as a result of a comment by C. L. Siegel. Since g( -m, -n)=g(m,n), the case m<O, n>O can be reduced to the case m>O, n<O. Relation (21) shows that -1 is a quadratic residue of primes == 1 (mod4), and a quadratic non-residue of primes == 3 (mod4). § 4. Theorem 3 is due to Euler, Opera, (1), iii, 240. For the example and the remark, see Rademacher, (loc. cit., Notes, Ch. 3), 74, 82. Notes on Chapter VI As general references, see Hardy and Wright, loco cit., Chs. 16-18, and Vinogradov, loco cit. 135 Notes § 2. The statement that r(n) = O(nE), for every £ > 0, is equivalent to the statement that r(n)=o(nE), for every £>0. For Theorem 1, see Gauss, Werke, (ii), 272 - 5. § 3. For the proof of Theorem 4, see P6lya and Szego (loc. cit., Notes, Ch. 1), ii, 160-1, 386. For Theorems 5 and 6, see Hardy and Wright, loco cit., 259. Theorem 9 was proved by Dirichlet in 1849, see his Werke, ii, 49 - 66. G. Voronoi's improvement of the error-term is given in Ann. Sci. Ecole Norm. Sup. (3), 21 (1904), 207 - 267; 459 - 533. That the errorterm is not O(N1/4) was proved by Hardy, Proc. London Math. Soc. (2) 15 (1916),192-213. § 4. For the history of Mersenne numbers, and of perfect numbers, see Dickson, loco cit., i, Chs. 1 - 2. § 5. For Theorems 15 and 19, see A. F. Mobius, J. for die reine und angewandte Math. 9 (1832), 105 -123; Werke (1887), iv, 589 - 612. See also Landau's Handbuch, loco cit., §§ 150-152. Theorems 16 and 17 were proved by Dedekind, J.for die reine und angewandte Math. 54 (1857), 21, and by Liouville, J. de Math. pures et appliquees, (2) 2 (1857), 111, at about the same time. § 6. For Theorem 20, see Landau's Handbuch, loco cit., § 59. Theorem 22 is due to F. Mertens, J. fur die reine und angewandte Math. 77 (1874),290-291, and is given in Landau's book, § 152. 00 The evaluation of L f.1(n) n - 2, without the use of Euler's identity n=l (proved later in Chapter VII, § 4) is a result of a comment by Raghavan 00 Narasimhan. For a proof of the formula L n - 2= 1[2/6, see, for in"=1 stance, K. Knopp, Theory and application of infinite series (1951), 237, 323,376. Notes on Chapter VII As general references, see Landau's H andbuch, loco cit., §§ 12 - 28, and Ingham's book, loco cit., Ch. 1. § 1. For Theorem 1, see Euler, Opera (1),8, § 279; (1), 14,216-244. § 2. Theorem 3 is due to Chebyshev, Oeuvres, i, 49 - 70. § 3. S. S. Pill ai's proof of Theorem 4 is given in Bull. Calcutta Math. Soc. 36 (1944), 97 -99; 37 (1944), 27. See also Landau's Handbuch, loco cit., § 22. § 4. Theorem 7 is due to Chebyshev, Oeuvres, i, 27 -48. See Ingham's book, loco cit., 16 - 21. Euler used the formal identity. 136 Notes § 5. Theorem 8 is due to F. Mertens, J. for die reine und angewandte Math. 78 (1874), 46 - 62. See Ingham's book, loco cit., 22. Stirling's formula is given, for instance, in the book by E. C. Titchmarsh, The theory of functions (Oxford, 1932), 2 nd edition (1939), § 1.87. Notes on Chapter VIII §§ 1-4. Weyl's theorems were proved by him in Math. Annalen, 77 (1916), 313 - 352. An exposition using the notion of "discrepancy" is given by 1. W. Cassels, An introduction to Diophantine approximation (Cambridge, 1957), Ch. 4. § 5. Kronecker proved his theorem in the Berliner Sitzungsberichte (1884); see his Werke, iii (i) 47 -110. For further developments, see J. F. Koksma, Diophantische Approximationen, Ergebnisse der Math. Band iv, Heft 4 (1937). H. Bohr's proof of Theorem 8 is given in J. London Math. Soc. 9 (1934),5-6. See also Hardy and Wright, Ch. 23. Notes on Chapter IX As general references, see Minkowski's Geometrie der Zahlen, lSI edition (1896), and his Diophantische Approximationen (1927). See also the Lecture Notes on the Geometry of Numbers by C. L. Siegel (New York University, 1945). § 2. Theorem 1 is true without the assumption that the set S is bounded. For if it is unbounded, with measure V(S) > 2 n , one can take its intersection with a cube KM given by IXkl <M, 1 ~k~n, and if M is sufficiently large, then SM = S n K M will be a bounded set satisfying the required conditions, because of the countable additivity of Lebesgue measure. We do not seek to give the optimal hypotheses here since we do not wish to go into questions of measurability in greater detail. The formulation and proof of Theorem 3 support this line. Minkowski proved Theorem 3 in 1891, see his Gesammelte Abhandlung en, i, 264. Siegel's proof of his formula (8) is given in Acta Math. 65 (1935), 307 -323. The lemma which appears between Theorems 2 and 3 is due to G. D. Birkhoff, as stated by Blichfeldt, Trans. American Math. Soc. 15 (1914), 230. See also an Appendix in Cassels's book (loc. cit., Notes, Ch. 8). In Theorem 2 we use the fact that a closed set in W is Lebesgue measurable. Notes 137 Minkowski (loc. cit.) shows that a bounded convex set in R n has a volume in the sense of Jordan. See his Geometrie der Zahlen, 50-60; also his Theorie der konvexen Korper, Ges. Abh. 2, 142 -143; Blaschke's Kreis und Kugel, (Leipzig, 1916), 57. If a convex set S has Lebesgue measure V(S), 0< V(S)< 00, then it is bounded. See J. W. S. Cassels, An introduction to the geometry of numbers (Springer 1919), 109. Notes on Chapter X As a general reference, see Landau's Handbuch, loco cit., §§ 95 -103. See also C. L. Siegel, Lectures on Analytic Number Theory (New York University, 1945). § 5. The main theorem of this chapter, namely Theorem 8, was first proved by Dirichlet in 1837, see his Werke (i), 307 - 342. An elementary proof was given by Mertens, Wiener Sitzungsberichte, 106 (1897), 254286. A proof of the theorem by a new elementary method is due to A. Selberg, Annals of Math. (2) 50 (1949), 297 - 304; Canadian J. of Math. 2 (1950),66-78. Another elementary proof is due to H. Zassenhaus, Comm. Math. Helvetici, 22 (1949),232-259. Notes on Chapter XI As a general reference, see Landau's Handbuch (loc. cit.) including the Appendix by P. T. Bateman, pp. 929 - 931, which gives a history of the proof of the prime number theorem as an asymptotic relation. The idea of connecting the behaviour of n(x) with the properties of ((s), where s is a complex variable, and, in particular, with the location of its non-real zeros, originated with Riemann, Ober die Anzahl der Primzahlen unter einer gegebenen GrojJe, Monatsberichte der Preuss. Akad. der Wissenschaften (Berlin, 1859 -1860),671- 680; Werke (1 sl edition, 1876), 136-144; (2 nd edition, 1892), 145-155. § 1. The first proof of the prime number theorem was given by J. Hadamard, Bull. de la Soc. Math. de France, 24 (1896),199-220; and by c.-J. de la Vallee Poussin, Annales de la Soc. sci. de Bruxelles, 20 2 (1896), 183 - 256. For a clear presentation of the classical proof, see Ingham's book (loc. cit.), Ch. 2. § 2. For the theorem of Wiener-Ikehara, see S. Ikehara, J. Math. Phys. Mass. Inst. Tech. 10 (1931), 1-12; N. Wiener, Annals of Math. 33 (1932), 1-100, 787; and N. Wiener, The Fourier integral (Cambridge, 1933), § 19. The theorem is true with weaker hypotheses, but for the deduction of the prime number theorem, it does not make much difference what form we consider. 138 Notes The proof of the Wiener-Ikehara theorem given here, which does not use Wiener's general Tauberian theorem, is substantially that of S. Bochner, Math. Zeit. 37 (1933), 1-9, as simplified by E. Landau, Berliner Sitzungsberichte (1932),514-521, and by Bochner in his Lectures on Fourier Analysis (Princeton University, 1936). It is the same as the proof given in the author's Lectures on the Riemann zeta-Junction (Tata Institute of Fundamental Research, Bombay, 1953). A proof of the prime number theorem by a new elementary method has been given by A. Selberg, Annals oj Math. (2) 50 (1949), 305 - 313. Subject index Abel's summation formula 78 arithmetical function 14; completely multiplicative - 76; multiplicative 14; - r(n) 45; - R(N) 46; - d(n) 47; - D(N) 50; - (J(n) 54; - J1(n) 55; - A(n) 57; - cp(n) 13; - q,(N) 59; - n(n) 63; - 9(n) 64; - t/!(n) 64; Bertrand's postulate 71 Birkholrs lemma 100 Character: - of an abelian group 107; - modulo m 110; principal- 107, 110 Chebyshev's: - functions 64; - inequality 74; -lemma 68; - theorem 67 composite number 1 congruences 11 ; sum, difference, product of - 11 Congruent: - modulo m 11; -modulo 1 84 Convergence: abscissa of 113; abscissa of absolute - 114; half-plane of 113; line of 113; strip of conditional- 114 Dirichlet's: - formula for D(N) 53; - theorems 84, 120; - L-functions 117 Dirichlet series 78, 111; coefficients of 78, 111; formal product of 116; product of - 116; uniqueness of 116 discrepancy 85; - modulo 1 87 divisibility 1 divisor 1; greatest common 3; - function 47 Euclid's theorem 4 Euler's: - constant 51; - criterion 27; - function cp 13; - identity 76; - theorem 13 Farey: - fraction 6; - sequence 6 Fermat number 10 Fermat's theorem 13 fraction 6; irreducible 6; proper- 6; reduced- 6 fractional part 84 Gaussian sum 34; generalized34 group: abelian 107; cyclic107; generator of 107 Hadamard's theorem 123 Hurwitz's theorem 23; Khinchin's proof of- 23 integral part 18 interval function 85 Kronecker's theorem 91; proof of 93 Bohr's Lagrange's theorem: - on congruences 16; sums of squares 31 Landau's theorem 115 lattice 101; determinant of a 101 lattice point 45; - function r(n) 45 Legendre symbol 26 linearly independent 91 von Mangoldt's function 57 mediant 8 Mersenne: - number 55; -prime 55 Mertens's: - formulae 81; - theorem 59 Minkowski's theorem 98; Siegel's proof of 98 Mobius's function 55 Mobius inversion formula: first - 56; second 58 module 3; the trivial- 4 1; least multiple 1; integral common- 5 Orthogonality relations Polya's theorem 10 perfect number 54 110 140 Subject index prime: - number 1; - residue class 12; relatively 3 prime number theorem 122 principal character 107, 110 quadratic: - residue 26; - nonresidue 26; - reciprocity law 34 quadratic form: positive definite104; determinant of a - I 04 quotient 1 remainder 1 representation: imprimitive 30; primitive 30 residue class 11 ; prime 12 residue system: complete 12; complete prime 12 Riemann's zeta-function 107 set: convex 97; symmetric 97; translate of a 97 Siegel's formula 99 standard form 2 Stirling's formula 81 summatory function 45 uniformly distributed 85; -modulo 1 86 unique factorization theorem 2 de la Vallee Poussin's theorem Weyl's theorems 87 Wiener-Ikehara theorem 124 Wilson's theorem 27 123 Die Grundlehren der mathematischen Wissenschaften in Einzeldarstellungen mit besonderer Berticksichtigung der Anwendungsgebiete Lieferbare Biinde: 2. 3. 4. 10. 14. 15. 16. 19. 20. 22. 26. 27. 30. 31. 32. 38. 40. 50. 52. 57. 58. 59. 60. 61. 62. 64. 65. 66. 68. 69. 71. Knopp: Theorie und Anwendung der unendlichen Reihen. DM 48,-; US $ 12.00 Hurwitz: Vorlesungen iiber allgemeine Funktionentheorie und elliptische Funktionen. DM 49,-; US $ 12.25 Madelung: Die mathematischen Hilfsmittel des Physikers. DM 49,70; US $ 12.45 Schouten: Ricci-Calculus. DM 58,60; US $ 14.65 Klein: Elementarmathematik vom hiiheren Standpunkt aus. 1. Band: Arithmetik. Algebra. Analysis. DM 24,-; US $ 6.00 Klein: Elementarmathematik vom hiiheren Standpunkt aus. 2. Band: Geometrie. DM 24,-; US $ 6.00 Klein: Elementarmathematik vom hiiheren Standpunkt aus. 3. Band: Prazisionsund Approximationsmathematik. DM 19,80; US $ 4.95 P6Iya/Szegii: Aufgaben und Lehrsatze aus der Analysis I: Reihen, Integralrechnung, Funktionentheorie. DM 34,-; US $ 8.50 P6Iya/Szegii: Aufgaben und Lehrsatze aus der Analysis II: Funktionentheorie, Nullstellen, Polynome, Determinanten, Zahlentheorie. DM 38,-; US $ 9.50 Klein: Vorlesungen iiber hiihere Geometrie. DM 28,-; US $ 7.00 Klein: Vorlesungen iiber nicht-euklidische Geometrie. DM 24,-; US $ 6.00 Hilbert/Ackermann: Grundziige der theoretischen Logik. DM 38,-: US $ 9.50 Lichtenstein: Grundlagen der Hydromechanik. DM 38,-; US $ 9.50 Kellogg: Foundations of Potential Theory. DM 32,-; US $ 8.00 Reidemeister: Vorlesungen iiber Grundlagen der Geometrie. DM 18,-; US $ 4.50 Neumann: Mathematische Grundlagen der Quantenmechanik. DM 28,-; US $ 7.00 Hilbert/Bernays: Grundlagen der Mathematik I. DM 68,-; US $ 17.00 Hilbert/Bernays: Grundlagen der Mathematik II. 2. Aufl. in Vorbereitung Magnus/Oberhettinger/Soni: Formulas and Theorems for the Special Functions of Mathematical Physics. DM 66,-; US $ 16.50 Hamel: Theoretische Mechanik. DM 84,-; US $ 21.00 Blaschke/Reichardt: Einfiihrung in die Differentialgeometrie. DM 24,-; US $ 6.00 Hasse: Vorlesungen iiber Zahlentheorie. DM 69,-; US $ 17.25 Collatz: The Numerical Treatment of Differential Equations. DM 78,-; US $ 19.50 Maak: Fastperiodische Funktionen. DM 38,-; US $ 9.50 Sauer: Anfangswertprobleme bei partiellen Differentialgleichungen. DM 41,-; US $ 10.25 Nevanlinna: Uniformisierung. DM 49,50; US $ 12.40 T6th: Lagerungen in der Ebene, auf der Kugel und im Raum. DM 27,-; US $ 6.75 Bieberbach: Theorie der gewiihnlichen Differentialgleichungen. DM 58,50; US $ 14.65 Aumann: Reelle Funktionen. DM 59,60; US $ 14.90 Schmidt: Mathematische Gesetze der Logik I. DM 79,-; US $ 19.75 Meixner/Schafke: Mathieusche Funktionen und Spharoidfunktionen mit Anwendungen auf physikalische und technische Probleme. DM 52,60; US $ 13.15 73. 74. 75. Hermes: Einfiihrung in die Verbandstheorie. DM 46,-; US $ 11.50 Boerner: Darstellungen von Gruppen. DM 58,-; US $ 14.50 Rado/Reichelderfer: Continuous Transformations in Analysis, with an Introduction to Algebraic Topology. DM 59,60; US $ 14.90 76. 77. Tricomi: Vorlesungen iiber Orthogonalreihen. DM 37,60; US $ 9.40 Behnke/Sommer: Theorie der analytischen Funktionen einer komplexen Veranderlichen. DM 79,-; US $ 19.75 Saxer: Versicherungsmathematik. 1. Teil. DM 39,60; US $ 9.90 Pickert: Projektive Ebenen. DM 48,60; US $ 12.15 Schneider: Einfiihrung in die transzendenten Zahlen. DM 24,80; US $ 6.20 Specht: Gruppentheorie. DM 69,60; US $ 17.40 Bieberbach: Einfiihrung in die Theorie der Differentialgleichungen im reellen Gebiet. DM 32,80; US $ 8.20 Conforto: Abe1sche Funktionen und algebraische Geometrie. DM 41,80; US $ 10.45 Siegel: Vorlesungen iiber Himmelsmechanik. DM 33,-; US $ 8.25 Richter: Wahrscheinlichkeitstheorie. DM 68,-; US $ 17.00 van der Waerden: Mathematische Statistik. DM 49,60; US $ 12.40 Miiller: Grundprobleme der mathematischen Theorie elektromagnetischer Schwingungen. DM 52,80; US $ 13.20 Pfluger: Theorie der Riemannschen FHichen. DM 39,20; US $ 9.80 Oberhettinger: Tabellen zur Fourier Transformation. DM 39,50; US $ 9.90 Prachar: Primzahlverteilung. DM 58,-; US $ 14.50 Rehbock: Darstellende Geometrie. DM 29,-; US $ 7.25 Hadwiger: Vorlesungen iiber Inhalt, Oberflache und Isoperimetrie. DM 49,80; US $ 12.45 Funk: Variationsrechnung und ihre Anwendung in Physik und Technik. DM 98,-; US $ 24.50 Maeda: Kontinuierliche Geometrien. DM 39,-; US $ 9.75 Greub: Linear Algebra. DM 39,20; US $ 9.80 Saxer: Versicherungsmathematik. 2. Teil. DM 48,60; US $ 12.15 Cassels: An Introduction to the Geometry of Numbers. DM 69,-; US $ 17.25 Koppenfels/Stallmann: Praxis der konformen Abbildung. DM 69,-; US $ 17.25 Rund: The Differential Geometry of Finsler Spaces. DM 59,60; US $ 14.90 Schiitte: Beweistheorie. DM 48,-; US $ 12.00 Chung: Markov Chains with Stationary Transition Probabilities. DM 56,-; US $ 14.00 Rinow: Die innere Geometrie der metrischen Raume. DM 83,-; US $ 20.75 Scholz/Hasenjaeger: Grundziige der mathematischen Logik. DM 98,-; US $ 24.50 Kothe: Topologische Lineare Raume I. DM 78,-; US $ 19.50 Dynkin: Die Grundlagen der Theorie der Markoffschen Prozesse. DM 33,80; US $ 8.45 Hermes: Aufzahlbarkeit, Entscheidbarkeit, Berechenbarkeit. DM 49,80; US $ 12.45 Dinghas: Vorlesungen iiber Funktionentheorie. DM 69,-; US $ 17.25 Lions: Equations differentielles operationnelles et problemes aux limites. DM 64,-; US $ 16.00 Morgenstern/Szabo: Vorlesungen iiber theoretische Mechanik. DM 69,-; US$17.25 Meschkowski: Hilbertsche Raume mit Kernfunktion. DM 58,-; US $ 14.50 79. 80. 81. 82. 83. 84. 85. 86. 87. 88. 89. 90. 91. 92. 93. 94. 95. 97. 98. 99. 100. 101. 103. 104. 105. 106. 107. 108. 109. 1l0. Ill. 112. 113. 114. 115. 116. 117. 118. 119. 120. MacLane: Homology. DM 62,-; US $ 15.50 Hewitt/Ross: Abstract Harmonic Analysis. Vol. I: Structure of Topological Groups. Integration Theory. Group Representations. DM 76,-; US $ 19.00 Hormander: Linear Partial Differential Operators. DM 42,-; US $ 10.50 O'Meara: Introduction to Quadratic Forms. DM 48,-; US $ 12.00 Schiifke: Einfiihrung in die Theorie der speziellen Funktionen der mathematischen Physik. DM 49,40; US $ 12.35 Harris: The Theory of Branching Processes. DM 36,-; US $ 9.00 Collatz: Funktionalanalysis und numerische Mathematik. DM 58,-; US $ 14.50 g~: Dynkin: Markov Processes. DM 96,-; US $ 24.00 123. Yosida: Functional Analysis. DM 66,-; US $ 16.50 124. Morgenstern: Einfiihrung in die Wahrscheinlichkeitsrechnung und mathematische Statistik. DM 34,50; US $ 8.65 125. itO/McKean: Diffusion Processes and Their Sample Paths. DM 58,-; US $ 14.50 126. LehtojVirtanen: Quasikonforme Abbildungen. DM 38,-; US $ 9.50 127. Hermes: Enumerability, Decidability, Computability. DM 39,-; US $ 9.75 128. Braun/Koecher: Jordan-Algebren. DM 48,-; US $ 12.00 129. Nikodym: The Mathematical Apparatus for Quantum-Theories. DM 144,-; US $ 36.00 130. Morrey: Multiple Integrals in the Calculus of Variations. DM 78,-; US $ 19.50 131. Hirzebruch: Topological Methods in Algebraic Goemetry. DM 38,-; US $ 9.50 132. Kato: Perturbation theory for linear operators. DM 79,20; US $ 19.80 133. Haupt/Kiinneth: Geometrische Ordnungen. DM 68,-; US $ 17.00 134. Huppert: Endliche Gruppen I. DM 156,-; US $ 39.00 135. Handbook for Automatic Computation. Vol. IfPart a: Rutishauser: Description of ALGOL 60. DM 58,-; US $ 14.50 136. Greub: Multilinear Algebra. DM 32,-; US $ 8.00 137. Handbook for Automatic Computation. Vol. I/Part b: Grau/Hill/Langmaack: Translation of ALGOL 60. DM 64,-; US $ 16.00 138. Hahn: Stability of Motion. DM 72,-; US $ 18.00 139. Mathematische Hilfsmittel des Ingenieurs. Herausgeber: Sauer/Szab6. 1. Teil. DM 88,-; US $ 22.00 141. Mathematische Hilfsmittel des Ingenieurs. Herausgeber: Sauer/Szab6. 3. Teil. DM 98,-; US $ 24.50 143. Schur/Grunsky: Vorlesungen iiber Invariantentheorie. DM 32,-; US $ 8.00 144. Weil: Basic Number Theory. DM 48.-; US $ 12.00 145. Butzer/Berens: Semi-Groups of Operators and Approximation. DM 56,-; US $ 14.00 146. Treves: Locally Convex Spaces and Linear Partial Differential Equations. D M 36,-; US $ 9.00 147. Lamotke: Semisimpliziale algebraische Topologie. DM 48,-; US $ 12.00 148. Chandrasekharan: Introduction to Analytic Number Theory. DM 28,-; US $ 7.00 149. Sario/Oikawa: Capacity Functions. In Vorbereitung 150. Iosifescu/Theodorescu: Random Processes and Learning. DM 68,-; US $ 17.00 151. 152. 153. Mandl: Analytical Treatment of One-Dimensional Markov Processes. DM 36,-; US $ 9.00 Hewitt/Ross: Abstract Harmonic Analysis. Vol. 2. In Vorbereitung Federer: Geometric Measure Theory. In Vorbereitung