Least fixed point of a monotone function
5. Advanced Mathematics
This part surveys parts of Mathlib that go beyond first-year mathematics: order theory and lattices, the algebraic hierarchy, filters as a unifying language for limits, topology and measure theory, and discrete probability. The aim is not to redo the mathematics from scratch but to show how Mathlib organizes it -- which typeclasses to use, where to find the lemmas, and what the design trade-offs were.
Each chapter starts with a short notation table so the symbols are
introduced before they appear, and uses Verso's {docstring} block
to render Mathlib definitions and lemmas inline so the notes stay in
sync with the library.
5.1. Orders, Lattices, and Complete Lattices
In this chapter we explore how Mathlib formalizes the fundamental notions of order theory. Ordered structures pervade all of mathematics: partial orders appear in set inclusion, divisibility of natural numbers, and subgroup relations; lattices arise whenever we can form meets and joins; and complete lattices are essential for fixed-point theorems and topology.
5.1.1. Notation and naming conventions
Several symbols below are unicode characters typed in VS Code via a
backslash escape (e.g. \le produces ≤). Hover over a symbol in the
editor to see how to type it.
Symbol | Lean name | Reads as | Typed as |
|---|---|---|---|
|
| "a less than or equal to b" |
|
|
| "a strictly less than b" |
|
|
| "s is a subset of t" |
|
|
| "a join b" / "a sup b" |
|
|
| "a meet b" / "a inf b" |
|
|
| "top" |
|
|
| "bottom" |
|
|
| "supremum of the set s" | (ASCII) |
|
| "infimum of the set s" | (ASCII) |
|
| "indexed supremum" | (ASCII) |
|
| "indexed infimum" | (ASCII) |
|
| "order homomorphism α to β" |
|
Naming hints:
-
The prefix
sinsSup,sInfstands for set: it takes aSet α. -
The prefix
iiniSup,iInfstands for indexed: it takes a function. -
The bare operators
⊔and⊓are binary;sSup,sInf,iSup,iInfare their set/indexed counterparts. -
Lemma naming follows the convention
le_sup_left,inf_le_right,le_antisymm, etc.: the conclusion is named first, then the hypotheses in order.
5.1.2. Partial orders
A partial order on a type α is a reflexive, transitive, and antisymmetric
relation ≤. In Mathlib this is captured by the typeclass PartialOrder α,
which extends Preorder α (reflexive and transitive) with antisymmetry.
Key lemmas:
-
le_refl a : a ≤ a -
le_trans : a ≤ b → b ≤ c → a ≤ c -
le_antisymm : a ≤ b → b ≤ a → a = b
A linear order (LinearOrder α) additionally satisfies totality: for every
a b : α, either a ≤ b or b ≤ a. The natural numbers, integers, and
reals all carry linear orders.
#check (inferInstance : LinearOrder ℕ)
-- Every linear order is a partial order:
#check (inferInstance : PartialOrder ℕ)
-- A simple proof using transitivity
example (a b c : ℕ) (hab : a ≤ b) (hbc : b ≤ c) : a ≤ c :=
le_trans hab hbc
-- Antisymmetry
example (a b : ℕ) (hab : a ≤ b) (hba : b ≤ a) : a = b :=
le_antisymm hab hba
The strict order < is defined in terms of ≤:
a < b ↔ a ≤ b ∧ a ≠ b (or equivalently a ≤ b ∧ ¬ b ≤ a).
example (a b : ℕ) (h : a < b) : a ≤ b :=
le_of_lt h
example (a b : ℕ) (h : a < b) : a ≠ b :=
ne_of_lt h
5.1.3. The power set as a partial order
For any type α, the type Set α carries a partial order given by set
inclusion ⊆. This is a canonical example of a partial order that is not
linear in general.
example (A B C : Set ℕ) (hAB : A ⊆ B) (hBC : B ⊆ C) : A ⊆ C :=
Set.Subset.trans hAB hBC
example (A B : Set ℕ) (hAB : A ⊆ B) (hBA : B ⊆ A) : A = B :=
Set.Subset.antisymm hAB hBA
5.1.4. Lattices
A lattice is a partial order in which every pair of elements has a least upper bound (supremum, join) and a greatest lower bound (infimum, meet).
In Mathlib:
-
⊔(typed\sup) denotes the join (supremum of two elements). -
⊓(typed\inf) denotes the meet (infimum of two elements). -
The typeclass is
Lattice α.
Key API:
-
le_sup_left : a ≤ a ⊔ b -
le_sup_right : b ≤ a ⊔ b -
sup_le : a ≤ c → b ≤ c → a ⊔ b ≤ c -
inf_le_left : a ⊓ b ≤ a -
inf_le_right : a ⊓ b ≤ b -
le_inf : c ≤ a → c ≤ b → c ≤ a ⊓ b
example (A B : Set ℕ) : A ⊆ A ∪ B :=
Set.subset_union_left
example (A B C : Set ℕ) (hA : C ⊆ A) (hB : C ⊆ B) : C ⊆ A ∩ B :=
Set.subset_inter hA hB
-- In a general lattice:
example {α : Type*} [Lattice α] (a b c : α) (ha : a ≤ c) (hb : b ≤ c) :
a ⊔ b ≤ c :=
sup_le ha hb
example {α : Type*} [Lattice α] (a b : α) : a ⊓ b ≤ a :=
inf_le_left
5.1.5. Complete lattices
A complete lattice is a partial order in which every subset has a supremum
and an infimum (not just pairs). In Mathlib the typeclass is
CompleteLattice α.
The operations are:
-
sSup S: the supremum of a setS : Set α -
sInf S: the infimum of a setS : Set α -
⊤(\top) : the greatest element (=sSup Set.univ) -
⊥(\bot) : the least element (=sInf Set.univ) -
iSupandiInffor indexed suprema and infima.
#check (inferInstance : CompleteLattice (Set ℕ))
-- sSup of a family of sets is their union
example (S : Set (Set ℕ)) : sSup S = ⋃₀ S :=
rfl
-- Every element of the family is below the sSup
example {α : Type*} [CompleteLattice α] (s : Set α) (a : α) (ha : a ∈ s) :
a ≤ sSup s :=
le_sSup ha
5.1.6. Monotone and antitone functions
A function f : α → β between preorders is monotone if a ≤ b → f a ≤ f b,
and antitone if a ≤ b → f b ≤ f a. Mathlib provides Monotone f and
Antitone f.
example : Monotone (fun n : ℕ ↦ n + 1) :=
fun _ _ h ↦ Nat.add_le_add_right h 1
-- A constant function is both monotone and antitone
example {α β : Type*} [Preorder α] [Preorder β] (c : β) :
Monotone (fun _ : α ↦ c) :=
fun _ _ _ ↦ le_refl c
There is also StrictMono f for strictly increasing functions.
5.1.7. The Knaster-Tarski fixed point theorem
The Knaster-Tarski theorem is one of the cornerstones of fixed point theory. It generalizes Tarski's earlier fixed point theorem for the real line and has applications in logic, denotational semantics, and set theory (e.g. the Schröder-Bernstein theorem).
Theorem (Knaster-Tarski). Let α be a complete lattice and let
f : α → α be a monotone function. Then:
-
fhas a least fixed point, given bysInf {x | f x ≤ x}. -
fhas a greatest fixed point, given bysSup {x | x ≤ f x}. -
Moreover, the set of fixed points of
fis itself a complete lattice under the induced order.
The proof of the first part is short and instructive. Let
S = {x | f x ≤ x} (the prefixed points) and set a = sInf S.
Since f is monotone and a ≤ x for every x ∈ S, we get
f a ≤ f x ≤ x for all x ∈ S, so f a is a lower bound of S,
hence f a ≤ a. Thus a ∈ S. Applying f monotonically gives
f (f a) ≤ f a, so f a ∈ S, hence a ≤ f a. Antisymmetry yields
f a = a, and any other fixed point y = f y belongs to S, so
a ≤ y.
In Mathlib this theorem is available through the OrderHom API. The
least fixed point is OrderHom.lfp and the greatest is OrderHom.gfp:
The least fixed point and the key lemmas:
OrderHom.lfp.{u} {α : Type u} [CompleteLattice α] : (α →o α) →o αOrderHom.lfp.{u} {α : Type u} [CompleteLattice α] : (α →o α) →o α
OrderHom.map_lfp.{u} {α : Type u} [CompleteLattice α] (f : α →o α) : f (OrderHom.lfp f) = OrderHom.lfp fOrderHom.map_lfp.{u} {α : Type u} [CompleteLattice α] (f : α →o α) : f (OrderHom.lfp f) = OrderHom.lfp f
OrderHom.lfp_le.{u} {α : Type u} [CompleteLattice α] (f : α →o α) {a : α} (h : f a ≤ a) : OrderHom.lfp f ≤ aOrderHom.lfp_le.{u} {α : Type u} [CompleteLattice α] (f : α →o α) {a : α} (h : f a ≤ a) : OrderHom.lfp f ≤ a
Dually, the greatest fixed point:
OrderHom.gfp.{u} {α : Type u} [CompleteLattice α] : (α →o α) →o αOrderHom.gfp.{u} {α : Type u} [CompleteLattice α] : (α →o α) →o α
Greatest fixed point of a monotone function
OrderHom.map_gfp.{u} {α : Type u} [CompleteLattice α] (f : α →o α) : f (OrderHom.gfp f) = OrderHom.gfp fOrderHom.map_gfp.{u} {α : Type u} [CompleteLattice α] (f : α →o α) : f (OrderHom.gfp f) = OrderHom.gfp f
OrderHom.le_gfp.{u} {α : Type u} [CompleteLattice α] (f : α →o α) {a : α} (h : a ≤ f a) : a ≤ OrderHom.gfp fOrderHom.le_gfp.{u} {α : Type u} [CompleteLattice α] (f : α →o α) {a : α} (h : a ≤ f a) : a ≤ OrderHom.gfp f
Note that α →o α is the type of monotone self-maps of α (the arrow
→o denotes an OrderHom). A useful small example:
example {X : Type*} :
OrderHom.lfp
⟨(fun _ : Set X ↦ (∅ : Set X)),
fun _ _ _ ↦ le_refl _⟩
= ∅ := X:Type u_1⊢ OrderHom.lfp { toFun := fun x => ∅, monotone' := ⋯ } = ∅
X:Type u_1⊢ OrderHom.lfp { toFun := fun x => ∅, monotone' := ⋯ } ≤ ∅X:Type u_1⊢ ∅ ≤ OrderHom.lfp { toFun := fun x => ∅, monotone' := ⋯ }
X:Type u_1⊢ OrderHom.lfp { toFun := fun x => ∅, monotone' := ⋯ } ≤ ∅ All goals completed! 🐙
X:Type u_1⊢ ∅ ≤ OrderHom.lfp { toFun := fun x => ∅, monotone' := ⋯ } All goals completed! 🐙
Applications.
-
Recursive definitions. The semantics of recursion in programming languages is given by least fixed points of monotone functionals on a domain of partial functions.
-
The Schröder-Bernstein theorem. Given injections
f : α → βandg : β → α, one constructs a bijection by taking the least fixed point of a certain monotone operator onSet α. -
Inductively defined sets. Many sets in mathematics (the natural numbers, syntactic terms, well-founded relations) are defined as least fixed points of monotone operators.
5.1.8. Summary of key API
Here is a quick reference for the most important order-theoretic lemmas:
Lemma | Statement |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The omega tactic is very useful for goals involving natural number and integer
inequalities. For more general ordered structures, gcongr can help with
monotonicity goals.
5.2. The Algebraic Hierarchy in Mathlib
One of the great achievements of Mathlib is a carefully designed hierarchy of algebraic structures, encoded using Lean's typeclass system. In this section we explore this hierarchy from the ground up: from basic operations to groups, rings, fields, and their morphisms and substructures.
5.2.1. Notation and naming conventions
Unicode symbols can be typed via a backslash escape (e.g. \inv produces
⁻¹). Hover over a symbol in VS Code to see how to type it.
Symbol | Lean name | Reads as | Typed as |
|---|---|---|---|
|
| "a times b" | (ASCII) |
|
| "a plus b" | (ASCII) |
|
| "zero" | (ASCII) |
|
| "one" | (ASCII) |
|
| "a inverse" |
|
|
| "minus a" | (ASCII) |
|
| "a divided by b" | (ASCII) |
|
| "monoid homomorphism" |
|
|
| "additive monoid homomorphism" |
|
|
| "ring homomorphism" |
|
|
| "R modulo I" |
|
|
| "the trivial sub-object" |
|
|
| "the whole object as a sub-object" |
|
Naming hints.
-
Additive versus multiplicative. Mathlib systematically duplicates algebraic notions into an additive (
Add,+,0,-a) and a multiplicative (Mul,*,1,a⁻¹) version. Lemmas for the additive side are prefixed withadd_:add_assoc,add_comm,add_zero,neg_add_cancel, etc. -
"Class" suffix.
MulOneClass,AddZeroClassare the bare algebraic laws for unit elements; they are weaker than a full monoid. -
Bundled morphisms.
→*,→+,→+*are bundled: they package a function together with proofs that it preserves the relevant operations. Lemmas are namedmap_mul,map_add,map_one,map_zero,map_pow,map_neg. -
Subgroups and ideals.
Subgroup G,Subring R,Ideal Rare bundled sub-objects with membershipa ∈ H, closed under the relevant operations (H.mul_mem,H.inv_mem,H.add_mem, etc.).
5.2.2. From operations to monoids
At the bottom of the hierarchy sit the basic operation typeclasses:
-
Mul αprovides a multiplication*onα. -
Add αprovides an addition+onα. -
One αprovides a distinguished element1 : α. -
Zero αprovides a distinguished element0 : α.
These are combined step by step:
-
MulOneClass α: a type with*and1, satisfying1 * a = aanda * 1 = a. -
Monoid α: aMulOneClasswith associativity of*. -
AddMonoid α: the additive version (with+and0). -
CommMonoid α: a monoid where*is commutative.
#check (inferInstance : AddCommMonoid ℕ)
-- Associativity is available as a lemma
example (a b c : ℕ) : a + b + c = a + (b + c) :=
add_assoc a b c
-- The identity element
example (a : ℕ) : a + 0 = a :=
add_zero a
5.2.3. Groups
A group is a monoid where every element has an inverse.
-
Group α: multiplicative group (inverse writtena⁻¹, typed\invor\-1). -
AddGroup α: additive group (inverse written-a). -
CommGroup α,AddCommGroup α: commutative (abelian) versions.
#check (inferInstance : AddCommGroup ℤ)
-- Cancellation
example (a b : ℤ) : a + b - b = a := a:ℤb:ℤ⊢ a + b - b = a
All goals completed! 🐙
-- In a general group
example {G : Type*} [Group G] (a : G) : a * a⁻¹ = 1 :=
mul_inv_cancel a
example {G : Type*} [Group G] (a b : G) :
(a * b)⁻¹ = b⁻¹ * a⁻¹ :=
mul_inv_rev a b
5.2.4. Rings and fields
A ring has both addition (forming an abelian group) and multiplication (forming a monoid), linked by distributivity.
-
Ring α: a (unital) ring. -
CommRing α: a commutative ring. -
Field α: a field (commutative ring where every nonzero element is invertible).
#check (inferInstance : Field ℝ)
#check (inferInstance : CommRing ℝ)
-- Distributivity
example (a b c : ℝ) : a * (b + c) = a * b + a * c :=
mul_add a b c
-- Division in a field
example (a : ℝ) (ha : a ≠ 0) : a * a⁻¹ = 1 :=
mul_inv_cancel₀ ha
Mathlib also provides ordered variants:
-
OrderedRing α: a ring with a compatible order. -
LinearOrderedField α: a linearly ordered field (likeℝorℚ).
#check (inferInstance : LinearOrder ℝ)
#check (inferInstance : IsStrictOrderedRing ℝ)
5.2.5. Morphisms
Algebraic morphisms are functions that preserve the relevant structure. Mathlib uses bundled morphisms (structures carrying both the function and the proof of preservation).
-
MonoidHom(notation→*) : maps preserving*and1. -
AddMonoidHom(notation→+) : maps preserving+and0. -
RingHom(notation→+*) : maps preserving both+,*,0,1.
#check (Int.castRingHom ℚ : ℤ →+* ℚ)
-- A ring homomorphism preserves addition
example (f : ℤ →+* ℚ) (a b : ℤ) : f (a + b) = f a + f b :=
f.map_add a b
-- A ring homomorphism preserves multiplication
example (f : ℤ →+* ℚ) (a b : ℤ) : f (a * b) = f a * f b :=
f.map_mul a b
-- A ring homomorphism maps 0 to 0
example (f : ℤ →+* ℚ) : f 0 = 0 :=
f.map_zero
5.2.6. Substructures
Mathlib provides types for substructures of algebraic objects:
-
Subgroup G: a subgroup ofG. -
Subring R: a subring ofR. -
Subalgebra R A: a subalgebra ofAoverR. -
Ideal R: an ideal of a (commutative) ringR.
These are bundled: a Subgroup G consists of a carrier set together with
proofs that it is closed under the group operations.
example {G : Type*} [Group G] : Subgroup G :=
⊥
-- The whole group as a subgroup
example {G : Type*} [Group G] : Subgroup G :=
⊤
-- Membership
example {G : Type*} [Group G] (H : Subgroup G) (a b : G)
(ha : a ∈ H) (hb : b ∈ H) : a * b ∈ H :=
H.mul_mem ha hb
-- Subgroups form a complete lattice
#check (inferInstance : CompleteLattice (AddSubgroup ℤ))
5.2.7. Quotient structures
Given a normal subgroup N of a group G, the quotient G ⧸ N is again a
group. Similarly, given an ideal I of a ring R, the quotient R ⧸ I is a
ring.
example (R : Type*) [CommRing R] (I : Ideal R) : CommRing (R ⧸ I) :=
inferInstance
The quotient map is a ring homomorphism:
Ideal.Quotient.mk.{u} {R : Type u} [Ring R] (I : Ideal R) [I.IsTwoSided] : R →+* R ⧸ IIdeal.Quotient.mk.{u} {R : Type u} [Ring R] (I : Ideal R) [I.IsTwoSided] : R →+* R ⧸ I
The ring homomorphism from a ring R to a quotient ring R/I.
5.2.8. The ext tactic for algebraic structures
The ext tactic is very useful when proving equalities between algebraic
objects. For instance, two subgroups are equal if they have the same elements;
two morphisms are equal if they agree on all inputs.
example {G : Type*} [Group G] (H K : Subgroup G) (h : ∀ g, g ∈ H ↔ g ∈ K) :
H = K := G:Type u_1inst✝:Group GH:Subgroup GK:Subgroup Gh:∀ (g : G), g ∈ H ↔ g ∈ K⊢ H = K
G:Type u_1inst✝:Group GH:Subgroup GK:Subgroup Gh:∀ (g : G), g ∈ H ↔ g ∈ Kg:G⊢ g ∈ H ↔ g ∈ K
All goals completed! 🐙
-- Two ring homomorphisms agreeing on all elements are equal
example {R S : Type*} [Ring R] [Ring S] (f g : R →+* S) (h : ∀ r, f r = g r) :
f = g := R:Type u_1S:Type u_2inst✝¹:Ring Rinst✝:Ring Sf:R →+* Sg:R →+* Sh:∀ (r : R), f r = g r⊢ f = g
R:Type u_1S:Type u_2inst✝¹:Ring Rinst✝:Ring Sf:R →+* Sg:R →+* Sh:∀ (r : R), f r = g rr:R⊢ f r = g r
All goals completed! 🐙
5.2.9. Navigating the hierarchy
When working with Mathlib's algebraic hierarchy, recall the
exploration commands introduced in the Lean chapter: #check for
types, #print for definitions and fields of typeclasses,
inferInstance for checking whether a given instance exists, and
exact? / apply? for lemma search. In addition:
-
Look at
Mathlib.AlgebraandMathlib.GroupTheoryfor the files relevant to the structure you are working with. -
Lemma names follow the predictable pattern
<operation>_<result>, e.g.mul_assoc,add_comm,map_mul,inv_inv.
#check (inferInstance : Field ℝ)
#check (inferInstance : LinearOrder ℝ)
#check (inferInstance : AddCommGroup ℝ)
-- A general proof that works for any commutative ring
example {R : Type*} [CommRing R] (a b : R) : (a + b) ^ 2 = a ^ 2 + 2 * a * b + b ^ 2 := R:Type u_1inst✝:CommRing Ra:Rb:R⊢ (a + b) ^ 2 = a ^ 2 + 2 * a * b + b ^ 2
All goals completed! 🐙
The ring tactic is a powerful decision procedure that closes goals that are
identities in commutative (semi)rings. Similarly, group works for group
identities, and field_simp helps simplify field expressions involving
division.
5.3. Filters in Mathlib
Filters are one of the most distinctive design choices in Mathlib. They provide a unified framework for limits, neighborhoods, convergence, and "eventually" -- replacing the many different epsilon-delta definitions that appear in classical analysis.
The exposition below follows the pedagogical approach of Kevin Buzzard and Bhavik Mehta in their Formalising Mathematics notes: a filter is a generalized subset.
5.3.1. Notation and naming conventions
Many filter-related symbols are unicode and typed via backslash escapes in VS Code (hover over a symbol to see its input sequence).
Symbol | Lean name | Reads as | Typed as |
|---|---|---|---|
|
| "filter on α" | (ASCII) |
|
| "the sets of the filter F" | (ASCII) |
|
| "s is a member of F" (F-large) |
|
|
| "F is finer than G" |
|
|
| "the principal filter of s" |
|
|
| "the top filter" (only Set.univ) |
|
|
| "the bottom filter" (every set) |
|
|
| "at top" / "going to ∞" | (ASCII) |
|
| "at bottom" / "going to −∞" | (ASCII) |
|
| "the cofinite filter" | (ASCII) |
|
| "neighborhood filter at x" |
|
|
| same as above |
|
|
| "preimage of s under f" |
|
|
| "pushforward of F along f" | (ASCII) |
|
| "pullback of G along f" | (ASCII) |
|
| "f tends from F to G" | (ASCII) |
|
| "eventually p, along F" |
|
|
| "frequently p, along F" |
|
Naming hints.
-
F ≤ Gis pronounced "F is finer than G": smaller filters contain more sets. The slogan is: the smaller the filter, the bigger its collection of sets. -
The prefix
mem_(as inmem_principal,mem_atTop_sets,mem_nhds_iff) names lemmas that characterize membership in a given filter. -
𝓟and𝓝are mathematical script letters, typed\McPand\nhds; they are notation, not definitions. -
⁻¹'is the preimage of a set under a function (not to be confused with the inverse⁻¹of a group element); it is typed\inv'(or\i\n\v 'in sequence).
5.3.2. Motivation: why filters?
In a traditional analysis course one meets many similar-looking definitions:
-
A sequence
a : ℕ → ℝconverges tolif for everyε > 0there existsNsuch that for alln ≥ N,|a n - l| < ε. -
A function
f : ℝ → ℝhas limitlatx₀if for everyε > 0there existsδ > 0such that for allxwith0 < |x - x₀| < δ,|f x - l| < ε. -
A function
f : ℝ → ℝtends to+∞asx → +∞if for everyMthere existsNsuch that for allx ≥ N,f x ≥ M.
All of these have the shape: "for sufficiently large/close inputs, the output lies in a given set." Filters capture this pattern abstractly.
5.3.3. Filters as generalized subsets
Morally, a filter on a type α is a generalized subset of α. Every
ordinary subset gives rise to a filter, but there are other filters
representing "ideas" that cannot be described by a single subset, such as
an infinitesimal neighborhood of a point in a topological space, or a
neighborhood of ∞ in a totally ordered set. Isaac Newton wanted dx
to be "a real number infinitesimally close to 0"; filters are a modern
way to recover such thoughts.
The property we want a generalized subset F to have is that it is
uniquely determined by the actual subsets which contain F. If F is
a generalized subset of α, what properties should the collection of
actual subsets containing F have?
-
If
ScontainsFandS ⊆ T, thenTcontainsF. -
If
SandTboth containF, then so doesS ∩ T. -
The whole type
Set.univ : Set αcontainsF.
These three axioms turn out to be enough: a filter is modelled precisely
as a collection of subsets of α satisfying them.
5.3.4. The formal definition
The Mathlib definition, rendered directly from the library:
Filter.{u_1} (α : Type u_1) : Type u_1Filter.{u_1} (α : Type u_1) : Type u_1
A filter F on a type α is a collection of sets of α which contains the whole α,
is upwards-closed, and is stable under intersection. We do not forbid this collection to be
all sets of α.
Constructor
Filter.mk.{u_1}
Fields
sets : Set (Set α)
The set of sets that belong to the filter.
univ_sets : Set.univ ∈ self.sets
The set Set.univ belongs to any filter.
sets_of_superset : ∀ {x y : Set α}, x ∈ self.sets → x ⊆ y → y ∈ self.sets
If a set belongs to a filter, then its superset belongs to the filter as well.
inter_sets : ∀ {x y : Set α}, x ∈ self.sets → y ∈ self.sets → x ∩ y ∈ self.sets
If two sets belong to a filter, then their intersection belongs to the filter as well.
Some authors add a fourth axiom forbidding ∅ ∈ sets. Mathlib does not:
this is analogous to disallowing a ring as an ideal of itself. It makes
the initial definition cleaner but introduces many awkward edge cases
later, so Mathlib keeps the "empty filter" ⊥ as a legitimate object.
If F : Filter α and S : Set α, the notation S ∈ F is sugar for
S ∈ F.sets. Think of it as morally meaning F ⊆ S -- but of course
that expression does not type-check, because F is a filter, not a set.
The axioms are restated in the Filter namespace under friendlier
names, rendered here directly from Mathlib:
5.3.5. The order on filters
Filters form a partial order by ≤. The definition is
F ≤ G ↔ G.sets ⊆ F.sets
which looks reversed, but matches the "generalized subset" intuition: if
F is contained in G as generalized subsets, then every actual subset
containing G also contains F, so G.sets ⊆ F.sets. A useful slogan:
The smaller the filter, the more sets it has.
The API lemma is:
5.3.6. The principal filter
The simplest filter is the principal filter, with notation 𝓟 s.
It is the filter whose sets are exactly the supersets of s -- i.e.
the generalized subset corresponding to the ordinary subset s.
The principal filter of s is the collection of all supersets of s.
example {α : Type*} (s t : Set α) :
t ∈ Filter.principal s ↔ s ⊆ t :=
Filter.mem_principal
-- Every s belongs to its own principal filter
example {α : Type*} (s : Set α) :
s ∈ Filter.principal s :=
Filter.mem_principal_self s
-- The principal filter on the whole space is ⊤
example {α : Type*} :
Filter.principal (Set.univ : Set α) = ⊤ :=
Filter.principal_univ
-- The order on principal filters matches subset inclusion
example {α : Type*} (s t : Set α) :
Filter.principal s ≤ Filter.principal t ↔ s ⊆ t :=
Filter.principal_mono
5.3.7. The atTop filter
On any preorder, Filter.atTop is the filter of sets that contain all
sufficiently large elements. Morally, atTop is a "neighborhood of
∞": it represents "arbitrarily large elements" even though no single
largest element exists.
atTop is the filter representing the limit → ∞ on an ordered set.
It is generated by the collection of up-sets {b | a ≤ b}.
(The preorder need not have a top element for this to be well defined,
and indeed is trivial when a top element x exists, i.e., it coincides with pure x.)
example (s : Set ℕ) :
s ∈ Filter.atTop ↔ ∃ a, ∀ b, a ≤ b → b ∈ s :=
Filter.mem_atTop_sets
example :
{n : ℕ | n ≥ 100} ∈ (Filter.atTop : Filter ℕ) := ⊢ {n | n ≥ 100} ∈ Filter.atTop
⊢ ∃ a, ∀ b ≥ a, b ∈ {n | n ≥ 100}
All goals completed! 🐙
5.3.8. The cofinite filter
Another instructive example: on any type α, the cofinite filter
consists of sets whose complement is finite.
The cofinite filter is the filter of subsets whose complements are finite.
example (α : Type*) (s : Set α) :
s ∈ (Filter.cofinite : Filter α) ↔ sᶜ.Finite :=
Iff.rfl
This filter is a generalized subset of α representing "almost every
element". On ℕ, the cofinite filter is strictly smaller than atTop
(every cofinite set eventually contains all large n, but the set of
even numbers is in atTop... wait -- actually it is not; and conversely,
the set of even numbers is not cofinite). Comparing these two filters
is a good way to train the "generalized subset" intuition.
5.3.9. Filter.map: the pushforward
Given F : Filter α and m : α → β, there is a natural way to build a
filter on β: a set s : Set β is in the pushforward if its preimage
is in F. This is the filter-theoretic counterpart of the image of a
subset.
The forward map of a filter
example {α β : Type*} (m : α → β) (F : Filter α)
(s : Set β) :
s ∈ Filter.map m F ↔ m ⁻¹' s ∈ F := Iff.rfl
There is also a pullback:
The inverse map of a filter. A set s belongs to Filter.comap m f if either of the following
equivalent conditions hold.
-
There exists a set
t ∈ fsuch thatm ⁻¹' t ⊆ s. This is used as a definition. -
The set
kernImage m s = {y | ∀ x, m x = y → x ∈ s}belongs tof, seeFilter.mem_comap'. -
The set
(m '' sᶜ)ᶜbelongs tof, seeFilter.mem_comap_iff_complandFilter.compl_mem_comap.
map and comap form a Galois connection:
Filter.map_le_iff_le_comap.{u_1, u_2} {α : Type u_1} {β : Type u_2} {f : Filter α} {g : Filter β} {m : α → β} : Filter.map m f ≤ g ↔ f ≤ Filter.comap m gFilter.map_le_iff_le_comap.{u_1, u_2} {α : Type u_1} {β : Type u_2} {f : Filter α} {g : Filter β} {m : α → β} : Filter.map m f ≤ g ↔ f ≤ Filter.comap m g
5.3.10. Filter.Tendsto: the general notion of limit
With map in hand, the general definition of a limit is strikingly
short:
Filter.Tendsto is the generic "limit of a function" predicate.
Tendsto f l₁ l₂ asserts that for every l₂ neighborhood a,
the f-preimage of a is an l₁ neighborhood.
Unpacking: Tendsto f l₁ l₂ says that for every s ∈ l₂, the preimage
f ⁻¹' s is in l₁. In the generalized-subset picture, it says that
f sends the generalized subset l₁ into the generalized subset l₂.
This single definition unifies all classical limits:
Classical statement | Filter version |
|---|---|
|
|
|
|
|
|
|
|
example (a : ℕ → ℝ) (l : ℝ) :
Filter.Tendsto a Filter.atTop (nhds l) ↔
∀ ε > 0, ∃ N, ∀ n ≥ N, |a n - l| < ε := a:ℕ → ℝl:ℝ⊢ Filter.Tendsto a Filter.atTop (nhds l) ↔ ∀ ε > 0, ∃ N, ∀ n ≥ N, |a n - l| < ε
a:ℕ → ℝl:ℝ⊢ (∀ ε > 0, ∃ N, ∀ n ≥ N, dist (a n) l < ε) ↔ ∀ ε > 0, ∃ N, ∀ n ≥ N, |a n - l| < ε
All goals completed! 🐙
Composition: if a → l and f is continuous at l, then f ∘ a → f l:
Filter.Tendsto.comp.{u_1, u_2, u_3} {α : Type u_1} {β : Type u_2} {γ : Type u_3} {f : α → β} {g : β → γ} {x : Filter α} {y : Filter β} {z : Filter γ} (hg : Filter.Tendsto g y z) (hf : Filter.Tendsto f x y) : Filter.Tendsto (g ∘ f) x zFilter.Tendsto.comp.{u_1, u_2, u_3} {α : Type u_1} {β : Type u_2} {γ : Type u_3} {f : α → β} {g : β → γ} {x : Filter α} {y : Filter β} {z : Filter γ} (hg : Filter.Tendsto g y z) (hf : Filter.Tendsto f x y) : Filter.Tendsto (g ∘ f) x z
5.3.11. Filter.Eventually and Filter.Frequently
Filters let us speak precisely about something being true "eventually".
Filter.Eventually p F, with notation ∀ᶠ x in F, p x, is defined as
{x | p x} ∈ F. Read it as: "p holds for generalized-almost-all x
with respect to F."
-
For
F = atToponℕ,∀ᶠ n in atTop, p nmeans "for all sufficiently largen,p n". -
For
F = nhds x₀,∀ᶠ x in nhds x₀, p xmeans "pholds on a neighborhood ofx₀". -
For
F = cofinite, it means "pholds off a finite set".
The dual Filter.Frequently p F (notation ∃ᶠ x in F, p x) is the
negation of Eventually (¬ p) and means "p holds on an F-non-negligible
set".
example {p : ℕ → Prop} :
(∀ᶠ n in Filter.atTop, p n) ↔
∃ N, ∀ n ≥ N, p n :=
Filter.eventually_atTop
-- Finite conjunctions of eventually hold eventually
example {α : Type*} {F : Filter α} {p q : α → Prop}
(hp : ∀ᶠ x in F, p x) (hq : ∀ᶠ x in F, q x) :
∀ᶠ x in F, p x ∧ q x :=
hp.and hq
-- Tendsto preserves Eventually
#check @Filter.Tendsto.eventually
When you have several ∀ᶠ-hypotheses and want to conclude another
∀ᶠ-statement pointwise, the dedicated tactic
filter_upwards [h₁, h₂, …] with x h₁ h₂ … intersects them and
hands you the pointwise goal:
example {α : Type*} {F : Filter α} {p q : α → Prop}
(hp : ∀ᶠ x in F, p x) (hq : ∀ᶠ x in F, q x) :
∀ᶠ x in F, p x ∧ q x := α:Type u_1F:Filter αp:α → Propq:α → Prophp:∀ᶠ (x : α) in F, p xhq:∀ᶠ (x : α) in F, q x⊢ ∀ᶠ (x : α) in F, p x ∧ q x
filter_upwards [hp, hq] with x α:Type u_1F:Filter αp:α → Propq:α → Prophp✝:∀ᶠ (x : α) in F, p xhq:∀ᶠ (x : α) in F, q xx:αhp:p x⊢ q x → p x ∧ q x α:Type u_1F:Filter αp:α → Propq:α → Prophp✝:∀ᶠ (x : α) in F, p xhq✝:∀ᶠ (x : α) in F, q xx:αhp:p xhq:q x⊢ p x ∧ q x
All goals completed! 🐙
A remark on ≥. Lean generally prefers ≤ over ≥, and most lemmas
are stated with ≤. The binder ∀ n ≥ N, ... is a special exception
(because ∀ N ≤ n, ... would bind N rather than n), but in bare
inequalities prefer to write 1 < n over n > 1.
5.3.12. A worked example: 1 / (n + 1) → 0
To see all of the pieces working together, let us show that
1 / (n + 1) → 0 as n → ∞, using Tendsto, atTop, and nhds 0
in the metric setting:
example : Filter.Tendsto
(fun n : ℕ => 1 / ((n : ℝ) + 1))
Filter.atTop (nhds 0) := ⊢ Filter.Tendsto (fun n => 1 / (↑n + 1)) Filter.atTop (nhds 0)
⊢ ∀ ε > 0, ∃ N, ∀ n ≥ N, dist (1 / (↑n + 1)) 0 < ε
intro ε ε:ℝhε:ε > 0⊢ ∃ N, ∀ n ≥ N, dist (1 / (↑n + 1)) 0 < ε
-- Find N with 1/(N+1) < ε:
-- it suffices that N + 1 > 1/ε.
ε:ℝhε:ε > 0N:ℕhN:1 / ε < ↑N⊢ ∃ N, ∀ n ≥ N, dist (1 / (↑n + 1)) 0 < ε
ε:ℝhε:ε > 0N:ℕhN:1 / ε < ↑Nn:ℕhn:n ≥ N⊢ dist (1 / (↑n + 1)) 0 < ε
have hε' : (0 : ℝ) < (n : ℝ) + 1 := ⊢ Filter.Tendsto (fun n => 1 / (↑n + 1)) Filter.atTop (nhds 0) All goals completed! 🐙
have hN' : (1 : ℝ) / ε < (n : ℝ) + 1 := ⊢ Filter.Tendsto (fun n => 1 / (↑n + 1)) Filter.atTop (nhds 0)
calc (1 : ℝ) / ε < N := hN
_ ≤ n := ε:ℝhε:ε > 0N:ℕhN:1 / ε < ↑Nn:ℕhn:n ≥ Nhε':0 < ↑n + 1 :=
Right.add_pos_of_nonneg_of_pos (Nat.cast_nonneg' n)
(Mathlib.Meta.Positivity.pos_of_isNat (Mathlib.Meta.NormNum.isNat_ofNat ℝ Nat.cast_one) (Eq.refl (Nat.ble 1 1)))⊢ ↑N ≤ ↑n All goals completed! 🐙
_ < (n : ℝ) + 1 := ε:ℝhε:ε > 0N:ℕhN:1 / ε < ↑Nn:ℕhn:n ≥ Nhε':0 < ↑n + 1 :=
Right.add_pos_of_nonneg_of_pos (Nat.cast_nonneg' n)
(Mathlib.Meta.Positivity.pos_of_isNat (Mathlib.Meta.NormNum.isNat_ofNat ℝ Nat.cast_one) (Eq.refl (Nat.ble 1 1)))⊢ ↑n < ↑n + 1 All goals completed! 🐙
ε:ℝhε:ε > 0N:ℕhN:1 / ε < ↑Nn:ℕhn:n ≥ Nhε':0 < ↑n + 1 :=
Right.add_pos_of_nonneg_of_pos (Nat.cast_nonneg' n)
(Mathlib.Meta.Positivity.pos_of_isNat (Mathlib.Meta.NormNum.isNat_ofNat ℝ Nat.cast_one) (Eq.refl (Nat.ble 1 1)))hN':1 / ε < ↑n + 1 :=
Trans.trans (Trans.trans hN (cast (Eq.symm Nat.cast_le._simp_1) hn))
(lt_of_not_ge fun a =>
Mathlib.Tactic.Linarith.lt_irrefl
(Eq.mp
(congrArg (fun _a => _a < 0)
(Mathlib.Tactic.Ring.of_eq
(Mathlib.Tactic.Ring.add_congr
(Mathlib.Tactic.Ring.neg_congr
(Mathlib.Tactic.Ring.cast_pos (Mathlib.Meta.NormNum.isNat_ofNat ℝ Nat.cast_one))
(Mathlib.Tactic.Ring.neg_add
(Mathlib.Tactic.Ring.neg_one_mul
(Mathlib.Meta.NormNum.IsInt.to_raw_eq
(Mathlib.Meta.NormNum.isInt_mul (Eq.refl HMul.hMul)
(Mathlib.Meta.NormNum.IsInt.of_raw ℝ (Int.negOfNat 1))
(Mathlib.Meta.NormNum.IsNat.to_isInt (Mathlib.Meta.NormNum.IsNat.of_raw ℝ 1))
(Eq.refl (Int.negOfNat 1)))))
Mathlib.Tactic.Ring.neg_zero))
(Mathlib.Tactic.Ring.sub_congr
(Mathlib.Tactic.Ring.add_congr (Mathlib.Tactic.Ring.atom_pf ↑n)
(Mathlib.Tactic.Ring.cast_pos (Mathlib.Meta.NormNum.isNat_ofNat ℝ Nat.cast_one))
(Mathlib.Tactic.Ring.add_pf_add_gt (Nat.rawCast 1)
(Mathlib.Tactic.Ring.add_pf_add_zero (↑n ^ Nat.rawCast 1 * Nat.rawCast 1 + 0))))
(Mathlib.Tactic.Ring.atom_pf ↑n)
(Mathlib.Tactic.Ring.sub_pf
(Mathlib.Tactic.Ring.neg_add
(Mathlib.Tactic.Ring.neg_mul (↑n) (Nat.rawCast 1)
(Mathlib.Tactic.Ring.neg_one_mul
(Mathlib.Meta.NormNum.IsInt.to_raw_eq
(Mathlib.Meta.NormNum.isInt_mul (Eq.refl HMul.hMul)
(Mathlib.Meta.NormNum.IsInt.of_raw ℝ (Int.negOfNat 1))
(Mathlib.Meta.NormNum.IsNat.to_isInt (Mathlib.Meta.NormNum.IsNat.of_raw ℝ 1))
(Eq.refl (Int.negOfNat 1))))))
Mathlib.Tactic.Ring.neg_zero)
(Mathlib.Tactic.Ring.add_pf_add_lt (Nat.rawCast 1)
(Mathlib.Tactic.Ring.add_pf_add_overlap_zero
(Mathlib.Tactic.Ring.add_overlap_pf_zero (↑n) (Nat.rawCast 1)
(Mathlib.Meta.NormNum.IsInt.to_isNat
(Mathlib.Meta.NormNum.isInt_add (Eq.refl HAdd.hAdd)
(Mathlib.Meta.NormNum.IsNat.to_isInt (Mathlib.Meta.NormNum.IsNat.of_raw ℝ 1))
(Mathlib.Meta.NormNum.IsInt.of_raw ℝ (Int.negOfNat 1)) (Eq.refl (Int.ofNat 0)))))
(Mathlib.Tactic.Ring.add_pf_zero_add 0)))))
(Mathlib.Tactic.Ring.add_pf_add_overlap_zero
(Mathlib.Meta.NormNum.IsInt.to_isNat
(Mathlib.Meta.NormNum.isInt_add (Eq.refl HAdd.hAdd)
(Mathlib.Meta.NormNum.IsInt.of_raw ℝ (Int.negOfNat 1))
(Mathlib.Meta.NormNum.IsNat.to_isInt (Mathlib.Meta.NormNum.IsNat.of_raw ℝ 1))
(Eq.refl (Int.ofNat 0))))
(Mathlib.Tactic.Ring.add_pf_zero_add 0)))
(Mathlib.Tactic.Ring.cast_zero (Mathlib.Meta.NormNum.isNat_ofNat ℝ Nat.cast_zero))))
(Mathlib.Tactic.Linarith.add_lt_of_neg_of_le (neg_neg_of_pos Mathlib.Tactic.Linarith.zero_lt_one)
(Mathlib.Tactic.Linarith.sub_nonpos_of_le a))))⊢ 1 / (↑n + 1) < ε
ε:ℝhε:ε > 0N:ℕhN:1 / ε < ↑Nn:ℕhn:n ≥ Nhε':0 < ↑n + 1 :=
Right.add_pos_of_nonneg_of_pos (Nat.cast_nonneg' n)
(Mathlib.Meta.Positivity.pos_of_isNat (Mathlib.Meta.NormNum.isNat_ofNat ℝ Nat.cast_one) (Eq.refl (Nat.ble 1 1)))hN':1 / ε < ↑n + 1 :=
Trans.trans (Trans.trans hN (cast (Eq.symm Nat.cast_le._simp_1) hn))
(lt_of_not_ge fun a =>
Mathlib.Tactic.Linarith.lt_irrefl
(Eq.mp
(congrArg (fun _a => _a < 0)
(Mathlib.Tactic.Ring.of_eq
(Mathlib.Tactic.Ring.add_congr
(Mathlib.Tactic.Ring.neg_congr
(Mathlib.Tactic.Ring.cast_pos (Mathlib.Meta.NormNum.isNat_ofNat ℝ Nat.cast_one))
(Mathlib.Tactic.Ring.neg_add
(Mathlib.Tactic.Ring.neg_one_mul
(Mathlib.Meta.NormNum.IsInt.to_raw_eq
(Mathlib.Meta.NormNum.isInt_mul (Eq.refl HMul.hMul)
(Mathlib.Meta.NormNum.IsInt.of_raw ℝ (Int.negOfNat 1))
(Mathlib.Meta.NormNum.IsNat.to_isInt (Mathlib.Meta.NormNum.IsNat.of_raw ℝ 1))
(Eq.refl (Int.negOfNat 1)))))
Mathlib.Tactic.Ring.neg_zero))
(Mathlib.Tactic.Ring.sub_congr
(Mathlib.Tactic.Ring.add_congr (Mathlib.Tactic.Ring.atom_pf ↑n)
(Mathlib.Tactic.Ring.cast_pos (Mathlib.Meta.NormNum.isNat_ofNat ℝ Nat.cast_one))
(Mathlib.Tactic.Ring.add_pf_add_gt (Nat.rawCast 1)
(Mathlib.Tactic.Ring.add_pf_add_zero (↑n ^ Nat.rawCast 1 * Nat.rawCast 1 + 0))))
(Mathlib.Tactic.Ring.atom_pf ↑n)
(Mathlib.Tactic.Ring.sub_pf
(Mathlib.Tactic.Ring.neg_add
(Mathlib.Tactic.Ring.neg_mul (↑n) (Nat.rawCast 1)
(Mathlib.Tactic.Ring.neg_one_mul
(Mathlib.Meta.NormNum.IsInt.to_raw_eq
(Mathlib.Meta.NormNum.isInt_mul (Eq.refl HMul.hMul)
(Mathlib.Meta.NormNum.IsInt.of_raw ℝ (Int.negOfNat 1))
(Mathlib.Meta.NormNum.IsNat.to_isInt (Mathlib.Meta.NormNum.IsNat.of_raw ℝ 1))
(Eq.refl (Int.negOfNat 1))))))
Mathlib.Tactic.Ring.neg_zero)
(Mathlib.Tactic.Ring.add_pf_add_lt (Nat.rawCast 1)
(Mathlib.Tactic.Ring.add_pf_add_overlap_zero
(Mathlib.Tactic.Ring.add_overlap_pf_zero (↑n) (Nat.rawCast 1)
(Mathlib.Meta.NormNum.IsInt.to_isNat
(Mathlib.Meta.NormNum.isInt_add (Eq.refl HAdd.hAdd)
(Mathlib.Meta.NormNum.IsNat.to_isInt (Mathlib.Meta.NormNum.IsNat.of_raw ℝ 1))
(Mathlib.Meta.NormNum.IsInt.of_raw ℝ (Int.negOfNat 1)) (Eq.refl (Int.ofNat 0)))))
(Mathlib.Tactic.Ring.add_pf_zero_add 0)))))
(Mathlib.Tactic.Ring.add_pf_add_overlap_zero
(Mathlib.Meta.NormNum.IsInt.to_isNat
(Mathlib.Meta.NormNum.isInt_add (Eq.refl HAdd.hAdd)
(Mathlib.Meta.NormNum.IsInt.of_raw ℝ (Int.negOfNat 1))
(Mathlib.Meta.NormNum.IsNat.to_isInt (Mathlib.Meta.NormNum.IsNat.of_raw ℝ 1))
(Eq.refl (Int.ofNat 0))))
(Mathlib.Tactic.Ring.add_pf_zero_add 0)))
(Mathlib.Tactic.Ring.cast_zero (Mathlib.Meta.NormNum.isNat_ofNat ℝ Nat.cast_zero))))
(Mathlib.Tactic.Linarith.add_lt_of_neg_of_le (neg_neg_of_pos Mathlib.Tactic.Linarith.zero_lt_one)
(Mathlib.Tactic.Linarith.sub_nonpos_of_le a))))⊢ 1 < ε * (↑n + 1)
have : 1 < ε * ((n : ℝ) + 1) := ⊢ Filter.Tendsto (fun n => 1 / (↑n + 1)) Filter.atTop (nhds 0)
ε:ℝhε:ε > 0N:ℕhN:1 / ε < ↑Nn:ℕhn:n ≥ Nhε':0 < ↑n + 1 :=
Right.add_pos_of_nonneg_of_pos (Nat.cast_nonneg' n)
(Mathlib.Meta.Positivity.pos_of_isNat (Mathlib.Meta.NormNum.isNat_ofNat ℝ Nat.cast_one) (Eq.refl (Nat.ble 1 1)))hN':1 / ε < ↑n + 1 :=
Trans.trans (Trans.trans hN (cast (Eq.symm Nat.cast_le._simp_1) hn))
(lt_of_not_ge fun a =>
Mathlib.Tactic.Linarith.lt_irrefl
(Eq.mp
(congrArg (fun _a => _a < 0)
(Mathlib.Tactic.Ring.of_eq
(Mathlib.Tactic.Ring.add_congr
(Mathlib.Tactic.Ring.neg_congr
(Mathlib.Tactic.Ring.cast_pos (Mathlib.Meta.NormNum.isNat_ofNat ℝ Nat.cast_one))
(Mathlib.Tactic.Ring.neg_add
(Mathlib.Tactic.Ring.neg_one_mul
(Mathlib.Meta.NormNum.IsInt.to_raw_eq
(Mathlib.Meta.NormNum.isInt_mul (Eq.refl HMul.hMul)
(Mathlib.Meta.NormNum.IsInt.of_raw ℝ (Int.negOfNat 1))
(Mathlib.Meta.NormNum.IsNat.to_isInt (Mathlib.Meta.NormNum.IsNat.of_raw ℝ 1))
(Eq.refl (Int.negOfNat 1)))))
Mathlib.Tactic.Ring.neg_zero))
(Mathlib.Tactic.Ring.sub_congr
(Mathlib.Tactic.Ring.add_congr (Mathlib.Tactic.Ring.atom_pf ↑n)
(Mathlib.Tactic.Ring.cast_pos (Mathlib.Meta.NormNum.isNat_ofNat ℝ Nat.cast_one))
(Mathlib.Tactic.Ring.add_pf_add_gt (Nat.rawCast 1)
(Mathlib.Tactic.Ring.add_pf_add_zero (↑n ^ Nat.rawCast 1 * Nat.rawCast 1 + 0))))
(Mathlib.Tactic.Ring.atom_pf ↑n)
(Mathlib.Tactic.Ring.sub_pf
(Mathlib.Tactic.Ring.neg_add
(Mathlib.Tactic.Ring.neg_mul (↑n) (Nat.rawCast 1)
(Mathlib.Tactic.Ring.neg_one_mul
(Mathlib.Meta.NormNum.IsInt.to_raw_eq
(Mathlib.Meta.NormNum.isInt_mul (Eq.refl HMul.hMul)
(Mathlib.Meta.NormNum.IsInt.of_raw ℝ (Int.negOfNat 1))
(Mathlib.Meta.NormNum.IsNat.to_isInt (Mathlib.Meta.NormNum.IsNat.of_raw ℝ 1))
(Eq.refl (Int.negOfNat 1))))))
Mathlib.Tactic.Ring.neg_zero)
(Mathlib.Tactic.Ring.add_pf_add_lt (Nat.rawCast 1)
(Mathlib.Tactic.Ring.add_pf_add_overlap_zero
(Mathlib.Tactic.Ring.add_overlap_pf_zero (↑n) (Nat.rawCast 1)
(Mathlib.Meta.NormNum.IsInt.to_isNat
(Mathlib.Meta.NormNum.isInt_add (Eq.refl HAdd.hAdd)
(Mathlib.Meta.NormNum.IsNat.to_isInt (Mathlib.Meta.NormNum.IsNat.of_raw ℝ 1))
(Mathlib.Meta.NormNum.IsInt.of_raw ℝ (Int.negOfNat 1)) (Eq.refl (Int.ofNat 0)))))
(Mathlib.Tactic.Ring.add_pf_zero_add 0)))))
(Mathlib.Tactic.Ring.add_pf_add_overlap_zero
(Mathlib.Meta.NormNum.IsInt.to_isNat
(Mathlib.Meta.NormNum.isInt_add (Eq.refl HAdd.hAdd)
(Mathlib.Meta.NormNum.IsInt.of_raw ℝ (Int.negOfNat 1))
(Mathlib.Meta.NormNum.IsNat.to_isInt (Mathlib.Meta.NormNum.IsNat.of_raw ℝ 1))
(Eq.refl (Int.ofNat 0))))
(Mathlib.Tactic.Ring.add_pf_zero_add 0)))
(Mathlib.Tactic.Ring.cast_zero (Mathlib.Meta.NormNum.isNat_ofNat ℝ Nat.cast_zero))))
(Mathlib.Tactic.Linarith.add_lt_of_neg_of_le (neg_neg_of_pos Mathlib.Tactic.Linarith.zero_lt_one)
(Mathlib.Tactic.Linarith.sub_nonpos_of_le a))))h:1 < (↑n + 1) * ε := (div_lt_iff₀ hε).mp hN'⊢ 1 < ε * (↑n + 1)
All goals completed! 🐙
All goals completed! 🐙
The structure of the proof mirrors the classical epsilon-N argument
exactly: pick N, then show n ≥ N → |1 / (n+1) - 0| < ε. The
filter language lets us state convergence in one line; the actual
work is still the inequality manipulation.
5.3.13. Ultrafilters
An ultrafilter is a maximal proper filter: for every set s, either
s or sᶜ belongs to the ultrafilter. Ultrafilters are fundamental in
logic (they correspond to consistent complete theories) and in topology
(compactifications, nonstandard analysis).
#check Ultrafilter
#check (pure : ℕ → Ultrafilter ℕ)
5.3.14. Why filters are better for formalization
Filters may feel abstract at first, but they pay off:
-
Unification. One definition of
Tendstoreplaces dozens of epsilon-delta variants. -
Composability. Limits compose naturally via
Filter.Tendsto.comp. -
Algebraic structure. Filters form a complete lattice, so we can take meets and joins of families of filters and reason about them order-theoretically.
-
Avoidance of partial functions. Instead of "the limit of
fatx" (which may not exist), we useTendsto f (nhds x) (nhds l)as a proposition. -
Smooth interaction with topology. The neighborhood filter
nhds xis the bridge between filters and topological spaces.
5.4. Topology via Filters
Topology in Mathlib is built on top of the filter framework. Rather than
defining continuity via epsilon-delta or open preimages directly, the primary
definition uses Filter.Tendsto applied to neighborhood filters. This section
explains how topological spaces are set up in Mathlib, how continuity works,
and how metric spaces fit in.
5.4.1. Notation and naming conventions
Symbol | Lean name | Reads as | Typed as |
|---|---|---|---|
|
| "topology on α" | (ASCII) |
|
| "s is open" | (ASCII) |
|
| "s is closed" | (ASCII) |
|
| "s is clopen (open and closed)" | (ASCII) |
|
| "neighborhood filter at x" |
|
|
| "f is continuous" | (ASCII) |
|
| "f is continuous at x" | (ASCII) |
|
| "s is compact" | (ASCII) |
|
| "s is connected" | (ASCII) |
|
| "s is preconnected" | (ASCII) |
|
| "distance between x and y" | (ASCII) |
|
| "open ball of radius r at x" | (ASCII) |
|
| "closed ball" | (ASCII) |
|
| "preimage of s under f" |
|
|
| "image of s under f" |
(ASCII, two |
|
| "product of α and β" |
|
|
| "complement of s" |
|
Naming hints.
-
Predicates over sets are mostly prefixed with
Is:IsOpen,IsClosed,IsCompact,IsConnected. -
The filter-flavored counterpart of a property is usually
Tendsto+ some filter:ContinuousAt f x ↔ Tendsto f (𝓝 x) (𝓝 (f x)). -
'and''are two different ASCII sequences:''is the image operator (two apostrophes), while⁻¹'is the preimage operator (unicode,\inv'). -
Product-related continuity lemmas live in the
Continuous.prodnamespace:Continuous.prodMk,Continuous.fst,Continuous.snd.
5.4.2. TopologicalSpace
A topological space in Mathlib is a typeclass TopologicalSpace α that
specifies which sets are open. However, the actual definition in Mathlib
uses the neighborhood filter approach: a TopologicalSpace is given by
specifying, for each point x, its neighborhood filter nhds x.
Equivalently, one can define it via IsOpen. Here is the structure
as it appears in Mathlib:
TopologicalSpace.{u} (X : Type u) : Type uTopologicalSpace.{u} (X : Type u) : Type u
A topology on X.
Instance Constructor
TopologicalSpace.mk.{u}
Methods
IsOpen : Set X → Prop
A predicate saying that a set is an open set. Use IsOpen in the root namespace instead.
isOpen_univ : TopologicalSpace.IsOpen Set.univ
The set representing the whole space is an open set.
Use isOpen_univ in the root namespace instead.
isOpen_inter : ∀ (s t : Set X), TopologicalSpace.IsOpen s → TopologicalSpace.IsOpen t → TopologicalSpace.IsOpen (s ∩ t)
The intersection of two open sets is an open set. Use IsOpen.inter instead.
isOpen_sUnion : ∀ (s : Set (Set X)), (∀ t ∈ s, TopologicalSpace.IsOpen t) → TopologicalSpace.IsOpen (⋃₀ s)
The union of a family of open sets is an open set.
Use isOpen_sUnion in the root namespace instead.
The real numbers carry the standard topological space structure:
#check (inferInstance : TopologicalSpace ℝ)
5.4.3. Open and closed sets
-
IsOpen s:sis an open set. -
IsClosed s:sᶜis open, i.e.,sis closed. -
IsClopen s:sis both open and closed.
example {α : Type*} [TopologicalSpace α] : IsOpen (∅ : Set α) :=
isOpen_empty
example {α : Type*} [TopologicalSpace α] : IsOpen (Set.univ : Set α) :=
isOpen_univ
-- A set is closed iff its complement is open
example {α : Type*} [TopologicalSpace α] (s : Set α) :
IsClosed s ↔ IsOpen sᶜ :=
isOpen_compl_iff.symm
Finite intersections and arbitrary unions of open sets are open:
IsOpen.inter.{u} {X : Type u} [TopologicalSpace X] {s t : Set X} (hs : IsOpen s) (ht : IsOpen t) : IsOpen (s ∩ t)IsOpen.inter.{u} {X : Type u} [TopologicalSpace X] {s t : Set X} (hs : IsOpen s) (ht : IsOpen t) : IsOpen (s ∩ t)
isOpen_iUnion.{u, v} {X : Type u} {ι : Sort v} [TopologicalSpace X] {f : ι → Set X} (h : ∀ (i : ι), IsOpen (f i)) : IsOpen (⋃ i, f i)isOpen_iUnion.{u, v} {X : Type u} {ι : Sort v} [TopologicalSpace X] {f : ι → Set X} (h : ∀ (i : ι), IsOpen (f i)) : IsOpen (⋃ i, f i)
5.4.4. Closure, interior, and frontier
For a set s in a topological space, three derived sets are
fundamental:
-
closure s-- the smallest closed set containings; -
interior s-- the largest open set contained ins; -
frontier s-- the boundary, defined asclosure s \ interior s.
The closure of s is the smallest closed set containing s.
The interior of a set s is the largest open subset of s.
A point x lies in closure s if every neighborhood of x meets
s; equivalently, if s is frequently close to x:
mem_closure_iff_frequently.{u} {X : Type u} [TopologicalSpace X] {x : X} {s : Set X} : x ∈ closure s ↔ ∃ᶠ (x : X) in nhds x, x ∈ smem_closure_iff_frequently.{u} {X : Type u} [TopologicalSpace X] {x : X} {s : Set X} : x ∈ closure s ↔ ∃ᶠ (x : X) in nhds x, x ∈ s
example {α : Type*} [TopologicalSpace α] (s : Set α) :
IsClosed s ↔ closure s = s :=
⟨fun h => h.closure_eq, fun h => h ▸ isClosed_closure⟩
-- The interior of a set is open.
example {α : Type*} [TopologicalSpace α] (s : Set α) :
IsOpen (interior s) :=
isOpen_interior
5.4.5. The neighborhood filter nhds
For a point x in a topological space, nhds x is the filter of
neighborhoods of x. A set s is a neighborhood of x if there is an
open set U with x ∈ U ⊆ s.
A set is called a neighborhood of x if it contains an open set around x. The set of all
neighborhoods of x forms a filter, the neighborhood filter at x, is here defined as the
infimum over the principal filters of all open sets containing x.
mem_nhds_iff.{u} {X : Type u} [TopologicalSpace X] {x : X} {s : Set X} : s ∈ nhds x ↔ ∃ t ⊆ s, IsOpen t ∧ x ∈ tmem_nhds_iff.{u} {X : Type u} [TopologicalSpace X] {x : X} {s : Set X} : s ∈ nhds x ↔ ∃ t ⊆ s, IsOpen t ∧ x ∈ t
The neighborhood filter is the bridge between topology and the filter framework.
5.4.6. Continuity via Tendsto
A function f : α → β between topological spaces is continuous if it
preserves the neighborhood filter at every point. In Mathlib:
Continuous f ↔ ∀ x, Filter.Tendsto f (nhds x) (nhds (f x))
This is equivalent to the classical definition: preimages of open sets are open.
Continuous.{u, v} {X : Type u} {Y : Type v} [TopologicalSpace X] [TopologicalSpace Y] (f : X → Y) : PropContinuous.{u, v} {X : Type u} {Y : Type v} [TopologicalSpace X] [TopologicalSpace Y] (f : X → Y) : Prop
A function between topological spaces is continuous if the preimage of every open set is open. Registered as a structure to make sure it is not unfolded by Lean.
Constructor
Continuous.mk.{u, v}
Fields
isOpen_preimage : ∀ (s : Set Y), IsOpen s → IsOpen (f ⁻¹' s)
The preimage of an open set under a continuous function is an open set. Use IsOpen.preimage
instead.
continuous_def.{u_1, u_2} {X : Type u_1} {Y : Type u_2} {x✝ : TopologicalSpace X} {x✝¹ : TopologicalSpace Y} {f : X → Y} : Continuous f ↔ ∀ (s : Set Y), IsOpen s → IsOpen (f ⁻¹' s)continuous_def.{u_1, u_2} {X : Type u_1} {Y : Type u_2} {x✝ : TopologicalSpace X} {x✝¹ : TopologicalSpace Y} {f : X → Y} : Continuous f ↔ ∀ (s : Set Y), IsOpen s → IsOpen (f ⁻¹' s)
Equivalence with the filter formulation:
continuous_iff_continuousAt.{u_1, u_2} {X : Type u_1} {Y : Type u_2} [TopologicalSpace X] [TopologicalSpace Y] {f : X → Y} : Continuous f ↔ ∀ (x : X), ContinuousAt f xcontinuous_iff_continuousAt.{u_1, u_2} {X : Type u_1} {Y : Type u_2} [TopologicalSpace X] [TopologicalSpace Y] {f : X → Y} : Continuous f ↔ ∀ (x : X), ContinuousAt f x
example {α : Type*} [TopologicalSpace α] : Continuous (id : α → α) :=
continuous_id
-- Composition of continuous functions is continuous
example {α β γ : Type*} [TopologicalSpace α] [TopologicalSpace β]
[TopologicalSpace γ] (f : α → β) (g : β → γ)
(hf : Continuous f) (hg : Continuous g) : Continuous (g ∘ f) :=
hg.comp hf
A useful variant is ContinuousAt f x, meaning f is continuous at
the single point x:
example {α β : Type*} [TopologicalSpace α] [TopologicalSpace β]
(f : α → β) (x : α) :
ContinuousAt f x ↔ Filter.Tendsto f (nhds x) (nhds (f x)) :=
Iff.rfl
5.4.7. Compact spaces
A set s is compact if every open cover has a finite subcover. In Mathlib,
compactness is defined using filters: s is compact if every filter that
contains s (via the principal filter) and is nontrivial has a cluster point
in s.
A set s is compact if for every nontrivial filter f that contains s,
there exists a ∈ s such that every set of f meets every neighborhood of a.
example : IsCompact (Set.Icc (0 : ℝ) 1) :=
isCompact_Icc
A compact space is one where the whole space is compact:
Type class for compact spaces. Separation is sometimes included in the definition, especially in the French literature, but we do not include it here.
Instance Constructor
CompactSpace.mk.{u_1}
Methods
isCompact_univ : IsCompact Set.univ
In a compact space, Set.univ is a compact set.
The image of a compact set under a continuous function is compact:
IsCompact.image.{u, v} {X : Type u} {Y : Type v} [TopologicalSpace X] [TopologicalSpace Y] {s : Set X} {f : X → Y} (hs : IsCompact s) (hf : Continuous f) : IsCompact (f '' s)IsCompact.image.{u, v} {X : Type u} {Y : Type v} [TopologicalSpace X] [TopologicalSpace Y] {s : Set X} {f : X → Y} (hs : IsCompact s) (hf : Continuous f) : IsCompact (f '' s)
5.4.8. Connected spaces
A topological space is connected if it cannot be written as the union of
two nonempty disjoint open sets. A set s is connected if, viewed as a
subspace, it is connected.
#check (inferInstance : ConnectedSpace ℝ)
The interval [a, b] is connected:
isPreconnected_Icc.{u} {α : Type u} [ConditionallyCompleteLinearOrder α] [TopologicalSpace α] [OrderTopology α] [DenselyOrdered α] {a b : α} : IsPreconnected (Set.Icc a b)isPreconnected_Icc.{u} {α : Type u} [ConditionallyCompleteLinearOrder α] [TopologicalSpace α] [OrderTopology α] [DenselyOrdered α] {a b : α} : IsPreconnected (Set.Icc a b)
A closed interval in a densely ordered conditionally complete linear order is preconnected.
The continuous image of a connected set is connected:
IsPreconnected.image.{u, v} {α : Type u} {β : Type v} [TopologicalSpace α] [TopologicalSpace β] {s : Set α} (H : IsPreconnected s) (f : α → β) (hf : ContinuousOn f s) : IsPreconnected (f '' s)IsPreconnected.image.{u, v} {α : Type u} {β : Type v} [TopologicalSpace α] [TopologicalSpace β] {s : Set α} (H : IsPreconnected s) (f : α → β) (hf : ContinuousOn f s) : IsPreconnected (f '' s)
The image of a preconnected set is preconnected as well.
5.4.9. Metric spaces
A MetricSpace α provides a distance function dist : α → α → ℝ satisfying
the usual axioms. Every metric space is automatically a topological space
(the topology is induced by the metric).
#check (inferInstance : MetricSpace ℝ)
-- The distance function
#check @dist
-- dist : α → α → ℝ
-- Key properties
example (x y : ℝ) : dist x y ≥ 0 :=
dist_nonneg
example (x y : ℝ) : dist x y = dist y x :=
dist_comm x y
-- Triangle inequality
example (x y z : ℝ) : dist x z ≤ dist x y + dist y z :=
dist_triangle x y z
-- dist x y = 0 iff x = y
example (x y : ℝ) : dist x y = 0 ↔ x = y :=
dist_eq_zero
In a metric space, nhds x is generated by open balls:
Metric.ball.{u} {α : Type u} [PseudoMetricSpace α] (x : α) (ε : ℝ) : Set αMetric.ball.{u} {α : Type u} [PseudoMetricSpace α] (x : α) (ε : ℝ) : Set α
ball x ε is the set of all points y with dist y x < ε
Metric.nhds_basis_ball.{u} {α : Type u} [PseudoMetricSpace α] {x : α} : (nhds x).HasBasis (fun x => 0 < x) (Metric.ball x)Metric.nhds_basis_ball.{u} {α : Type u} [PseudoMetricSpace α] {x : α} : (nhds x).HasBasis (fun x => 0 < x) (Metric.ball x)
5.4.10. Continuity in metric spaces
For metric spaces, continuity at a point can be stated in the familiar epsilon-delta form:
Metric.continuousAt_iff.{u, v} {α : Type u} {β : Type v} [PseudoMetricSpace α] [PseudoMetricSpace β] {f : α → β} {a : α} : ContinuousAt f a ↔ ∀ ε > 0, ∃ δ > 0, ∀ ⦃x : α⦄, dist x a < δ → dist (f x) (f a) < εMetric.continuousAt_iff.{u, v} {α : Type u} {β : Type v} [PseudoMetricSpace α] [PseudoMetricSpace β] {f : α → β} {a : α} : ContinuousAt f a ↔ ∀ ε > 0, ∃ δ > 0, ∀ ⦃x : α⦄, dist x a < δ → dist (f x) (f a) < ε
example : Continuous (fun x : ℝ ↦ 2 * x) := ⊢ Continuous fun x => 2 * x
All goals completed! 🐙
5.4.11. Product topology
Given topological spaces α and β, the product α × β carries the
product topology, where a basis of open sets is given by products of open
sets.
#check (inferInstance : TopologicalSpace (ℝ × ℝ))
-- Projections are continuous
example : Continuous (Prod.fst : ℝ × ℝ → ℝ) :=
continuous_fst
example : Continuous (Prod.snd : ℝ × ℝ → ℝ) :=
continuous_snd
A function into a product is continuous iff both components are:
Continuous.prodMk.{u, v, u_2} {X : Type u} {Y : Type v} {Z : Type u_2} [TopologicalSpace X] [TopologicalSpace Y] [TopologicalSpace Z] {f : Z → X} {g : Z → Y} (hf : Continuous f) (hg : Continuous g) : Continuous fun x => (f x, g x)Continuous.prodMk.{u, v, u_2} {X : Type u} {Y : Type v} {Z : Type u_2} [TopologicalSpace X] [TopologicalSpace Y] [TopologicalSpace Z] {f : Z → X} {g : Z → Y} (hf : Continuous f) (hg : Continuous g) : Continuous fun x => (f x, g x)
5.4.12. Summary of key Mathlib API for topology
Concept | Mathlib name |
|---|---|
Open set |
|
Closed set |
|
Neighborhood filter |
|
Continuous function |
|
Continuous at a point |
|
Compact set |
|
Connected set |
|
Metric space |
|
Open ball |
|
Distance |
|
The key insight is that all of these interact seamlessly with the filter framework from the previous section. Limits, continuity, and compactness are all expressed in terms of filters.
5.5. Measure Theory in Mathlib
Measure theory in Mathlib is a substantial part of the library, covering sigma-algebras, measurable functions, measures, integration, and probability. It builds on the filter framework for convergence theorems. This section gives an overview of the main definitions and how to work with them.
5.5.1. Notation and naming conventions
Several measure-theory symbols are unicode characters. In VS Code, hover over a symbol to see the backslash escape that types it.
Symbol | Lean name | Reads as | Typed as |
|---|---|---|---|
|
| "sigma-algebra on α" | (ASCII) |
|
| "s is measurable" | (ASCII) |
|
| "f is measurable" | (ASCII) |
|
| "measure on α" | (ASCII) |
|
| "the Lebesgue measure" | (ASCII) |
|
| "measure of s under μ" | (ASCII) |
|
| "nonnegative reals" |
|
|
| "extended nonneg reals, 0 to ∞" |
|
|
| "Bochner integral of f w.r.t. μ" |
|
|
| "lower Lebesgue integral" |
|
|
| "p holds μ-almost-everywhere" |
|
|
| "f equals g μ-almost-everywhere" |
|
|
| "the almost-everywhere filter" | (ASCII) |
Naming hints.
-
Prefix
Isfor properties of measures:IsProbabilityMeasure μ,IsFiniteMeasure μ,IsSigmaFiniteMeasure μ. -
Prefix
ae(short for "almost everywhere"):ae μis the filter of conull sets;AEMeasurable f,AEStronglyMeasurable f,AEEqFunare ae-variants of stronger regularity properties. -
The integral symbol
∫takes a binder∫ x, ...followed by the measure∂μ. Without∂μ, the integral is with respect to the defaultvolumemeasure. -
ℝ≥0∞is the preferred codomain for measures: they take values in[0, ∞]. Coercions to/fromℝuseENNReal.toReal,ENNReal.ofReal. -
Almost-everywhere statements are just the filter quantifier
∀ᶠapplied to theaefilter:∀ᵐ x ∂μ, p xis definitionally∀ᶠ x in μ.ae, p x.
5.5.2. Sigma-algebras: MeasurableSpace
A sigma-algebra (or measurable space) on a type α is specified by the
typeclass MeasurableSpace α. It tells Lean which subsets of α are
measurable.
MeasurableSpace.{u_7} (α : Type u_7) : Type u_7MeasurableSpace.{u_7} (α : Type u_7) : Type u_7
A measurable space is a space equipped with a σ-algebra.
Instance Constructor
MeasurableSpace.mk.{u_7}
Methods
MeasurableSet' : Set α → Prop
Predicate saying that a given set is measurable. Use MeasurableSet in the root namespace
instead.
measurableSet_empty : MeasurableSpace.MeasurableSet' self ∅
The empty set is a measurable set. Use MeasurableSet.empty instead.
measurableSet_compl : ∀ (s : Set α), MeasurableSpace.MeasurableSet' self s → MeasurableSpace.MeasurableSet' self sᶜ
The complement of a measurable set is a measurable set. Use MeasurableSet.compl instead.
measurableSet_iUnion : ∀ (f : ℕ → Set α), (∀ (i : ℕ), MeasurableSpace.MeasurableSet' self (f i)) → MeasurableSpace.MeasurableSet' self (⋃ i, f i)
The union of a sequence of measurable sets is a measurable set. Use a more general
MeasurableSet.iUnion instead.
#check (inferInstance : MeasurableSpace ℝ)
-- The whole space and the empty set are measurable
example : MeasurableSet (Set.univ : Set ℝ) :=
MeasurableSet.univ
example : MeasurableSet (∅ : Set ℝ) :=
MeasurableSet.empty
-- Complements of measurable sets are measurable
example (s : Set ℝ) (hs : MeasurableSet s) : MeasurableSet sᶜ :=
hs.compl
Countable unions of measurable sets are measurable:
MeasurableSet.iUnion.{u_1, u_6} {α : Type u_1} {ι : Sort u_6} {m : MeasurableSpace α} [Countable ι] ⦃f : ι → Set α⦄ (h : ∀ (b : ι), MeasurableSet (f b)) : MeasurableSet (⋃ b, f b)MeasurableSet.iUnion.{u_1, u_6} {α : Type u_1} {ι : Sort u_6} {m : MeasurableSpace α} [Countable ι] ⦃f : ι → Set α⦄ (h : ∀ (b : ι), MeasurableSet (f b)) : MeasurableSet (⋃ b, f b)
On ℝ (and other topological spaces), the default sigma-algebra is the
Borel sigma-algebra, generated by the open sets.
5.5.3. Measurable functions
A function f : α → β between measurable spaces is measurable if the
preimage of every measurable set is measurable.
Measurable.{u_1, u_2} {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] (f : α → β) : PropMeasurable.{u_1, u_2} {α : Type u_1} {β : Type u_2} [MeasurableSpace α] [MeasurableSpace β] (f : α → β) : Prop
A function f between measurable spaces is measurable if the preimage of every
measurable set is measurable.
example {α : Type*} [MeasurableSpace α] :
Measurable (id : α → α) :=
measurable_id
-- Composition of measurable functions is measurable
example {α β γ : Type*} [MeasurableSpace α] [MeasurableSpace β]
[MeasurableSpace γ] (f : α → β) (g : β → γ)
(hf : Measurable f) (hg : Measurable g) :
Measurable (g ∘ f) :=
hg.comp hf
Continuous functions are measurable (with Borel sigma-algebras):
Continuous.measurable.{u_1, u_3} {α : Type u_1} {γ : Type u_3} [TopologicalSpace α] [MeasurableSpace α] [OpensMeasurableSpace α] [TopologicalSpace γ] [MeasurableSpace γ] [BorelSpace γ] {f : α → γ} (hf : Continuous f) : Measurable fContinuous.measurable.{u_1, u_3} {α : Type u_1} {γ : Type u_3} [TopologicalSpace α] [MeasurableSpace α] [OpensMeasurableSpace α] [TopologicalSpace γ] [MeasurableSpace γ] [BorelSpace γ] {f : α → γ} (hf : Continuous f) : Measurable f
A continuous function from an OpensMeasurableSpace to a BorelSpace
is measurable.
Important: every continuous function is measurable (when both spaces carry the Borel sigma-algebra), but not vice versa.
5.5.4. Measures
A measure on a measurable space α is a countably additive function from
measurable sets to [0, ∞]. In Mathlib, measures take values in ℝ≥0∞
(the extended nonnegative reals, typed \R\ge0\infty or using ENNReal).
A measure is defined to be an outer measure that is countably additive on measurable sets, with the additional assumption that the outer measure is the canonical extension of the restricted measure.
The measure of a set s, denoted μ s, is an extended nonnegative real. The real-valued version
is written μ.real s.
Constructor
MeasureTheory.Measure.mk.{u_6}
Extends
Fields
measureOf : Set α → ENNReal
-
MeasureTheory.OuterMeasure α
empty : self.measureOf ∅ = 0
-
MeasureTheory.OuterMeasure α
mono : ∀ {s₁ s₂ : Set α}, s₁ ⊆ s₂ → self.measureOf s₁ ≤ self.measureOf s₂
-
MeasureTheory.OuterMeasure α
iUnion_nat : ∀ (s : ℕ → Set α), Pairwise (Function.onFun Disjoint s) → self.measureOf (⋃ i, s i) ≤ ∑' (i : ℕ), self.measureOf (s i)
-
MeasureTheory.OuterMeasure α
m_iUnion : ∀ ⦃f : ℕ → Set α⦄, (∀ (i : ℕ), MeasurableSet (f i)) → Pairwise (Function.onFun Disjoint f) → self.toOuterMeasure (⋃ i, f i) = ∑' (i : ℕ), self.toOuterMeasure (f i)
trim_le : self.trim ≤ self.toOuterMeasure
example {α : Type*} [MeasurableSpace α]
(μ : MeasureTheory.Measure α) (s : Set α) : ENNReal :=
μ s
-- The measure of the empty set is 0
example {α : Type*} [MeasurableSpace α]
(μ : MeasureTheory.Measure α) : μ ∅ = 0 :=
MeasureTheory.measure_empty
Monotonicity: if s ⊆ t, then μ s ≤ μ t:
MeasureTheory.measure_mono.{u_1, u_3} {α : Type u_1} {F : Type u_3} [FunLike F (Set α) ENNReal] [MeasureTheory.OuterMeasureClass F α] {μ : F} {s t : Set α} (h : s ⊆ t) : μ s ≤ μ tMeasureTheory.measure_mono.{u_1, u_3} {α : Type u_1} {F : Type u_3} [FunLike F (Set α) ENNReal] [MeasureTheory.OuterMeasureClass F α] {μ : F} {s t : Set α} (h : s ⊆ t) : μ s ≤ μ t
5.5.5. The Lebesgue measure on ℝ
The standard measure on ℝ is the Lebesgue measure, available as
MeasureTheory.volume (or via the notation volume).
The Lebesgue measure of an interval:
Real.volume_Icc {a b : ℝ} : MeasureTheory.volume (Set.Icc a b) = ENNReal.ofReal (b - a)Real.volume_Icc {a b : ℝ} : MeasureTheory.volume (Set.Icc a b) = ENNReal.ofReal (b - a)
open MeasureTheory in
#check (volume : Measure ℝ)
5.5.6. A small worked computation
A concrete sanity check: the Lebesgue measure of the closed unit
interval is 1.
open MeasureTheory in
example : volume (Set.Icc (0 : ℝ) 1) = 1 := ⊢ volume (Set.Icc 0 1) = 1
⊢ ENNReal.ofReal (1 - 0) = 1
All goals completed! 🐙
Mathlib also gives the volume of Ico, Ioo, Ioc (all equal to
b - a when a ≤ b); a half-open interval can be written as a
disjoint union with a single point, and points have measure zero
(Real.volume_singleton).
5.5.7. Integration
Mathlib provides the Bochner integral ∫ x, f x ∂μ (typed \int), which
generalizes both the Lebesgue integral for real-valued functions and the
integral for Banach-space-valued functions.
MeasureTheory.integral.{u_6, u_7} {α : Type u_6} {G : Type u_7} [NormedAddCommGroup G] [NormedSpace ℝ G] {x✝ : MeasurableSpace α} (μ : MeasureTheory.Measure α) (f : α → G) : GMeasureTheory.integral.{u_6, u_7} {α : Type u_6} {G : Type u_7} [NormedAddCommGroup G] [NormedSpace ℝ G] {x✝ : MeasurableSpace α} (μ : MeasureTheory.Measure α) (f : α → G) : G
The Bochner integral
Linearity of the integral:
MeasureTheory.integral_add.{u_1, u_5} {α : Type u_1} {G : Type u_5} [NormedAddCommGroup G] [NormedSpace ℝ G] {m : MeasurableSpace α} {μ : MeasureTheory.Measure α} {f g : α → G} (hf : MeasureTheory.Integrable f μ) (hg : MeasureTheory.Integrable g μ) : ∫ (a : α), f a + g a ∂μ = ∫ (a : α), f a ∂μ + ∫ (a : α), g a ∂μMeasureTheory.integral_add.{u_1, u_5} {α : Type u_1} {G : Type u_5} [NormedAddCommGroup G] [NormedSpace ℝ G] {m : MeasurableSpace α} {μ : MeasureTheory.Measure α} {f g : α → G} (hf : MeasureTheory.Integrable f μ) (hg : MeasureTheory.Integrable g μ) : ∫ (a : α), f a + g a ∂μ = ∫ (a : α), f a ∂μ + ∫ (a : α), g a ∂μ
MeasureTheory.integral_smul.{u_1, u_4, u_5} {α : Type u_1} {𝕜 : Type u_4} [NormedDivisionRing 𝕜] {G : Type u_5} [NormedAddCommGroup G] [NormedSpace ℝ G] {m : MeasurableSpace α} {μ : MeasureTheory.Measure α} [Module 𝕜 G] [NormSMulClass 𝕜 G] [SMulCommClass ℝ 𝕜 G] (c : 𝕜) (f : α → G) : ∫ (a : α), c • f a ∂μ = c • ∫ (a : α), f a ∂μMeasureTheory.integral_smul.{u_1, u_4, u_5} {α : Type u_1} {𝕜 : Type u_4} [NormedDivisionRing 𝕜] {G : Type u_5} [NormedAddCommGroup G] [NormedSpace ℝ G] {m : MeasurableSpace α} {μ : MeasureTheory.Measure α} [Module 𝕜 G] [NormSMulClass 𝕜 G] [SMulCommClass ℝ 𝕜 G] (c : 𝕜) (f : α → G) : ∫ (a : α), c • f a ∂μ = c • ∫ (a : α), f a ∂μ
The Bochner integral is linear. Note this requires 𝕜 to be a normed division ring, in order
to ensure that for c ≠ 0, the function c • f is integrable iff f is. For an analogous
statement for more general rings with an a priori integrability assumption on f, see
MeasureTheory.Integrable.integral_smul.
Integral over an indicator function:
MeasureTheory.integral_indicator.{u_1, u_3} {X : Type u_1} {E : Type u_3} {mX : MeasurableSpace X} [NormedAddCommGroup E] [NormedSpace ℝ E] {f : X → E} {s : Set X} {μ : MeasureTheory.Measure X} (hs : MeasurableSet s) : ∫ (x : X), s.indicator f x ∂μ = ∫ (x : X) in s, f x ∂μMeasureTheory.integral_indicator.{u_1, u_3} {X : Type u_1} {E : Type u_3} {mX : MeasurableSpace X} [NormedAddCommGroup E] [NormedSpace ℝ E] {f : X → E} {s : Set X} {μ : MeasureTheory.Measure X} (hs : MeasurableSet s) : ∫ (x : X), s.indicator f x ∂μ = ∫ (x : X) in s, f x ∂μ
For a function f and a measurable set s, the integral of indicator s f
over the whole space is equal to ∫ x in s, f x ∂μ defined as ∫ x, f x ∂(μ.restrict s).
For nonnegative functions, there is also the ∫⁻ (lower Lebesgue integral)
taking values in ℝ≥0∞:
#check @MeasureTheory.lintegral
-- ∫⁻ x, f x ∂μ : ℝ≥0∞
5.5.8. Convergence theorems
The fundamental convergence theorems of measure theory are available in Mathlib:
Monotone Convergence Theorem: If f₀ ≤ f₁ ≤ f₂ ≤ ... is an increasing
sequence of nonnegative measurable functions, then
∫⁻ x, (⨆ n, f n x) ∂μ = ⨆ n, ∫⁻ x, f n x ∂μ.
MeasureTheory.lintegral_iSup.{u_1} {α : Type u_1} {m : MeasurableSpace α} {μ : MeasureTheory.Measure α} {f : ℕ → α → ENNReal} (hf : ∀ (n : ℕ), Measurable (f n)) (h_mono : Monotone f) : ∫⁻ (a : α), ⨆ n, f n a ∂μ = ⨆ n, ∫⁻ (a : α), f n a ∂μMeasureTheory.lintegral_iSup.{u_1} {α : Type u_1} {m : MeasurableSpace α} {μ : MeasureTheory.Measure α} {f : ℕ → α → ENNReal} (hf : ∀ (n : ℕ), Measurable (f n)) (h_mono : Monotone f) : ∫⁻ (a : α), ⨆ n, f n a ∂μ = ⨆ n, ∫⁻ (a : α), f n a ∂μ
Monotone convergence theorem, version with Measurable functions.
Dominated Convergence Theorem: If fₙ → f pointwise a.e., and
|fₙ| ≤ g a.e. for an integrable g, then ∫ x, fₙ x ∂μ → ∫ x, f x ∂μ.
MeasureTheory.tendsto_integral_of_dominated_convergence.{u_1, u_3} {α : Type u_1} {G : Type u_3} [NormedAddCommGroup G] [NormedSpace ℝ G] {m : MeasurableSpace α} {μ : MeasureTheory.Measure α} {F : ℕ → α → G} {f : α → G} (bound : α → ℝ) (F_measurable : ∀ (n : ℕ), MeasureTheory.AEStronglyMeasurable (F n) μ) (bound_integrable : MeasureTheory.Integrable bound μ) (h_bound : ∀ (n : ℕ), ∀ᵐ (a : α) ∂μ, ‖F n a‖ ≤ bound a) (h_lim : ∀ᵐ (a : α) ∂μ, Filter.Tendsto (fun n => F n a) Filter.atTop (nhds (f a))) : Filter.Tendsto (fun n => ∫ (a : α), F n a ∂μ) Filter.atTop (nhds (∫ (a : α), f a ∂μ))MeasureTheory.tendsto_integral_of_dominated_convergence.{u_1, u_3} {α : Type u_1} {G : Type u_3} [NormedAddCommGroup G] [NormedSpace ℝ G] {m : MeasurableSpace α} {μ : MeasureTheory.Measure α} {F : ℕ → α → G} {f : α → G} (bound : α → ℝ) (F_measurable : ∀ (n : ℕ), MeasureTheory.AEStronglyMeasurable (F n) μ) (bound_integrable : MeasureTheory.Integrable bound μ) (h_bound : ∀ (n : ℕ), ∀ᵐ (a : α) ∂μ, ‖F n a‖ ≤ bound a) (h_lim : ∀ᵐ (a : α) ∂μ, Filter.Tendsto (fun n => F n a) Filter.atTop (nhds (f a))) : Filter.Tendsto (fun n => ∫ (a : α), F n a ∂μ) Filter.atTop (nhds (∫ (a : α), f a ∂μ))
Lebesgue dominated convergence theorem provides sufficient conditions under which almost
everywhere convergence of a sequence of functions implies the convergence of their integrals.
We could weaken the condition bound_integrable to require HasFiniteIntegral bound μ instead
(i.e. not requiring that bound is measurable), but in all applications proving integrability
is easier.
These theorems are stated using filters: convergence of the integrals is
expressed as Filter.Tendsto with respect to Filter.atTop.
5.5.9. Probability measures
A measure μ is a probability measure if μ Set.univ = 1. Mathlib uses
the typeclass MeasureTheory.IsProbabilityMeasure μ for this.
MeasureTheory.IsProbabilityMeasure.{u_1} {α : Type u_1} {m0 : MeasurableSpace α} (μ : MeasureTheory.Measure α) : PropMeasureTheory.IsProbabilityMeasure.{u_1} {α : Type u_1} {m0 : MeasurableSpace α} (μ : MeasureTheory.Measure α) : Prop
A measure μ is called a probability measure if μ univ = 1.
Instance Constructor
MeasureTheory.IsProbabilityMeasure.mk.{u_1}
Methods
measure_univ : μ Set.univ = 1
example {α : Type*} [MeasurableSpace α]
(μ : MeasureTheory.Measure α)
[MeasureTheory.IsProbabilityMeasure μ] (s : Set α) :
μ s ≤ 1 := α:Type u_1inst✝¹:MeasurableSpace αμ:MeasureTheory.Measure αinst✝:MeasureTheory.IsProbabilityMeasure μs:Set α⊢ μ s ≤ 1
calc μ s ≤ μ Set.univ :=
MeasureTheory.measure_mono (Set.subset_univ s)
_ = 1 :=
MeasureTheory.IsProbabilityMeasure.measure_univ
There is also MeasureTheory.IsFiniteMeasure μ for finite measures and
MeasureTheory.SigmaFinite μ for sigma-finite measures.
5.5.10. Almost everywhere and ae-filter
Many statements in measure theory hold "almost everywhere" (a.e.), meaning
outside a set of measure zero. In Mathlib, this is captured by the
ae-filter MeasureTheory.Measure.ae μ, which is the filter of sets whose
complement has measure zero.
MeasureTheory.ae.{u_1, u_3} {α : Type u_1} {F : Type u_3} [FunLike F (Set α) ENNReal] [MeasureTheory.OuterMeasureClass F α] (μ : F) : Filter αMeasureTheory.ae.{u_1, u_3} {α : Type u_1} {F : Type u_3} [FunLike F (Set α) ENNReal] [MeasureTheory.OuterMeasureClass F α] (μ : F) : Filter α
The “almost everywhere” filter of co-null sets.
open MeasureTheory in
example {α : Type*} [MeasurableSpace α]
(μ : Measure α) (f g : α → ℝ) :
f =ᵐ[μ] g ↔ ∀ᶠ x in ae μ, f x = g x :=
Iff.rfl
This is a beautiful example of the filter framework in action: "almost everywhere" is just "eventually" with respect to the ae-filter.
5.5.11. Conditional expectation
For probability theory, Mathlib provides conditional expectation with respect to a sub-sigma-algebra. This is an advanced topic, but the key definition is:
MeasureTheory.condExp.{u_5, u_6} {α : Type u_5} {E : Type u_6} (m : MeasurableSpace α) {m₀ : MeasurableSpace α} [NormedAddCommGroup E] [CompleteSpace E] [NormedSpace ℝ E] (μ : MeasureTheory.Measure α) (f : α → E) : α → EMeasureTheory.condExp.{u_5, u_6} {α : Type u_5} {E : Type u_6} (m : MeasurableSpace α) {m₀ : MeasurableSpace α} [NormedAddCommGroup E] [CompleteSpace E] [NormedSpace ℝ E] (μ : MeasureTheory.Measure α) (f : α → E) : α → E
Conditional expectation of a function, with notation μ[f | m].
It is defined as 0 if any one of the following conditions is true:
-
mis not a sub-σ-algebra ofm₀, -
μis not σ-finite with respect tom, -
fis not integrable.
The conditional expectation μ[f | m] (notation
MeasureTheory.condExp m μ f) is the unique (up to
μ-almost-everywhere equality) function g : α → E such that
-
gism-measurable, wheremis a sub-sigma-algebra of the ambient sigma-algebra onα; -
for every
m-measurable sets,∫ x in s, f x ∂μ = ∫ x in s, g x ∂μ.
In probability terms: μ[f | m] is the best m-measurable
predictor of f in the L² sense. The defining property is
captured by MeasureTheory.setIntegral_condExp. Mathlib has the
linearity, monotonicity, tower, and Jensen properties; the API
lives in the MeasureTheory namespace and uses condExp_ as a
prefix.
The conditional expectation μ[f | m] is the unique (a.e.) m-measurable
function that agrees with f in integral over every m-measurable set.
5.5.12. Summary of key Mathlib API for measure theory
Concept | Mathlib name |
|---|---|
Sigma-algebra |
|
Measurable set |
|
Measurable function |
|
Measure |
|
Lebesgue measure |
|
Bochner integral |
|
Lebesgue integral |
|
Probability measure |
|
Almost everywhere |
|
Integrable |
|
Conditional expectation |
|
Working with measure theory in Mathlib requires familiarity with ℝ≥0∞
(extended nonnegative reals) and the filter framework. Many lemmas require
measurability or integrability hypotheses, so expect to prove these as
side goals.
5.6. Monads and Probability Mass Functions
This chapter has two parts. The first is a general introduction to
monads -- a design pattern from functional programming that
provides a uniform way to sequence computations with additional
structure (failure, multiple results, state, I/O, probability). The
second part specialises the discussion to Mathlib's PMF α, the
type of discrete probability distributions on α, which forms a
monad in exactly this sense.
5.6.1. Monads: the general pattern
Monads are a design pattern from functional programming that provide a uniform way to sequence computations with additional structure -- such as computations that may fail, that carry state, or that perform input/output. The name "monad" comes from category theory, but in programming they are a practical tool rather than an abstract concept.
5.6.1.1. The problem: chaining computations that may fail
Consider looking up values in a list of key-value pairs. The lookup might fail if the key is not present. Lean represents this with the Option type:
#check @Option
-- Option : Type u → Type u
A value of type Option α is either some a (for some a : α) or none. This is like a computation that might produce a result or might fail.
def safeDivide (a b : ℕ) : Option ℕ :=
if b == 0 then none else some (a / b)
#eval safeDivide 10 3 -- outputs some 3
#eval safeDivide 10 0 -- outputs none
Now suppose we want to chain several such computations: divide a by b, then divide the result by c. Without special support, we would need to write nested pattern matches:
def chainDivide (a b c : ℕ) : Option ℕ :=
match safeDivide a b with
| none => none
| some r => safeDivide r c
#eval chainDivide 100 5 4 -- outputs some 5
#eval chainDivide 100 0 4 -- outputs none
#eval chainDivide 100 5 0 -- outputs none
This nesting becomes deeply unreadable with more steps. Monads solve this problem.
5.6.1.2. The Bind operation
The key operation of a monad is bind (also written >>=). For Option, it works like this:
def myBind (x : Option α) (f : α → Option β) : Option β :=
match x with
| none => none
| some a => f a
It says: if x is none, propagate the failure; if x is some a, apply f to a. Now we can rewrite our chained division:
def chainDivide' (a b c : ℕ) : Option ℕ :=
safeDivide a b >>= fun r => safeDivide r c
This reads naturally: "divide a by b, then take the result r and divide it by c."
The monad instance for Option is given by:
-
pure (a : α) : Option α := some a -
bind (x : Option α) (f : α → Option β) : Option β := match x with | none => none | some a => f a
So pure wraps a value in some, and bind propagates none or applies the continuation f to the contained value.
5.6.1.3. The do-notation
Writing >>= and fun can still be verbose. Lean provides do-notation, which makes monadic code look almost like imperative programming:
def chainDivideDo (a b c : ℕ) : Option ℕ := do
let r ← safeDivide a b
safeDivide r c
#eval chainDivideDo 100 5 4 -- outputs some 5
#eval chainDivideDo 100 0 4 -- outputs none
The let r ← e syntax means: "run the computation e; if it succeeds with value r, continue; if it fails, propagate the failure." Under the hood, Lean translates this into >>= calls.
We can chain many steps:
def multiDivide (a b c d : ℕ) : Option ℕ := do
let r1 ← safeDivide a b
let r2 ← safeDivide r1 c
safeDivide r2 d
#eval multiDivide 1000 10 5 4 -- outputs some 5
#eval multiDivide 1000 10 0 4 -- outputs none
5.6.1.4. The Functor typeclass
Before we discuss monads, we need the simpler concept of a functor. In Lean, Functor is a typeclass for type constructors f : Type → Type that support a map operation:
-
map : (α → β) → f α → f β-- applies a function to the value(s) inside the container, without changing the structure.
The infix notation for map is <$>, so g <$> x means Functor.map g x. Every monad is automatically a functor (since map can be defined in terms of bind and pure), but the functor interface is useful on its own when you only need to transform values without chaining computations.
Here is what map does for each of our examples:
-
Option:map f xappliesfto the contained value ifx = some a, givingsome (f a), and returnsnoneifx = none.
#eval (· + 10) <$> (some 5 : Option ℕ) -- some 15
#eval (· + 10) <$> (none : Option ℕ) -- none
-
List:map f xsappliesfto every element of the list, i.e.List.map f xs.
#eval (· * 2) <$> [1, 2, 3] -- [2, 4, 6]
-
IO:map f actionproduces a new IO action that runsactionand then appliesfto its result. -
StateM σ:map f xruns the stateful computationx, obtains resultaand new states', and returns(f a, s'). The state is threaded through unchanged.
#eval (Functor.map (· * 10) (get : StateM ℕ ℕ)).run 7
-- outputs (70, 7)
-
Except ε:map f xappliesfto the value ifx = Except.ok a, givingExcept.ok (f a), and returnsExcept.error eunchanged ifx = Except.error e.
#eval (· + 1) <$> (Except.ok 5 : Except String ℕ)
-- Except.ok 6
#eval (· + 1) <$> (Except.error "fail" : Except String ℕ)
-- Except.error "fail"
-
Set:map f sis the image ofsunderf, i.e.{f a | a ∈ s}. This is the familiar image operation from mathematics. -
Filter:Filter.map f Fis the pushforward filter. A setsbelongs toFilter.map f Fif and only iff ⁻¹' s ∈ F. This generalizes the image operation to filters. -
Finset:map f sapplies an embeddingfto each element of the finite set. For a plain function, one usesFinset.image f sinstead.
In summary, map always means "apply a function inside the container." The key property is that map preserves the structure: it cannot change none to some, shorten a list, or introduce new side effects. It only transforms the values.
5.6.1.5. The Monad typeclass
In Lean, Monad is a typeclass. Any type constructor m : Type → Type can be a monad if it provides two operations:
-
pure : α → m α-- wraps a plain value into the monadic context. -
bind : m α → (α → m β) → m β-- chains computations.
These must satisfy the monad laws (associativity and identity), which ensure that do-notation behaves sensibly. Every Monad is also a Functor, with map f x = x >>= (pure ∘ f).
Lean already provides Monad Option, Monad List, Monad IO, and many more. For each monad we encounter below, we will spell out what pure and bind mean concretely.
5.6.1.6. Lists as a monad
Lists can also be viewed as monads, representing computations with multiple possible results:
def pairs : List (ℕ × ℕ) := do
let x ← [1, 2, 3]
let y ← [10, 20]
return (x, y)
#eval pairs
-- outputs [(1, 10), (1, 20), (2, 10), (2, 20), (3, 10), (3, 20)]
Here, let x ← [1, 2, 3] means "for each x in the list [1, 2, 3]," and return (x, y) creates a singleton list. The monadic bind for lists is essentially List.flatMap (also known as concatMap or List.bind).
This is similar to set-builder notation in mathematics: { (x, y) | x ∈ {1,2,3}, y ∈ {10,20} }.
The monad instance for List is given by:
-
pure (a : α) : List α := [a] -
bind (xs : List α) (f : α → List β) : List β := xs.flatMap f
So pure creates a singleton list, and bind applies f to each element and concatenates the resulting lists. For example, [1, 2].bind (fun x => [x, x * 10]) evaluates to [1, 10, 2, 20].
5.6.1.7. The IO monad
Functional programming insists on pure functions, but real programs need to interact with the outside world: reading files, printing to the screen, communicating over a network. The IO monad solves this: an IO α computation is a description of an action that, when executed, may perform side effects and produce a value of type α.
def greet (name : String) : IO Unit := do
IO.println s!"Hello, {name}!"
IO.println "Welcome to Lean."
Inside do-notation, we can sequence IO actions. The key insight is that greet itself is a pure function -- it returns a description of actions, not the actions themselves. The actions are only performed when the program runs.
A complete program in Lean is a value of type IO Unit:
def myMain : IO Unit := do
IO.println "What is 2 + 2?"
IO.println s!"It is {2 + 2}."
The monad instance for IO is given by:
-
pure (a : α) : IO α-- returnsawithout performing any side effect. -
bind (x : IO α) (f : α → IO β) : IO β-- first executes the actionx, obtaining a valuea : α, then executesf a.
Internally, IO α is defined as EStateM IO.Error IO.RealWorld α, a state monad over a token representing the real world. However, these implementation details are hidden: you cannot pattern-match on an IO value.
5.6.1.8. Stateful computation with StateM
The StateM monad allows computations that read and modify state. A value of type StateM σ α is a computation that can access and modify state of type σ and produces a result of type α.
def counter : StateM ℕ ℕ := do
let n ← get
set (n + 1)
return n
def countThree : StateM ℕ (ℕ × ℕ × ℕ) := do
let a ← counter
let b ← counter
let c ← counter
return (a, b, c)
#eval countThree.run 0 -- outputs ((0, 1, 2), 3)
The .run function executes the stateful computation with an initial state. The result ((0, 1, 2), 3) tells us the three counter values were 0, 1, 2, and the final state is 3.
The type StateM σ α is defined as σ → (α × σ), i.e. a function that takes an initial state and returns a result together with an updated state. The monad instance is:
-
pure (a : α) : StateM σ α := fun s => (a, s)-- returnsaand leaves the state unchanged. -
bind (x : StateM σ α) (f : α → StateM σ β) : StateM σ β := fun s => let (a, s') := x s; f a s'-- runsxwith states, obtaining resultaand new states', then runsf awith states'.
The operations get : StateM σ σ := fun s => (s, s) and set (s' : σ) : StateM σ Unit := fun _ => ((), s') allow reading and writing the state.
5.6.1.9. Error handling with Except
The Except type is like Option, but it carries an error message when something goes wrong:
def safeDivideE (a b : ℕ) : Except String ℕ :=
if b == 0 then Except.error "division by zero"
else Except.ok (a / b)
def computation : Except String ℕ := do
let r1 ← safeDivideE 100 5
let r2 ← safeDivideE r1 4
return r2
#eval computation -- outputs Except.ok 5
def failingComputation : Except String ℕ := do
let r1 ← safeDivideE 100 0
let r2 ← safeDivideE r1 4
return r2
#eval failingComputation -- outputs Except.error "division by zero"
With Except ε α, we can propagate meaningful error messages instead of just none.
The type Except ε α has two constructors: Except.ok (a : α) and Except.error (e : ε). The monad instance is:
-
pure (a : α) : Except ε α := Except.ok a -
bind (x : Except ε α) (f : α → Except ε β) : Except ε β := match x with | Except.error e => Except.error e | Except.ok a => f a
This is exactly analogous to Option, but the failure case carries an error value of type ε instead of just being none.
5.6.1.10. Sets as a monad
In mathematics, a natural analogue of the List monad is the Set monad. A set s : Set α represents a nondeterministic choice among the elements of s. Mathlib provides a monad instance for Set, but it is not registered globally because sets have no computational content (you cannot iterate over an arbitrary Set). To use it, we enable it locally:
section SetMonad
attribute [local instance] Set.monad
def setPairs : Set (ℕ × ℕ) := do
let x ← ({1, 2, 3} : Set ℕ)
let y ← ({10, 20} : Set ℕ)
return (x, y)
example : setPairs = {(1, 10), (1, 20), (2, 10), (2, 20), (3, 10), (3, 20)} := ⊢ setPairs = {(1, 10), (1, 20), (2, 10), (2, 20), (3, 10), (3, 20)}
⊢ {(3, 20), (3, 10), (2, 20), (2, 10), (1, 20), (1, 10)} = {(1, 10), (1, 20), (2, 10), (2, 20), (3, 10), (3, 20)}
a:ℕb:ℕ⊢ (a, b) ∈ {(3, 20), (3, 10), (2, 20), (2, 10), (1, 20), (1, 10)} ↔
(a, b) ∈ {(1, 10), (1, 20), (2, 10), (2, 20), (3, 10), (3, 20)}
a:ℕb:ℕ⊢ a = 3 ∧ b = 20 ∨ a = 3 ∧ b = 10 ∨ a = 2 ∧ b = 20 ∨ a = 2 ∧ b = 10 ∨ a = 1 ∧ b = 20 ∨ a = 1 ∧ b = 10 ↔
a = 1 ∧ b = 10 ∨ a = 1 ∧ b = 20 ∨ a = 2 ∧ b = 10 ∨ a = 2 ∧ b = 20 ∨ a = 3 ∧ b = 10 ∨ a = 3 ∧ b = 20
All goals completed! 🐙
end SetMonad
The monad instance for Set is given by:
-
pure (a : α) : Set α := {a}-- the singleton set. -
bind (s : Set α) (f : α → Set β) : Set β := ⋃ a ∈ s, f a-- the union of all images underf.
This is exactly the mathematical counterpart of the List monad: where List.bind concatenates lists, Set.bind takes unions of sets.
5.6.1.11. Filters as a monad
Recall that a filter F : Filter α is a collection of sets that is upward closed and closed under finite intersections. Filters generalize the notion of "sets of large elements" or "neighborhoods." Mathlib defines pure and bind operations for filters:
#check @Filter.bind
-- Filter.bind : Filter α → (α → Filter β) → Filter β
The Pure instance for Filter is registered so that pure a is the principal filter of {a}:
-
pure (a : α) : Filter α := 𝓟 {a}-- the collection of all sets containinga. In particular,s ∈ pure a ↔ a ∈ s. -
Filter.bind (F : Filter α) (m : α → Filter β) : Filter β := Filter.join (Filter.map m F)-- informally, this takes the "union" of the filtersm aasaranges overF.
However, Mathlib does not register a Monad Filter instance. The reason is that this bind is incompatible with the Applicative Filter instance (which is based on Filter.seq), so the monad laws would not hold with respect to the existing applicative structure. One can still use pure and Filter.bind directly, but do-notation is not available for filters.
5.6.1.12. Finite sets and Finset
Finset α is the type of finite sets with elements in α. Unlike Set, finite sets do have computational content, and Mathlib registers a global Monad Finset instance.
section FinsetMonad
open scoped Classical
noncomputable def finsetPairs : Finset (ℕ × ℕ) := do
let x ← ({1, 2, 3} : Finset ℕ)
let y ← ({10, 20} : Finset ℕ)
return (x, y)
end FinsetMonad
Note that we need open scoped Classical because the Monad Finset instance requires all propositions to be decidable. This makes the definition noncomputable, so we cannot use #eval on it. However, we can still reason about it in proofs.
The monad instance for Finset is given by:
-
pure (a : α) : Finset α := {a}-- the singleton finite set. -
bind (s : Finset α) (f : α → Finset β) : Finset β := s.sup f-- which equalss.biUnion f, the finite union of all images underf.
This is the finite analogue of the Set monad.
5.6.1.13. Connection to category theory
The name "monad" comes from category theory. In that setting, a monad on a category C is an endofunctor T : C → C together with natural transformations η : Id → T (the unit, corresponding to pure) and μ : T ∘ T → T (the multiplication, related to bind/join), satisfying certain coherence conditions.
The programming notion of monad is exactly this, specialized to the category of types and functions. The monad laws in programming:
-
pure a >>= f = f a-- left identity -
m >>= pure = m-- right identity -
(m >>= f) >>= g = m >>= (fun x => f x >>= g)-- associativity
correspond precisely to the coherence conditions for monads in category theory. So if you study category theory later, you will find that you already understand monads from a programming perspective.
5.6.1.14. Summary of monads seen so far
Monads provide a unified pattern for sequencing computations:
-
Option-- computations that may fail silently -
Except ε-- computations that may fail with an error of typeε -
List-- computations with multiple results -
Set-- nondeterministic choice over (possibly infinite) sets -
Finset-- nondeterministic choice over finite sets -
StateM σ-- computations with mutable state of typeσ -
IO-- computations with real-world side effects
Filters have pure and bind operations but do not form a Monad instance due to incompatibility with the applicative structure.
The do-notation makes monadic code readable and close to imperative style, while preserving the benefits of functional programming (purity, type safety, composability).
The next sections apply exactly this framework to discrete
probability distributions, packaged in Mathlib as PMF α.
5.6.2. Probability mass functions
Mathlib's PMF α models a discrete probability distribution on a
type α: it is a function α → ℝ≥0∞ whose values sum (as a
tsum) to 1. Unlike full measure theory, working with PMF lets
us avoid most measurability hypotheses: the support of a PMF is
automatically countable, and there is no sigma-algebra on α to
keep track of.
The goal of the rest of this chapter is to explain the monadic
structure of PMF: the four operations pure, bind, map,
join, and the laws they satisfy. This structure is exactly the
one introduced above, now applied to probability distributions.
5.6.2.1. Notation and naming conventions
Symbol | Lean name | Reads as | Typed as |
|---|---|---|---|
|
| "probability mass function on α" | (ASCII) |
|
| "the probability of a under p" | (ASCII) |
|
| "the support of p" | (ASCII) |
|
| "the Dirac measure at a" | (ASCII) |
|
| "bind p with f" | (ASCII) |
|
| "push p along f" | (ASCII) |
|
| "f mapped over p" | (ASCII) |
|
| "p bind f" | (ASCII) |
|
| "extended nonneg reals" |
|
5.6.2.2. The definition
PMF.{u} (α : Type u) : Type uPMF.{u} (α : Type u) : Type u
A probability mass function, or discrete probability measures is a function α → ℝ≥0∞ such
that the values have (infinite) sum 1.
So a PMF α is a function p : α → ℝ≥0∞ together with a proof
HasSum p 1. In particular:
-
p a : ℝ≥0∞is the probability mass ata; -
∑' a, p a = 1(total mass is one); -
p.support = {a | p a ≠ 0}is automatically countable.
The countability of the support is the key reason we can largely
avoid measurability: a discrete distribution only "sees" countably
many values, so integrals against a PMF reduce to infinite sums.
5.6.2.3. pure: the Dirac distribution
The simplest PMF is the Dirac mass at a point a : α: it
assigns probability 1 to a and 0 everywhere else.
The API:
example : (PMF.pure (0 : ℕ)) 0 = 1 :=
PMF.pure_apply_self 0
example (n : ℕ) (h : n ≠ 0) : (PMF.pure (0 : ℕ)) n = 0 :=
PMF.pure_apply_of_ne 0 n h
5.6.2.4. bind: composing distributions
Given p : PMF α and a family of distributions
f : α → PMF β, the composite distribution p.bind f : PMF β
draws a according to p, then draws b according to f a.
Its probability mass is
(p.bind f) b = ∑' a, p a * (f a) b.
The support of the bind is the union of the supports of the second-stage distributions along the support of the first:
The monad laws hold by definition (no measurability needed):
These are exactly the two identity laws and the associativity law
introduced earlier in this chapter, specialised from Option and
List to PMF.
5.6.2.5. map: pushing a distribution along a function
If p : PMF α and f : α → β, the image distribution
p.map f : PMF β assigns to each b the total probability mass
that p places on the preimage f ⁻¹' {b}.
By definition,
p.map f = p.bind (pure ∘ f),
so map is just bind composed with pure. This is the generic
derivation of Functor.map from the monad operations.
The functor laws, again purely by calculation with bind/pure:
A handy identity relates bind and map:
Notice how no measurability hypothesis appears: PMF.map f p makes
sense for every function f : α → β. This is in stark contrast
to Measure.map, which requires Measurable f (and produces the
zero measure otherwise). Measurability only re-enters if we want
to compare p.map f with the pushforward measure p.toMeasure.map f,
as in PMF.toMeasure_map_apply -- but nothing on the PMF side
requires it.
5.6.2.6. join: flattening distributions of distributions
A distribution of distributions is a PMF (PMF α): a random
choice of a random variable. The join operation collapses it
into a single PMF α by sampling at both stages.
Mathlib does not define PMF.join separately, because every monad
already supplies it as bind id. In Lean:
noncomputable def PMF.join {α : Type*} (P : PMF (PMF α)) : PMF α :=
P.bind id
Equivalently, unfolding bind,
(P.join) a = ∑' q, P q * q a.
The general monad identities give us
-
join (pure p) = p(fromPMF.pure_bind); -
join (p.map pure) = p(fromPMF.bind_pure); -
join (P.map (fun p ↦ p.map f)) = (join P).map f(naturality); -
join (P.map join) = join (join P)(associativity of join),
all derivable from pure_bind, bind_pure, and bind_bind. No
measurability is used anywhere.
5.6.2.7. The Monad PMF instance
Everything above is packaged into a Monad instance for PMF,
so that do-notation works out of the box:
noncomputable example (die : PMF (Fin 6)) : PMF ℕ := do
let x ← die
let y ← die
pure (x.val + y.val + 2)
Under the hood, this is elaborated to
die.bind fun x => die.bind fun y => pure (x.val + y.val + 2).
5.6.2.8. Concrete distributions
Mathlib provides several familiar named distributions as PMFs:
-
PMF.bernoulli p hp : PMF Bool-- the Bernoulli distribution with probabilitypoftrue, requiringp ≤ 1. -
PMF.binomial p hp n : PMF (Fin (n + 1))-- the Binomial distributionB(n, p). -
PMF.uniformOfFinset s hs : PMF α-- the uniform distribution on a nonempty finite set.
noncomputable example :
PMF.bernoulli (1 / 2) (⊢ 1 / 2 ≤ 1 All goals completed! 🐙) true = 1 / 2 := ⊢ (PMF.bernoulli (1 / 2) ⋯) true = 1 / 2
All goals completed! 🐙
5.6.2.9. Beyond PMFs: measures, expectation, independence
PMF is the right tool for discrete distributions, but probability
theory in Mathlib is much larger. A sketch of where to go next:
-
Every
PMF αinduces a probability measure onαviaPMF.toMeasure; conversely,Measure αis the general setting that handles continuous distributions. -
Expectation of a function
f : α → ℝagainst a probability measureμis the Bochner integral∫ x, f x ∂μ, found inMathlib.MeasureTheory.Integral.Bochner. -
Independence of two events / sigma-algebras / random variables is
ProbabilityTheory.IndepSets,IndepFun, etc., inMathlib.Probability.Independence. -
Conditional probability and conditional expectation live in
Mathlib.Probability.ConditionalProbabilityand the measure-theory chapter of these notes.
5.6.2.10. When do we still need measurability?
The PMF monad goes a remarkably long way with no mention of
sigma-algebras. You only need to reach for measurability when:
-
You want to convert a
PMF αto a MathlibMeasure α: this usesPMF.toMeasure, which does require aMeasurableSpace α. -
You want to integrate a function
g : α → ℝagainst aPMFand reuse the Bochner-integral library. For discrete sums this is usually avoidable; thePMF-level API (e.g.PMF.bind_apply) gives you an explicittsum. -
You want to compare
PMF.map fwith the measure-theoretic pushforwardMeasure.map f.
For most manipulations of discrete distributions -- composing
samplers, pushing them along functions, computing marginals --
working directly with PMF and its monad laws is cleaner and
measurability-free.