Formal Languages, Coinductively Formalized Andreas Abel Department of Computer Science and Engineering Chalmers and Gothenburg University Departmental Seminar Department of Computer and Information Sciences Strathclyde University, Glasgow, Scotland, UK 19 April 2016 Andreas Abel (GU) Languages coinductively Strathclyde 2016 1 / 30 Contents 1 Formal Languages 2 Coinductive Types and Copatterns 3 Bisimilarity 4 Sized Coinductive Types 5 Conclusions Andreas Abel (GU) Languages coinductively Strathclyde 2016 2 / 30 Formal Languages Formal Languages A language is a set of strings over some alphabet A. Real life examples: Orthographically and grammatically correct English texts (infinite set). Orthographically correct English texts (even bigger set). List of university employees plus their phone extension. AbelAndreas1731,CoquandThierry1030,DybjerPeter1035,... Programming language examples: The set of grammatically correct JAVA programs. The set of decimal numbers. The set of well-formed string literals. Languages can describe protocols, e.g. file access. A = {o, r , w , c} (open, read, write, close) Read-only access: orc, oc, orrrc, orcorrrcoc, . . . Illegal sequences: c, rr , orr , oco, . . . Andreas Abel (GU) Languages coinductively Strathclyde 2016 3 / 30 Formal Languages Running Example: Even binary numbers Even binary numbers: 0, 10, 100, 110, 1000, 1010, . . . Excluded: 00, 010 (non-canonical); 1, 11 (odd) . . . Alphabet A = {a, b} where a is zero and b is one. So E = {a, ba, baa, bba, baaa, baba, . . . }. Andreas Abel (GU) Languages coinductively Strathclyde 2016 4 / 30 Formal Languages Tries An infinite trie is a node-labeled A-branching tree. I.e., each node has one branch for each letter a ∈ A. A language can be represented by an infinite trie. To check whether word a1 · · · an is in the language: We start at the root. At step i, we choose branch ai . At the final node, the label tells us whether the word is in the language or not. Andreas Abel (GU) Languages coinductively Strathclyde 2016 5 / 30 Formal Languages Trie of E 8 a @ b b a b & a b 8 a b a b & Andreas Abel (GU) a b 3 a a b a b a b + 3 + 3 + 3 a Languages coinductively b + a b a b a b a b 3 ··· +3 ··· +3 ··· +3 ··· +3 ··· +3 ··· +3 ··· +3 ··· + ··· Strathclyde 2016 6 / 30 Formal Languages Regular Languages A trie is regular if it has only finitely many different subtrees. Each node of the trie corresponds to one of these languages: E Z N ε ∅ Andreas Abel (GU) even binary numbers strings ending in a strings not ending in b the empty string nothing (empty language) Languages coinductively Strathclyde 2016 7 / 30 Formal Languages 3 ··· a 4 ∅ a 8 ∅ b * 4 ··· a * a ∅ Aε b + 3 ··· a b & a 4 ∅ a ∅ b * b * 4 ··· a ∅ E b * 4 ··· a a b 8 N 4 N b b * * 4 ··· a a Z Z b * 4 ··· a b & Andreas Abel (GU) b a Z Languages coinductively b 4 N * b a Z b * 4 ··· + ··· Strathclyde 2016 8 / 30 Formal Languages Cutting duplications at depth 3 4 a 7 ∅ a ?ε b ' a b * 4 a ∅ b * E 7 N a Z b ' Andreas Abel (GU) 4 a b b + 4 a Z Languages coinductively b + Strathclyde 2016 9 / 30 Formal Languages Bending branches . . . a 7 ∅ h K a ?ε b b ' a a b ∅ E a b 7 N i a Z o S b b ' a Z b Andreas Abel (GU) Languages coinductively Strathclyde 2016 10 / 30 Formal Languages Final Automata We have arrived at a familiar object: a final automaton. Depending on what we cut, we get different automata for E . If we cut all duplicate subtrees, we get the minimal automaton. Andreas Abel (GU) Languages coinductively Strathclyde 2016 11 / 30 Formal Languages Removing duplicate subtrees II. . . 4 a a : ε 7 ∅ b b * * a E 4 a b 7 N a Z b + b ) Andreas Abel (GU) Languages coinductively Strathclyde 2016 12 / 30 Formal Languages Bending branches II . . . a 7 ∅ BK a : ε b a b E a b a J Z h 7 N b b Andreas Abel (GU) Languages coinductively Strathclyde 2016 13 / 30 Formal Languages Extensional Equality of Automata All automata for E unfold to the same trie. This gives a extensional notion of automata equality: 1 2 Recognizing the same language. I.e., unfold to the same trie. Andreas Abel (GU) Languages coinductively Strathclyde 2016 14 / 30 Formal Languages Automata, Formally An automaton consists of 1 2 3 A set of states S. A function ν : S → Bool singling out the accepting states. A transition function δ : S → A → S. s∈S E ε ∅ Z N νs 7 X 7 7 X δsa ε ∅ ∅ N N δsb Z ∅ ∅ Z Z Language automaton 1 2 3 State = language ` accepted when starting from that state. ν`: Language ` is nullable (accepts the empty word)? δ`a = {w | aw ∈ `}: Brzozowski derivative. Andreas Abel (GU) Languages coinductively Strathclyde 2016 15 / 30 Formal Languages Differential equations Language E and friends can be specified by differential equations: ν gives the initial value. ν∅ δ∅x = false = ∅ νε δεx = true = ∅ νE = false δE a = ε δE b = Z νN = true δNa = N δNb = Z νZ δZ a δZ b = false = N = Z For these simple forms, solutions exist always. What is the general story? Andreas Abel (GU) Languages coinductively Strathclyde 2016 16 / 30 Coinductive Types and Copatterns Final Coalgebras (Weakly) final coalgebra. S f / F (S) F (coit f ) coit f νF force / F (νF ) Coiteration = finality witness. force ◦ coit f = F (coit f ) ◦ f Copattern matching defines coit by corecursion: force (coit f s) = F (coit f ) (f s) Andreas Abel (GU) Languages coinductively Strathclyde 2016 17 / 30 Coinductive Types and Copatterns Automata as Coalgebra Arbib & Manes (1986), Rutten (1998), Traytel (2016). Automaton structure over set of states S: o : S → Bool t : S → (A → S) “output”: acceptance transition Automaton is coalgebra with F (S) = Bool × (A → S). ho, ti : S −→ Bool × (A → S) Andreas Abel (GU) Languages coinductively Strathclyde 2016 18 / 30 Coinductive Types and Copatterns Formal Languages as Final Coalgebra ho,ti S / Bool × (A → S) id×(coitho,ti ◦_) ` := coitho,ti Lang ν◦` ν (` s) hν,δi “nullable” = o = os δ◦` = (` ◦ _) ◦ t δ (` s) = ` ◦ (t s) δ (` s) a = ` (t s a) Andreas Abel (GU) / Bool × (A → Lang) (Brzozowski) derivative Languages coinductively Strathclyde 2016 19 / 30 Coinductive Types and Copatterns Languages – Rule-Based Coinductive tries Lang defined via observations/projections ν and δ: Lang is the greatest type consistent with these rules: l : Lang ν l : Bool l : Lang a:A δ l a : Lang Empty language ∅ : Lang. Language of the empty word ε : Lang defined by copattern matching: ν ε = true : Bool δεa = ∅ : Lang Andreas Abel (GU) Languages coinductively Strathclyde 2016 20 / 30 Coinductive Types and Copatterns Corecursion Empty language ∅ : Lang defined by corecursion: ν ∅ = false δ∅a = ∅ Language union k ∪ l is pointwise disjunction: ν (k ∪ l) = ν k ∨ ν l δ (k ∪ l) a = δ k a ∪ δ l a Language composition k · l à la Brzozowski: ν (k · l) = νk ∧νl (δ k a · l) ∪ δ l a δ (k · l) a = (δ k a · l) if ν k otherwise Not accepted because ∪ is not a constructor. Andreas Abel (GU) Languages coinductively Strathclyde 2016 21 / 30 Bisimilarity Bisimilarity Equality of infinite tries is defined coinductively. _∼ =_ is the greatest relation consistent with ∼k l= ∼ =ν νl ≡νk l∼ a:A ∼ =k =δ δla ∼ = δka Equivalence relation via provable ∼ =refl, ∼ =sym, and ∼ =trans. ∼ =trans ∼ =ν (∼ =trans p q) ∼ ∼ (=trans p q) a =δ : (p : l ∼ = k) → (q : k ∼ = m) → l ∼ =m ∼ ∼ : νl ≡νk = ≡ trans (=ν p) (=ν q) = ∼ = δma =δ q a) : δ l a ∼ =trans (∼ =δ p a) (∼ Congruence for language constructions. k∼ l∼ = k0 = l0 ∼ =∪ (k ∪ k 0 ) ∼ = (l ∪ l 0 ) Andreas Abel (GU) Languages coinductively Strathclyde 2016 22 / 30 Bisimilarity Proving bisimilarity Composition distributes over union. dist : ∀ k l m. k · (l ∪ m) ∼ = (k · l) ∪ (k · m) Proof. Observation δ _ a, case k nullable, l not nullable. δ (k · (l ∪ m)) a = δ k a · (l ∪ m) ∪ δ (l ∪ m) a ∼ = (δ k a · l ∪ δ k a · m) ∪ (δ l a ∪ δ m a) ∼ = (δ k a · l ∪ δ l a) ∪ (δ k a · m ∪ δ m a) = δ ((k · l) ∪ (k · m)) a by definition by coind. hyp. (wish) by union laws by definition Formal proof attempt. ∼ =δ dist a = ∼ =trans (∼ =∪ dist . . . ) . . . Not coiterative / guarded by constructors! Andreas Abel (GU) Languages coinductively Strathclyde 2016 23 / 30 Sized Coinductive Types Construction of greatest fixed-points Iteration to greatest fixed-point. > ⊇ F (>) ⊇ F 2 (>) ⊇ · · · ⊇ F ω (>) = \ F n (>) n<ω Naming ν i F = F i (>). ν0 F ν n+1 F νω F = > = T F (ν n F ) n = n<ω ν F Deflationary iteration. T j νi F = j<i F (ν F ) Andreas Abel (GU) Languages coinductively Strathclyde 2016 24 / 30 Sized Coinductive Types Sized coinductive types Add to syntax of type theory Size i νi F Size< i type of ordinals ordinal variables sized coinductive type type of ordinals below i Bounded quantification ∀j<i. A = (j : Size< i) → A. Well-founded recursion on ordinals, roughly: f : ∀ i. (∀ j<i. ν j F ) → ν i F fix f : ∀ i. ν i F Andreas Abel (GU) Languages coinductively Strathclyde 2016 25 / 30 Sized Coinductive Types Sized coinductive type of languages Lang i ∼ = Bool × (∀j<i. A → Lang j) l : Lang i ν l : Bool l : Lang i j <i a:A δ l {j} a : Lang j ∅ : ∀i. Lang i by copatterns and induction on i: ν (∅ {i}) = false : Bool δ (∅ {i}) {j} a = ∅ {j} : Lang j Note j < i. On right hand side, ∅ : ∀j<i. Lang j (coinductive hypothesis). Andreas Abel (GU) Languages coinductively Strathclyde 2016 26 / 30 Sized Coinductive Types Type-based guardedness checking Union preserves size/guardeness: k : Lang i l : Lang i k ∪ l : Lang i ν (k ∪ l) = νk ∨νl δ (k ∪ l) {j} a = δ k {j} a ∪ δ l {j} a Composition is accepted and also guardedness-preserving: k : Lang i l : Lang i k · l : Lang i ν (k · l) = νk ∧νl (δ k {j} a · l) ∪ δ l {j} a δ (k · l) {j} a = (δ k {j} a · l) Andreas Abel (GU) Languages coinductively if ν k otherwise Strathclyde 2016 27 / 30 Sized Coinductive Types Guardedness-preserving bisimilarity proofs Sized bisimilarity ∼ = is greatest family of relations consistent with l∼ =i k ∼ =ν νl ≡νk l∼ =i k j <i a:A ∼ =δ j ∼ δla = δka Equivalence and congruence rules are guardedness preserving. ∼ =trans ∼ (=trans ∼ p q) =ν ∼ ∼ =δ (=trans p q) j a : (p : l ∼ =i k) → (q : k ∼ =i m) → l ∼ =i m = ≡ trans (∼ : νl ≡νk =ν p) (∼ =ν q) ∼ ∼ ∼ = =trans (=δ p j a) (=δ q j a) : δ l a ∼ =j δ m a Coinductive proof of dist accepted. ∼ =δ dist j a = ∼ =trans j (∼ =∪ (dist j) (∼ =refl j)) . . . Andreas Abel (GU) Languages coinductively Strathclyde 2016 28 / 30 Conclusions Conclusions Tracking guardedness in types allows natural modular corecursive definition natural bisimilarity proof using equation chains Implemented in Agda (ongoing) Abel et al (POPL 13): Copatterns Abel/Pientka (ICFP 13): Well-founded recursion with copatterns Andreas Abel (GU) Languages coinductively Strathclyde 2016 29 / 30 Conclusions Related work Hagino (1987): Coalgebraic types Cockett et al.: Charity Dmitriy Traytel (PhD TU Munich, 2015): Languages coinductively in Isabelle Kozen, Silva (2016): Practical coinduction Hughes, Pareto, Sabry (POPL 1996) Papers on sized types (1998–2015): e.g. Sacchini (LICS 2013) Andreas Abel (GU) Languages coinductively Strathclyde 2016 30 / 30
© Copyright 2025 Paperzz