1992 Draft of Notes for Undergraduate Discrete Math Course CS301/201 LOGICO-LINGUISTIC FOUNDATIONS OF COMPUTER SCIENCE REFERENCES For logico-linguistics: Barbara Partee, et al., Mathematical Methods in Linguistics. William Cooper, Foundations of Logico-Linguistics. For set theory based on minimal origins: Patrick Suppes, Axiomatic Set Theory. Elliot Mendelson, Introduction to Mathematical Logic. For grammars, automata, and computability: Harry Lewis and Christos Papadimitriou, Elements of The Theory of Computation. George Boolos and Richard Jeffrey, Computability and Logic. For formalization of logics and model theories: Harry Lewis and Christos Papadimitriou, Elements of The Theory of Computation. Herbert Enderton, Introduction to Mathematical Logic. PROLOGUE Everything is computable. Everything is specifiable. Are these true? They are very different questions from "how do I specify a computation of sorting in C?" To decide if everything is computable, one need decide what counts as computation. Is photosynthesis computation? What is the input to a computation? To decide if everything is specifiable, one need decide what counts as specification. Can one describe in language something that is not specifiable in language (e.g., describe something that is ineffable)? How does one describe the language in which specification occurs? Sixty years of studying computing has arrived at these partial answers: there is a sense of computing in which computing whether something computes something is uncomputable; but there are languages that can be specified that are powerful enough to specify themselves. Even if we are not interested in these questions and these answers, it is clear that a study of computing has much to do with language, specification, abstraction, and formalism. MACHINES Abstract machines are a basic conception of computation. Other models of computation include Markov algorithms and recursive functions. One model of machines presupposes symbolic input and makes transitions between machine states in a pre-defined way in response to input. Sometimes, when input is exhausted, the computation terminates. Sometimes there is output. Sometimes we can say that the computation was successful; other times, it can be said that it was not. As a definition of machines, this description is unsatisfactory. Our main interest in what follows is the development of sufficient formalism to define this model of machines satisfactorily. We are interested in the development of formalism, especially formal language powerful enough to define further formalism; we postpone results about formal models of computation to later computer science courses, and we defer to philosophers of mathematics the task of building formalism from minimal origins. SYMBOLS Machines take symbols as input. Examples of symbols input to a machine are "0", "1", CR, menu-open, button-release, disk-in, boot. What makes something a symbol is that someone takes it to stand for something in some context. Examples of non-textual symbols include: nautical flags on a halyard, a flashing icon, the hands of a clock, "How are you?" uttered as a salutation, this kiss (a symbol of our undying love), a Montblanc pen (a symbol of writing refinment), a dot in Morse code, an "T" in the pictorial representation of the board in rogue (symbolizing a troll), a flashing red traffic light. Not all things that are symbolic of something actually symbolize that something: morning sickness forebodes pregnancy, but it would take an idealistic person to perceive morning sickness as a symbol for pregnancy. Symbols are grouped by type. Symbol types can be tokened in multiple instances. "a" is a symbol type which is tokened (instantiated) at least five times in this sentence. Someone who writes mostly vowels on bathroom walls may not token the symbol type "z" very often, but may token "a"'s frequently. Symbols can be evaluated. The tty-evaluator in UNIX evaluates ctrl-C, ctrl-H. The shell variable evaluator in UNIX evaluates expressions such as "$TERM". The shell command evaluator in UNIX evaluates expressions such as "echo ls". Programs such as more evaluate command-line arguments, such as filenames. The usual evaluation of textual symbols is dereferencing. PRINT X; Set "x" to stand for "a". x is the same symbol type as "a". We have just dereferenced the symbol "x". We have not dereferenced x, because we do not know what "a" symbolizes. We also have not said the same thing as "x = a", which is a symbolization of an assertion, that an equivalence holds between whatever "x" symbolizes and whatever "a" symbolizes. Dereferencing is prevented by quoting. PRINT "X"; Double-quotes prevent "St. Louis" from having a large river run along it. St. Louis has a large river running along it. The translation of "unbekannt" is unfamiliar. The translation of "unbekannt" is "unfamiliar". The name of my pet is the first word in this sentence. The name of my pet is "the first word in this sentence". Consider the first word in this sentence. Consider "the first word in this sentence". Consider the first word in "this sentence". Consider the name of my pet. Despite old-fashioned rules of punctuation, it is incorrect to say that "me" is the same as "me." Double-quotes prevent "me" from referring to me, when it is used in this sentence. But notice that "me" refers to me. It is the first expression, ``"me"'' that doesn't refer to me. The second expression, "me" does refer to me. Some people think ""me"", the four-letter expression, refers to "me", the two-letter expression. I think the first is a tough expression to parse. The single-quote in the LISP S-expression "(list a 'a)" prevents the last atom from being evaluated. The backslash in UNIX prevents quotes from quoting. Consider echo \" "a" which prints a quote, a space, and an a. The ctrl-V in UNIX prevents tty-evaluation (so does backslashing a carriage return). Unquoting forces evaluation. Single backquotes in UNIX force command evaluation: "echo date" echoes four characters, while "echo `ls`" echoes considerable more. Commas under backquote unquotes in LISP: "(list `(a b ,c))" evaluates "c", but not "a" nor "b". The newspaper reported the angry politician's response to the allegations: "They're all lies." LateX files have uninterpreted text except where preceded by a "\" and delimited by a " " or other delimiter. Text unquoted in this manner is evaluated, as in "\begin{center} hello \end{center}". SETS Objects so far include (a) those objects we choose to denote by symbols, whose referents are understood in some context, and (b) those symbols we choose to denote by quoting the symbol. Prior to introducing proper names, we can currently refer to the objects me, you, "a", "b", "c", "d", and so forth. Classes collect objects. Some classes are sets, which obey the following properties. An object is an element of the set that explicitly lists the object among its members. "a" in {"a", me, you}. me in {me, you}. Let a = me. a !in {"a", "b"}. An object is an element of a set with explicitly listed membership only if the object equals one of the listed members. {"a", me} = {x | x = "a" or x = me}. Note that explicit membership lists cannot specify infinite sets (sets with an infinite membership, whatever that means). Infinite sets are created by implicit specification. The set that abstracts some property contains all and only those objects that have the property. At present, the language that expresses the property is not restricted. S = {x, 2x | x = 1 or x = 2} is equivalent to saying for all x: if x = 1 or x = 2, then x in S and 2x in S and for all x: if x in S or 2x in S, then x =1 or x = 2. "a" in {x | x is a letter}. not(me in {x | x is a letter}). S = {f(x) | P(x)} is equivalent to saying for all x: if P(x) then f(x) in S and for all x: if f(x) in S then P(x). Not all abstractions are successful. The set {f(x) | P(x)} is not adequately specified if "P" is unspecified or if "f" is unspecified. The set S = {x | x !in x} has inherent problems. Is S in S? We normally do not allow a set to be a member of itself. We normally do not consider the set of all sets. An specification can be successful even if there is no known algorithm for computing all members of the set. {x | x is a Turing machine program that halts} successfully specifies a set, even though there can be no program that generates all of its members (it is a non-constructive specification of a set). Two sets are equal just in case they have the same elements. {"a", me} = {me, "a"}. {"a", me, me} = {me, "a"}. {"a"} != {"a", me}. There is a set that contains no elements, for all x: x !in {}. Sets that contain a single element are called singleton sets. Sets containing exactly two elements are called doubletons. Formally, if S is a singleton, then for all x: for all y: if x in S and y in S then x = y; if S is a doubleton, then 1. for all x: for all y: for all z: (i.e., for all x, y, z:) if x in S and y in S and z in S then (x = y or x = z); and 2. there is some x: there is some y: x != y and x in S and y in S. For any two sets, A and B, A is a subset of B just in case every member of A is also a member of B. {"a", "b"} SUBSET {x | x is a letter}. {"a", "b"} SUBSET {"a", "b"}. {"a", "b", "c"} !SUBSET {"a", "b"}. For all sets A, B: A PROPER-SUBSET B just in case A SUBSET B and for some x in B, x !in A. {me, you} PROPER-SUBSET {me, you, "a"} {me, you} !PROPER-SUBSET {me, you} Set union, intersection, and difference are defined as follows: For all x: x in A u B just in case x in A or x in B. For all x: x in A n B just in case x in A and x in B. For all x: x in A -- B just in case x in A and x !in B. "a" in {"a"} u {"b"}. "c" !in {"a" u {"b"}. "a" in {"a", me} n {x | x is a letter}. me !in {"a", me} n {x | x is a letter}. me in {me, you} -- {you}. me !in {me, you} -- {me}. me !in {me, you} -- {me, {}}. The following can be proved: For all sets A: {} SUBSET A; and if A SUBSET {} then A = {}. For all A, B, C: A u (B u C) = (A u B) u C; and A n (B n C) = (A n B) n C; and A u (B n C) = (A u B) n (A u C); and A n (B u C) = (A n B) u (A n C). For all A, A u {} = A and A n {} = {}. For all sets A, B: A SUBSET A u B; and A n B SUBSET A; and A u B = B u A; and A n B = B n A; and A -- {} = A; and if A SUBSET B then A -- B = {}; and A SUBSET (A -- B) u B; and A = (A -- B) u (A n B). For all sets x: 2**x = {y | y SUBSET x}. 2**{me, you} = { {}, {me}, {you}, {me, you}}. 2**{me} = { {}, {me} }. 2**{} = { {} }. 2**{me, you, {}} = { {}, {me}, {you}, {{}}, {me, you}, {me, {}}, {you, {}}, {me, you, {}} }. The following can be proved: For all sets A: {} in 2**A; A in 2**A; for all x: if x in A then {x} in 2**A. For all sets A, B: 2**A SUBSET 2**B just in case A SUBSET B. 2**A u 2**B SUBSET 2**(A u B). ALPHABETS An alphabet is a set of symbol types. {"0", "1"} is the input alphabet of most low-level machine models. EBCDIC was a smaller alphabet than ASCII. {"a", "b", "c", "d", "e"} contains part of our normal alphabet. An alphabet that contains "a" and "aa" as atomic symbols will make unambiguous composition of symbols difficult. An alphabet that contains the symbol "", assuming "" successfully refers to a symbol, is an alphabet asking for trouble. {} is a useless alphabet. The keyboard alphabet is not capable of symbolizing mathematics without the adoption of creative conventions. STRINGS Some symbol systems allow composition of simple or atomic symbols into complex or composite symbols. Composition can be syntactic: merely that subparts of the symbol are distinguishable; or semantic: the meaning of the composite can be determined from the meanings of the parts composed. A spoken assertion followed by the the saying of "not" is a spoken symbolization of a denial. A gesture followed by a pointing to oneself gestures attribution of a property to oneself. A double-mouse click has different semantics from a single click. Two flags of nations flown side-by-side are a composite symbol. A pictorial symbol with an "X" drawn through it is a prohibition in those contexts where the pictorial symbol alone is a license. Western textual symbol systems compose by juxtaposing, left-to-right. "a" juxtaposed with "b" is "ab". "8" juxtaposed with "1" is "81". Hebrew, Chinese, and other older languages sometimes juxtapose right-to-left or top-to-bottom. Gestures and spoken symbols juxtapose subsequently in time. Symbols juxtaposed in this way form strings. "aaaa" is a string of four "a"'s. A string such as "81^9+" symbolizes a sequence of inputs to a reverse-polish-notation calculator. Strings also have types and tokens. We assume that reference to type or to token is determinable from context. The string "flat-head" is tokened twice in this sentence, because I won't let the sentence end until I write "flat-head" again. There is a null string. "" is the null string. Convention (symbols are strings): a single atomic symbol shall be considered a string of one symbol. In C, there is a difference between 'a' and "a". There is an append-right operation that extends strings by exactly one atomic symbol. "abc" # "z" = "abcz". "" # "a" = "a". "a" # "a" = "aa". A string formed by juxtaposing symbols from an alphabet is a string over that alphabet. The null string is a string over any alphabet. "01" is a string over {"0", "1"}. "" is a string over {"a", "e", "i", "o", "u"}. "" is a string over {}. It is not always possible to recover the sequence of atomic symbols once juxtaposed. The string "aaaba" over {"a","aa", "ab"} could represent several compositions. If a string can be composed from an alphabet by only one sequence of append-right operations, then it is lexically unambiguous. "aa" over {"a", "aa"} is lexically ambiguous. "x = a+++b" is lexically ambiguous in C. "a0a0a0" is not lexically ambiguous over {"a0", "0a"}. SEQUENCES, LISTS, AND STRINGS Mathematicians define 2-tuples as the result of applying a tuple-forming 2-operator (an operator on two arguments). 3-tuples are defined as a conventional way of writing 2-tuples the first element of which is a 2-tuple. Computer scientists conceive of lists which take internal structure more seriously and have inherent length. Our notation is overloaded: we use the same right-angle brackets for both kinds of structures and the same tilde for concatenation. We prefer to use lists instead of sequences in this course, simply to break from a convention that is unduly entrenched. We present the mathematicians' approach first for historical reasons. "< ..., ... >" is an operator that surrounds its two arguments (as opposed to operators that are written prefix, infix, or postfix). SEQUENCES: For all x, y, z, t, m, n: = if and only if x = z and y = t. != . <"a", "ba"> != <"ab", "a">. if x = "a" then <"a", "b"> = . convention (k-tuples): = <, z> and = <, z, t>, etc. This forms 3-tuples from some 2-tuples, but not from others. = <, me>. > cannot be written as a 3-tuple. Note that there are no 0-tuples, nor are there 1-tuples. Kuratowski defines to be {{x}, {x, y}}. Wiener defines to be {{{}, {x}}, {{y}}}. Cantor defines to be {{{}, x}, {{{}}, y}}. All of these definitions are superfluous once we have the axiom above. The Cartesian product, A X B, of two sets, A and B is { | x in A and y in B}. {me, you} X {me} = {, }. {me} X {me} = {}. {} X {me} = {}. {me, you, "a"} X {<"a", "b">, "c"} = { >, >, <"a", <"a", "b">>, , , <"a", "c"> } . Since <, c> is , 3-tuples are formed using iteration of this same Cartesian product: {me} X {you, me} X {me} = ({me} X {you, me}) X {me} = {, } X {me} = {< , me>, < , me>} = {, }. Set powers are iterated Cartesian products of a set with itself. For all sets A: A**1 = A; for all n in |N -- {0}: A**(n+1) = A**n X A = { | x in A**n and y in A}. {me, you}**3 = {, , , , , , , } {{}}**3 = {<{}, {}, {}>}. STRINGS AS SEQUENCES: Clearly, strings cannot be k-tuples of symbols because there is a null string but no 0-tuple: "" is a string but <> is not a tuple; string concatenation is associative but pairing is not: "ab" concatenated with "c" equals "a" concatenated with "bc"; <<"a", "b">, "c"> != <"a", <"b", "c">>; strings have inherent length: "abc" is one symbol longer than "ab"; <"a", "b", "c"> and <"a", "b"> could both be written as 2-tuples; One approach is to define concatenation over an alphabet, Q, and adopt axioms for the null string and a definition of string length: For all x, y, z: x ~ "" = x; "" ~ x = x; "" has length 0; if x in Q then "" ~ x = x and has length 1; if y in Q and x has length n then x ~ y = and has length n+1; convention (juxtaposition): we write "a" ~ "b" as "ab", including both symbols under the double-quotes; if none of x, y, and z is "", then x ~ = (x ~ y) ~ z; "" ~ "a" = "a" and has length 1. "abc" ~ "d" = <"abc", "d"> and has length 1 greater than the length of "abc". <"a", "b"> ~ <"d", "e"> = "ab" ~ "de" = (<"a", "b"> ~ "d") ~ "e" = <<"a", "b">, "d"> ~ "e" = <<<"a", "b">, "d">, "e"> = <<"ab", "d">, "e"> = <""abd", "e"> = "abde" and has length 4. Note that x & y defined above becomes a special case of x ~ y. For lists, define the operators "&" (append) and "~" (concat), and use the angle-bracket notation to display the lists via a different convention. LISTS: <> is a list of length 0, called the null list. For all x, x & <> = is a list of length 1. & is like cons in LISP. <> & <> = <<>>. me & <> = . & <> = <>. For all x, and all lists y of length m: x & y is a list of length m+1. For all x and y: if x and y are lists with different lengths, then x != y <> & != . For all x, y, z: x & (y & z) != y & z. For all lists y, t and all x, z: x & y = z & t just in case x = z and y = t. <> & != & <>. me & = me & (you & <>). convention (list notation): = x & = x & (y & <>), and = x & = x & (y & ) = x & (y & (z & <>)), etc. me & <> = . me & (me & <>) = . you & (me & (you & <>)) = . you & ((me & <>) & (you & <>))))) = , you>. <> ~ <> = <>. For all lists x: <> ~ x = x, and x ~ <> = x. <> ~ = <> ~ = ~ <> = . For all x, and all lists y: ~ y = x & y; i.e., (x & <>) ~ y = x & y. ~ = me & = . <> ~ = & = <, you, me>. ~ <> = . Note that there are now two ways to enumerate lists; 1. <> is a list of length 0; for all x and all lists y of length m, x & y is a list of length m+1. 2. <> is a list; for all x, x & <> is a list of length 1; for all lists x of length m and all lists y of length n, x ~ y is a list of length m+n. For all a, and all lists s and t: (a & s) ~ t = (a & <>) ~ (s ~ t) = a & (s ~ t), because of the above rule that (x & <>) ~ y = x & y. (me & ) ~ <"u"> = (me & <>) ~ ( ~ <"u">) = ~ (you & <"u">) = ~ = me & = . Thm. For all lists x, y, z: x ~ (y ~ z) = (x ~ y) ~ z (proof requires induction). For lists, a product analogous to Cartesian product must respect length: For all sets A, B: X_0() = <>; X_1(A) = { | x in A}; X_2(A, B) = { | x in A and y in B}. A general definition is not easily given for list-forming products. A definition for list-forming product-powers of a single set is easy: For all sets A: A**0 = {<>}; for all n in |N - {0}: A**(n+1) = {x ~ | x in A**n and y in A}. STRINGS AS LISTS: flat(<>) = <>. For all x: if x is not a list then flat(x) = . For all x: flat(x & <>) = flat(x) ~ <> = flat(x). For all x, y: flat(x & y) = flat(x) ~ flat(y). For all x: if x is a list and x = flat(x) then x is a flat list. flat(me) = . flat() = flat(me & <>) = flat(me) ~ <> = ~ <> = . flat(<"a", "b", "c">) = flat("a" & ("b" & ("c" & <>))) = flat("a") ~ flat("b" & ("c" & <>)) = <"a"> ~ (flat("b") ~ flat("c" & <>)) = <"a"> ~ (<"b"> ~ (flat("c") ~ <>)) = <"a"> ~ (<"b"> ~ (<"c"> ~ <>)) = <"a", "b", "c">. flat(>) = flat(me & <>) = flat(me) ~ flat(<>) = ~ (flat() ~ <>) = ~ ( ~ <>) = . Thm. For all lists x: flat() = flat(x). Thm. For all lists x: flat(flat(x)) = flat(x). Strings are flat lists of symbols. We could have said that strings are sequences of symbols, but chose not to do so. For all x, y: x is a prefix of y just in case there is some z such that x ~ z = y. For all x, y: x is a proper prefix of y just in case there is some z such that z != <> and x ~ z = y. "a" is a prefix of "aab". "" is a prefix of "aaa". "aaa" is a prefix of "aaa" but not a proper prefix. Juxtaposition of symbols appears to be concatenation of 1-symbol strings. This appearance is not real since lexical ambiguity is not a sensible property for lists of symbols, but it is for strings. Also, the apparent equivalence forces an interpretation of symbols as distinct from 1-symbol strings, since <"a"> != "a". The programming language, C, enforces such a distinction by using two kinds of quotes: one (single-quote) which refers to the symbol and one (double quote) which refers to the 1-symbol string. We simply apologize for not earlier enforcing a distinction between symbols and 1-symbol strings. And we take strings to be a compressed representation of lists of symbols, which may introduce lexical ambiguity. Note that lists are defined by a pre-pending operator (&), while strings were defined by an appending operation (juxtaposition). It is an artifice of computer science that pre-pending is chosen over appending: strings are stored in pointer-based data structures front-to-back, instead of back-to-front, so that pre-pending is a faster operation than appending. QUASI-QUOTING Sometimes we want to refer to a string built up from concatenating variables and constants. x = "abc" "(" ~ x ~ ")" has five symbols in it. It is often cumbersome to write the infix operator symbol, so juxtaposition often stands for concatenation of strings. "("x")" has more symbols than x. In those situations where quoting is also bothersome, quasi-quotes are used: '`(x)'` has more than three symbols. In these cases, it is assumed that the symbols denoting variables and the symbols denoting string constants can be distinguished. The usual symbols for quasiquoting are matching left and right square quotes. INDUCTIVE DEFINITIONS Strings, sequences, and lists are defined recursively (inductively). <> is a list of length 0. For all x: x & <> is a list of length 1. For all x, y, and n: if y is a list of length n, then x & y is a list of length n+1. nothing else is a list. A set is defined inductively by giving three parts: 1. the base; 2. the recursion; 3. the minimization. At present, the language of the base and recursion is not restricted. The base need not be restricted to a single element. For all x in the Tribe of Zebulun: x is a Child of Israel. For all x in the Tribe of Gad: x is a Child of Israel. For all x in the Tribe of Ephrxim: x is a Child of Israel. For all x: if x is a Child of Israel, then child(x) is a Child of Israel. The recursion need not imply a deterministic computation. S is the smallest set such that: "a" in S; "b" in S; for all x: if x in S then x~"a" in S and x~"b" in S. The minimization is required for the set to be well-defined. Well-defined means that there are not two distinct sets both satisfying the base and recursion. "a" is an a-string. For all strings x: if x is an a-string then x~"a" is an a-string. is "b" an a-string? Clearly {"a", "aa", "aaa", ...} satisfies the requirements for the set of a-strings. So does {"a", "b", "aa", "ba", "aaa", "baa", ...}. Although an algorithm that generates a-strings in order would never produce "b", a-strings are not defined in terms of an algorithm. The set of a-strings is defined by requiring it to satisfy the base and inductive constraints. It is not well-defined, since the constraints do not uniquely determine the set. Minimization of a set satisfying P1 and P2 is equivalent to the following statement: S satisfies P1 and P2, and for all sets T that satisfy P1 and P2, S SUBSET T. Maximization does not always result in well-defined sets for the same reason that set abstraction sometimes fails. Let S be the largest set such that: 0 !in S; for all x: if x !in S then succ(x) !in S. This is equivalent to S = {x | x !in |N}. Let S be the largest subset of |N such that: 5 !in S; for all x: if x !in S then x+5 !in S. This is equivalent to |N -- {5x + 5 | x in |N}. NATURAL NUMBERS |N, the set of natural numbers, is defined for 0 a given object, and succ a given function subject to the following constraint: there is no x such that 0 = succ(x). 1. 0 in |N; 2. for all x: if x in |N then succ(x) in |N; 3. nothing else is in |N Let 0 = <> and succ(x) = me & x. Define for all x, y in |N: x < y just in case x is a proper prefix of y. Then 0 = <>; 1 = ; 2 = ; 3 = ; etc. Alternatively, Let 0 = {} and succ(x) = 2**x. Define for all x, y in |N: x < y just in case x is a proper subset of y. Then 0 = {}; 1 = {{}}; 2 = { {}, {{}} }; etc. |N+ = |N -- {0}. Alternatively, 1. succ(0) in |N+; 2. for all x: if x in |N+ then succ(x) in |N+; 3. nothing else is in |N+. Define also: 1. 0 in EVENS; 2. for all x: if x in EVENS then succ(succ(x)) in EVENS; 3. nothing else is in EVENS; and 1. 1 in ODDS; 2. for all x: if x in ODDS then succ(succ(x)) in ODDS; 3. nothing else is in ODDS. We also write x' for succ(x), and x'' for succ(succ(x)). So 3 is 0''' and 5 is 0'''''. All members of |N are finite numbers. Any number not a member of |N is an infinite number. pred(1) = 0; for all x: pred(succ(x)) = x. Note that pred is not defined on all members of |N, since pred(0) is not defined. RECURSIVE FUNCTIONS Functions exist in English. The eldest brother of The Chief Justice is also a racist. The function, "the eldest brother of, operates on its argument, The Chief Justice. The eldest brother of the eldest brother of The Chief Justice is also a racist. The number of apples is large, whereas the apples are numerous; "number of" is a function that produces singulars from plurals. A function is defined by giving base and recursion parts. It is an effective definition if there is no way of determining more than one value of the function for a given set of arguments. 1. f("a") = 1. 2. For all x, n: if f(x) = n then f(x~"a") = n+1. This is the length function. Note that no minimization is needed. The function simply is not defined for strings that contain non-"a"'s. There is no way to determine two values of f for a given argument. By not minimizing, the definition allows multiple ways f can be extended to strings that contain non-"a"'s. The following definition is ineffective: 1a. f("a") = 1. 1b. f("aa") = 1. 2. For all x,y: if f(x) = n then f(x~y) = f(x)+f(y). Since "aa" is "a"~"a", two values are specified for "aa" under f, which is not allowed for functions. Functions of two arguments, or more, can also be defined recursively: 1. For all x: x + 0 = x. 2. For all x, y, z: if y = succ(z) then x + y = succ(x) + z. Consider the function that indicates whether the first argument evenly divides the second argument. 1a. For all x: divides(1, x) = 1; 1b. For all x: divides(x, x) = 1; 2. For all x, y: if divides(x, y) then divides(x, y+x); 3. all other values of divides(x, y) are 0. This A function is k-ary just in case it takes k arguments. A lambda-expression declares the arguments of the function and the value of the function expressed in terms of its arguments. At present, the language that expresses the value of the function is not restricted. "+" is a 2-ary function, lambda(x, y)(x + y). "eldest-brother-of" is a 1-ary function. {} is a 0-ary function. Note that a k-ary function is not quite the same as a 1-ary function whose argument must be a list of length k. manhattan() = x. For all x, y, z: if y = succ(z) then manhattan() = manhattan(succ(x), z). "manhattan" is analogous to "+", but not identical. A large subset of inductively defined functions can be identified rigorously. The set of primitive recursive functions is defined recursively: 1a. The 0-ary function "zero" is a primitive recursive function; i.e., zero_0() = lambda(0) = 0. The 1-ary function zero_1(x) = 0 is primitive recursive. There is a k-ary function for all k in |N, zero_k, which simply returns the value zero. We refer to all of these functions simply as "zero". 1b. For all k >= 1 and all i, 1 <= i <= k: the i-from-k selection function is a primitive recursive function; i.e., sel(i,k)(a_1, a_2, a_3, ..., a_k) = a_i. sel(2,4)(me, you, , ) = you. 1c. The successor function succ(x) is a primitive recursive function; i.e., succ(0) = 1. succ(4) = 5. 2a. For all g, l, k: if g is an l-ary primitive recursive function then for all h_1, h_2, ..., h_l, if each of which is a k-ary primitive recursive function then lambda(x_1, ..., x_k) ( g( h_1(x_1, ..., x_k), h_2(x_1, ..., x_k), ... h_l(x_1, ..., x_k), ) ) is a primitive recursive function (the generalized composition of g and h_1 through h_l. for l=2, k=3, this is: lambda(x_1, x_2, x_3) ( g( h_1(x_1, x_2, x_3), h_2(x_1, x_2, x_3) ) ) is a primitive recursive function. Here, g is 2-ary, h_1 is 3-ary, and h_2 is 3-ary. Let l=1, k=5: h_1 = sel(1, 5); g = lambda(x)(succ(x)); then for all x, y, z, u, v: lambda(x, y, z, u, v)( succ( sel(1, 5)(x, y, z, u, v) ) ) = lambda(x, y, z, u, v)( succ(x) ). 2b. For all k, g, h: if g is a k-ary function, and h is a (k+2)-ary function, both primitive recursive: then f, defined as follows, is primitive recursive: For all x_1, ..., x_k: f(x_1, ..., x_k, 0) = g(x_1, ..., x_k). i.e., g defines the base case for f. For all x_1, ..., x_k, and all m: f(x_1, ..., x_k, m+1) = h( x_1, ..., x_k, m, f(x_1, ..., x_n, m) ). i.e, h determines the next f-value from the previous f-value and its place in the sequence. Let k=1. For all x: f(x, 0) = sel(1, 1)(x) = x. For all x, m: f(x, m+1) = succ(sel(3, 3)(x, m, f(x, m))). = succ( f(x, m) ). f(0, 0) = 0; f(1, 0) = 1; f(0, 1) = succ( f(0, 0) ) = succ(0) = 1; f(1, 1) = succ( f(1, 0) ) = succ(1) = 2; f(1, 2) = succ( f(1, 1) ) = succ(2) = 3. f(2, 0) = 2; f(2, 1) = succ( f(2, 0) ) = succ(2) = 3; f(2, 2) = succ( f(2, 1) ) = succ(3) = 4; These functions try our patience (they appear to be mathturbation); it would be a poor choice of functional programming language. However, they define rigorously a subset of the inductively defined functions. This subset is surprisingly large (equivalent to all functions computable on machines that halt on all inputs). Writing recursive function definitions is a bit like recursive programming. This connection allows relationships to be proved between computability and inductive definitions: to characterize exactly when an inductive definition is a specification of a set or function that can be computed (i.e., realized as a computer program). The set of MU-recursive functions is a superset of the set of primitive recursive functions. MU-recursive functions include functions formed from primitive recursive functions using the following rule: if there is a y such that: g(x_1, ..., x_k, y) = 0, then (mu y)(g(x_1, ..., x_k, y) = 0) is the least y such that: g(x_1, ..., x_k, y) = 0. This can be used, for example, to complete the definition of the function that indicates whether x divides y ("divides" is the original definition which is not defined except when the value is 1, while "Divides" is the new function, which can return the value 0 when appropriate): 1. for all x, y: g(x, y, 0) = 0. 2. for all x, y: g(x, y, 1) = 1 - divides(x, y). 3. Divides(x, y) = 1 - (mu z) (g(x, y, z) = 0). INDUCTIVE PROOFS Every member of an inductively defined set can be shown to have some property via an inductive argument. "a" is no shorter than "b". For all x: if x is no shorter than "b" then x~"a" is no shorter than "b". Thus, every member of the set of 1-long or longer strings of "a"'s is no shorter than "b". Intuitively, an inductive proof is an algorithm for computing a proof that an object has the property in question, given a proof that the object belongs to the set. Let S be the smallest set such that: "a" in S; for all x: if x in S then x~"a" in S. "aaa" is a member of S because "aa" is a member of S, and "aaa"="aa"~"a"; "aa" is a member of S because "a" is a member of S, and "aa"="a"~"a"; "a" is no shorter than b. if "a" is no shorter than b, then "a"~"a" is no shorter than b. if "aa" is no shorter than b, then "aa"~"a" is no shorter than b. The base and recursive steps of an inductive proof need not match the base and recursive steps of the set's definition; the base and recursive step of the proof should generate a set that is at least as large as the set for members of which the property is being proved. Odds is the smallest set such that: 1 in Odds; for all x: if x in Odds then x+2 in Odds. Prove that each member of Odds is the successor of some member of |N. Base: 1 = succ(0); so there is some y s.t. 1 = succ(y). Recursion: For all x: if there is some y s.t. x = succ(y) then there is some z s.t. x+1 = succ(z). (Technically, this requires proof: for some y, if x = succ(y), then x+1 = succ(succ(y)), so there is some z, namely succ(y), such that x+1 = succ(z)). Sufficiency: This base and recursion suffice for all members of Odds, since Odds is a subset of the smallest set, S, such that: 1 in S; if x in S then x+1 in S. If there are two ways of inductively defining the same set, the proof need not reflect the set's definition at all, though the sufficiency claim may not be obvious. Axes is the smallest set such that: <0, 1> in Axes; for all x: if <0, x> in Axes then <0, x+1> in Axes; for all x, y: if in Axes than in Axes. Show that each member of Axes has a non-zero value under the manhattan function. Base: manhattan(<1, 0>) = 1 != 0. manhattan(<0, 1>) = 1 != 0. Recursion: For all x: if manhattan() != 0 then manhattan() != 0. For all x: if manhattan(<0, x>) != 0 then manhattan(<0, x+1>) != 0. (Technically, this requires proof: manhattan() = succ(manhattan()) != 0, since there is no z s.t. 0 = succ(z); similarly for manhattan(<0, x+1>)). Sufficiency: Since Axes is contained in (actually equivalent to) the smallest set, S, such that: <1, 0> in S; <0, 1> in S; for all x: if in S then in S; for all x: if <0, x> in S then <0, x+1> in S; this suffices to prove that every member of Axes has a non-zero manhattan-value. LANGUAGES A language is a set of strings. {"a", "aa", "aaa"} is a language. {"a", "b", "aa", "bb", ...} is a language. {} is a language. Infinite languages are expressed inductively, or as the result of applying inductively-defined functions to finite sets with explicit membership. The following operations on languages are useful for expressing large languages. Set-Concatenation: For all sets A, B: A ~~ B = { x~y | x in A and y in B}. {"a", "b"} ~~ {"b", "c"} = {"ab", "ac", "bb", "bc"}. {"aaa"} ~~ {} = {}. {"aaa"} ~~ {"", "b"} = {"aaa", "aaab"}. Set-Powers: For all sets A: 1. A**0 = {""}; 2. for all n in |N: A**(n+1) = A ~~ A**n. {"a", "b"}**0 = {""}. {"a", "b"}**1 = {"a", "b"}. {"a", "b"}**2 = {"aa", "ab", "ba", "bb"}. {"a", "b"}**3 = {"aaa", "aab", "aba", "abb", "baa", "bab", "bba", "bbb"}. Note that set-powers are written by juxtaposing a "**" and a natural number onto the set's name. This is also the notation for iterated Cartesian products of a set with itself. The notation is overloaded. Kleene-closures: For all sets A: A+ is the smallest set such that: for all x in |N, if x != 0, then A**x SUBSET A+; A* is the smallest set such that: for all x in |N, A**x SUBSET A+; {"a"}+ = {"a", "aa", "aaa", "aaaa", ...}. {"a"}* = {"", "a", "aa", "aaa", "aaaa", ...}. {"a", "bb"}+ = {"a", "bb", "abb", "bba", "aa", "bbbb", "aaa", ...}. Let A = {"a", "b"}; let B = {"0", "1"}; (A ~~ B)* = {"", "a0", "a1", "b0", "b1", "a0a0", "a0a1", ...}; (A u B)* = {"", "a", "b", "0", "1", "aa", "ab", "a0", a1", ...}. For all A: A* is the set of all strings over A. The definition of Kleene closure is inductive because it refers to an inductively defined set, |N. Alternatively, A+ is the smallest set such that: A**1 SUBSET A+; for all n: if A**n SUBSET A+ then A**succ(n) SUBSET A+. AUTOMATA AS SPECIFYING LANGUAGES A machine (or an automaton), M, takes symbolic input from an alphabet SIGMA, and make transitions from state-to-state according to some regimen DELTA, starting in the start state, q_0, and arriving, perhaps, at a terminal state. If the terminal state belongs to the set ACC, the accepting subset of the set of states, Q, then M accepts the input. The sequential input can be represented as a list of symbols over SIGMA, hence, as a string over SIGMA, assuming no lexical ambiguity. The machine DOUBLE-CHECK checks for pairs of numbers such that the second number is twice the first. Input strings over the alphabet {"a", ","} are interpreted as pairings of numbers. "aaaa,aaaaaaaa" is interpreted as a check whether 8 is the doubling of 4. The machine starts in some state q_0 and moves to successively higher states, two at a time, as if counting by two: q_0 to q_2, q_2 to q$, and so forth. Once a "," is found in the input, it reverses its count, moving from state-to-state as if decrementing by one, for each "a" that is input. If the computation exhausts the input when in state q_0 again, then DOUBLE-CHECK accepts the input. Note that if a machine accepts the language {"a", "aa", "aaa"}, we do not also say that it accepts the language {"a", "aa"}. Clearly, a machine that accepts the language {"a", "aa", "aaa"} also accepts any string in the set {"a", "aa"}, which is a language. But it is misleading to say that the machine accepts the language {"a", "aa"} when this language is a proper subset of the largest set of strings that it accepts. To say a machine accepts a particular language is to say that that language fully characterizes the machine. A machine accepts a string just in case the list of symbols represented by the string initiates a computation that terminates in an accepting state. Not all computations are deterministic. Consider the smallest set, W, such that: "" in W; for all x: if x in W then x~"a" in W; for all x: if x in W then x~"b" in W. There is clearly a machine M which always terminates in an accepting state if its input string is in W. It starts in q_0, which is the only accepting state, and moves from q_0 to q_0 (i.e., it stays in q_0) when an "a" or a "b" is read. On any other symbol, it moves to q_1, which is not an accepting state. It stays in q_1 on any input. There is also the following non-deterministic machine: It starts in q_0 and guesses whether the next symbol will be an "a" or a "b". That is, it moves for no reason and with no regularity into either q_a or q_b. From q_a, if the next symbol is an "a", it returns to q_0; otherwise, it moves to q_1, where it stays. From q_b, it moves to q_0 on a "b", and to q_1 otherwise. Then the process repeats if the input is not consumed. For every string in W, this machine also has at least one accepting computation. In this case, non-determinism is more complex, but in other examples, non-determinism simplifies. If the machine is non-deterministic, it could initiate several computations. In that case, a machine accepts a string just in case there is one or more ov the several computations that the list of symbols represented by the string could have initiated, that would have terminated in an accepting state. Not all computations terminate. Imagine a machine with internal storage on which it writes a string over the same alphabet as the input alphabet, {"a"}. It checks for members of the language, L, the smallest set such that: "a" in L; for all x: if x in L then x~"aa" in L. Clearly there is a machine that accepts L and terminates on all inputs. The following machine accepts L, but terminates only on inputs in L. The machine puts "a" in internal storage. It runs a subroutine that checks whether the input is equivalent to the internally stored string. If so, it terminates. Otherwise, it concatenates "aa" onto the contents of internal storage, resets the input, and repeats the check again. This machine does not terminate on the input "aa". L(M) is the language accepted by M. PICTORIALLY SPECIFIED FINITE STATE AUTOMATA A finite state automaton is a machine which receives its input in order (it cannot reset or back up), has a finite number of states, and has no additional storage. Input strings are over an alphabet SIGMA. q_0 is the start state, and ACC is a (possibly empty) set of accepting states. Finite state automata (FSA's, or finite state machines, FSM's), are represented pictorially. States are circled names, transitions are labeled arrows, the initial state is always q_0, and accepting states are indicated by double-circling. As a convention, we do not draw the state from which there is no escape, and any symbol encountered in the input at any state for which there is no labeled transition is understood to make a transition into this undrawn trapping state. Also as a convention, labels on transitions are not quoted. ((q_0))---a--->(q_1)---b--->((q_2)) | ^ | | +----------b----------------+ This machine accepts the language {"", "b", "ab"}. ((q_0))---a--->(q_1)---b--->((q_2)) ^ | | | +----------b----------------+ This machine accepts the language {"", "ab", "abb", "abab", ...}, which is more conspicuously written in terms of operations on languages: {"ab", "abb"}*. (q_0)---a--->(q_1) This machine accepts the language {}. {} is a language, but machines that accept {} are sometimes (improperly) said to accept no language (in fact, they accept no strings). ((q_0)) This machine accepts the language {""}. ((q_0))---a--->((q_1))---b---+ ^ | | | +----------+ This machine accepts the language {""} u ({"a"} ~~ {"b"}*), or {""} u {"a"}~~{"b"}*, assuming set-concatenation takes precedence over set-union. (q_0)---0--->(q_2) | | | | 2 2 | | | | v v (q_1)---0--->((q_3)) This machine accepts the language {"02", "20"}. A non-deterministic finite state automaton can have two transitions out of a state with the same label. It can also have transitions with the label "@", which indicates that no symbol is required to be read from input in order to make the transition. Such a transition may or may not occur each time it is a possible transition. Several @-transitions (also called epsilon transitions can be made in series). A computation that consumes its input in a state that can still make @-transitions may not be a terminated computation; the @-transitions may still be made. (q_0)---a--->(q_1)---b--->(q_2)---b--->((q_3)) ^ | | | | | | b | | | | | v | | ((q_4)) | | | | | +-----------@---------------------------+ This machine accepts {"abb"}* ~~ ({""} u {"ab"}). GRAMMARS A grammar is a system of rules over a terminal and non-terminal alphabet, with a distinguished non-terminal symbol, S, the start symbol. Rules have left-sides and right-sides, where each side is a string over the union of the two alphabets. The following is a grammar: S --> AB A --> a B --> b B --> bb Grammars are displayed diagrammatically. Grammars transform the start symbol into strings, according to the rules, and subsequently transform parts of the resulting string into further strings. If a string over the terminal alphabet can be generated in this way, the string is derived by the grammar. The above grammar derives "ab". It also derives "abb". The set of all strings that can be derived by the grammar is the language generated by the grammar. The language of the above grammar is {"ab", "abb"}. Convention: rules with the same left-sides can be compacted. S --> AB A --> a B --> b | bb is the same grammar as above. Convention: when written diagramatically, symbols and strings from either alphabet are not quoted. Here is a fragment of the grammar for the commands that can be given to my telephone: Com --> Mode|Call Mode --> *1|*2|*3|*4|*5DDDD|*7|*8|*9 Call --> Internal|Local|Toll|Long|International Internal --> DDDD|0 D --> 0|1|2|3|4|5|6|7|8|9 Local --> 9DDDDDDD|91411|9911|90 Toll --> 91DDDDDDD Long --> 91AreaDDDDDDD International --> 9011CountryCityLocal @ can occur in a rule, which stands for the null string. . S --> A | B A --> a | Aa B --> @ | Bb This grammar generates the language {"a"}+ u {"b"}*. Convention: In some cases, the alphabet of non-terminals is distinguished from the alphabet of terminals by enclosing characters within angle-brackets. Note that in this convention, "<@>" denotes the epsilon production, not simply "@", which is the single character. --> the | a --> | | --> , , could be a fragment of a grammar for a subset of English. Convention: All real grammars adopt shorthands. A common extension is to allow a string of symbols to be put in braces, or raised to the Kleene power * or +. They often use square-brackets to enclose optional parts of a rule. It should be easy to reduce any occurrence of '`{s}'` where s is a string over the terminal and non-terminal symbols with a new symbol "N" and the new production: N --> s | Ns or N --> s | Ns | @ as appropriate. Similarly, '`[s]'` can be replaced with the new symbol "N" and the production: N --> @ | s ; (note that this discussion would be better if there were some way of quasi-quoting diagrams that represent rules). Here is a fragment of the grammar for the programming language Ada; note that space must be treated very carefully (from Preliminary Ada Reference Manual, SIGPLAN Notices v. 14, #6, 1979): --> {[_]} --> | --> | --> | --> | --> {[_]} --> #{[_]} --> --> | --> .[E] --> [+]integer|-integer --> "{character}" Here is a fragment of the grammar for the programming language Scheme; note the difficulty in using "*", "<", and ">" (this is from J. Rees and W. Clinger, eds., Revised3 Report on the Algorithmic Language Scheme, SIGPLAN Notices v. 21, #12, December 1986): --> ||||| (|)|#(|'|`|,|,@|. --> |(|)|"|; --> | --> *| --> | --> a|b|c|d|e|f|g|h|i|j|k|l|m|n|o|p|q|r|s|t|u|v|w|x|y|z --> !|$|%|&|*|/|:|<|=|>|?|~|_^ --> || --> 0|1|2|3|4|5|6|7|8|9|0 --> .|+|- --> +|- --> |else|=>|define|unquote|unquote-splicing --> quote|lambda|if|set!|begin|cond|and| or|case|let|let*|letrec|do|delay|quasiquote Here is a fragment of the grammar for the programming language C; note that an optional argument is specified by subscripting with "opt"; brackets cannot be used because the bracket characters are needed at face value (from B. Kernighan and D. Ritchie, The C Programming Language, Prentice Hall, 1978): --> *|&|-| !|~|++|--|++| --|() |sizeof | sizeof () --> *| /|% --> |()|*| ()|[] Here is a fragment of the grammar for the programming language ICON (from C. Coutant, R. Griswold, and S. Wampler, Reference Manual for the Icon Programming Language Version 5, University of Arizona TR81-4a, 1981): --> |||| ||||| --> if then [ else ] --> while [ do ] --> until [ do ] --> every [ do ] --> repeat --> case of {;[]} --> :|default: --> not --> to [by --> next --> break[ expr] Here is a fragment of the grammar for the programming language PASCAL: --> FUNCTION: --> PROCEDURE --> <@>|()|()|() --> |; --> VAR : --> , | Here is a fragment of the grammar for the command language of the editor VI; note the problem of referring to the character "|" with "<|>" in the rule about movement: --> | --> :|a|i||d| c|j|o|I|A| O|r|s|R|dd|u|x p|P|Y|y|.|~| --> ||0|[[|]] --> $|<|>||G|| |(|)|}|{|-|+|k|j|h|l|''|n|N|w|W|e|b|, --> ||ctrl-L> Natural language programs are based on grammars for subsets of English. Almost all compilers, interpreters, and even language-specific editors are based on grammars. A parse tree is a diagram that depicts the dependence of rules used in the derivation of a string from "S". S --> ABC | ACB | BAC | BCA | CBA | CAB A --> Aa | a B --> Bb | b C --> Cc | c | cC One parse tree for the string "cccabb" is: S +---------------------+---------------------+ | | | | | | v v v C A B +--------+ | +-------+ | | | | | | | | | | v v v v v C c a B b +----+ | | | | | | v v v b C c | | v c Another parse tree for the same string is: S +---------------------+---------------------+ | | | | | | v v v C A B +--------+ | +-------+ | | | | | | | | | | v v v v v C c a B b +----+ | | | | | | v v v b c C | | v c There are two other parse trees of this same string, for this grammar. The grammar: S --> ABA | BB A --> Aa | @ B --> b derives "aaab" with the following parse tree: S +-------------------+-------+ | | | | | | v v v A B A +-----------+ | | | | | | v v v v A a b @ +------+ | | | | v v A a +---+ | | | | v v A a | | v @ GRAMMARS FORMALIZED Formally, a grammar is any 4-tuple such that: (let SIGMA = SIGMA_N u SIGMA_T): 1. SIGMA_N n SIGMA_T = {}; 2. "@" !in SIGMA; 3. "S" in SIGMA_N; 4. DELTA SUBSET SIGMA*~~SIGMA_N~~SIGMA* X (SIGMA_N u SIGMA_T u {"@"})+. The grammar S --> A | AA | @ A --> a is the 4-tuple < {"S", "A"}, {"a"}, "S", {<"S", "A"}, <"S", "AA">, <"S", "@">, <"A", "a">} > . A string s is directly derivable from a string t in grammar just in case there is some in DELTA and some a, b, c, d such that s = a ~ u ~ b and (t = a ~ v ~ b or (t = a~b and v = "@")). "AA" is directly derivable from "S" in the above grammar. "MooMooa" is directly derivable from "MooMooA" in the above grammar, even though there is no way to obtain "MooMoo". "aa" is not directly derivable from "S" in the above grammar, even though "AA" is directly derivable from "S", and "Aa" is directly derivable from "AA", and "aa" is directly derivable from "Aa". A string s is derivable from a string t in grammar just in case 1. s is directly derivable from t in the same grammar; or 2. there is some u such that s is directly derivable from u, and u is derivable from t, in the same grammar. "aaa" is derivable from "AAA" in the above grammar, even though there is no way to derive "AAA" from "S". A string s is derived by grammar just in case 1. s is derivable from "S" in the same grammar, and 2. s in ST*. Any string in the language {"", "a", "aa"} is derived by the above grammar. This completes the formal definition of grammars. This formal definition of grammars induces a formal definition of languages acceptable by grammars. This formal definition has been given in a language that is nearly rigorous enough to be defined by a grammar. REGULAR EXPRESSIONS The language of regular expressions over the alphabet {"0", "1"}, which we will call "RE(01)", is defined recursively. Regular expressions over a large subset of the ASCII alphabet are used throughout UNIX programs and are the basis of the program grep. Regular expressions are yet another way of specifying languages; they are less powerful than grammars; they are well-suited to the languages accepted by finite state automata. RE(01) has the alphabet { "0", "1", "+", "*", "(", ")", "++", "e", "%"}. "0" is a (well-formed) string in RE(01). "1" is a (well-formed) string in RE(01). "e" is a (well-formed) string in RE(01). "%" is a (well-formed) string in RE(01). For all x in RE(01): '`(x)'` in RE(01); i.e., "("~x~")" in RE(01); and '`(x)+'` in RE(01); '`(x)*'` in RE(01). For all x in RE(01): for all y in RE(01): '`(x++y)'` in RE(01); and '`xy'` in RE(01); i.e. x~y in RE(01); Nothing else is a string in RE(01). A string in RE(01) is a regular expression (regexp) (over the alphabet {"0", "1"}). Parentheses may be omitted if they can be recovered using the operator precedence: "*", "+", concatenation, "++" (this rule is difficult to define formally, but too useful not to adopt!). The following strings are regexp's: "e" which omits no parentheses. "0++1" which is "(0++1)". "(00++10)+" which is "((00++10))+". "01100" which omits no parentheses. "(01++11)*++111" which is "(((01++11))*++111)" "%*" which is "(%)*" "(01++11)*++(111*++000)+" which is "(((01++11))*++(((111)*++000))+)" The denotation of a regexp is the set of strings to which it refers. Note that "01" as a string denotes the string "01". But "01" as a regular expression denotes the language {"01"}. This ambiguity is usually resolved by boldfacing regular expressions instead of quoting them. We could have used a different quoting device, such as \re 01\, but chose to treat them as strings. Thus, we have strings denoting sets of strings. It is the bane of computer science to have frequently to switch between many languages. The regexp "0" denotes the language {"0"}. The regexp "1" denotes the language {"1"}. The regexp "e" denotes the language {""}. The regexp "%" denotes the language {}. If the regexp x denotes the language S1, and the regexp y denotes the language S2, then the regexp '`xy'` denotes {v~w | v in S1 and w in S2}. Use heavy brackets to symbolize denotation of the bracketed expression. [[ "01" ]] = {"01"} = {"0"} ~~ {"1"}. [[ "001" ]] = [["00"]] ~~ {"1"} = {"0"} ~~ {"0"} ~~ {"1"} = {"001"}. [[ "(0++1)1" ]] = [["(0++1)"]] ~~ [["1"]] = {"0", "1"} ~~ {"1"} = ["01", "11"}. [[ "(0++1)*(11++00)+" ]] = [["(0++1)*"]] ~~ [["(11+00)+"]]. If the regexp x denotes the language S, then the regexp '`(x)+'` denotes the smallest set, T, such that: any element in S is an element in T, and for any s in S and any t in T, s~t is in T. [[ "11(00)+" ]] = [["11"]] ~~ [["(00)+"]] = {"11"} ~~ {"00"}+. If the regexp x denotes the language S, then the regexp '`(x)*'` denotes {v | v in the denotation of '`(x)+'` or v=""}. [[ "(00)*1" ]] = {"00"}* ~~ {"1"}. If the regexp x denotes the language S1, and the regexp y denotes the language S2, then the regexp '`(x++y)'` denotes {v | v in S1 or v in S2}. [[ "(000 ++ 1)" ]] = {v | v in [["000"]] or v in [["1"]]} = {"000", "1"}. The rules in summary are: 1a. [[ "0" ]] = {"0"}; 1b. [[ "1" ]] = {"1"}; 1c. [[ "e" ]] = {""}; 1d. [[ "%" ]] = {}; 2. [[ '`xy'` ]] = [[x]] ~~ [[y]]; 3. [[ '`(x++y)'` ]] = [[x]] u [[y]]; 4. if [[ x ]] = S then [[ '`(x)+'` ]] = T such that 4a. if s in S then s in T, 4b. if s in S and t in T, then s~t in T, and 4c. there is no smaller such T; 5. if [[ x ]] = S then [[ '`(x)*'` ]] = [[ '`(x)+'` ]] u {""}. A machine accepting a language that can be denoted by a regexp in RE(01) will be called a regular automaton. ((q_0))---a--->(q_1)---b--->((q_2)) | ^ | | +----------b----------------+ This machine accepts the language {"", "b", "ab"}, or "e++b++ab". ((q_0))---a--->(q_1)---b--->((q_2)) ^ | | | +----------b----------------+ This machine accepts the language {"", "ab", "abb", "abbab", ...}, or "(abb)*(e++ab)". (q_0)---a--->(q_1) This machine accepts the language {}, or "%". ((q_0)) This machine accepts the language {""}, or "e". ((q_0))---a--->((q_1))---b---+ ^ | | | +----------+ This machine accepts the language {""} u ({"a"} ~~ {"b"}*), or "e++ab*". (q_0)---0--->(q_2) | | | | 2 2 | | | | v v (q_1)---0--->((q_3)) This machine accepts the language {"02", "20"}, or "02++20". (q_0)---a--->(q_1)---b--->(q_2)---b--->((q_3)) ^ | | | | | | b | | | | | v | | ((q_4)) | | | | | +-----------@---------------------------+ This machine accepts {"abb"}* ~~ ({""} u {"ab"}), or "(abb)*(e++ab)". There are sets of strings over the alphabet {"0", "1"} that cannot be denoted by a regexp in RE(01). Let the reverse of a string be defined inductively: rev("") = "". for all s, t: rev(s~t) = rev(t)~rev(s). Consider the language {x~rev(x) | x in S*} for alphabets S with at least two elements. This language cannot be denoted by a regexp in RE(S). Regular expressions are not expressive enough to denote the set of regular expressions over any alphabet. The problem has to do with making sure parentheses are matched. There is however a grammar for regular expressions over an alphabet, such as {"0", "1"}. The start symbol and only non-terminal is "S", and the terminal symbols are { "0", "1", "+", "*", "(", ")", "++", "e"}. S --> 0 | 1 | e | % S --> (S)+ | (S)* | (S ++ S) | SS generates the language RE(01). There is also a grammar that derives all grammars. First, describe how a string represents a grammar. Let the string "S --> A | B; A --> Aa | a; B --> Bb| b" represent the grammar <{"S", "A", "B"}, {"a", "b"}, "S", {<"S", "A">, <"S", "B">, <"A", "Aa">, <"A", "a">, <"B", "Bb">, <"B", "b">} > or S --> A | B A --> Aa | a B --> Bb | b . That is, for given NON-TERMS and TERMS, '`p --> q; s'` in string representing a grammar includes the rule for each '`r |'` or '`| r |'` or '`| r'` a substring of q, or for r = q, if q does not contain "|". The complete system is , where s represents the additional rules. The grammar that generates grammars has the start symbol "1", the terminals TERMS u NON-TERMS u {"|", "-->", ";"}, and has the additional non-terminal symbols {"2"}; its rules include: <"1", "2">; <"1", "2; 1">; <"2", "3 --> 4">; <"3", s> for any s in (TERMS u NON-TERMS)*; <"4", s> for any s in (TERMS u NON-TERMS)*. ENTAILMENT AND DERIVATION Some regexp's in RE(01) are stronger than others in the sense that the language denoted by one regexp contains the language denoted by another. A regexp v entails a regexp w just in case x is in [[ v ]] whenever x is in [[ w ]]. The following reductions of a regexp result in a derived regexp (we call these diagrams or schemata; later, they will have formal structure; note that since they are displayed text, they can stand without quasi-quotation); each is given a suggestive name: a, b, and c are variables in the following schemata: R1. ++ -reduction (a ++ b) -------- a [["10++01"]] contains [["10"]]; i.e., {"10"} SUBSET {"10", "01"}. R2. * -reduction-to- + (a*) ---- (a+) [["01*"]] contains [["01+"]]; i.e., {"01", "011", "0111", ...} SUBSET {"", "01", "011", "0111", ...}. R3. * -reduction-to- e (a*) ---- e [["(0100)*"]] contains [["e"]]; i.e., {""} SUBSET {"", "0100", "01000100", "010001000100", ...}. R4. + -reduction (a+) ---- a [["(0100)*"]] contains [["0100"]]; i.e., {"0100"} SUBSET {"", "0100", "01000100", "010001000100", ...}. R5. + -introduction (a+) ---- a(a+) [["(01)+"]] contains [["01(01)+"]]; i.e., {"0101", "010101", ...} SUBSET {"01", "0101", "010101", ...}. E1. right absorption-into- * (a*)a ----- (a+) [["1*1"]] = [["1+"]] = {"1", "11", "111", ...}. E2. left absorption-into- * a(a*) ----- (a+) [["11*"]] = [["1+"]] = [["1*1"]]. E3. self-association-with- + (a+)a ----- a(a+) [["(1++01)+(1++01)"]] = [["(1++01)(1++01)+"]] = {"11", "101", "011", "0101", "111", "1101", "10101", "010101", "01101", ...} E4. commutativity-of- ++ (a ++ b) -------- (b ++ a) [["(11++1)"]] = [["(1++11)"]] = {"11", "1"}. E5. associativity-of- ++ (a ++ (b ++ c)) --------------- ((a ++ b) ++ c) [["(1++(11++0*))"]] = [["((1++11)++0*)"]] = {"1", "11", "", "0", "00", "000", ...}. E6. distribution-of-concatenation-over- ++ a(b ++ c) --------- (ab ++ ac) [["1(0*++1+)"]] = ["(10*++11+)"]] = {"1", "10", "100", ..., "11", "111", ...}. E7. left concatenation of e ea -- a [["e111"]] = [["111"]] = {"111"}. E8. right concatenation of e ae -- a [["10+e"]] = [["10+"]] = {"10", "100", "1000", ...}. E9. union of % (% ++ a) -------- a [["%++1*"]] = [["1*"]] = {"", "1", "11", "111", ...}. E10. right concatenation of % a% -- % [["1%"]] = [["%"]] = {}. E11. left concatenation of % %a -- % [["%(1++11)"]] = [["%"]] = {}. The following rules govern the use of these schemata: S1. E schemata can be reversed. S2. for all a,b,x,y: if a derives b, then x~a~y derives x~b~y. S3. for all a,b,c: if a derives b, and b derives c, then a derives c. (ab ++ ac) --------- a(b ++ c) follows from S1. a(b*)c ------ a(b+)c follows from S2. "110*" derives "11" because "110*" derives "11e" through R3 and "11e" derives "11" through E8, and S3 allows the intermediary not to be mentioned. It can be proved inductively that anything derived from a is also entailed by a. Base Cases. For all x, y, z, r, s, t: R1. [[s]] SUBSET [[s]] u [[t]]. R2. [[s]]+ SUBSET [[s]]*, by definition of Kleene closures. R3. [["e"]] SUBSET [[s]]*, because {} SUBSET A, for all A. R4. [[s]] SUBSET [[s]]+, since A+ contains A**1. R5. [[s]] ~~ [[s]]+ SUBSET [[s]]+. E1. [[s]]* ~~ [[s]] = [[s]]+. First, note that x ~~ (y u z) = (x ~~ y) u (x ~~ z); both are {x~m | m in y or m in z}. Then from the definition of [[s]]* as the smallest set containing all [[s]]**i for i in |N, see that [[s]]* ~~ [[s]] is the smallest set containing all [[s]]**(i+1) for i in |N. This is the definition of [[s]]+. E2. The same argument except that [[s]] ~~ [[s]]* produces the smallest union of [[s]]**(i+1) for i in |N. E3. [[s]]+ ~~ [[s]] = [[s]] ~~ [[s]]+. Each is the smallest union of [[s]]**(i+1) for i in |N -- {0}; i.e., of [[s]]**i for i in |N -- {0, 1}. E4. [[s]] u [[t]] = [[t]] u [[s]]. E5. [[s]] u ([[t]] u [[r]]) = ([[s]] u [[t]]) u [[r]]. E6. As in the argument for E1, x ~~ (y u z) == (x~~y) u (x~~ z). E7. {""} ~~ [[s]] = [[s]]. E8. [[s]] ~~ {""} = [[s]]. E9. {} u [[s]] = [[s]]. E10. {} ~~ [[s]] = {}. E11. [[s]] ~~ {} = {}. Inductive cases. if [[s]] SUBSET [[t]] then [[x]]~~[[s]]~~[[y]] SUBSET [[x]]~~[[s]]~~[[y]]. if [[s]] SUBSET [[t]] and [[t]] SUBSET [[r]] then [[s]] SUBSET [[r]]. Since this exhausts all the possible derivations of one regexp from another, all regexp's preserve the SUBSET relation for the regexps' denotations. EQUIVALENCE RELATIONS is a binary relation over A just in case R SUBSET { | x in A and y in A}. Relations are often called by the name of their first element since the domain is often unimportant. Distinguish by context. <{<0, 1>, <1, 1>}, {0, 1}> is a relation over {0, 1}; it is often called "{<0, 1>, <1, 1>}". is often called just "R". In that case, dom(R) = A, is R's domain. It would be better to say dom() = A, and extension() = R, but nobody does this because it is easier to assume "R" is overloaded as the name as the relation and the name of its extension. A binary relation, R, is reflexive just in case R is a binary relation over A, and (A X A) SUBSET R. That is, for all x: if x in A then in R. The relation {<0, 0>, <1, 1>} on {0, 1} is reflexive. The relation {<0, 0>, <1, 1>, <1, 0>} on {0, 1} is reflexive. The relation {<0, 0>, <1, 1>} on {0, 1, 2} is not reflexive. Love is a reflexive relation if everyone loves himself. Equality is a reflexive relation. A binary relation, R, is reflexive on B, B SUBSET dom(R), just in case for all x: if x in B then in R. Ever relation is reflexive on {}. The restriction of a binary relation, R, to B is (R n (B X B)), with domain (dom(R) n B)**2. A binary relation, , is reflexive on B just in case its restriction to B is reflexive. A binary relation, R, is symmetric just in case for all x, y: if in R then in R. Note that symmetry does not refer to the relation's domain. {<0, 1>} is not symmetric. {<0, 1>, <1, 0>} is symmetric. {<0, 1>, <1, 0>, <1, 1>, <2, 1>, <1, 2>} is symmetric. {} is symmetric no matter what the claimed domain. Love is a symmetric relation in a perfect world. Equality is a symmetric relation. A binary relation, R, is transitive just in case for all x, y, z: if in R and in R then in R. Transitivity does not refer to the relation's domain. {<0, 1>, <1, 2>, <0, 2>} is transitive. {} is transitive. {<0, 1>} is transitive. {<0, 0>} is transitive. {<0, 1>, <1, 0>} is not transitive. {<0, 1>, <1, 0>, <0, 0>, <1, 1>} is transitive. Love is not a transitive relation, but exposure to infection is. A binary relation that is reflexive, symmetric, and transitive is an equivalence relation (over its domain). Equivalence relations are also called equivalences. The following union is an equivalence relation on {0, 1, 2, 3, 4, 5}: {<0, 0>, <1, 1>, <1, 0>, <0, 1>} u {<2, 2>} u {<3, 3>, <4, 4>, <5, 5>, <3, 4>, <4, 3>, <3, 5>, <5, 3>, <4, 5>, <5, 4>}. The relation "starts-with-the-same-symbol-as" is an equivalence on the language denoted by the regexp "a+ ++ b+". We can now write equivalence schemata as equivalence relations over sentences in a language. Inference schemata will also be relations, but not equivalence relations, over sentences in a language. The equivalence class of a under the equivalence relation R is: [a]_R = {x | in R}. Thm. For all equivalences R and all x, y in dom(R): if in R then [x]_R = [y]_R. The equivalence relation "has-the-same-length-as" on the language {"a", "bb", "ab", "ba", "aaa", "abb"} has three equivalence classes: ["a"] = {"a"}; ["bb"] = ["ab"] = ["ba"] = {"bb", "ab", "ba"}; ["aaa"] = ["abb"] = {"aaa", "abb"}. A set of sets, P, is pairwise exclusive just in case for every x, y in P: if x != y then x n y = {}. { {0, 1}, {1}, {} } is not pairwise exclusive. { {0, 2}, {1}, {} } is pairwise exclusive. x is in a generalized union, UNION{p in P}(f(p)), just in case for some y in P: x in f(y). UNION{x in {0, 1, 2}}({"a"}**x) = {"", "a", "aa"}. Thm. UNION{x in 2**A}(x) = A. A partition, P, of a set S, is a set of pairwise exclusive non-empty sets whose union is S; i.e., 1. for all p_1, p_2 in P: if x != y then x n y = {}; and 2. UNION{p in P}(p) = S. Elements of P are called blocks. { {0, 1}, {2, 4}, {3, 5, 6, 7}, {8} } partitions the first nine natural numbers into four sets. { {0, 1}, {}, {1, 0} } is not a partition of {0, 1, 2} for three reasons: it contains an empty block, its blocks are not pairwise exclusive, and its generalized union is not {0, 1, 2}. {} cannot be partitioned. For any partition, P, of any set S, the binary relation over S: { | for some p in P, x in p and y in p} is an equivalence relation over S; and each member of P is an equivalence class under this equivalence relation. The partition { {0}, {1, 2} } of {0, 1, 2} induces the equivalence {<0, 0>, <1, 1>, <2, 2>, <1, 2>, <2, 1>}. { {, }, {}, {} } is a three-set partition of {me, you}**2. The equivalence relation it induces has three equivalence classes: [] = [] = {, }; [], a singleton; [], also a singleton. The equivalence relation induced by this partition is: { < , >, < , >, < , >, < , >, < , >, < , > }. A partition, P, of S, is also called the quotient set of S under the equivalence relation, R, induced by the partition. This is because the equivalence relation "divides" S into its equivalence classes. GENERAL RELATIONS and FUNCTIONS , or just R, is a relation on A X B just in case R SUBSET A X B. Note that A or B themselves could be Cartesian products of sets, which allows relations that contain any kind of tuple. For example, {<0, 1>, <1, 1>} is a relation on |N**2; {<0, 1, 0>, <1, 1, 0>} is a relation on |N**3; {<0, 1, 1, 0>} is a relation on |N**4; {<0, 1, 1, 1, 1>, <0, 0, 0, 0, 0>, <1, 1, 1, 1, 1>} is a relation on |N**5; and so forth. Note that a binary relation over A is a binary relation on A X A. This terminology is unfortunate. f is an extensionally represented function (or just a function) from A (f's domain) to B (f's co-domain) just in case 1. f is a relation on A X B; 2. for all x, y, z: if in f and in f then y = z; i.e., no member of the domain is related to two distinct members of the co-domain under f. {<0, 1>, <2, 4>} is a function on {0, 2} X {1, 4}. It is also a function on any superset of {0, 2} crossed with any superset of {1, 4}. {} is a function on any domain (and co-domain). succ is a function on |N**2. Declaring that f is a function from A to B is often written f: A --> B. Write f(x) = y when f is a function and in f. If f as an extensionally represented function is {<0, 1>, <2, 4>}, then f(0) = 1, since <0, 1> in f. A rather different notation takes juxtaposition with parenthesization to imply evaluation. Thus, f(0) applies f to 0, and f must have the special form of a LAMBDA expression, which states how 0 is transformed by the function, e.g., f = LAMBDA(x)(1 if x=0; else 4 if x=2). f(x) in the present notation is the y such that in f; which some write (iy)( in f). An assertion that makes use of this latter expression, e.g., (iy)( in f) in A, as follows: for all y: if in f then y in A; and for all y, z: if in f and in f, then y = z. R is a total relation just in case dom(R) = A and for every x in A: there is some y such that in R. {<0, 0>, <1, 0>, <2, 0>} is total on {0, 1, 2}**2. {} is total only on the domain {} (with any co-domain). R is a partial relation just in case R is a relation and is not a total relation. As special cases, f is a total function if it is defined over its entire domain; otherwise it is a partial function. {<0, 1>, <1, 1>, <3, 3>} over {0, 1, 2, 3} is partial. {} is a partial function on all domains except {}. Let the function "next" be defined inductively on |N: next(3) = 5; for all x: if next(x) = 5 then next(2x) = 5. This function is partial on |N. It is the relation {<3, 5>, <6, 5>, <12, 5>, <24, 5>, ...} = {<3(2**i), 5> | i in |N}. It does not associate a value with many of the members of |N. GRAPHS is a directed graph (digraph, or just graph) just in case E SUBSET (V X V). V is the set of vertices (or nodes). E is the set of edges (or links). Since digraphs are difficult to depict in this character set, we will use a connectivity-matrix representation: a V X V grid is provided, and a "!" appears in row i and column j when there is an edge from vertex i to vertex j in the graph. All other entries are "-". 0 1 2 3 4 0 - - - - - 1 - - - - - 2 - - - - - 3 - - - - - 4 - - - - - is a graph with five vertices and no edges. 5 6 7 8 5 - - - - 6 - - ! - 7 - - - - 8 - - - - is a graph with five vertices {5, 6, 7, 8} and the single edge <6, 7>. For a in V, the digraph has a loop at a, just in case in E. 1 2 1 ! - 2 - ! is a graph with loops at both vertices. For a digraph , is a subdigraph (or just subgraph) of just in case S SUBSET V; and T SUBSET E; and is a digraph. For a digraph , is a proper subdigraph (or just subgraph) of just in case S SUBSET V; and T SUBSET E; and is a digraph; and S != V or E != T. That is, a subdigraph can be created by deleting a vertex, an edge, both, or many of each. <{}, {}> is a subdigraph of any digraph, but not necessarily a proper subdigraph, since <{}, {}> is itself a digraph. An equally popular definition of subdigraph is what we have called restriction: A restriction of to a proper subset of V, S, is the digraph . 0 1 2 3 4 5 0 - - - - - - 1 - ! - - - - 2 - - - ! - - 3 ! - - - - - 4 ! - - ! ! ! 5 - - - - - - has many subgraphs; among them are: itself, which is not a proper subgraph; 0 1 0 - - 1 - ! which is a restriction to {0, 1}; 1 2 5 1 ! - - 2 - - - 5 - - - which is a restriction to {1, 2, 5}; and 0 1 2 3 4 5 0 - - - - - - 1 - - - - - - 2 - - - - - - 3 - - - - - - 4 - - - - ! ! 5 - - - - - - which is a proper subgraph, but not a restriction. There is a path from s to t in , which we write as path(s, t, ), just in case: 1. in E; or 2. for some a, in E, and path(a, t, ). The following two definitions are equivalent: path(s, t, ) just in case: 1. in E; or 2. for some a, path(s, a, ), and in E. path(s, t, ) just in case: 1. in E; or 2. for some a, path(s, a, ), and path(a, t, ). Let the graph depicted below be G: 0 1 2 3 4 0 - ! - - - 1 - - ! - - 2 - - - ! - 3 - - - - ! 4 ! - - - - then, path(0, 1, G), path(1, 2, G), path(0, 2, G), etc. A sequence of vertices is a path in just in case 1. the sequence has at least two elements; 2. in E; 3. for all i such that v_i is not the final element in the sequence: in E. Again in G, <0, 1> is a path, as are: <1, 2>, <3, 4>, <4, 0>, <0, 1, 2>, <1, 2, 3>, <2, 3, 4>, <3, 4, 0>, <0, 1, 2, 3>, <1, 2, 3, 4>, <2, 3, 4, 0>, <3, 4, 0, 1>, etc. Alternatively, since is <, v_2>, 1. is a path if in E; 2. if is a path and in E, then <, x> is a path. If the sequence has length k, for k in |N, then the path has length k. A digraph is strongly connected just in case for all x, y in V: if x != y then path(x, y, ). G is strongly connected since: path(0, 1, G); i.e., <0, 1>; ... path(1, 0, G); i.e., <1, 2, 3, 4, 0>; ... ... path(3, 4, G); i.e., <3, 4>; The graph G_2: 0 1 2 3 4 0 - ! - - - 1 - - ! - - 2 - - - ! - 3 - - - - ! 4 - - - - - is not strongly connected. One reason is that !path(4, 0, G_2). A digraph is connected just in case for all x, y in V: if x != y then (path(x, y, ) or path(y, x, )). Strong connectivity requires paths between any distinct pair of vertices, in both directions. Connectivity (simpliciter) requires only a path in one direction between distinct vertices. A component, C, of digraph is a connected subdigraph of such that for all D: if D is a connected subdigraph of , then C is not a proper subdigraph of D. That is, a component is a maximal connected subgraph; i.e., if C is a component of , then it cannot be augmented, remain connected, and remain a subgraph all at the same time. 0 1 2 3 4 5 0 - - - - - - 1 - - ! ! - - 2 - - - - - - 3 - - - - - - 4 - - - - - ! 5 - - - - - - has three components: <{0}, {}>, <{1, 2, 3}, {<1, 2>, 1, 3>}>, and <{4, 5}, {<4, 5>}>. A graph, , is complete just in case E = V**2. 1 2 1 ! ! 2 ! ! is a complete graph with two vertices. GRAPHS DEPICTING RELATIONS A digraph depicts a binary relation E over V. A digraph contains a loop at vertex v just in case in E. A binary relation is reflexive just in case its digraph has loops at all vertices. An equivalence class is a digraph whose components are completely connected. A binary relation, R, is irreflexive just in case for all x in dom(R): !in R. A binary relation, R, is irreflexive just in case no vertex in its digraph has a loop. If is an edge in and x != y, then is edge's reverse. A binary relation is symmetric just in case E contains the reverses of all edges in E; i.e., E is closed under edge-reversal. A binary relation, R, is asymmetric just in case for all x, y: if in R then !in R. A binary relation, R, is asymmetric just in case for no edge in E does E also contain its reverse; and E contains no loops. A binary relation, R, is antisymmetric just in case for all x, y: if in R and in R, then x = y. A binary relation is antisymmetric just in case E contains the reverse of no edge in E. Not that this does not preclude loops. Three equivalent definitions of antisymmetric are: R is antisymmetric just in case for all x, y: if x != y then !in R or !in R. R is antisymmetric just in case for all x, y: if x != y then if in R then !in R. R is antisymmetric just in case for all x, y: if x != y then if in R or !in R. Thm. An asymmetric relation is also antisymmetric. Pf. Suppose R is asymmetric but not antisymmetric. If not antisymmetric, then there is some in R such that is also in R and a != b. But asymmetry already disallows the first two conditions holding at the same time. Thm. An irreflexive antisymmetric relation is asymmetric. Pf. Let R be irreflexive and antisymmetric. From irreflexivity, there are no x and y such that x = y and in R. Since the consequent of the condition for antisymmetry can never be satisfied at the same time as the antecedent, the antecedent must never be true: that is, for all x, y: it can never be the case that in R and in R. This implies asymmetry. PARTIAL ORDERS R is a partial order on B just in case R is a binary relation over B that is 1. reflexive (i.e., reflexive on B); 2. antisymmetric; and 3. transitive. LESS-OR-EQ = {<0, 0>, <0, 1>, <1, 1>, <0, 2>, <2, 2>, ...} is a partial order on |N. is-a-prefix-of = { | x is a prefix of y} is a partial order on the language denoted by the regexp "01+". is-no-longer-than = { | x is no longer than y} is not a partial order on the language denoted by the regexp "a+ ++ b+". The relation depicted as: 0 1 2 3 4 5 0 - - - - - - 1 - - ! ! - - 2 - - - - - - 3 - - - - - - 4 - - - - - ! 5 - - - - - - is antisymmetric and transitive, but not reflexive, so it is not a partial order. But the relation: 0 1 2 3 4 5 0 ! - - - - - 1 - ! ! ! - - 2 - - ! - - - 3 - - - ! - - 4 - - - - ! ! 5 - - - - - ! is a partial order. 1 precedes 2 and 3, 4 precedes 5. There is no ordering between 2 and 3, or between any members of different components. x is a minimal element in R just in case there is no z in dom(R) s.t. in R. Thm. All finite partial orders have at least one minimal element. The order G_1, 0 1 2 3 4 5 0 ! - - - - - 1 - ! ! ! - - 2 - - ! - - - 3 - - - ! - - 4 - - - - ! ! 5 - - - - - ! has three minimal elements: 0, 1, and 4. The order, G_2, 0 1 2 3 4 5 0 ! - - - - - 1 ! ! ! ! ! ! 2 ! - ! - - ! 3 ! - ! ! ! ! 4 ! - - - ! ! 5 ! - - - - ! has one minimal element, 1. x is the least element of R just in case 1. x is a minimal element of R; and 2. for all y in dom(R): in R. Thm. Not all partial orders have a least element. Thm. A least element, if it exists, is unique. G_2 has a least element, but G_1 does not, since 0, 1, and 4 are not comparable in the order. x is an upper bound under R of a set S just in case for all z in S: in R. In G_2, 5 and 0 are both upper bounds of {2, 3, 4}. x is the least upper bound under R of a set S just in case 1. x is an upper bound; and 2. for all y: if y is an upper bound under R of S, then in R. In G_2, 5 is the least upper bound of {2, 3, 4}, since 5 precedes 0 in the order. x is a minimal upper bound under R of a set S just in case 1. x is an upper bound; and 2. for all z: if z is an upper bound under R of S, then !in R. In G_2, 2 and 4 are both minimal upper bounds of {1, 3}. Similar definitions can be given for maximal elements, greatest element, lower bounds, greatest lower bound, and maximal lower bounds: x is a maximal element in R just in case there is no y in dom(R) s.t. in R. x is the greatest element of R just in case 1. x is a maximal element of R; and 2. for all z in dom(R): in R. x is a lower bound under R of a set S just in case for all y in S: in R. x is the greatest lower bound under R of a set S just in case 1. x is a lower bound; and 2. for all z: if z is a lower bound under R of S, then in R. x is a maximal lower bound under R of a set S just in case 1. x is a lower bound; and 2. for all y: if y is a lower bound under R of S, then !in R. G_1's maximal elements are 0, 2, 3, and 5. G_2 has a single maximal element, a greatest element, which is 0. The lower bounds of {2, 4} in G_2 are 1 and 3. The greatest lower bound of {2, 4} in G_2 is 3, since 1 precedes 3 in the order. {5, 0} in G_2 has the lower bounds 1, 2, 3, and 4. 1 precedes 3, which precedes both 2 and 4. But 2 and 4 are not ordered. So 2 and 4 are both maximal lower bounds of {5, 0} in G_2. CLOSURE, REDUCTION, AND COMPOSITION OF RELATIONS A relation R on B is an extension of a relation S on B (and S is a reduction of R) just in case S SUBSET R. Note that reductions differ from subdigraphsL a reduction is a relation with the same domain but perhaps smaller extension; i.e. subrelations correspond to subdigraphs where the set of vertices remains the same. {<0, 1>, <1, 1>} on {0, 1} is an extension of {<0, 1>} on {0, 1}. Obs. Reduction is one way to form a subgraph of the depicted relation: by deleting edges. The SUBSET relation partially orders the set of extensions of a relation. The extensions of {} on {0, 1}: E_1: {<0, 0>}; E_2: {<1, 1>}; E_3: {<0, 0>, <1, 1>}; are partially ordered by {, , , , }. The reflexive closure, r(R), of a binary relation, R over B, is the smallest extension of R that is reflexive. Thm. This minimization is well-defined; i.e., there is a unique minimal reflexive extension of R. Thm. r(R) = R u { | x in dom(R)}. The reflexive closure of 0 1 2 3 4 5 0 - - - - - - 1 - - ! ! - - 2 - - - - - - 3 - - - - - - 4 - - - - - ! 5 - - - - - - is 0 1 2 3 4 5 0 ! - - - - - 1 - ! ! ! - - 2 - - ! - - - 3 - - - ! - - 4 - - - - ! ! 5 - - - - - ! The symmetric closure, s(R), of a binary relation, R over B, is the smallest extension of R that is symmetric. Thm. This minimization is well-defined. Thm. s(R) = R u { | in R}. Thm. For all binary relations R over B, s(r(R)) = r(s(R)); i.e., s and r commute. The symmetric closure of 0 1 2 3 4 5 0 - - - - - - 1 - - ! ! - - 2 - - - - - - 3 - - - - - - 4 - - - - - ! 5 - - - - - - is 0 1 2 3 4 5 0 - - - - - - 1 - - ! ! - - 2 - ! - - - - 3 - ! - - - - 4 - - - - - ! 5 - - - - ! - The irreflexive reduction, i(R), of a binary relation, R over B, is the greatest reduction of R that is irreflexive. Thm. i(R) = R -- I_B, where I_B = { | x in B}. The irreflexive reduction of 0 1 2 3 4 5 6 7 8 0 - - - - - - - - - 1 - ! ! ! - - - - - 2 - ! - - - - - - - 3 - ! - - - - - - - 4 - - - - ! ! - - - 5 - - - - ! - - - - 6 - - - - ! - ! - - 7 - - - - ! - - - - 9 - - - - ! - - - - is 0 1 2 3 4 5 6 7 8 0 - - - - - - - - - 1 - - ! ! - - - - - 2 - ! - - - - - - - 3 - ! - - - - - - - 4 - - - - - ! - - - 5 - - - - ! - - - - 6 - - - - ! - - - - 7 - - - - ! - - - - 9 - - - - ! - - - - An asymmetric reduction, S, of a binary relation, R over B, is a maximal reduction of R that is asymmetric. Obs. Since asymmetric reduction is not unique, it cannot be written as a function, e.g., a(R). One asymmetric reduction of 1 2 1 ! ! 2 ! ! is 1 2 1 - - 2 ! - and the other is 1 2 1 - ! 2 - - . The transitive closure, t(R), of a binary relation, R over B, is the smallest extension of R that is transitive. The transitive closure of 1 2 3 1 - ! - 2 ! - ! 3 - - - is 1 2 3 1 ! ! ! 2 ! ! ! 3 - - - . For a binary relation over B, the composition of R with itself, R o R, is { | for some y: in R and in R}, a binary relation over B. The composition of a relation R over A X B, with a relation S over B X C, is the relation R o S = { | for some y: in R and in S}, a relation over A X C. Compositional powers: 1. R**0 = I_dom(R) = { | x in dom(R)}. 2. R**n = R o R**(n-1). Note that the notation is overloaded again: **-powers represent repeated set-concatenation, repeated Cartesian product, repeated relational composition, and repeated multiplication. Distinguish by context. R**1 = R. R**2 = R o R. R**3 = R o R**2 = R o (R o R). etc. Thm. For all R: R o (R o R) = (R o R) o R. Let R be depicted as 0 1 2 3 4 0 - ! - - - 1 - - ! - - 2 - - - ! - 3 - - - - ! 4 - - - - - . R**0 is 0 1 2 3 4 0 ! - - - - 1 - ! - - - 2 - - ! - - 3 - - - ! - 4 - - - - ! . R**1 is R. R**2 is 0 1 2 3 4 0 - - ! - - 1 - - - ! - 2 - - - - ! 3 - - - - - 4 - - - - - . R**3 is 0 1 2 3 4 0 - - - ! - 1 - - - - ! 2 - - - - - 3 - - - - - 4 - - - - - . R**4 is 0 1 2 3 4 0 - - - - ! 1 - - - - - 2 - - - - - 3 - - - - - 4 - - - - - . R**4 and all higher powers are 0 1 2 3 4 0 - - - - - 1 - - - - - 2 - - - - - 3 - - - - - 4 - - - - - . Thm. For any relation R, t(R) = UNION{i in |N+}(R**i). Thm. For any binary relation over a finite-domain, there is an upper bound on the number of distinct compositional powers. Thm. The number of distinct reflexive binary relations over a domain of n elements is 2**(n**2 - n). Pf. Considering the connectivity matrix, there are n choices along the diagonal that are fixed. There are n**2 total choices. Thm. The number of distinct irreflexive binary relations over a domain of n elements is 2**(n**2 - n). Thm. The number of distinct symmetric binary relations over a domain of n elements is 2**((n**2 + n)/2). Pf. There are (n**2 - n)/2 choices in the lower left triangle of the connectivity matrix. Choices here fix the choices in the upper right triangle. There are n additional choices along the diagonal. Thm. The number of distinct binary relations over a domain of n elements is 2**(n**2). This bounds the number of distinct compositional powers. Thm. For all R: r(t(R)) = t(r(R)) = t(R) u R**0 = UNION{i in |N}(R**i). Obs. For some R: s(t(R)) != t(s(R)). Pf. Consider S = {<0, 1>, <2, 1>}. s(S) = {<0, 1>, <2, 1>, <1, 0>, <1, 2>}, and t(s(S)) = {0, 1, 2}**2. But t(S) = S, so s(t(S)) = s(S), which is not t(s(S)). The order, G_2, 0 1 2 3 4 5 0 ! - - - - - 1 ! ! ! ! ! ! 2 ! - ! - - ! 3 ! - ! ! ! ! 4 ! - - - ! ! 5 ! - - - - ! is the reflexive transitive closure of 0 1 2 3 4 5 0 - - - - - - 1 - - ! ! ! - 2 ! - - - - ! 3 - - ! - ! - 4 ! - - - - ! 5 ! - - - - - . Since orders are more easily depicted without loops and shortcuts, they are usually so depicted, with the understanding that transitive and reflexive closures are to be taken. A diagram that is the minimal relation whose transitive reflexive closure is the desired partial order is called the Hasse diagram of the order. Thm. If f is a function and g is a function then f o g, sometimes written gf, is a function. Function composition is often written in columns: 0 -------> 0 -- 1 ----> 1 \ \----> 1 2 --/ 2 \ 2 3 \----> 3 which is more easily written 0 ------> 0 ------> 1 1 2 2 ------> 1 ------> 3 3 2 depicts the composition of {<0, 0>, <2, 1>} with {<0, 1>, <1, 3>}. The resulting, composed fuction is {<0, 1>, <2, 3>} a partial function on {0, 1, 2, 3} X {1, 2, 3}. EDGE-WEIGHTED DIGRAPHS As a prelude to the definitions of finite state automata and parse trees, consider the edge-weighted digraph. is an edge-weighted digraph just in case 1. is a digraph; and 2. f: E --> |R, the set of real numbers. The weight of a path in the graph, s = , is defined inductively: 1. if for some a, b: s = , and in E, then weight(s) = f(); 2. if for some a, b, t: s = a & (b & t), and a, b in V, then weight(s) = f() + weight(b & t). The least-weighted path (shortest path) between vertices v_s and v_f in an edge-weighted digraph, , is defined inductively: 1. for all x: if in E then LW_0(x) = f(); otherwise LW_0(x) = |N; 2. for all i: for all x: for all y: if in E and ( LW_i(x) + f() < LW_(i)(y) or LW_(i)(y) = |N ) then LW_(i+1)(y) = LW_i(x) + f(). If for some i: for all x in V: LW_i(x) = LW_(i-1)(x), then LW_i(v_f) is the length of the least-weighted path from v_s to v_f. FINITE STATE AUTOMATA FORMALIZED A deterministic finite state automaton is a 5-tuple just in case: 1. Q is finite (the set of states); 2. q_0 in Q (the starting state); 3. A SUBSET Q (the set of accepting states); 4. S is a finite set of symbols (the input alphabet), not containing the special symbol epsilon, which we denote @ (not quoted); and 5. d is a function on (Q X S) X Q (the transition relation). A non-deterministic finite state automaton is a 5-tuple just in case: 1-4. (same as above) 5. d is a relation on (Q X (S u {@})) X Q (the transition relation). If d is partial on (Q X S) then is understood as: < Q u {q#}, q_0, A, S, d u { | there is no q' such that in d} u { } >, where q# does not already belong to Q. This adds a waste state, q#, which is the destination of all unnamed transitions, and from which there is no exit. ((q_0))---a--->(q_1)---b--->((q_2)) | ^ | | +----------b----------------+ is the machine M_1 = < {q_0, q_1, q_2}, q_0, {q_0, q_2}, {"a", "b"}, {, , } >. ((q_0))---a--->(q_1)---b--->((q_2)) ^ | | | +----------b----------------+ is the machine M_2 = < {q_0, q_1, q_2}, q_0, {q_0, q_2}, {"a", "b"}, {, , } >. A FSA is deterministic just in case 1. @ !in S; and 2. d is a function; otherwise it is non-deterministic. Both M_1 and M_2 above are deterministic. (q_0)---a--->(q_1)---b--->(q_2)---b--->((q_3)) ^ | | | | | | b | | | | | v | | ((q_4)) | | | | | +-----------@---------------------------+ is a non-deterministic machine: M_3 = < {q_0, q_1, q_2, q_3, q_4}, q_0, {q_3, q_4}, {"a", "b", @}, {, , , , } >. A deterministic machine makes transitions according to the following regimen: if s is a sequence of symbols from S then: s reaches q_f from q_s in just in case: 1. (base I) 1a. s in S; and 1b. in d; or 2. (base II) 2a. s = ""; and 2b. q_s = q_f; or 3. (recursion) for some a, t, q: 3a. s = a~t; 3b. a in S (is the first single symbol of s); 3c. in d; and 3d. t reaches q_f from q in . In M_1, "ab" reaches q_2 from q_0. "b" reaches q_2 from q_1. In M_2, "abbabb" reaches q_0 from q_0, as does "abb", and "". For all s: For all q: A deterministic machine terminates in q on input s just in case s reaches q from q_0. A deterministic machine accepts s just in case for some q: the machine terminates in q on s, and q in A. A non-deterministic machine can make transitions according to the following: if s is a sequence of symbols from S then: s can reach q_f from q_s in just in case: 1. (base I) 1a. s in S; and 1b. in d; or 2. (base II) 2a. s = ""; and 2b. q_s = q_f; or 3. (recursion I) for some a, t, q: 3a. s = a~t; 3b. a in S (is the first single symbol of s); 3c. in d; and 3d. t can reach q_f from q in ; or 4. (recursion II; epsilon-transitions) 4or some q: 4a. in d; and 4b. s can reach q_f from q in . In M_3, "ab" reaches q_2 from q_0 and "ab" also reaches q_4 from q_0. In M_3, "abb" reaches q_3 from q_0 and also reaches q_0 from q_0. For all s: For all q: A non-deterministic machine can terminate in q on input s just in case s can reach q from q_0. A non-deterministic machine accepts s just in case for some q: the machine can terminate in q on s, and q in A. This completes the definition of a finite state machine. Thm. If a language is accepted by some non-deterministic machine then there is some deterministic machine that also accepts this language. Thm. The set of automata that accept languages describable by regular expressions is exactly the set of finite state machines. This formal definition of finite state machines induces a formal definition of languages acceptable by finite state machines. This definition has been given in language that is rigorous enough that it can in principle be accepted by a machine (but not a finite state machine). TREES A tree is a connected digraph such that: 1. V != {}; 2. for some r in V: 2a. there is no w such that in E; 2b. for all s in V: if s != r, then there is a unique path from r to s; i.e., for all x and y: if x is a path from r to s in and if y is also a path from r to s in , then x = y. Note that connectivity and 2a guarantee that the vertex in 2a is unique. The vertex in 2a is called the root of the tree and is designated root(). A leaf of the tree is any member, x, of V such that there is no c such that in E. If x is a leaf of tree , then x in leaves(). For any vertex x in a tree , 1. y is a child of x just in case in E; 2. p is the parent of x just in case in E. y is-progeny-of x just in case in t(E). For a given tree, children(x) = {y | y is a child of x}. Note that the root has no parent. Leaves have no children (sorry about the mixed metaphor). +-------------12----------+ | | | | +--------10-------+ | | | | | | | +---7---+ +------8------+ +---9---+ | | | | | | | | | | | | | | 0 1 2 33 4 5 6 depicts a tree, with the convention that the edges point down. This tree has root 12, which has no parent. The leaves are 0, 1, 2, 33, 4, 5, and 6. children(12) = {9, 10}. children(10) = {7, 8}. children(10) = {7, 8}. 1 is-progeny-of 10. Alternatively, with the definition of leaf as above: 1. <{x}, {}> is a tree; 2. if is a tree and x !in V and y in V is a leaf, then }> is a tree. A relation R on A is connected just in case for every x and y in A: in A or in A. A linear order is a partial order that is connected. 1 ----> 2 ----> 4 ----> 3 ----> 0 ----> 5 depicts the Hasse diagram of a linear order, r(t( {<1, 2>, <2, 4>, <4, 3>, <3, 0>, <0, 5>} )). Linear orders are also called simple orders. Linear orders with least elements (including all finite linear orders) are well-orders. A sequence s is non-repeating just in case there are no x and y such that: for some a, b, c, d: 1. s = a~x~b and s = c~y~d; and 2. x = y and a != c. <1, 1, 3, 4> is not non-repeating. <1, 3, 4> is non-repeating. If P is a linear order on A, and B is a subset of A, then a non-repeating sequence, s, of objects from B is ascending in P just in case for every x, y, a, b, c: if s = a~x~b~y~c, for non-null x and y, and possibly null a, b, or c, then in P. <2, 6, 10, 14> is a non-repeating sequence of objects from EVENS that is ascending in LESS-OR-EQ. If is a tree and P linearly orders V, then the ordered children of a vertex v in V is the sequence s which is the longest non-repeating ascending sequence of objects from children(v). A parse tree for a string s derived by the grammar is an ordered tree where 1. root() = "S"; 2. leaves() SUBSET SIGMA_T; 3. P is a linear order on V; 4. for all x in V: x in (SIGMA_N u SIGMA_T); This requires nodes to be terminal or non-terminal symbols. Thus, ordered children are sequences of symbols, which are strings. 5. for all x in (V -- leaves()): for all sequences s: if s is the ordered children of x then in DELTA. PC AND VC PC: PC is the logical language that is intended to symbolize the following kinds of assertions: It is not spring. It is autumn or it is spring. If it is autumn then it is not spring. If it is autumn and it is spring, then air is water. but not: Man chases woman until she catches him. Every man loves some woman, and there is some woman who is loved by every man. If I had a hammer, I'd hammer in the morning. It is not raining and I know that it is not raining. Whoever is a man is taxed. The degree to which something is snow affects the degree to which it is white. A man is not really a man without a woman. That is, the connectives of interest to PC are motivated by consideration of English's "and", "or", "not", and "if ... then" constructions, and how they might interact. PC is not concerned with nuance (autumn is not autumn), shades of meaning (degrees of autumn), imperatives (look at autumn now!), interrogatives (what is autumn?), attitudes toward propositions (belief, desire, knowledge), variables (whoever, any), quantifiers (most, all, some, uniquely one), counterfactuals (if something were the case, then something else would be the case), dynamics of state or of belief (it became autumn), pronominal reference (it is a season), and so forth. PC is a small, unambitious language of historical interest to logicians and lingering practical interest to hardware designers. WELL-FORMEDNESS OF PC: PC (propositional calculus) can be defined on any of a variety of alphabets. We choose the union of the sets: 1-connectives = {"-"}; 2-connectives = {"v", "^", "J"}; Proposition-Letters = {"1", "0", "P", "Q", "R", "P1", "P2", "P3", ...}; Grouping-Symbols = {"(", ")"}. Propositions are defined recursively as the smallest set s.t.: for all x: if x in Proposition-Letters then x in Propositions; for all x, y, in Propositions: '`(-x)'` in Propositions; and '`(x^y)'` in Propositions; and '`(xvy)'` in Propositions; and '`(xJy)'` in Propositions. "(P v (Q v P))" in Propositions. "(-P1)" in Propositions. "-(P v -(Q ^ P1))" in Propositions. "(P v (Q v (Q v -P)))" in Propositions. "(P ^ -P)" in Propositions. "(P v 1)" in Propositions. As with the language of regular expressions, parentheses can be omitted if they can be restored using the precedence order "-", "^", "v", "J". "P ^ Q v -P1 J P2" must be "(((P ^ Q) v (-P1)) J P2) Note that it is common in engineering to think of PC-connectives as (boolean) functions that map {0, 1}**2 to {0, 1}. This is not what we are doing here. Here, a PC-connective is a string. Concatenation, which is a 2-function, can take a 1-connective and a Proposition, and map them to a Proposition. PROOF-THEORY FOR PC: The following rules are used to define derivation in PC: Syntactic equivalence (Eq) is the smallest equivalence relation such that: for all q, r, s, t: E1. double negation '`-(-s)'` Eq s E2. distributing - over v '`-(s v t)'` Eq '`(-s ^ -t)'` E3. distributing - over ^ '`-(s ^ t)'` Eq '`(-s v -t)'` E4. commutativity of v '`(s v t)'` Eq '`(t v s)'` E5. commutativity of ^ '`(s ^ t)'` Eq '`(t ^ s)'` E6. associativity of ^ '`(s ^ (t ^ r))'` Eq '`((s ^ t) ^ r)'` E7. associativity of v '`(s v (t v r))'` Eq '`((s v t) v r)'` E8. distributing ^ over v '`(s ^ (t v r))'` Eq '`((s ^ t) v (s ^ r))'` E9. distributing v over ^ '`(s v (t ^ r))'` Eq '`((s v t) ^ (s v r))'` E10. disjoining tautology '`(s v 1)'` Eq "1" E11. disjoining falsehood '`(s v 0)'` Eq s E12. conjoining tautology '`(s ^ 1)'` Eq s E13. conjoining falsehood '`(s ^ 0)'` Eq "0" E14. tautology '`(s v -s)'` Eq "1" E15. falsehood '`(s ^ -s)'` Eq "0" E16. material conditional '`(s J q)'` Eq '`(-s v q)'` E17. idempotence of ^ '`(s ^ s)'` Eq s E18. idempotence of v '`(s v s)'` Eq s E19. substring equivalence if s Eq t then q~s~r Eq q~t~r Reduction (Red) is the smallest relation on Propositions**2 such that: for all s, t, q, r: R1. equality if s Eq t then s Red t R2. dilution s Red '`(s v t)'` R3. simplification '`(s ^ t)'` Red s R4. disjunctive syllogism '`(s ^ (-s v t))'` Red t R5. substring reduction if s Red t then q~s~r Red q~t~r "P1 ^ -P1" Eq "0". "-(-(-P1))" Eq "-P1". "P2 v (P1 ^ -P1)" Eq "P2". "P1 ^ P2 ^ P3 ^ P4 ^ P5" Eq "P2 ^ P2 ^ (P3 v P3) ^ P4 ^ P5 ^ P1". "-(P1 ^ -P2)" Eq "-P1 v P2". "P1 J P2" Eq "P2 v -P1". "P1 J (P2 J P3)" Eq "-P1 v (P2 J P3)" Eq "-P1 v (-P2 v P3)" Eq "-P2 v (-P1 v P3)" Eq "P2 J (P1 J P3)". "P1 J (P1 J P2)" Eq "-P1 v (-P1 v P2)" Eq "-P1 v P2" Eq "P1 J P2". "P1 ^ P2 ^ (-P1 v -P2 v P3)" Eq "P1 ^ (P2 ^ (-P2 v (P3 v -P1)))" Red "P1 ^ (P3 v -P1)" Eq "P1 ^ (-P1 v P3)" Red "P3". "P1 ^ P2 ^ (P1 J (P2 J P3))" Eq "(P1 ^ P2) ^ (-P1 v -P2 v P3)" Eq "(P1 ^ P2) ^ (-(P1 ^ P2) v P3)" Red "P3". The relation of syntactic entailment, |--, on Propositions**2, is the transitive closure of Red. "P1 J (P2 J P3)" |-- "P2 J (P1 J P3)". "P1 J (P1 J P2)" |-- "P1 J P2". "P1 ^ P2 ^ (-P1 v -P2 v P3)" |-- "P3". "P1 ^ P2 ^ (P1 J (P2 J P3))" |-- "P3". SEMANTICS FOR PC: Consider a function, f: Propostions --> {4, 7}. 4 and 7 are arbitrarily chosen so that you do not make unnecessary associations. The usual values are 0 and 1, which make PC's connectives look like operators in a boolean algebra. Historically, this is how PC was first developed. But modern logicians interested in multi-valued logics use the special values top and bottom, in some pre-defined order, epistemologists use the special values True and False, and real semanticists use arbitrary values; hence, our use of 4 and 7. If f satisfies the following constraints, then f is a model of PC: for all s, t in Propositions: 1. f("1") = 7; 2. f("0") = 4; 3. f('`s ^ t'`) = min(f(s), f(t)); i.e., f('`s ^ t'`) = 7 if f(s) = 7 and f(t) = 7; otherwise f('`s ^ t'`) = 4; 4. f('`s v t'`) = max(f(s), f(t)); i.e., f('`s v t'`) = 4 if f(s) = 4 and f(t) = 4; otherwise f('`s v t'`) = 7; 5. f('`-s'`) = absval(f(s) - 11). i.e. f('`-s'`) = 4 if f(s) = 7; otherwise f('`-s'`) = 7. It is not easy to write f, since it is infinite. However, if f is a model of PC and f is specified on all Proposition-Letters that occur in a subset of Propostions (members of which are called sentences of PC, or well-formed formulae, wffs), then f's value on all of those propositions is determined by the constraints on models, just like a recursive function is specified by base and recursion. A model can be displayed in a table: P1 P2 P3 P4 | -P1 --P1 P1^P2 P1vP2 P1JP2 --------------------+-------------------------------------- f 4 4 7 7 | 7 4 4 4 7 It is customary to include all models in a single table. Here is a fragment of that infinite table: P1 P2 P3 P4 | -P1 --P1 P1^P2 P1vP2 P1JP2 --------------------+-------------------------------------- f1 4 4 4 4 | 7 4 4 4 7 f2 4 4 4 7 | 7 4 4 4 7 f3 4 4 7 4 | 7 4 4 4 7 f4 4 4 7 7 | 7 4 4 4 7 f5 4 7 4 4 | 7 4 4 7 7 f6 4 7 4 7 | 7 4 4 7 7 f7 4 7 7 4 | 7 4 4 7 7 f8 4 7 7 7 | 7 4 4 7 7 f9 7 4 4 4 | 4 7 4 7 4 f10 7 4 4 7 | 4 7 4 7 4 f11 7 4 7 4 | 4 7 4 7 4 f12 7 4 7 7 | 4 7 4 7 4 f13 7 7 4 4 | 4 7 7 7 7 f14 7 7 4 7 | 4 7 7 7 7 f15 7 7 7 4 | 4 7 7 7 7 f16 7 7 7 7 | 4 7 7 7 7 But even this fragment is redundant. A proposition-letter need appear on the left side of the table only if it is mentioned somewhere in some wff on the right side. P1 P2 | -P1 --P1 P1^P2 P1vP2 P1JP2 ----------+-------------------------------------- g1 4 4 | 7 4 4 4 7 g2 4 7 | 7 4 4 7 7 g3 7 4 | 4 7 4 7 4 g4 7 7 | 4 7 7 7 7 Note that whenever f("P1^P2") = 7, f("P1vP2") = 7. Moreover, note that "P1" and "--P1" have the same entry in every row. For all s, t in Propositions: s is semantically equivalent to t just in case for every f: if f is a model of PC then f(s) = f(t). s semantically entails t, i.e., s |== t, just in case for every f: if f(s) = f("1") then f(t) = f("1"). Thm. For all s, t: s Eq t just in case s is semantically equivalent to t. Thm. For all s, t: if s |-- t then s |== t. That is, the definition of the relation "|--" is sound with respect to the definition of the relation "|==". If we were to add a new reduction rule, R6. P3-introduction s Red '`P3 ^ s'` then were to take |-- to be the transitive closure of Red, |-- would be unsound with respect to |==. From "P1", derive "P3 ^ P1", but there is a model, f, such that f("P1") = 7 and f("P3 ^ P1") = 4. Thm. For all s, t: if s |== t then s |-- t. That is, the definition of the relation "|--" is complete with respect to the definition of the relation "|==". If we were to remove R3 (simplification) from the requirements for the relation "Red", then take |-- to be the transitive closure of Red, then |-- is not complete with respect to |==. There would be no derivation from "P1 ^ P2" to "P1", but clearly any f that models PC that assigns "P1 ^ P2" the value 7 must also assign "P1" the value 7. Model-theory reflects proof theory in mathematical structure. Proof theory is one way to define entailment; model theory is another. One is superfluous in the presence fo the other, but the relation between the two has historical significance. Sometimes when '`s ^ t'` |-- q, one writes {s, t} |-- q. Technically, this changes |-- from a relation on Propositions**2 to a relation on (Propostions u 2**Propositions) X Propositions. This shorthand is only permissible when "^" is commutative, associative, and idempotent; otherwise, it is inappropriate to use a set to represent conjunction. M satisfies s, M sat s, just in case M is a model of PC and M(s) = 1. s is a tautology just in case for all M, a model of PC, M sat s. s is a falsehood just in case for all M, a model of PC, M !sat s. Note s |== t just in case for all M: if M sat s then M sat t. VC: VC is a language that shows that the material conditional in PC is not necessarily the most natural treatment of the locution "if ... then ...". VC is also one of the simplest languages with possible-worlds semantics. VC is intended to symbolize assertions of the form: If it were Spring, it would next be Autumn. and distinguish it from assertions of the form: If it is Spring, then it is Autumn next. The counterfactual conditional is false, whatever the season. The material conditional is true if it is not Spring. Material conditionals are always true when their antecedents are false (I have not uttered a falsehood if the condition for my assertion does not hold). Counterfactual conditionals do not share this property. WELL-FORMEDNESS OF VC: Like PC, VC (propositional calculus) can be defined on any of a variety of proposition letters. We choose the alphabet to include: 1-connectives = {"-"}; 2-connectives = {"v", "^", "J", ">"}; Proposition-Letters = {"1", "0", "P", "Q", "R", "P1", "P2", "P3", ...}; Grouping-Symbols = {"(", ")"}. Propositions are defined recursively as the smallest set s.t.: for all x: if x in Proposition-Letters then x in PC-Wffs; for all x, y, in PC-Wffs: '`(-x)'` in PC-Wffs; and '`(x^y)'` in PC-Wffs; and '`(xvy)'` in PC-Wffs; and '`(xJy)'` in PC-Wffs; for all x, y, in PC-Wffs: x in VC-Wffs; and '`(x>y)'` in VC-Wffs. for all x, y, in VC-Wffs: '`(-x)'` in VC-Wffs; and '`(x^y)'` in VC-Wffs; and '`(xvy)'` in VC-Wffs; and '`(xJy)'` in VC-Wffs; "(P1 J P2)", "(P1 > P2)" in VC-Wffs. "(P1 ^ P2) v ((P1 ^ P3) > P2)" in VC-Wffs. But "(P1 > (P2 > P3))" !in VC-Wffs. Normally, variable-conditional logics allow iteration of the variable conditional, ">". But this makes the semantics unncecessarily complex, so we limit the use of ">". The same parenthesis conventions hold, with ">" taking the least priority; i.e., ">" is last in the precedence order, after "J". All PC rules for Eq and Red remain, and are augmented by: VC-E1. factual antecedent '`(1 > t)'` Eq t VC-E2. counterfactual consequent '`(t > 0)'` Eq "0" VC-E3. tautological counterfactual '`(t > t)'` Eq "1" VC-R1. tautological counterfactual '`(s > t)'` Red '`(s J t)'` Again, |-- is defined as the transitive closure of Red. SEMANTICS FOR VC: The idea is to say that a conditional '`s > t'` holds if its consequent, t, holds in the closest possible world in which its antecedent holds. A possible-world for VC is a function that satisfies the constraints for being a PC-model. For all f: f is a possible world for VC just in case: for all s, t in PC-Wffs: 1. f("1") = me; 2. f("0") = you; 3. f('`s ^ t'`) = me if f(s) = me and f(t) = me; otherwise f('`s ^ t'`) = you; 4. f('`s v t'`) = me if f(s) = me or f(t) = me; otherwise f('`s v t'`) = you; 5. f('`-s'`) = me if f(s) = you; otherwise f('`-s'`) = you; Like models for PC, possible worlds for VC are infinite, but can be tersely specified by writing the possible world's value on each proposition letter of interest. Writing an "M" for me and a "Y" for you, here are four of the sixteen possible worlds (partial functions) that are distinguishable on "P1", "P2", "P3", and "P4". P1 P2 P3 P4 w1 M M M M w2 M M M Y w3 M M Y Y w4 M Y Y Y A model for VC is a triple where 1. W is a set of VC possible worlds; and i.e., a set of functions that are PC-models; and 2. w_0 in W; and i.e., w_0 is a distinguished world, the actual world; 3. CLOSER is a partial order on W with least element w_0. Like VC possible worlds, VC models are abbreviated and list only the values for the relevant proposition letters. One VC model that uses three possible worlds: f_1 = {<"P1", me>, <"P2", you>}; f_2 = {<"P1", you>, <"P2", you>}; f_3 = {<"P1", you>, <"P2", me>}; is: < {f_1, f_2, f_3}, f_1, r(t({, })) > which can be depicted: +----------+ | P1 ^ -P2 | +----------+ | | +----------+ | -P1 ^ -P2| +----------+ | | +----------+ | -P1 ^ P2 | +----------+ The partial order takes f_1 to be closer (to actual) than f_2, and f_2 to be closer than f_3. If M = is a VC model, then for all s, t in VC-Wffs: 1. if s in PC-Wffs then M(s) = w_0(s); and i.e., sentences not containing the variable conditional have the value that w_0 gives them; 2. M('`s > t'`) = me just in case for all w_i minimal in (CLOSER n {w | w in W and w(s)=me }**2): w_i(t) = me; and i.e., sentences containing the variable conditional are evaluated in all closest worlds in which the antecedent holds; if the consequent holds in all of those worlds, then the conditional holds; 3a. M('`-s'`) = me if M(s) = you; otherwise M(s) = you; and 3b. M('`s ^ t'`) = me if M(s) = me and M(t) = me; otherwise M('`s ^ t'`) = you; 3c. M('`s v t'`) = me if M(s) = me or M(t) = me; otherwise M('`s v t'`) = you. i.e., "(P1 > P2) ^ P3" holds just in case "(P1 > P2)" holds and so does "P3". M sat s just in case M is a VC model and M(s) = me. In the VC model f_1 = {<"P1", me>, <"P2", you>}; f_2 = {<"P1", you>, <"P2", you>}; f_3 = {<"P1", you>, <"P2", me>}; M = < {f_1, f_2, f_3}, f_1, r(t({, })) >, +----------+ | P1 ^ -P2 | +----------+ | | +----------+ | -P1 ^ -P2| +----------+ | | +----------+ | -P1 ^ P2 | +----------+ M("P1") = me; M("P2") = you; M("-P1") = you; M("P1 ^ P2") = you; M("P1 v P2") = me; M("P1 J P2") = M("-P1 v P2") = you; M("P1 > P2") = you; M("P2 > P1") = you; M("P2 > -P1") = me; M("P2 > (P1 v P2)") = me; M("(P2 > -P1) ^ (P2 > P2)") = me. In the VC model M_1 = < {f_1, f_2, f_3}, f_1, r(t({, })) >, +----------+ | P1 ^ -P2 | +----------+ / \ / \ +----------+ +----------+ | -P1 ^ -P2| | -P1 ^ P2 | +----------+ +----------+ note that f_2 and f_3 are non-comparable: M("-P1 > P2") = you; because even though f_3 is minimal among worlds satisfying the antecedent, and assigns "P2" the value me, f_2 is also minimal among worlds that satisfy the antecedent, and assigns "P2" the value you. The rules for M-valuation require that all such minimal worlds satisfy the consequent. Note that '`s J t'` and '`s > t'` do not always have the same value under a model of VC. FOL SUBSET OF ENGLISH So far, we have been using English as our meta-language. But we have been careful to use, as much as possible, a fragment of English. This fragment of English has certain formal aspects, and is governed by certain conventions. Among them are: COMPOSITIONALITY: The basic components are term constants, which refer to objects; ``me'' is a term constant; ``you'' is a term constant; ``"a"'' is a term constant; ``{}'' is a term constant; ``0'' is a term constant; and predicates, which assert that an object has some property, or that certain objects stand in some relation; ``in'' is a predicate; ``is a substring of'' is a predicate; ``PROPER-SUBSET'' is a predicate; ``SUBSET'' is a predicate; ``is a grammar'' is a predicate; ``is a finite state automaton'' is a predicate; ``is a predicate'' is a predicate. Predicates, especially 2-predicates (predicates that take two objects), are often called relations. This is unfortunate, since we already call extensionally represented relations "relations". The first is a string, the second is a 2-tuple. There are also functions, which map an object or objects to a single object; ``~'' is a function; ``&'' is a function; ``succ'' is a function; ``n'' is a function; ``u'' is a function; ``2**'' is a function. This terminology is also unfortunate, since we know extensionally represented functions to be "functions". Here we have strings we care calling functions, whereas earlier we took special 2-tuples to be functions. There are quantifiers, which allow assertions about unnamed objects; these require variables, which are bound by the quantifiers, and vary over possible objects: Universal: ``for all x:''; ``for every x:''; ``for each x:''; ``for no x:'' is the same as ``it is not the case that for all x:''; Existential: ``for some x:''; ``for at least one x:''; ``there is an x:''. ``there is no x:'' is the same as ``it is not the case that there is an x:'' These two kinds of quantifiers appear in a variety of forms: Repeated: ``for all x, y:''; is the same as ``for all x: for all y:''; ``there are x, y:''; is the same as ``there is an x: there is a y:''; Fixed Scope: A variable is bound when its interpretation is governed by a quantifier. Otherwise it is free. The scope of a quantifier is the range over which it binds a variable. ``for all x: if x in S then x in T; and for all x: if x in U then x in V'' contains two universal quantifications of ``x'', but they have non-overlapping scopes. The scope of the first quantification of "x" extends until the "and". One might as well have used two different variables for the two different scopes. Implicit Universal: variables unbound by quantifiers are understood as universally quantified with the widest possible scope: ``if x in S then {x} in 2**S''; is the same as ``for all x, S: if x in S then {x} in 2**S''; Typed: ``for x in S:''; e.g., ``for x in S: x in T'' is the same as ``for all x: if x in S then x in T''; ``there is an x in S:''; e.g., ``there is an x in S: x in T'' is the same as ``there is an x: x in S and x in T''; Nested: ``for all x: there is some y:''; ``for some x: for every y:''; note that the latter choice of what the variable names depends on what choices were made in the outer quantification: ``for all x: there is some y: loves(x, y)'' reports veridically of a situation in which everyone has a (possibly different) object of affection; ``for some y: for every x: loves(x, y)'' reports on some one person whom everyone likes. There are connectives, which combine assertions. ``x in T and y in S''; ``x in T or y in S''; ``if x in T then y in S''; ``x !in T''. There are parenthetical devices, which help with the parsing of the assertion. ``if (x in T and y in S) then in (T X S)''; ``if x in T and y in S then in (T X S)''. INFERENCE: Users of this fragment of English frequently presuppose certain inference rules governing the use and meaning of their locutions: Some of them are: ``or'' is inclusive: ``x in S or y in T'' is true even when both are true; ``and'' is a compound assertion: if ``x in S and y in T'' is true, then ``x in S'' is true; ``if ... then ...'' is true when its antecedent is false: ``if x in S then y in T'' is true when ``x in S'' is false, regardless of the veracity of ``y in T''; universals can be instantiated: if ``for all x: x in S'' is true, then ``0 in S'' is true; individuals can be generalized to existentials: if ``0 in S'' is true, then ``for some x: x in S'' is true; variables can be changed if they do not alter scope: if ``for all x: x in S'' is true, then ``for all y: y in S'' is also true; but sometimes ``for all x: x in S'' is not true when ``for all S: S in S'' is true; the order of quantification can be switched if it does not swap existentials and universals: ``for all x: for all y: in S'' says the same thing as ``for all y: for all x: in S''; but ``for all x: there is some y: in S'' is not the same as ``there is some y: for all x: in S''; negated existentials are universals and vice versa: ``for all x: x !in S'' is the same as ``for no x: x in S''; and ``there is no x: x in S'' is the same as ``for all x: x !in S''; quantifiers need not quantify assertions that do not contain the variables they quantify: ``for all x: x in S and {} in S'' is the same as ``{} in S and for all x: x in S''; conjoined universals can be collapsed: ``(for all x: x in S) and (for all y: y in T)'' which is the same as ``(for all x: x in S) and (for all x: x in T)'' amounts to the same thing as ``for all x: x in S and x in T''; disjoined existentials can be collapsed: ``(for some x: x in S) or (for some y: y in T)'' which is the same as ``(for some x: x in S) or (for some x: x in T)'' amounts to the same thing as ``for some x: x in S or x in T''; but note that disjoined universals are weakened by collapsing, ``(for all x: x in S) or (for all x: x in T)'' is stronger than ``(for all x: x in S or x in T)'', since the first requires at least one of the sets to contain everything; whereas the second requires only that the union of the two sets be universal; and conjoined existentials are strengthened by collapsing, ``(for some x: x in S) and (for some x: x in T)'' is weaker than ``(for some x: x in S and x in T)'', since the first allows one of the sets to be empty, while the second requires that both sets be non-empty. There are probably more inference rules that govern the user of the first-order-logic fragment of En