\section{Sets} \label{sets} These are abstract mathematical sets: unordered collections with uniqueness (an item can only occur once). They contain values, which are of the type specified when the set was created. Types are not checked here (all typechecking is in the evaluation module). Every set has an address in the type of sets [[*set-type*]]. Since a pointer is the only thing that can change, sets are static. To "edit" a set, you create a new set based on the old one and change a pointer somewhere to point at the new set. This module currently only implements finite sets. Later iterations will certainly implement sets that contain a possibly-infinite number of addresses. <>= <> <> <> <> @ <>= (in-package max) @ Tests: We need to write some tests for sets. However, it's not a high priority because I believe this implementation is rather straightforward and likely to be correct. @ \subsection{Implementation} Since there are an infinite number of possible sets, and there is no limit on the number of items in a set, a set's address may consume a variable amount of type. The contents of the set are stored in the address's value field as a hash table. The hash table is used because it is setlike: an unordered collection with keys occurring at most once. The hash values will be NIL since we don't need them. Another thing we don't need is the mutability of LISP's hash tables. The hash tables are never modified, instead they are copied when constructing a new set based on an old one. <>= <> <> <> <> <> <> @ Define a new class to implement the type of sets, because they're different from other types. <>= (defclass set-type (max-type) nil ) @ Create the type of sets and assign its address to the global *set-type*. This is probably going to be the only instance of class set-type. <>= (defparameter *set-type* (make-type :class 'set-type :key 'SET)) (add-init-hook #'(lambda () (add-name *set-type* "sets"))) @ Define how set address values are compared for equality. Two set addresses should be equal iff the sets they describe are set-equal: <>= (defmethod addr-value-equal ((s set-type) a b) (set-equal a b) ) @ The format of a set address: The type is sets, and the value is a dotted pair of the type to be stored in the set and a hash table with hash-keys being the Max values in the set, and hash-values being NIL. <>= ; Create a set address with hash table h (defun make-set-helper (type h) (make-address-form (addr-value *set-type*) (cons (addr-value type) h)) ) @ Given a set address, get its hash table <>= (defun set-implementation (setaddr) (cdr (addr-value setaddr)) ) @ <>= (defun set-type (setaddr) (make-address-form (addr-value *types*) (car (addr-value setaddr))) ) @ \subsection{The empty set} Return the empty set. We'll depart from the practice seen in other modules of creating a global variable for common values in a type. This was originally because we did modify the hash tables in a set, but providing a function to return a common value is not a worse approach. The philosophy in Max is that every value is the result of calling some function. It makes little difference if we provide the value or the function because we'll be able to obtain one from the other. When we have the function, we can call it to get the value. When we have the value, we can use introspection to find out where it came from. It makes some sense to consider functions primitive, though. Bucky Fuller believed that verbs are primitive and nouns are just an illusion, since everything is continuously changing anyway. This is not a bad belief to have when dealing with a reflective system. Here's the LISP function: <>= (defun get-empty-set (&optional (type *objects*)) ; return the empty-set address (make-set-helper type (make-hash-table :test 'equalp)) ) @ Now we create a Max function as a builtin that returns the empty set. <>= (defparameter get-empty-set (make-function 'builtin-function *null-type* *set-type* #'(lambda (ignore-nil) (get-empty-set)) 'GET-EMPTY-SET )) (add-init-hook #'(lambda () (add-public-function get-empty-set) (add-name get-empty-set "emptyset") )) @ There are no optional arguments in Max, so we have to make a separate function for creating an empty set specifying the type of items that can be in the set. <>= (defparameter get-empty-typed-set (make-function 'builtin-function *types* *set-type* #'(lambda (type) (get-empty-set type)) 'GET-EMPTY-TYPED-SET )) (add-init-hook #'(lambda () (add-public-function get-empty-typed-set) (add-name get-empty-typed-set "emptyset-typed") )) @ \subsection{Operations on sets} We'll define the standard mathematical operations on sets. <>= <> <> <> <> <> <> <> <> <> <> <> <> @ \subsubsection{Addition} Given the address of a set and another address ``addr'', return a new set containing the contents of old set with ``addr'' added. If ``addr'' was already in the set, the old set is returned. The original set is unchanged. The type of the new set is the same as the old set. <>= (defun set-add (set_addr addr) ; Create a copy of the original hash table (let ((newhash (copy-hash (set-implementation set_addr)))) ; Add the addr to the newhash (setf (gethash (addr-value addr) newhash) nil) ; return a new set address (make-set-helper (set-type set_addr) newhash) ) ) @ Similarly to FUNCTION-QUOTIENT, the type of SET-ADD's second argument depends on the value of the first argument: specifically, the type of the value to add to the set must be the same as the type of addresses stored in the set. We use the same approach that we did with FUNCTION-QUOTIENT of splitting the work into two functions. One takes the first argument and returns another function taking the second argument. SET-ADD is used by ADD-PUBLIC-FUNCTION so it has to be declared before initialization time. <>= (defparameter set-add (make-function 'builtin-function *set-type* *functions* #'(lambda (set_addr) (make-function 'builtin-function (set-type set_addr) *set-type* #'(lambda (value_to_add) (set-add set_addr value_to_add) ) ) ) ) ) (add-init-hook #'(lambda () (add-public-function set-add) (add-name set-add "add") ) ) @ \subsubsection{Membership predicate} Given a set and an address, return a (LISP) boolean value based on whether the value is in the set. A predicate is just a function returning a boolean. <>= (defun set-member (set addr) (hash-exists (addr-value addr) (set-implementation set)) ) (add-init-hook #'(lambda () (let ((set-member (make-function 'builtin-function *set-type* *functions* #'(lambda (set_addr) (make-function 'builtin-function (set-type set_addr) *boolean-type* #'(lambda (value_to_check) (set-member set_addr value_to_check) ) ) ) ))) (add-public-function set-member) (add-name set-member "member") ) )) @ \subsubsection{Iteration} Given a Max set and a LISP function taking one address as its argument, call the function on each element in the set. Returns NIL like maphash does. <>= (defun mapset (func set) (maphash #'(lambda (item ignore-hashval) (funcall func (make-address-form (addr-value (set-type set)) item)) ) (set-implementation set) ) ) (add-init-hook #'(lambda () (let ((mapset (make-function 'builtin-function *set-type* *functions* #'(lambda (set_addr) (make-function 'builtin-function *functions* *set-type* #'(lambda (func_addr) (let ((result (get-empty-set (set-type set_addr)))) (mapset #'(lambda (item) (setq result (set-add result (function-apply func_addr item)))) set_addr ) result )) )) ))) (add-public-function mapset) (add-name mapset "map") ) )) @ \subsubsection{Set union} Take the union of two sets. For every element in 'a' or 'b', add them all to a new set, 'c' and return it. Results are undefined if the sets have different types. <>= (defun set-union (a b) (let ((c (get-empty-set))) (mapset #'(lambda (item) (setq c (set-add c item))) a) (mapset #'(lambda (item) (setq c (set-add c item))) b) ; return the new set c ) ) (add-init-hook #'(lambda () (let ((set-union (make-function 'builtin-function (type-product *set-type* *set-type*) *set-type* #'(lambda (twosets) (apply #'set-union (product-values twosets)) )) )) (add-public-function set-union) (add-name set-union "union") ) )) @ \subsubsection{Set intersection} Take the intersection of two sets. For every element in 'a' and 'b', add them all to a new set, 'c' and return it. Results are undefined if the sets have different types. <>= (defun set-intersect (a b) (let ((c (get-empty-set))) (mapset #'(lambda (item) (if (set-member b item) (setq c (set-add c item)) ) ) a ) ; return the new set c ) ) (add-init-hook #'(lambda () (let ((set-intersect (make-function 'builtin-function (type-product *set-type* *set-type*) *set-type* #'(lambda (twosets) (apply #'set-intersect (product-values twosets)) )) )) (add-public-function set-intersect) (add-name set-intersect "intersect") ) )) @ \subsubsection{Set difference} Take the difference between two sets. Return the set consisting of all elements of 'a' that are not in 'b'. Common LISP defines the functions UNION, INTERSECTION, and SET-DIFFERENCE, which operate on lists. Set complement is defined as taking the difference of ``the universe'' (some set treated in some context as ``everything''), and some other set (subtracting a set from the universe). Such an operation would be a specialization of the set difference operation, specifying the first argument as constant and leaving the second argument variable (free). Results are undefined if the sets have different types. <>= (defun set-diff (a b) (let ((c (get-empty-set))) (mapset #'(lambda (item) (if (not (set-member b item)) (setq c (set-add c item)) ) ) a ) ; return the new set c ) ) (add-init-hook #'(lambda () (let ((set-diff (make-function 'builtin-function (type-product *set-type* *set-type*) *set-type* #'(lambda (twosets) (apply #'set-diff (product-values twosets)) )) )) (add-public-function set-diff) (add-name set-diff "difference") ) )) @ Return the set consisting of the original set minus the removed item (both Max addresses). This can be treated as a special case of set difference where you subtract a set containing one item. It's not worth having its own section. <>= (defun set-remove (origset item) (let ((newset (copy-hash (set-implementation origset)))) (remhash item newset) ; return the new set (make-set-helper (set-type origset) newset) ) ) (add-init-hook #'(lambda () (let ((set-remove (make-function 'builtin-function *set-type* *functions* #'(lambda (set_addr) (make-function 'builtin-function (set-type set_addr) *set-type* #'(lambda (item_to_delete) (set-remove set_addr item_to_delete) )) )) )) (add-public-function set-remove) (add-name set-remove "remove") ) )) @ \subsubsection{Subset} Return true if A is a subset of B. A and B are both Max set addresses. Results are undefined if the sets have different types. <>= (defun set-subset (a b) (mapset #'(lambda (item) (if (not (set-member b item)) (return-from set-subset nil) ) ) a ) t ) (add-init-hook #'(lambda () (let ((set-subset (make-function 'builtin-function (type-product *set-type* *set-type*) *boolean-type* #'(lambda (twosets) (apply #'set-subset (product-values twosets)) )) )) (add-public-function set-subset) (add-name set-subset "subset") ) )) @ \subsubsection{Set equality} Two sets are equal if they are subsets of each other. Results are undefined if the sets have different types. <>= (defun set-equal (a b) (and (set-subset a b) (set-subset b a)) ) @ \subsubsection{Set quotient} Given a set, a function, and a value from the function's codomain, return the subset of the original set consisting of items that, when the function is applied, returns the specified value. This is similar to a function quotient except that in a function quotient, the starting set is always the function's domain (which is a type, not a set). <>= (defun set-quotient (set func result) (let ((newset (get-empty-set (set-type set)))) (mapset #'(lambda (item) (if (addr-equal (function-apply func item) result) (setq newset (set-add newset item)) ) ) set ) newset ) ) ; first 3-argument function (add-init-hook #'(lambda () (let ((set-quotient (make-function 'builtin-function (type-product *set-type* *functions* *objects*) *set-type* #'(lambda (sfr) (apply #'set-quotient (product-values sfr))) ))) (add-public-function set-quotient) (add-name set-quotient "quotient") ) )) @ \subsection{Counting} Return the LISP integer of the number of items in a set. <>= (defun set-count (setaddr) (hash-table-count (set-implementation setaddr)) ) @ \subsection{Relations} \label{relation} Relations are sets of tuples, and are used to ``associate'' some values. A tuple is a value in a product type. We only need binary relations, because anything not in a many-to-many relationship can just use a function. Indeed, a function is a special case of a relation with the constraint that every member of the domain must be present in the relation, but we won't formalize this special case yet. We can already create product types, and sets of product types, so this module doesn't define a new Relation type. Instead, it just provides some convenient operations for dealing with relations. These are taken straight from \emph{relational algebra}. These operations work fine on sets of things other than tuples, so they're part of the sets interface. <>= <>= (defun select (relation predicate) (set-quotient relation predicate TRUE) ) (add-init-hook #'(lambda () (let ((select (make-function 'builtin-function (type-product *set-type* *functions*) *set-type* #'(lambda (set_and_func) (apply #'select (product-values set_and_func)) ) ))) (add-public-function select) (add-name select "select") ) )) @ \item[PROJECT] ($\pi$) For each tuple, create a new tuple from a subset of its fields. Return the new set. This could be implemented by using UPDATE with a function that takes a tuple of the existing kind and returns a new tuple. You can currently specify such a function in LISP, using positions in the list of members of the product to choose fields to copy and construct a value in a new product type. However, that approach misses the point of a projection operation, which is supposed to be easy, and Max is lacking an easy way to specify a member field in a product type. So implementation of projection will occur after the introduction of a Max list type and the Max versions of the operations on product types. \item[RENAME] ($\rho$) This operation is not provided, since it is totally unnecessary. Relational algebra needs a renaming operation to prevent column name conflicts, since it refers to columns by name. However, the concept of naming has nothing to do with relations, so it shouldn't be dealt with in a relational algebra. Max refers to columns by direct reference, so there can't be any conflicts. We'll do the same thing with Lambda calculus when we come to it: Alpha reduction is totally unnecessary when variables are referred to by direct addressing instead of by names. In general, names clutter up a formal specification and create unnecessary renaming operations. Predicate calculus has this too, and it's really annoying! \end{itemize} The names for the following operations are from SQL (Simple Query Language, a subset of relational algebra used in many database systems), with behavior modified to be functional instead of procedural. To actually perform changes (relations are sets, and sets are immutable), you'll need a pointer. \begin{itemize} \item[UPDATE] Takes a function on a tuple and applies it to every tuple, returning the new relation. <>= (defun update (relation func) (let ((newset (get-empty-set (set-type relation)))) (mapset #'(lambda (item) (setq newset (set-add newset (function-apply func item))) ) relation ) newset ) ) (add-init-hook #'(lambda () (let ((update (make-function 'builtin-function (type-product *set-type* *functions*) *set-type* #'(lambda (set_and_func) (apply #'update (product-values set_and_func)) )))) (add-public-function update) (add-name update "update") ) )) @ \item[DELETE] Takes a relation and a predicate, applies the function to every tuple and returns a new relation containing just the tuples that resulted in FALSE. This is the complement of the SELECT operation above: <>= (defun relation-delete (relation predicate) (set-quotient relation predicate FALSE) ) (add-init-hook #'(lambda () (let ((delete (make-function 'builtin-function (type-product *set-type* *functions*) *set-type* #'(lambda (set_and_func) (apply #'delete (product-values set_and_func)) )))) (add-public-function delete) (add-name delete "delete") ) )) @ \end{itemize}