Class 6: Set operations, Function definitions


Reading
Sections 1.4 of Theory of Computing, A Gentle Introduction ... again. You really need to know this stuff!

Homework
Printout the homework and answer the questions on that paper.

Set Operations
There are several basic operations on sets that you need to know, all of which were covered in detail in discrete! Here's a summary:

Complement machine algorithm: the formal version
Now we can give a precise algorithm that takes as input a machine M = (Q,Σ,δ,s,W) and returns as output a machine M' such that L(M') = L(M), i.e. the language accepted by M' is the complement of the language accepted by M. Here it is:
Algorithm: Complement Algorithm
Input: machine M = (Q,Σ,δ,s,W)
Output: machine M' = (Q,Σ,δ,s,Q - W).   (Note: We claim that L(M') = L(M))
From our informal discussions on this algorithm, it should be clear that it does what it's supposed to, i.e. that it produces a machine M' accepting the complement of the language accepted by M. A strictly formal proof of the algorithm's correctness relies on the concept of configuration, which we'll need to even give a formal definition of what it means for a machine to accept a string! On the other hand, hopefully this algorithm is so obviously correct that no further proof is needed.

Set construction notation
When we want to describe fancy sets, often union, intersection Cartesian product, etc just aren't enough. There is a special notation for constructing sets that we often employ in these situations. We'll start with an example. Suppose we wanted to construct the set of all ordered pairs of natural numbers (recall, the natural numbers N = {0,1,2,3,4,...}) whose sum is 1000. Here's how we'd do it: {(a,b) ∈ NxN | a + b = 1000}. This expression reads:
The set of all (a,b), which are pairs of natural numbers, such that a + b = 1000.
Essentially, the form of these expressions is {A | B} where A defines the set from which you're choosing objects, the | means "such that", and the B gives a criterion for selecting which objects will actually end up in your set. People are usually pretty free-wheeling about what goes in the A and B, as long as it's clear what set's getting constructed. The big thing to remember, is that in A you're also giving a name or pattern representing a random object which you'll refer to in B --- like I did with (a,b) in the above example. If I'd chosen a different "pattern", like x ∈ NxN, it would've been difficult to express the selection criterion. I would've ended up with something like:
{x ∈ NxN | first component of x + second component of x = 1000} ← ugly way to get the job done!
which is pretty ugly!

Languages (more formally)
Languages are just sets of strings, so we often use set constructor notation to describe languages. This often leaves us in the situation of wanting to refer to the language/set of all strings over a given alphabet Σ. The notation for this is Σ*.

Problems:

Function definitions
We know how to specify function prototypes in the mathematical language of sets, but we haven't dealt with function definitions yet. They're straightforward, actually. under the prototype, you just give a rule for getting the function's value from its input. For example, suppose we wanted to define the "successor function" s, which simply takes an integer and returns the next integers. (Recall that Z stands for the set of all integers.)
s:Z → Z
s(x) = x+1
The prototype you should already know. The definition says that for any x from the function's domain, the function produces x+1.

Often we end up breaking things up into cases when we define functions. For example, we might give the following definition for the "signed successor function" ss.

ss:Z →Z
ss(x) =
{ x+1 if x > 0
x-1 if x < 0
x if x = 0
As I said before, you get a fair amount of flexibility in how you define these things. Here's an example defining a function f that takes a string and concatenates it with its reverse, with the hitch that if the string has odd length, the last character isn't doubled up in the result string. For example: f(ab) = abba, but f(abb) = abbba. Let Σ = {a,b}. We'll define f in a few different ways:
f:Σ* →Σ*
f(λ) = λ
f(wa) =
w∈Σ*, a ∈ Σ
{ waaw if |wa| even
waw if |wa| odd
f:Σ* →Σ*
f(a1a2a3...ak) =
{ a1a2a3...akakak-1ak-2...a1 if k even
a1a2a3...akak-1ak-2...a1 if k odd

Intersection machine algorithm: more formally
This algorithm takes as input machines M1 = (Q1,Σ,δ1,s1,W1) and M2 = (Q2,Σ,δ2,s2,W2) and returns as output a machine M such that L(M) = L(M1) ∩ L(M2), i.e. the language accepted by M is the intersection of the languages accepted by M1 and M2. (Notice that the alphabets of the two machines must be the same!)
Algorithm: the "Intersection Algorithm"
Input: M1 = (Q1,Σ,δ1,s1,W1) and M2 = (Q2,Σ,δ2,s2,W2)
Output: M = (Q,Σ,δ,s,W), where    (Note: We claim that L(M) = L(M1) ∩ L(M2))
  • Q = Q1 x Q2
  • δ:Q x Σ → Q
    &delta((q,p),a) = (δ1(q,a),δ2(p,a))
  • s = (s1,s2)
  • W = W1 x W2
Once again, a formal proof that this algorithm really constructs a machine accepting L(M1) ∩ L(M2) will have to wait. We're familiar enough with the idea of this algorithm that we probably don't need any more convincing.

Notes on writing down algorithms
When you write down an algorithm you must:

  1. Clearly specify what the input is.
  2. Clearly specify what the output is, i.e. what kind of object does the algorithm return, what task does the algorithm accomplish.
  3. Write down the body of the algorithm clearly and unambiguously enough that another person could follow it, apply it to example inputs and get the answer you intend ... maybe even implement it as a program.

All the variables describing the input are like parameters to a function, i.e. they are values you can use to construct your output result --- you do not have to give them values in your algorithm. All variables appearing in your output description that are not part of the input are variables you must give values to. For example, in the intersection algorithm the variable Σ appears in the output expression, but nowhere in the "output" section do I give Σ a value. No problem, because Σ is given to us as input to the algorithm. The variable W, on the other hand, is not given to us as input and it appears in the output expression. Therefore, we must define W, i.e. give it a value, which we do on the last line with "W = W1 x W2".