Course...
Advanced Control Flow
- Basic control flow consists of conditional, loop, routine mechanisms.
- Arbitrary transfer due to goto statement makes program hard to understand and maintain.
- Structured programming uses a small set of control structures to replace goto, making programs easier to understand.
- A multi-exit loop is a loop with one or more exit locations within body of loop.
- Code-typing style should allow insertion of new code without changing much of existing code.
- Short-circuit logical operators are constrol structures because e1 && e2 ≠ &&(e1, e2).
- Static multi-level exits arbitrary number of control structures where exit points are known at compile time. It is execution-time efficient when needed.
A: for( ;; ){ B: for( ;; ){ break A; } } - Break differs from goto that it cannot be used to create loop, and it cannot be used to jump into a control structure.
- It's good to label all exits, because control structures can be nested without affecting break's destination.
- Return statements in a routine can generate multi-exit loop and multi-level exit.
- Recurisve algorithm subdivides a problem into recurisve cases and base cases. ie. induction.
- When a routine calls itself direct or indirectly, forming a call cycle is called a recursion.
- Recursion has ability to remember data and state by using routine-call stack with local variables and return mechanism.
- Tail recursion is when the recursion is the last execution.
- Backward links are automatically stored in stack when using recursion.
- Functional programming is a style of programming that restricts variables to be write-once read-only. Recursion becomes heavily used.
- Routine pointers allows code to be passed in as an argument.
int f( int v, int (*p)( int ) ){ return p( v*2 ) + 2; }The above function f takes in a function pointer p, which accpets an int parameter and returns an int value.template<typename T> T f( T v, T (*p)( T ) ){ return p( v*2) + 2; }The above is a more generalized version, given T has operator* and operator+ overloaded. - Routine pointers are passed around as constant pointers for it does not make sense to copy code.
- Routine pointers are used implicitly for inheritance.
Exceptions
- Exceptions provides flexible ways to jump among routines, cannot be achieved by normal control structures.
- Exceptional event occurs with low frequency.
- Control-flows can be characterized by two properties: static/dynamic call, static/dynamic return.
- sequel = static call and return; routine = static call dynamic return; exceptions = dynamic call static return; routine pointer, resumption = dynamic call and return.
- Traditional Exception Handling Mechanism (EHM) returns code, sets status flag, or calls fix-up routine (either global or passed as function pointer).
- Traditional EHM's drawbacks: checking is optional, return code is hard to differentiate, local fixup routine adds more parameters, status flag can be overwritten, return codes compounds together during multi-level exists.
- EHM is demanded to be more complex when dealing with advanced objects, coroutines, and threads.
- Execution entity is the language unit in which an exception can be raised, usually has its own stack.
- Exception type is a type name representing an exceptional event. Exception is an instance of the exception type.
- Raise (throw) creates an exception.
- Source execution is the execution entity that rasies an exception. Faulting execution is the execution entity that changes control flow due to the raised exception.
- Local exception means source execution = faulting execution. Nonlocal exception means otherwise.
- Concurrent exception is a nonlocal exception where source and faulting executions are executing concurrently.
- Propagation directs control from a raise in the source execution to a handler in the faulting execution. Propagation mechanism is the rule used to locate a handler.
- Handler is a nested routine that catches a raised exception and handles it.
- Guarded block is try block.
- Termination means stacks are unwinded while exception is thrown down, causing control cannot return to the raise point. Resumption is when stack is not unwinded.
- EHM = Exception type + raising + propagation + handlers.
- In C++, exceptions can be truncated when it's thrown and caught, this is often fixed by throwing and catching by reference.
- Static propagation is done by sequel, advantage is that handler is statically known, disadvantage is only works for monolithic programs, cannot be in libraries.
- Dynamic propagation (dynamic multi-level exit) is done by termination or resumption, dynamic raise with static / dynamic return, respectively.
- Advantage is works for separately compiled programs, disadvantage is handler is not statically known.
- 3 forms of non-recoverable termination: nonlocal, terminate, retry.
- Resumption provides limited mechanism to generate new blocks on call stack, when handler returns, it performs dynamic return.
- Resumption handler is nested routine, requires lexical link (because it supports nested access).
- Derived exception-types is a mechanism for inheritance of exception types. This provides ability to handle exceptions at different degrees along hierarchy. It also provides ability to catch multiple exceptions.
- Most propagation mechanisms perform linear search of handlers, so more specific handlers should appear before generic ones.
- Reraise is a mechanism to terminate current handling and continuing propagation of caught exception. Useful, when current handler cannot fully handle the exception.
} catch( E ){ ... throw; // no exception type needed } - Exception list is part of routine prototype specifying which exception types this routine throws. Allows static detection of unhandled exceptions and runtime detection where the exceptions get converted into failure exception.
- 2 kinds of exception checks: by exception-type (static, ie Java) or by routines (dynamic, ie C++).
- Exception-type check precludes code reuse when method is generic and may throw all kinds of exceptions based on type.
- Exception list can preclude reuse for arguments of routine pointers when a routine has less restrictive prototype than the routine it contains.
- Determining exception list for a routine is impossible for concurrent exceptions since they can propagate at any time.
Coroutine
- Coroutine is a routine that can be suspended as control leaves it and carry on when control returns to it.
- States of coroutine: current location, execution state (variables and such), and execution status (inactive/active/terminated).
- There is no concurrency among coroutines.
- Semi-coroutine acts asymmetrically, ie. a calls b. Full-coroutine acts symmetrically, ie. a calls b and b directly or indirectly calls a.
- In uC++, coroutine is an object.
#include <uC++.h> #include <iostream> using namespace std; _Coroutine fibonacci { // : public uBaseCoroutine int fn; // used for communication void main(){ // distinguished member int fn1, fn2; fn = 0; fn1 = fn; suspend(); // return to last resume fn = 1; fn2 = fn1; fn1 = fn; suspend(); // return to last resume for( ;; ){ fn = fn1 + fn2; fn2 = fn1; fn1 = fn; suspend(); // return to last resume } } public: int next(){ resume(); // transfer to last suspend return fn; } }; void uMain::main() { fibonacci f1, f2; for( int i = 1; i <= 10; i+= 1 ){ cout << f1.next() << " " << f2.next() << endl; } } - With coroutine, flag variables to retain execution state becomes unnecessary.
- When control reaches end of main, it goes back to the coroutine that started main.
- 3 phases of full coroutine: starting cycle, executing cycle, stopping cycle.
- Mutually recursive references for full coroutine: fc x(y), y(x). y hasn't been defined when used by x.
uC++ EHM
- Declaring exception type in uC++:
_Event exception_type_name { ... }; - To raise exception in uC++:
_Throw [ exception-type ] [ _At uBaseCoroutine-id ]; _Resume [ exception-type ] [ _At uBaseCoroutine-id ];
- uC++ always throw the actual object, so that handlers catch the more specific exception:
_Event B {}; _Event D : public B {}; void f( B &t ){ _Throw t // in uC++ D is thrown, not B }When catching, should catch by reference to avoid truncation. - 2 types of handlers: termination(try/catch), resumption(try
...{} where h1, h2 are functors). - Functor syntax:
struct H1 { void operator()( R1 &1 ){...} }; - _Throw is only caught by termination handlers, _Resume is only caught by resumption handlers.
- Exception types may overlap, ie, resume and throw same type.
- Recursive resuming is when resumption handlers throwing exception of the same type. This is prevented by uC++ by marking visited handlers to be ineligibee.
- uBaseCoroutine::UnhandledException is thrown at last resumer regardless of exception type if coroutine does not handle exception locally. Upon this condition, coroutine exists to its last resumer instead of starter.
- Nonlocal delivery is disabled initially for coroutine. Exceptions are not propagated until the faulting entity is active.
Concurrency
- Thread is an independent sequential execution path through a program.
- Process is a program component that has its own thread.
- Task is similar to process, but tasks share common memory, sometimes referred to as light-weight process.
- Parallel execution is when 2 or more operations occur simultaneously, only possible on multi-processor machines.
- Concurrent execution is when multiple threads appear to be performed in parallel. This can be done on single processor machines using context switching.
- Advantages of concurrent programming: dividing a problem into multiple threads is an important technique; the problem may be easily mapped to a multi-threads solution; enhances execution time efficiency.
- Difficulties of concurrent programming: hard to understand, specify, and debug.
- 2 types of concurrency: those attempt to discover concurrency in an otherwise sequential program; those provide concurrency through explicit constructs.
- A good concurrent system must support variety of concurrent approaches while not requiring programmers to work at low level.
- Multiprocessing on a uniprocessor cpu is done by context-switching based on timer interrupt independent of program execution, may happen between any two machine instructions.
- A thread goes through following states: new -> ready <- blocked -> running -> halted.
- State transitions initiated in response to events: timer alarm( running -> ready ), completion of I/O operation( blocked -> ready ), exceeding some limit( running -> halted ), exceptions( running -> halted ).
- Concurrency requires the ability to specify the following 3 mechanisms: thread creation, thread synchronization, thread communication.
- A thread graph represents thread creations.
- Termination synchronization examples:
Version 1: Restricted to creating trees of threads_Task T1 { void main(){ i = 1; } }; _Task T2 { void main(){ p1(5); } }; void uMain::main(){ { // COBEGIN T1 t1; T2 t2; } // COEND }Version 2:_Task T1 { void main(){ i = 1; } }; _Task T2 { void main(){ p1(5); } }; void uMain::main(){ T1 *p1 = new T1; // start a T1 ... s1 ... T2 *f1p = new T2; // start a T2 ... s2 ... delete p1p; // wait for p1 ... s3 ... delete f1p; // wait for f1 ... s4 ... } - A thread terminates when it finishes normally, with an error, gets killed by parent or parent terminates. Latter 2 are not supported in uC++.
- Synchronization occurs when one thread waits until another thread has reached a certain point in its code.
- Threads can transfer information via address (variable parameters) if memory is shared, or by value if memory is separated.
- Exceptions can be handled by coroutines, single task, or multiple tasks concurrently.
- Concurrent exception example:
_Task searcher { searcher &partner; void main() { try { ... if( key == ... ){ _Throw stopEvent() _At partner; } catch( stopEvent ) { ... } . . . };The above example has 2 tasks looking for a key, as soon as one finds it, it informs its partner to stop looking. - In uC++, nonlocal delivery is initially disabled for a task. _Enable must be used to allow delivery.
- Potential problem may arise when multiple threads attempt to operate on a same object. This is not a problem is operations are atomic (not divisible in cpu).
- A group of instructions on an object that must be performed atomically is called a critical section. Preventing simultaneous execution of a critical section is called mutual exclusion.
- In general, it is best to avoid using shared static variables in concurrency.
- Mutual exclusion rules:
- only one thread can be in critical section at a time
- threads run at arbitrary speed and order
- if a thread is not in critical section, it may not prevent others from entering
- no indefinite postponement
- no starvation.
- Self-testing critical section provides a mechanism to alarm the system when it detects multiple threads have operated on it simultaneously.
- Various software solutions to solve mutual exlusion:
- Lock
- Alternation
- Declare intent
- Retract intent
- Prioritize entry
- Dekker
- Peterson
- N-Thread prioritized entry
- N-Thread bakery (Tickets)
- Tournament
- Arbiter
Lock Abstraction
- Locks are constructed for synchronization or mutual exclusion or both. The two general categories are spinning and blocking.
- Spin lock is implemented using busy waiting. CPU is wasting time constantly checking for event. Spin lock may break rule 5: no bound on service.
class uSpinLock { public: uSpinLock(); void acquire(); bool tryacquire(); void release(); }; - uSpinLock is non-preemptive, no other task may execute once the lock is acquired. non-preemptive lock can only be used for mutual exclusion.
- Blocking locks reduce busy waiting by making task releasing the lock do additional work, called cooperoation
- Mutex lock is used only for mutual exclusion. Single acquisition task that acquired lock cannot acquire it again. Multiple acquisition owner can acquire it multiple times, called an owner lock.
- Implementation of a mutex lock requires a blocking task and a releasing task.
- Stream locks: osacquire( cout ) / isacquire( cin ).
- Synchronization lock is used soley for synchronization, often called a condition lock.
- Synchronization lock involves external locking and internal locking.
- External locking:
class SyncLock { Task *list; public: SyncLock() : list( NULL ){} void acquire( MutexLock &mutexlock ){ // add self to lock's blocked list mutexlock.release(); // yield } void release(){ if( list != NULL ){ // remove task from blocked list and make ready } } }; - Internal locking:
class SyncLock { spinlock mlock; Task *list; public: void acquire( MutexLock &userlock ){ mlock.acquire(); // add self to task list userlock.release(); // release before blocking mlock.release(); //yield } void release(){ mlock.acquire(); if( list != NULL ){ // remove task from blocked list and make ready } mlock.release(); } }; - A barrier is a synchronization mechanism used to repeatedly coordinate a group of tasks performing concurrent operation surrounded by a series of sequential operations. Most barriers use internal blocking.
- A semaphore provides synchronization and mutual exclusion.
- A binary semaphore is when a semaphore only has 2 values (0,1).
- A counting semaphore is a multi-valued semaphore. uC++ only provides a counting semaphore as it subsumes a binary one.
class uSemaphore{ public: uSemaphore( unsigned int count = 1 ); void P(); bool TryP(); void V( unsigned int times = 1 ); int counter() const; bool empty() const; }; - Semaphore usually starts at 0 when building synchronization and starts at 1 or N when used o build mutual exclusion.
- A split binary semaphore is a collection of semaphores whose value sum up less or equal to 1. These are used with baton passing.
- Baton passing rules involve:
- there is exactly one baton.
- nobody moves in the entry/exit code unless they have it.
- once the baton is released, cannot read/write variables in entry/exit.
Concurrent Errors
- Race condition
- Live-lock
- Dead-lock
- Synchronization deadlock
- Mutual exclusion deadlock
- 5 conditions for a set of processes to get into deadlock:
- There exists a shared resource requiring mutual exclusion
- A process holds a resource while waiting for access to a resource held by another process
- Once a process has gained access to a resource, the O/S cannot get it back
- There exists a circular wait of processes on resources
- These conditions must occur simultaneously
- Deadlock prevention eliminate one or more of the conditions required for a deadlock.
- Banker's Algorithm
- Allocation Graphs
- 3 ways of deadlock detection and recovery:
- Let it happen and recover
- Build and check for cycles in allocation graph
- Preemption of one or more processes in cycle
High Level Communication Techniques
- Monitor is an abstract data type that comebines shared data with serialization of its modification.
_Monitor name{ // shared data // members that see and modify the data }; - A mutex member is a member that does not begin execution if there is another active mutex member.
- Exceptions raised in mutex member always release the mutex lock.
- To achieve synchronization with monitor, external scheduling or internal scheduling is used. External scheduling scheules tasks outside the monitor and is accomplished with the accept statement. Internal scheduling schedules tasks inside the monitor and is accomplished using condition variables with signal and wait.
- An acceptor blocks until a call to the specified mutex member occurs.
if( count == 20 ) _Accept( remove ); // blocks until remove() is called
- uC++'s implementation for conditional lock used for internal scheduling is uCondition.
uCondition x; x.wait(); // waits until x.signal() is called x.signal(); // unblocks the first waiter in the FIFO queue
The signaller does not block, so the signalled task must continue waiting until the signaller exist or waits. All signals before any waits are wasted. - Explicit schedulings are internal scheduling (signal) and external scheduling (accept). Implicit scheduling involves monitor selects (wait/exit).
- Internal scheduling must be used when scheduling depends on member parameter values and a task might be further blocked while in the monitor.
- Conditional wait always blocks, where P only blocks if semaphore = 0.
- If signal occurs before a wait, it is lost, while a V before a P affects the P.
- Multiple Vs may start multiple tasks simultaneously, while multiple signals only start one task at a time because each task must exit serially through the monitor.
- Monitor type depends on the order of execution determined by implicit scheduling.
- A immediate-return signal is a monitor type that requires that the signaller exit immediately from monitor.
- 10 useful monitor types:
- Priority Blocking C < S < W
- No Priority Blocking C = S < W
- Priority Nonblocking C < W < S (uC++)
- No Priority Nonblocking C = W < S (Java/C#)
- Priority Quasi C < W = S
- No Priority Quasi C = W = S
- Priority Return C < W
- No Priority Return C = W
- Priority Implicit Signal C < W
- No Priority Implicit Signal C = W
- A coroutine monitor is called _Cormonitor
Direct Communication
- Communication among tasks with a monitor is indirect.
- A tasl is like a monitor as it provides mutual exclusion and can perform synchronization.
- Task's public members are implicitly mutexed.
- Task is also like a coroutine because it has distinguished member (main).
- If conditions are true but the calls are not made, acceptor blocks.
- Terminating else clause allows a conditional attempt to accept a call without blocking acceptor.
- When an acception is raised in a task member, it propagates to the caller, and a special exception called uSerial::uRendezvousFailure is thrown at this.
- Code in _Accept clause is executed after the mutex member being called, with desctructor as an exception.
- When a task's thread is halted, it behaves like a monitor.
- To increase concurrency, public mutex members should terminate as quickly as possible to allow callers to continue on. All admin jobs should be performed in the task's main method. Also, a larger internal buffer could be used to allow clients to get in and out of the server faster, however, this should only be used if the production and consumption speed is approximately equal, otherwise the buffer will be always full or empty.
- Administrator is used to manage a complex interaction or complex work. Administrators create workers to perform actual work, itself does not do any real work. Administrator makes no call ot another task because such calls may block the administrator.
- To achieve asyncrhony when a method has return value, split the call into argument transmission call and result retriving call.
- Tickets can be given out to callers to allow them to come back later to obtain results.
- Call-Back routine allows server to call clients's method when result is ready, this method must not block.
- Future provides the same asynchrony. It allows server to secretly store result in clients and hope that when clients retrieve the result after the result is stored. The caller is made to think that the result is immediately there.
Other Approaches
- Ada 95 is not as general.
- when clause is only external scheduling because it can only be used at start of entry routine.
- select is external scheduling and only appear in a task.
- has no direct internal-scheduling mechanism.
- requeue is used to re-block mutex member's entry queue. It suffers from accumulated temporary result, meaning intermediate results either need to be repackaged and taken along or recomputed at re-entry.
- Modula-3/Java/C#
- threads start in member run.
- Termination synchronization is achieved by called join.
- Synchronized is equivalent to _Mutex.
- Threads and lock libraries are unsafe or inefficient because the compiler thinks the program is sequential.
- Message passing is an alternative mechanism to parameter passing. All information is grouped into a single data area and pushed through a wire. This makes a message independent of the context of the message sender, and pointers need to be dereferenced.
- Message format could be variable sized, fixed sized, variable & fixed size or typed message.
Distributed Communication
- Remote Procedure Call(RPC) is an important mechanism in distributed computing.
- Difficult problems involve
- How does the code from called routine get on another computer?
- How is the code's address determined for the call once it is available on the other machine?
- How are the arguments made available on the other machine?
- How are pointer arguments dealt with?
- How are different data representations dealt with?
- How are different languages dealt with?
- How are arguments and parameters type checked?
- All schemes must provide a mechanism to
- make the call
- transimit the arguments and return a result
- RPC without shared memory
- local call from client to client stub
- client stub collects arguments into a message (marshalling) and passes it to the local transport
- client transport enity sends message to server transport entity
- server transport entity passes message to server stub
- server stub unmarshalls arguments and calls local procedure
- server performs its processing and returns to server stub
- marshall return values into message, pass back to client in similar fashion
- Defective communications are lost message, damaged message, out-of-order message, duplicated message.