* producer-consumer asynchronous queues

P: producer -- one that wants some work to be performed
C: consumer -- one that will perform the work
T(P): time it takes the producer to give its work to someone else
T(C): time it takes the consumer to process the work given
Q: a queue of length MAX
R(P): rate of incoming producers (e.g., items per second)
R(C): rate of consumers draining queue (e.g., items per second)

e.g., a process issuing an I/O.  I/O is slow, no need to wait for it,
instead let someone else do the work, and the process can go do something
else useful (e.g., CPU computation, another I/O).  Sometimes called
"interleaving I/O and CPU".

T(P) << T(C) (else if T(C) is very short, better do it yourself
synchronously)

1. Design a queue Q
2. Producer adds some work W to Q, then returns immediately
3. Consumer picks up work from Q, processes it, then returns results in some
   way.

Need to limit size of work queue Q, so as not to consume too many resources.
- if Q is "full", put new producers to wait
- once Q size drops below threshold, can wakeup some producers
- if Q is empty, put consumers to wait
- once Q has at least one item, can wakeup some consumers

If Q isn't full, producers can add work quickly to Q, then go back and do
something else.  That "something else" can be CPU activity, or produce MORE
work items for the Q.  Such producers are called "heavy writers".  By
putting those producers to sleep when Q is full, we are holding back or
forcing those heavy writers to stop making more work.  This is called
"throttling the heavy writers".  This in turns frees up the system so  that
consumers can get their work done, to "drain" the queue.

How to inform producers of the status of the async work?
- signals, callback functions, message passing, etc.
- note: producer may no longer be running (so no one to inform)

Often you see in APIs that submit jobs asynchronously, an argument or two for
filling in a callback function and/or callback structure (void*).  If you
set them to NULL, it means you don't want to get informed.  Else, the
producer, when done, will call your callback fxn w/ data like return
success/fail codes, etc.  If so, then then the producer/consumer queue also
has to record these callback values, so it knows who/what to inform.
See for example Linux's AIO (Async I/O) calls for reads/writes.

Number of producers that can add work is the size (or depth) of the Q (max).
In modern systems, even the max size of the Q can grow/shrink, within some
limits or ranges, to accommodate needs vs. system resources.  A form of load
balancing among queues.

No. of consumers can also be tunable:  too few and system is underutilized,
too many and you waste mem/etc. resources when there's not much work.  A
rule of thumb is: one consumer per CPU core.

Scenarios:

1. R(P) >> R(C): in steady state, Q is full, most producers waiting

2. R(P) << R(C): in steady state, Q is empty on avg, most consumers sleeping

Assume (e)psilon is a small number

3. R(P) = R(C) + e: meaning R(P) is just slightly faster than R(C).  Q is
still full in steady state, just takes longer.

4. R(P) + e = R(C): meaning R(P) is just slightly slower than R(C).  Q is
still empty in steady state, just takes longer.

5. R(P) == R(C): perfectly balanced system, Q size is on avg a fixed number
b/t 1..MAX.


// producer-consumer queues

struct work_item W; // generic struct to hold description of any work to do
// work item may need to encode not just work requested, but any callbacks.
list_head Q; // simple d-s, list, FCFS
u_int max = 10, count = 0; // max queue len allowed, and current size
//spinlock L; // protect Q, max?, count (can't use spinlock if kmalloc)
mutex L; // protect Q, max?, count
waitq WP; // waiting producers on Q size to drop below max
waitq WC; // waiting consumers on Q size to go above 0


// a producer (process/user) adding work to the queue
produce(W)
{
  lock(L);
  // assume need to alloc some struct to hold W -- kmalloc(...)
  if (count < max) {
    add2queue(W, Q); // add work item W to task list Q
    count++;
  } else {
    add_me_to_waitq(WP); // put this producer/task into waitq WP
  }
  if (count > 0 && !empty(WC))
    wakeup_one_consumer(WC); // wakeup one consumer from waitq WC
  unlock(L);
}

// consumer: picks up work and processes it
consume(??) // what arg: maybe the Q to process (consumer kthreads could
	    // operate on multiple queues)
{
  struct work_item *T; // temp to hold work item
  lock(L);
#if 0
  if (count == 0) { // bad: consumer toggles b/t READY/RUNNING sched states
    unlock(L);
    return;
  }
#endif
  if (count > 0) {
    T = remove_from_queue(Q); // remove one item from work Q (FCFS?)
    count--;
  }
  if (count < max) {
    wakeup_one_producer(WP); // wakeup one producer from waitq WP
  }
  if (count == 0) {
    add_me_to_waitq(WC); // put this consumer/kthread into waitq WC
  }
  unlock(L);
  ret = process(T); // actually perform work encoded in item T
  // what if process work succeeded or failed?
  yield(); // return to scheduler (consumer kthread put to sleep or READY)
}