#!/bin/sh
# example test shell scripts

echo about to run test 1
./testprog -flags
if test $? = 0 ; then
    echo program exited with 0, success
    # do something
else
    echo program exited with $?, failed
    # do something else maybe exit
fi

# prepare a test input file
echo hello world > testfile1.txt
# now test with this infile
# cleanup
/bin/rm -f testfile1.txt

# run another script
sh test1.sh
sh test2.sh

* assorted hw1 issues, cont.

1. testing

write a series of small shell scripts to test one thing at a time

may help to have 2 driver C programs: one that checks all params passed, and
one that doesn't (the latter can be used to test your kernel syscall)

2. partial output files

when inside the read-copy-write loop, serious errors could occur: can't read
file (EIO), can't write file (EIO, ENOSPC, ENOQUOTA, etc.).

Clearly need to return an error
- cleanup kernel state (close files, kfree, etc.)
- cleanup on-disk persistent state
	a. delete any partially written o/p file
	b. hint: look at sys_unlink()
		follow other code for sys_read, sys_write, sys_open
	c. note: try to del partial o/p file AFTER you filp_close() it.

"In hw1 syscall, why not call sys_foo() directly?"
- in theory you can, but...
- syscalls assume __user buffers
- it's the wrong abstraction layer

"I didn't verify __user buffers and it worked... why?"
- on systems with less phys RAM, Linux optimizes placement of user vs. phys
  buffers
- e.g., a 32-bit OS (max 4GB virt addr space), only 1GB of phys RAM.
- you can limit every process to have no more than, say, 2GB RAM.
- When kernel creates virt addrs, it can create them anywhere in 32-bit
  space, but instead, it creates them in the upper 2GB (from 2GB-4GB).
- phys addrs start at 0 and will go up to 1GB.
- given any addr, you can tell if it's physical (<1GB) or virt (>2GB)

**********************************************************************
* Virtual File System (VFS)

Early on computers had only one storage device (a floppy or early version of
a hard disk).  No POSIX standards.  Vendors created any kernel API they
wanted.  API incompatibility meant difficult and non-portable code.

At some point, computers started having 2+ storage devices: a hard disk, a
floppy, and even a cdrom.  How to read from a given device?!  First
iteration created APIs per device: floppy_read, floppy_open, cdrom_read,
cdrom_write, disk_open, disk_write, ...
- so had to know which device you're accessing, and code it into program.
- incompatible code, hard to port, depends on h/w configuration.
- inside the OS, had separate "file system" code to access different media.

Sun Microsystems created the first abstraction of APIs to access data files.
- realized that "file systems" for disk/cdrom/floppy had a lot of common
  code.
- noted that the APIs themselves were nearly identical save the API name

Created an abstraction called a "virtual" file system, which offers two
things:
1. provides a uniform API for system calls and file system developers
- the VFS "knows" which file system a file lives on, and redirects syscalls
  to a specific f/s code.
2. provided a "library" of helper utilities for f/s developers (ala "libc")

Goal:
- f/s developers will have a well defined API that system calls will call
  them.
- f/s dev only has to worry about the specifics of their media (hdd, flash,
  network, cdrom, etc.) and their f/s formal (data, meta-data, namespace)

* Key "objects" in Linux VFS.

1. A "struct inode" ("I"): represents info about a file on media (e.g.,
disk).  e.g.,

Meta-data:
- file size
- last modification time
- owner, group
- permissions
- ... stat(2) syscall

Pointers to actual data:

(a) pointers to data objects on persistent media, location where the data
  lives on the media (e.g., Logical Block Addresses or LBAs)
(b) pointers to "struct page" where some of the data is cached in page
  cache.

2. A "struct superblock" ("SB"): like the inode structure, but contains info
   about a WHOLE (mounted) file system.

- file system size
- number of files/inodes used, free
- block size
- number of used/free blocks
- ... df(1), which uses the statfs(2) syscall

(du(1) .. uses stat(2) and inode sizes)

SB also has pointers to on-media locations where different types of info are
placed:
- where to place file data blocks/sectors
- where to place inode objects
- other meta-data information (global stats about the whole f/s): the "super
  block" file system info

3. A "struct dentry" ("D"): a directory entry object.

- records the name of an object, part of the "namespace" of the f/s
- (a file can have multiple names -- link(2))

- provides a fast caching and lookup for file names (and their inodes)
  inside the OS.
- b/c many syscalls pass pathnames, and each component of a pathname
  translates to one "inode".
- need a fast cache to lookup those names in memory to avoid I/O

4. A "struct file" ("F"): an instance of an opened file/objects

- corresponds to a user level "file descriptor" (fd)

Contains:
- a read/write offset inside the file: updated each time you read/write the
  file, and l/seek(2).
- open file mode: O_RDONLY, O_RDWR, etc.

* connecting objects

Different VFS objects contain different information, but they're connected
together:

A dentry points to an inode that exists for that name, designated

D -> I

OS manages many objects in various caches: has to decide what to keep and
what to throw away, and when.  Suppose OS needs to clean up caches to make
room:

1. discard D: can throw away b/c it points to I
2. discard I: can't throw away alone b/c it'd leave a "dangling pointer"
   invalid D pointer to I.
3. discard both D and I: ok

How to manage all relationships b/t objects: a reference counter (RC)
- a number that every obj can keep that counts "how many others point to
  me?"
- inside every struct there's an "integer" that acts as RC.

D -> I

- D's RC is 0
- I's RC is 1
- So we can discard any object with RC==0
- each time you remove a pointer association, reduce RC of pointed object
  by 1.  Now you can free up "I" b/c it also has RC==0.