* Virtual File System (VFS), cont. 3. A "struct dentry" ("D"): a directory entry object. dentry objects are organized into a special purpose fast cache called a "dentry cache" or "dcache". This is sometimes called a Directory Name Lookup Cache (DNLC). * connecting objects Different VFS objects contain different information, but they're connected together: D.0 -> I.1 Meaning: - D's RC is 0 - I's RC is 1 - So we can discard any object with RC==0 ('D') - each time you remove a pointer association, reduce RC of pointed object by 1. Now you can free up "I" b/c it also has RC==0. Rule: 1. when creating a link between objects X and Y, inc Y's RC by 1 2. when removing a link between objects X and Y, dec Y's RC by 1 * opening a file User 1: fd1 = open("foo.txt", ...); // succeeds In kernel: F1.0? -> D1.1 ("foo.txt") -> I1.1 User 2 (after user 1's open succeeded): fd2 = open("foo.txt", ...); // succeeds Kernel: F2.0? -> D1.2 ("foo.txt") -> I1.1 F1.0? -> D1.2 ("foo.txt") -> I1.1 User 1: close(fd1); // succeeds Discard object F1, "break" link to D1, D1->RC-- F2.0? -> D1.1 ("foo.txt") -> I1.1 User 3: (assume clean state in kernel) $ ln foo.txt bar.txt # create a hard link from "bar.txt" to "foo.txt" fd1 = open("foo.txt", O_RDONLY); // ok fd2 = open("bar.txt", O_RDWR); // ok F1.? -> D1.1 ("foo.txt") -> I1.2 F2.? -> D2.1 ("bar.txt") -> I1.2 User: close(f2); Kernel: discard F2, D2->RC-- goes to 0 - but I1 still has RC==2 F1.? -> D1.1 ("foo.txt") -> I1.2 D2.0 ("bar.txt") -> I1.2 later on, kernel cleans up dcache, and removes D2.0, I1->RC-- goes from 2 to 1 * RC of open files (assume again clean kernel state) User: fd1 = open("foo.txt", ...); // ok Kernel: F1.1 -> D1.1 ("foo.txt") -> I1.1 F1's RC is 1, because there's a link to the struct file from the process "struct task". Inside struct task you'll find a "struct *files[]" field. The length of the array is limited by some OS parameter for the max allowed open FDs per process. The position within the array corresponds to the user file descriptor (integer). If "fd1 = open(...)" returned 3. In kernel, task struct *current->files[3] == pointer to "struct file F1" What is current->files[0]? // stdin, and so on for stdout/stderr If same process opened a file by same (or alias name) twice, will get two different struct file's. Each such F's will have RC=1. Sometimes, struct file's in a single process have an RC>1. - meaning two different slots in task struct point to same struct file ptr. - created using dup(2) and dup2(2). - also when you fork(2), or specifically clone(2) with CLONE_FD * Dentries and Inodes D ("name1") -> I (file "name1" exists) : a POSITIVE cache entry D ("name2") -> NULL (file "name2" does not exist) : a NEGATIVE cache entry (or "negative dentry") - after a delete/unlink the file (a positive dentry demoted to negative one) - if any user tried to lookup a name for an obj that doesn't exist. A negative dentry can be "promoted" to a positive dentry - when an inode for that dentry name is created anew A negative dentry can't have any files pointing to it - neg dentry RC is 0 (can be cleaned up any time) * look at struct inode in include/linux/fs.h - main header file for the VFS Certain fields that are likely to be accessed together are stored closer to each other in inode, to improve CPU cache-line efficiency. e.g., time fields, or permission-related fields (uid/gid/mode) struct super_block *i_sb; // inode ptr to superblock it belongs to struct address_space *i_mapping; // access to cached data pages of file // contains ptr to function pointers that can operate on THIS inode struct. const struct inode_operations *i_op; Many Linux structures are programmed in object-oriented manner: - obj has "methods" (fxn ptrs) to operate on object - private and public fields (who allowed to access) some inode fields are readonly, can be accessed freely, but not modified (initialized at struct inode creation time). No lock needed. Some fields can be accessed, but only under a certain lock. - different kinds of locks (fast spinlock, blocking mutexes, etc.) e.g., define which lock(s) protect which fields spinlock_t i_lock; /* i_blocks, i_bytes, maybe i_size */ * inode RC atomic_t i_count; // inode reference counter Uses - atomic_t is an int protected by some lock - includes ops to create/delete atomic_t objects - atomic_inc(struct atomic_t *ptr) // inc by 1 - atomic_dec(struct atomic_t *ptr) // dec by 1 - atomic_add/sub: to add/sub N values to a counter - atomic_read: read value of atomic counter struct list_head i_lru; /* inode LRU list */ struct list_head i_sb_list; struct list_head i_wb_list; /* backing dev writeback list */ union { struct hlist_head i_dentry; struct rcu_head i_rcu; }; atomic64_t i_version;