* inode ops, cont. Use "->" to refer to a generic operation. For example, ->unlink, refers to the inode unlink op. Individual f/s, will implement their own ->unlink fxn, called ext4_unlink, ntfs_unlink, nfs_unlink, etc. ->permission op: VFS calls f/s asking if "current" has access to inode, as listed in 2nd arg (bitmap -- read permission, write permission, etc.). * ->lookup struct dentry * (*lookup) (struct inode *, struct dentry *, unsigned int); Lookup takes: 1. locked inode of dir in which we're looking for an object 2. dentry of name of object we're looking for 3. int flags to control the lookup process Lookup returns: 1. another dentry: if found, return POSITIVE dentry (can be same dentry as passed, but made positive) 2. if not found: returns negative dentry 3. else, returns encoded PTR_ERR. VFS: 1. when user tries to access a file, stat("foo.txt") 2. sys_stat(...) [syscall layer] 3. [vfs layer]: perform a pathname-lookup (aka, "lookup_pn", or "namei") procedure. 4. check dcache to see if you find cached object with name "foo.txt" - if dentry object found, and is negative: then return -ENOENT - if dentry object found, and is POSITIVE: then we can operate on it (discuss next) 5. if not found dentry object: call f/s ->lookup method - if ->lookup returned positive dentry: cache it, then operate on it. - if ->lookup returned negative dentry: cache it, then return -ENOENT - if ->lookup returned PTR_ERR: return -errno to caller (nothing cached) * full path lookup User performs unlink("/home/jdoe/test/foo.txt") ??? ... series of ->lookups to find the next component pathname, in its parent directory, until the root "/". So where is the dentry for "/"? A: root dentry+inode get created at boot time and when you mount a new file system, root I+D get created "manually" by a file system when it is first initialized. perform ->lookup(I, D) - pass inode for dentry named "jdoe" - pass dentry name "test" call f/s ->unlink: - pass inode for for the dentry whose name is "test" - pass positive dentry name "foo.txt" Overall process is a series of: - get global "/" dentry+inode - call ->permission on newly found dir... - call ->lookup "home" in "/": get positive dentry - call ->permission on newly found dir... - call ->lookup "jdoe" in "home" inode - call ->permission on newly found dir... - call ->lookup "test" in "jdoe" inode - call ->permission on "test" dir for read+execute (traverse into dir) - call ->lookup "foo.txt" in "test" inode - call ->permission on "test" inode, looking for "write" permission - call ->unlink on "test" inode, and "foo.txt" dentry VFS has to un/lock objects before calling the f/s. VFS checks permission: if permission is denied, return EPERM/EACCESS If any ->lookup failed, return error or ENOENT. VFS also checks for cached objects: calls f/s only if object is NOT cached. For an unlink("/home/jdoe/test/foo.txt") syscall, f/s will see: 1. any lookup for objects that are NOT cached. 2. the actual, final ->unlink op 3. a series of ->permission checks IF the f/s implements ->permission Linux VFS has several d-s, with ops vectors, total no. of ops is 40+. Linux doesn't require a f/s to implement all ops: 1. for f/s that don't implement a given functionality can set the op to NULL. - if a user executes a symlink() syscall on FAT, VFS will find that the op is NULL, and return an error -ENOTSUPP 2. for f/s that want the VFS to offer a default behavior. - if f/s leaves ->permission field NULL, VFS will invoke a default POSIX permissions model (rwx by user, group, and other). - alternative: use a predefined function like "generic_permissions" as your ->permission. Note: other OSs force you to implement each VFS ops, even if you don't support it, and you have to return an -ENOTSUPP. If a f/s wants stronger permission model, it can implement ->get_acl, ->set_acl, and may need to implement xattr (Extended Attribute) methods (b/c ACLs are built on top of EAs). ACL: Access Control List. ACL models permits multiple owners and groups per file, including intersections and subtractions: - e.g., file can be accessed if user is in groups G1 and G2 - or, file can be accessed by anyone who's in group G3, but not G4. Extended Attributes: - any extra pairs you can attach to an inode - extends the default meta-data stat(2) attrs. Some ops depend on each other. E.g., for symlink support, you need ->symlink and ->readlink. Note: VFS also has to check after each ->lookup, if inode is a symlink, and if so, call ->readlink, and traverse the new pathname the symlink points to. ->setattr(): change the inode attributes (meta-data), like owner, group, permissions, etc. Corresponds to system calls: chown(2), chgrp(2), chmod(2), utimes(2). Takes an iattr struct and bitmap to tell what has changed. ->getattr(): retrieve one or more attributes of an inode, useful for stat(2). * struct dentry (include/linux/dcache.h) similar stuff as inode: has ops fields, locks protecting other fields, void*, etc. dentry->d_inode ptr to inode (NULL is negative) dentry->d_parent: to speed up lookups of "..". $ cd ../../foo/bar An optimization so don't even have to lookup the dcache structures