Writing Filesystems - Lifecycle of a vnode

From Genunix

Image:Info.gif This article has been identified as a draft. It is currently undergoing a community review. Please add your comments to the discussion page.

Do not quote any text on this page! It is still a draft!


The Solaris kernel and the UNIX userland knows several file-related abstractions. But on-disk files, file descriptors and vnodes don't have a 1:1 correspondence, mostly for speed reasons - caching. Several points come into play here:

  • Under UNIX, removing a file that is open does not purge access to that file from processes that have this file open. Hence, notification to the filesystem that a file is now 'to be deleted once and forever' may occur after the actual VOP_REMOVE() call.
  • Even if the file is closed by the last application that had it open, the system may want to keep filesystem state cached - the file could be reopened any moment.
  • The system page cache will keep pages associated with this file around even if the file is closed.
  • Certain filesystem operations, like stat(2) or rename(3C) can be applied to files that are not open.

All these mean that a vnode does not just exist while the file is held open by a process, but longer than that, and a vnode may even be required to exist if noone has the associated file open. The vnode_t data structure therefore is reference counted, via two macros VN_HOLD() and VN_RELE().

So how are vnodes actually created and how are they discarded ?

  • How does the framework request from the filesystem please give me a vnode ?
  • How does the framework tell the filesystem dump all state you may still have for this vnode - it's going ?

The answer to that is a seemingly strange pair of operations: VFS_VGET() and VOP_INACTIVE().

Why is one of them a VFS operation and the other one a vnode operation ? Well, simply because at the time when the framework requests a vnode to be created, it obviously cannot exist, and hence no operations can be associated with it. VFS_VGET() therefore must be a VFS, not a vnode op. On the other hand, vnode inactivation is clearly an operation on a vnode. When is that done ? Well, when VN_RELE() is called the last time - i.e. once the last reference on the vnode is gone. VN_RELE() actually evaluates to vn_rele(), which is simple:

vn_rele(vnode_t *vp)
{
	if (vp->v_count == 0)
		cmn_err(CE_PANIC, "vn_rele: vnode ref count 0");
	mutex_enter(&vp->v_lock);
	if (vp->v_count == 1) {
		mutex_exit(&vp->v_lock);
		VOP_INACTIVE(vp, CRED());
	} else {
		vp->v_count--;
		mutex_exit(&vp->v_lock);
	}
}


(to be continued ...)

VFS_VGET() VN_HOLD VOP_*() VN_RELE() VN_RELE() VOP_INACTIVE
... ...
... ... ... ...