Writing Filesystems
From Genunix
| This article has been identified as a draft. It is currently undergoing a community review. Please add your comments to the discussion page.
Do not quote any text on this page! It is still a draft! |
How to write a Solaris filesystem
Finally fulfilling a promise I gave a while ago, I'm going to braindump a little mini-series here on what I know about the sourcecode structure of Solaris filesystems, the undocumented interfaces that filesystem code in Solaris must use to actually work, how these interfaces work and how locking issues / typical design faults that Solaris filesystems have had in the past can be avoided in new implementations. As this unspools, a reimplementation of the FAT filesystem driver will be written, explaining step-by-step how the skeleton code grows into a functional implementation.
As an editorial note, I had a title "Writing a Solaris filesystem in 21 days" for this series originally. I must admit I failed at doing it in 21 days - so beware, it might take longer ...
Table of contents
- Introduction
- What does it involve writing a (disk-based) filesystem for Solaris ? Why this article ?
- Sourcecode Structure
- The sourcefile tree for a Solaris filesystem driver and associated utilities
- Build Environment
- How to set up an OpenSolaris workspace for developing your filesystem
- Module glue
- Filesystem drivers are kernel modules - but not 'ordinary' kernel modules ...
- Mount option handling
- See how mount option parsing can be handed off to the framework
- VFS and Vnode interfaces
- Provides an overview of per-mountpoint (VFS) and per-file (Vnode) operations that a filesystem driver must/may implement
- Specifics about VFS Operations
-
VFS_MOUNT(),VFS_UNMOUNT(), andVFS_FREEVFS() -
VFS_STATVFS() -
VFS_ROOT() -
VFS_SYNC() -
VFS_VGET()- Requesting a file node from a filesystem instance - Userdata I/O
- Excursion on how read/write and mmap-based I/O
- A simple locking protocol for a read/write filesystem
- Reentrancy in filesystems and the need for shared/exclusive access for directory and file updates
- Specifics about Vnode interfaces for file I/O
- Opening and closing files -
VOP_OPEN(),VOP_CLOSE() - Reading and writing files via system calls -
VOP_READ(),VOP_WRITE() - Support for mmap'ed IO -
VOP_MAP(),VOP_ADDMAP()andVOP_DELMAP() - Backend for mapped IO -
VOP_GETPAGE(),VOP_PUTPAGE() - File and directory attributes -
VOP_GETATTR(),VOP_SETATTR(),VOP_ACCESS() - ACLs and security attributes -
VOP_GETSECATTR(),VOP_SETSECATTR() - ioctl on files -
VOP_IOCTL() - Specifics about directory-related Vnode operations
- Reading directory contents -
VOP_READDIR()andVOP_LOOKUP() - File and directory creation/removal/rename -
VOP_CREAT(),VOP_REMOVE(),VOP_RENAME(),VOP_MKDIR(),VOP_RMDIR() -
VOP_INACTIVE()and a vnode's lifecycle - Shows how
VFS_VGET()andVOP_INACTIVE()complement each other, and describes support for forced umount - Generic directory walking support code
- demonstrates how readdir/lookup codepaths can be unified using a generic directory walker mechanism
- Mapping file/directory offsets to disk blocks
- one of the most important tasks of a filesystem - where's my data, dude ?
- Timestamps
- POSIX atime, mtime, ctime, and their (non)-equivalents in a given filesystem
- Filesystem Utilities
- mount / unmount
- fsck
- mkfs
