Copyright © 2015 by Alan Conroy. This article may be copied in whole or in part as long as this copyright is included.


1 Introduction
2 Ground Rules

Building a File System
3 File Systems
4 File Content Data Structure
5 Allocation Cluster Manager
6 Exceptions and Emancipation
7 Base Classes, Testing, and More
8 File Meta Data
9 Native File Class
10 Our File System
11 Allocation Table
12 File System Support Code
13 Initializing the File System
14 Contiguous Files
15 Rebuilding the File System
16 Native File System Support Methods
17 Lookups, Wildcards, and Unicode, Oh My
18 Finishing the File System Class

The Init Program
19 Hardware Abstraction and UOS Architecture
20 Init Command Mode
21 Using Our File System
22 Hardware and Device Lists
23 Fun with Stores: Partitions
24 Fun with Stores: RAID
25 Fun with Stores: RAM Disks
26 Init wrap-up

The Executive
27 Overview of The Executive
28 Starting the Kernel
29 The Kernel
30 Making a Store Bootable
31 The MMC
32 The HMC
33 Loading the components
34 Using the File Processor
35 Symbols and the SSC
36 The File Processor and Device Management
37 The File Processor and File System Management
38 Finishing Executive Startup

Users and Security
39 Introduction to Users and Security
40 More Fun With Stores: File Heaps
41 File Heaps, part 2
42 SysUAF
43 TUser
44 SysUAF API

Terminal I/O
45 Shells and UCL
46 UOS API, the Application Side
47 UOS API, the Executive Side
48 I/O Devices
49 Streams
50 Terminal Output Filters
51 The TTerminal Class
52 Handles
53 Putting it All Together
54 Getting Terminal Input
55 QIO
56 Cooking Terminal Input
57 Putting it all together, part 2
58 Quotas and I/O

UCL
59 UCL Basics
60 Symbol Substitution
61 Command execution
62 Command execution, part 2
63 Command Abbreviation
64 ASTs
65 Expressions, Part 1
66 Expressions, Part 2: Support code
67 Expressions, part 3: Parsing
68 SYS_GETJPIW and SYS_TRNLNM
69 Expressions, part 4: Evaluation

UCL Lexical Functions
70 PROCESS_SCAN
71 PROCESS_SCAN, Part 2
72 TProcess updates
73 Unicode revisted
74 Lexical functions: F$CONTEXT
75 Lexical functions: F$PID
76 Lexical Functions: F$CUNITS
77 Lexical Functions: F$CVSI and F$CVUI
78 UOS Date and Time Formatting
79 Lexical Functions: F$CVTIME
80 LIB_CVTIME
81 Date/Time Contexts
82 SYS_GETTIM, LIB_Get_Timestamp, SYS_ASCTIM, and LIB_SYS_ASCTIM
83 Lexical Functions: F$DELTA_TIME
84 Lexical functions: F$DEVICE
85 SYS_DEVICE_SCAN
86 Lexical functions: F$DIRECTORY
87 Lexical functions: F$EDIT and F$ELEMENT
88 Lexical functions: F$ENVIRONMENT
89 SYS_GETUAI
90 Lexical functions: F$EXTRACT and F$IDENTIFIER
91 LIB_FAO and LIB_FAOL
92 LIB_FAO and LIB_FAOL, part 2
93 Lexical functions: F$FAO
94 File Processing Structures
95 Lexical functions: F$FILE_ATTRIBUTES
96 SYS_DISPLAY
97 Lexical functions: F$GETDVI
98 Parse_GetDVI
99 GetDVI
100 GetDVI, part 2
101 GetDVI, part 3
102 Lexical functions: F$GETJPI
103 GETJPI
104 Lexical functions: F$GETSYI
105 GETSYI
106 Lexical functions: F$INTEGER, F$LENGTH, F$LOCATE, and F$MATCH_WILD
107 Lexical function: F$PARSE
108 FILESCAN
109 SYS_PARSE
110 Lexical Functions: F$MODE, F$PRIVILEGE, and F$PROCESS
111 File Lookup Service
112 Lexical Functions: F$SEARCH
113 SYS_SEARCH
114 F$SETPRV and SYS_SETPRV
115 Lexical Functions: F$STRING, F$TIME, and F$TYPE
116 More on symbols
117 Lexical Functions: F$TRNLNM
118 SYS_TRNLNM, Part 2
119 Lexical functions: F$UNIQUE, F$USER, and F$VERIFY
120 Lexical functions: F$MESSAGE
121 TUOS_File_Wrapper
122 OPEN, CLOSE, and READ system services

UCL Commands
123 WRITE
124 Symbol assignment
125 The @ command
126 @ and EXIT
127 CRELNT system service
128 DELLNT system service
129 IF...THEN...ELSE
130 Comments, labels, and GOTO
131 GOSUB and RETURN
132 CALL, SUBROUTINE, and ENDSUBROUTINE
133 ON, SET {NO}ON, and error handling
134 INQUIRE
135 SYS_WRITE Service
136 OPEN
137 CLOSE
138 DELLNM system service
139 READ
140 Command Recall
141 RECALL
142 RUN
143 LIB_RUN
144 The Data Stream Interface
145 Preparing for execution
146 EOJ and LOGOUT
147 SYS_DELPROC and LIB_GET_FOREIGN

CUSPs and utilities
148 The I/O Queue
149 Timers
150 Logging in, part one
151 Logging in, part 2
152 System configuration
153 SET NODE utility
154 UUI
155 SETTERM utility
156 SETTERM utility, part 2
157 SETTERM utility, part 3
158 AUTHORIZE utility
159 AUTHORIZE utility, UI
160 AUTHORIZE utility, Access Restrictions
161 AUTHORIZE utility, Part 4
162 AUTHORIZE utility, Reporting
163 AUTHORIZE utility, Part 6
164 Authentication
165 Hashlib
166 Authenticate, Part 7
167 Logging in, part 3
168 DAY_OF_WEEK, CVT_FROM_INTERNAL_TIME, and SPAWN
169 DAY_OF_WEEK and CVT_FROM_INTERNAL_TIME
170 LIB_SPAWN
171 CREPRC
172 CREPRC, Part 2
173 COPY
174 COPY, part 2
175 COPY, part 3
176 COPY, part 4
177 LIB_Get_Default_File_Protection and LIB_Substitute_Wildcards
178 CREATESTREAM, STREAMNAME, and Set_Contiguous
179 Help Files
180 LBR Services
181 LBR Services, Part 2
182 LIBRARY utility
183 LIBRARY utility, Part 2
184 FS Services
185 FS Services, Part 2
186 Implementing Help
187 HELP
188 HELP, Part 2
189 DMG_Get_Key and LIB_Put_Formatted_Output
190 LIBRARY utility, Part 3
191 Shutting Down UOS
192 SHUTDOWN
193 WAIT
194 SETIMR
195 WAITFR and Scheduling
196 REPLY, OPCOM, and Mailboxes
197 REPLY utility
198 Mailboxes
199 BRKTHRU
200 OPCOM
201 Mailbox Services
202 Mailboxes, Part 2
203 DEFINE
204 CRELNM
205 DISABLE
206 STOP
207 OPCCRASH and SHUTDOWN
208 APPEND

Glossary/Index


Downloads

File Systems

It might seem strange to start with the UOS file system, but it turns out that file system code has some utility outside of UOS. For instance, we can use it to create container files for our applications running on Microsoft Windows. In any case, file support is a central feature for any modern Operating System.

Before we go any further, we should recognize that there is a difference between the file system software and the data structure that is managed by that software. That is, the on-disk structure of a file system simply defines how the file data and folder relationships are stored. The software accesses that data structure to create/read/write/delete files, and usually adds some rules and features (such as locking, which we will discuss later). For now, let us set aside the software side of this and take a brief survey of the data structures used by existing file systems in the real world.

There are multiple file systems in use, depending upon the media that the files reside on. For instance, on a magnetic tape, the media is inherently sequential. There is, therefore, no hierarchy since each file simply occurs, one after another as the tape is read or written. Note that the file system software can present a hierarchical view of the files, but that is not how the files are stored - remember, we are only discussing the data structure on the media for now. Likewise, CDs contain data in a sequential format. But since the read/write heads can move over the CD like a hard disk, a CD can be accessed in a semi-random manner. However, like tapes, they can only be written to in a sequential manner. USB storage devices have their own data structures as well. And we will eventually discuss all of these, but for now we will focus on standard hard disk file system structures.

The first thing we note is that files can be of varying sizes, from 0 bytes to as large as the disk they reside on (or even larger, if the file can span multiple disks). Perhaps there are some specialized cases, where fixed-size files would be useful, but we want the native UOS file system structure to be general-purpose. When files can be any size on the disk, it presents a challenge in how to store and retrieve data from the files. Especially when multiple files may be written to alternatively, thus extending several of them over the course of running one program. And the problem only becomes worse if there are multiple programs running at the same time. Let's say you set aside 4K of disk space for a file. As soon as you write the 4097th byte to that file, it has to be extended to hold the data beyond the 4K that you set aside for it. It may be possible, sometimes, to simply allocate the next 4K on the disk for the file. But it is more likely that the first file area cannot be expanded because immediately after the 4K area, the next 4K area was allocated for another file. This is an example of how it could look:

How can we solve this? Well, one approach could be to move the data following the first 4K buffer so that the file can be expanded in place. In fact, this approach could simplify our data structure and the file system code since each file's data would be contiguous. But there would be a performance issue with having to move the next file's data to a new location on the disk in order to make room for the file each time it is expanded. In fact, a simple write to a file could result in a long pause as the data is moved around on the disk, especially if the next file was quite large. Further, this approach would tend to fragment the free space on the disk such that there may not be a large-enough contiguous free area to move a file into. You could have half of your disk space free, but not have enough free space in one place to be able to extend even a small file. On the other hand, reading the file's data takes very little overhead compared to other means of access. There are certain applications, such as when files are rarely extended, where this might be the right approach. However, in a general case, this is not an ideal solution for the reasons already stated.

How do most file systems handle efficient extending of files on disk? They allocate the files in pieces, and keep track of these pieces. When the file is accessed, the file system trapses down this list of file "extents" or "fragments" to get to the desired data. Since the next extent of the file can be placed where ever there is free space, the disk space is used efficiently. However, when files get too fragmented, system performance can suffer due to the overhead of moving the disk read/write heads all over to get these extents. That is why disk defragmenters are so useful on most computers - they gather all the fragments together to reduce the overhead.

In any programming, there are three things that you can optimize: speed of execution, amount of storage required, and complexity of the algorithm. Essentially, you can trade one against the other two. That is, you can choose to make it run fast and take less time, but only at the cost of extra code/algorithm complexity. Or you can make the code and data structures simple, and get decent speed, but only at the cost of additional storage (memory and/or disk). All programming ends up being a compromise between these three factors.

Chains of file extents seems like a good compromise, and is a widely used (and therefore well understood) implementation. I can't think of a better, overall, approach, and so UOS will use this approach to store file data. And that is what we will tackle in the next article.