1 Introduction
2 Ground Rules
Building a File System
3 File Systems
4 File Content Data Structure
5 Allocation Cluster Manager
6 Exceptions and Emancipation
7 Base Classes, Testing, and More
8 File Meta Data
9 Native File Class
10 Our File System
11 Allocation Table
12 File System Support Code
13 Initializing the File System
14 Contiguous Files
15 Rebuilding the File System
16 Native File System Support Methods
17 Lookups, Wildcards, and Unicode, Oh My
18 Finishing the File System Class
The Init Program
19 Hardware Abstraction and UOS Architecture
20 Init Command Mode
21 Using Our File System
22 Hardware and Device Lists
23 Fun with Stores: Partitions
24 Fun with Stores: RAID
25 Fun with Stores: RAM Disks
26 Init wrap-up
The Executive
27 Overview of The Executive
28 Starting the Kernel
29 The Kernel
30 Making a Store Bootable
31 The MMC
32 The HMC
33 Loading the components
34 Using the File Processor
35 Symbols and the SSC
36 The File Processor and Device Management
37 The File Processor and File System Management
38 Finishing Executive Startup
Users and Security
39 Introduction to Users and Security
40 More Fun With Stores: File Heaps
41 File Heaps, part 2
42 SysUAF
43 TUser
44 SysUAF API
Terminal I/O
45 Shells and UCL
46 UOS API, the Application Side
47 UOS API, the Executive Side
48 I/O Devices
49 Streams
50 Terminal Output Filters
51 The TTerminal Class
52 Handles
53 Putting it All Together
54 Getting Terminal Input
55 QIO
56 Cooking Terminal Input
57 Putting it all together, part 2
58 Quotas and I/O
UCL
59 UCL Basics
60 Symbol Substitution
61 Command execution
62 Command execution, part 2
63 Command Abbreviation
64 ASTs
65 Expressions, Part 1
66 Expressions, Part 2: Support code
67 Expressions, part 3: Parsing
68 SYS_GETJPIW and SYS_TRNLNM
69 Expressions, part 4: Evaluation
UCL Lexical Functions
70 PROCESS_SCAN
71 PROCESS_SCAN, Part 2
72 TProcess updates
73 Unicode revisted
74 Lexical functions: F$CONTEXT
75 Lexical functions: F$PID
76 Lexical Functions: F$CUNITS
77 Lexical Functions: F$CVSI and F$CVUI
78 UOS Date and Time Formatting
79 Lexical Functions: F$CVTIME
80 LIB_CVTIME
81 Date/Time Contexts
82 SYS_GETTIM, LIB_Get_Timestamp, SYS_ASCTIM, and LIB_SYS_ASCTIM
83 Lexical Functions: F$DELTA_TIME
84 Lexical functions: F$DEVICE
85 SYS_DEVICE_SCAN
86 Lexical functions: F$DIRECTORY
87 Lexical functions: F$EDIT and F$ELEMENT
88 Lexical functions: F$ENVIRONMENT
89 SYS_GETUAI
90 Lexical functions: F$EXTRACT and F$IDENTIFIER
91 LIB_FAO and LIB_FAOL
92 LIB_FAO and LIB_FAOL, part 2
93 Lexical functions: F$FAO
94 File Processing Structures
95 Lexical functions: F$FILE_ATTRIBUTES
96 SYS_DISPLAY
97 Lexical functions: F$GETDVI
98 Parse_GetDVI
99 GetDVI
100 GetDVI, part 2
101 GetDVI, part 3
102 Lexical functions: F$GETJPI
103 GETJPI
104 Lexical functions: F$GETSYI
105 GETSYI
106 Lexical functions: F$INTEGER, F$LENGTH, F$LOCATE, and F$MATCH_WILD
107 Lexical function: F$PARSE
108 FILESCAN
109 SYS_PARSE
110 Lexical Functions: F$MODE, F$PRIVILEGE, and F$PROCESS
111 File Lookup Service
112 Lexical Functions: F$SEARCH
113 SYS_SEARCH
114 F$SETPRV and SYS_SETPRV
115 Lexical Functions: F$STRING, F$TIME, and F$TYPE
116 More on symbols
117 Lexical Functions: F$TRNLNM
118 SYS_TRNLNM, Part 2
119 Lexical functions: F$UNIQUE, F$USER, and F$VERIFY
120 Lexical functions: F$MESSAGE
121 TUOS_File_Wrapper
122 OPEN, CLOSE, and READ system services
UCL Commands
123 WRITE
124 Symbol assignment
125 The @ command
126 @ and EXIT
127 CRELNT system service
128 DELLNT system service
129 IF...THEN...ELSE
130 Comments, labels, and GOTO
131 GOSUB and RETURN
132 CALL, SUBROUTINE, and ENDSUBROUTINE
133 ON, SET {NO}ON, and error handling
134 INQUIRE
135 SYS_WRITE Service
136 OPEN
137 CLOSE
138 DELLNM system service
139 READ
140 Command Recall
141 RECALL
142 RUN
143 LIB_RUN
144 The Data Stream Interface
145 Preparing for execution
146 EOJ and LOGOUT
147 SYS_DELPROC and LIB_GET_FOREIGN
CUSPs and utilities
148 The I/O Queue
149 Timers
150 Logging in, part one
151 Logging in, part 2
152 System configuration
153 SET NODE utility
154 UUI
155 SETTERM utility
156 SETTERM utility, part 2
157 SETTERM utility, part 3
158 AUTHORIZE utility
159 AUTHORIZE utility, UI
160 AUTHORIZE utility, Access Restrictions
161 AUTHORIZE utility, Part 4
162 AUTHORIZE utility, Reporting
163 AUTHORIZE utility, Part 6
164 Authentication
165 Hashlib
166 Authenticate, Part 7
167 Logging in, part 3
168 DAY_OF_WEEK, CVT_FROM_INTERNAL_TIME, and SPAWN
169 DAY_OF_WEEK and CVT_FROM_INTERNAL_TIME
170 LIB_SPAWN
171 CREPRC
172 CREPRC, Part 2
173 COPY
174 COPY, part 2
175 COPY, part 3
176 COPY, part 4
177 LIB_Get_Default_File_Protection and LIB_Substitute_Wildcards
178 CREATESTREAM, STREAMNAME, and Set_Contiguous
179 Help Files
180 LBR Services
181 LBR Services, Part 2
182 LIBRARY utility
183 LIBRARY utility, Part 2
184 FS Services
185 FS Services, Part 2
186 Implementing Help
187 HELP
188 HELP, Part 2
189 DMG_Get_Key and LIB_Put_Formatted_Output
190 LIBRARY utility, Part 3
191 Shutting Down UOS
192 SHUTDOWN
193 WAIT
194 SETIMR
195 WAITFR and Scheduling
196 REPLY, OPCOM, and Mailboxes
197 REPLY utility
198 Mailboxes
199 BRKTHRU
200 OPCOM
201 Mailbox Services
202 Mailboxes, Part 2
203 DEFINE
204 CRELNM
205 DISABLE
206 STOP
207 OPCCRASH and SHUTDOWN
208 APPEND
Glossary/Index
Downloads
|
File Heaps, part 2
Introduction
The file heap is analogous to a memory heap. They both operate in like manner, just using different kinds of stores. However, memory heaps have the advantages of a) speed and b) the ability to directly read and write data to the heap via pointers. Of course, there is also the disadvantages that a) errant pointers into a memory heap can corrupt the entire structure and b) the heap contents are not persisted across reboots of the system. Obviously, then, which one we use will depend upon our needs for each specific situation. SYSUAF.DAT requires persisting across multiple system reboots, which is why we use a file heap rather than a memory heap for it. We could load the entire structure into memory when needed and persist it back to a file when we shut down, but that might require a lot of memory on multi-user systems to be wasted when we only need pieces of the structure at any one time.
The disadvantage of not being able to directly read and write data to the file heap via pointers is the point of the helper classes that we will be discussing in this article. These classes provide for easy use of the file heap structures. Essentially, they are ephemeral in-memory representations of the data in the file heap. They will take care of handling common structures in the file heap so that the rest of the UOS code doesn't have to be bogged down in the details of structure management.
Each of the classes have a header stored in the file heap that indicates the type of structure. This helps to ensure that the code accessing the structures is using them properly. The first three bytes indicate the type of structure. The fourth byte indicates a version (in case we have future changes to the structures). Thus, all three record types have the following starting contents:
packed record
Prefix : TPrefix ;
Version : byte ;
Following is the definition of TPrefix, and the constants used for the file heap structures that we discuss in this article:
type TPrefix = packed array[ 0..2 ] of char ;
const String_Prefix : TPrefix = ( 'S', 'T', 'R' ) ;
const List_Prefix : TPrefix = ( 'L', 'I', 'S' ) ;
const String_List_Prefix : TPrefix = ( 'S', 'L', 'I' ) ;
Each class that wraps the file heap structures not only has a header, but also must have a pointer to an instance of a store which is the file heap, as well as a pointer to the location of the structure in the file heap. Thus, we have a base class from which the other classes descend. It is defined as follows:
type TStore_Object = class( TCommon_COM_Interface )
public // Constructors and destructors...
destructor Destroy ; override ;
protected // Instance data...
_Address : int64 ;
_Store : TUOS_File_Heap ;
public // API...
function Valid_Header : boolean ;
virtual ; abstract ;
protected // Property handlers...
function Get_Address : int64 ; virtual ;
procedure Set_Address( _Value : int64 ) ;
virtual ; abstract ;
procedure Set_Store( _Value : TUOS_File_Heap ) ;
public // Properties...
property Address : int64 // Header address
read Get_Address
write Set_Address ;
property Store : TUOS_File_Heap
read _Store
write Set_Store ;
end ; // TStore_Object
The base class has the instance data for the store and address, and properties to access them. It also has abstract ("pure virtual" in C++ parlance) methods which are overriden by the descendents: The Set_Address, which is used to load the structure from the heap, and Valid_Header to check for a header that the descendent is compatible with. The implementation of the other methods follows:
// Constructors and destructors...
destructor TStore_String.Destroy ;
begin
Store := nil ;
inherited Destroy ;
end ;
// API...
procedure TStore_Object.Set_Store( _Value : TUOS_File_Heap ) ;
begin
if( _Value <> nil ) then
begin
_Value.Attach ;
end ;
if( _Store <> nil ) then
begin
_Store.Detach ;
end ;
_Store := _Value ;
end ;
function TStore_Object.Get_Address : int64 ;
begin
Result := _Address ;
end ;
These are typical getter and setter methods to classes we've talked about in previous articles.
Now let us move on to the classes themselves.
Strings
One of the structures that we need is a string. These strings are not null-terminated, and may contain nulls. Thus, we need three pieces of data for a file heap string: the location of the string in the heap, the size of the string data, and the actual data itself. We could implement it simply as a size prefix followed by the data, but that creates problems if we change the value of the string. In our heap, we will have structures that reference strings. If the string value is changed to require more space, it might have to be moved to a new location in the heap. This would then require that the pointer to the string be updated. This, in turn, would mean that we'd have to pass a pointer to the string pointer, which will likely be at an arbitrary offset in some structure. If that weren't sticky enough, we have to update both the in-memory copy of the structure (and there might be multiple copies of the structure in memory) as well as the in-file copy. What if we wanted to have multiple references to the same string? Perhaps you can begin to see the headaches with this approach. Therefore, we will implement strings as pointers to a string header structure in the file heap. Any changes to the string will simply result, at most, in a change to that header. All the pointers to the string still point to the header. Once again, indirection saves the day. Here is the string header structure:
TStore_String_Header = packed record
Prefix : TPrefix ; // "STR"
Version : byte ;
Length : longword ;
Flags : longint ; // reserved
RefCount : longint ; // reserved
Data : int64 ;
end ;
A couple of the items are reserved for future use, so we'll ignore those for now. The structure starts with the prefix and version, as discussed above. We could store the string's length as a prefix to the string data, but since we have a header anyway, we store the length there. Finally, we have the pointer to the actual data. The header is 24 bytes long, and the default resolution of a file heap is 16 bytes. At first glance, it might seem that we are "wasting" 8 bytes by not having the header be 32 bytes long. But we have everything we need, and since the file heap implementation uses an 8 byte size prefix for the allocation chunk, it turns out that the header will take exactly 32 bytes anyway. We don't rely on this fact, since that is an implementation detail that could change in the future. Arbitrarily adding reserved data at the end of the header just to round it up to a 16-byte boundary would be relying on an implementation detail and, thus, is a form of pathological coupling.
Note that the data pointer is a 64-bit integer. This is so that the string can be located anywhere in a file-heap of any size. However, we restrict individual string lengths to 32-bits (4 gigabytes). This is a reasonable restriction for textual data. If we want non-textual data, we would use some other structure (such as a "blob" object), which could be up to the entire size of the file heap. But we don't need that for the foreseeable future, so we are content with a 4 gigabyte string limit.
The TStore_String class implements a Pascal string interface to a file heap string. Here is the class definition, and related constants:
TStore_String = class( TStore_Object )
private // Instance data...
Header : TStore_String_Header ;
S : string ; // Cached string from store
protected // Property handlers...
procedure Set_Address( _Value : int64 ) ; override ;
public
function Get_Value : string ;
procedure Set_Value( _Value : string ) ;
public // Override...
function Valid_Header : boolean ; override ;
public // API...
procedure Delete ; virtual ;
public // Properties...
property Value : string
read Get_Value
write Set_Value ;
end ; // TStore_String
const Store_String_Facility = -1 ;
Store_StringErr_Success = 0 ;
Store_StringErr_Invalid_Object = 1 ;
The class includes a header and a string which is a cached version of the string in the file heap. It adds a Value property which allows us to read and write string values.
// Override...
function TStore_String.Valid_Header : boolean ;
begin
Result := False ;
if( Header.Version >= 10 ) then // Unknown version
begin
Set_Last_Error( Create_Simple_UE( Store_String_Facility, 1, Store_StringErr_Invalid_Object,
UE_Error, 'Invalid object', '' ) ) ;
exit ;
end ;
if( Header.Prefix <> String_Prefix ) then // Not a string header
begin
Set_Last_Error( Create_Simple_UE( Store_String_Facility, 1, Store_StringErr_Invalid_Object,
UE_Error, 'Invalid object', '' ) ) ;
exit ;
end ;
Result := True ;
end ;
The Valid_Header method verifies that the loaded header is valid by checking the prefix and version.
procedure TStore_String.Set_Address( _Value : int64 ) ;
var UEC : TUnified_Exception ;
begin
Set_Last_Error( nil ) ;
if( _Store <> nil ) then
begin
_Store.Read_Data( Header, _Value, sizeof( Header ), UEC ) ;
if( UEC <> nil ) then
begin
Set_Last_Error( UEC ) ;
exit ;
end ;
if( not Valid_Header ) then // Unknown version
begin
exit ;
end ;
setlength( S, Header.Length ) ;
if( Header.Length > 0 ) then
begin
Store.Read_Data( PChar( S )[ 0 ], Header.Data, Header.Length, UEC ) ;
Set_Last_Error( UEC ) ;
end ;
end ;
_Address := _Value ;
end ; // TStore_String.Set_Address
When the address is set, the method reads the header from the store and validates it. Then it reads a copy of the string text from the heap into the internal string. Finally, it sets the _Address instance data.
procedure TStore_String.Delete ;
var H : TStore_String_Header ;
UEC : TUnified_Exception ;
begin
Set_Last_Error( nil ) ;
if( _Store <> nil ) then
begin
fillchar( H, sizeof( H ), 0 ) ;
_Store.Write_Data( H, Address, sizeof( H ), UEC ) ;
_Store.freemem( Header.Data ) ;
_Store.freemem( Address ) ;
end ;
Free ;
end ;
The Delete method deletes the string from the heap. It frees the actual string storage, zeroes out the header in the heap, then deallocates the header. Finally, the method deletes the current instance - since there is no longer a corresponding string in the heap. The reason for clearing the header in the heap is so that any bug in the code that tries to use the structure again won't have a left-over header that appears to be valid.
Next are the getter and setters for the Value property.
function TStore_String.Get_Value : string ;
begin
Result := S ;
end ;
procedure TStore_String.Set_Value( _Value : string ) ;
var I : int64 ;
UEC : TUnified_Exception ;
begin
Set_Last_Error( nil ) ;
if( _Value <> S ) then
begin
if( _Store <> nil ) then
begin
Get_Value simply returns the internal cached value. Set_Value does nothing if the passed value isn't different than the internal cached value. If there is no store assigned, the string cannot be updated.
// Reallocate the storage for the value...
I := Header.Data ;
if( Header.Data = 0 ) then // Not already allocated
begin
if( length( _Value ) > 0 ) then // Non-null string
begin
I := _Store.getmem( length( _Value ) ) ;
if( I = 0 ) then // Failure
begin
Set_Last_Error( _Store.Last_Error ) ;
exit ;
end ;
end ;
end else
begin
if( length( _Value ) = 0 ) then
begin
I := 0 ;
_Store.freemem( Header.Data ) ;
end else
begin
I := _Store.Reallocmem( Header.Data, length( _Value ) ) ;
if( I = 0 ) then // Failure
begin
Set_Last_Error( _Store.Last_Error ) ;
exit ;
end ;
end ;
end ;
There are two options: either string data has already been allocated in the heap, or not. If not, then we only need to allocate space if the value being set is non-null. Obviously, a null string requires no storage for text. If the header's Data field is not 0, then some space has already been allocated for the string. In that case, there are two options. If we are setting the string to null, we can free the existing allocation for string contents and set I to 0 (the I variable will be the new Data pointer). If the string value is non-null then we reallocate the current string allocation to match the size of the new value.
// Update the header...
if( ( Header.Data <> I ) or ( Header.Length <> length( _Value ) ) ) then
begin
Header.Data := I ;
Header.Length := length( _Value ) ;
_Store.Write_Data( Header, Address, sizeof( Header ), UEC ) ;
if( UEC <> nil ) then
begin
Set_Last_Error( UEC ) ;
exit ;
end ;
end ; // if
If the header data pointer or length values changed, we update the header and write the new one to the heap. But we want to avoid unnecessary I/O, so if neither value changed, we don't bother doing the write.
// Write the new value...
if( Header.Data <> 0 ) then
begin
_Store.Write_Data( PChar( _Value )[ 0 ], Header.Data, Header.Length, UEC ) ;
if( UEC <> nil ) then
begin
Set_Last_Error( UEC ) ;
exit ;
end ;
end ;
end ; // if( _Store <> nil )
S := _Value ;
end ; // if( _Value <> S )
end ; // TStore_String.Set_Value
If we have a memory location to write to (Header.Data <> 0), then we write the new string contents to the heap. Finally, we update the internal cached value.
That completes the store string support class. It is pretty simple and straight-forward. However, it does require some care in terms of its use. If the code using it ever has two instances of the class that point to the same string, the cached string contents could become inconsistent. Consider if instances A and B point to the same string. At creation time, they would have the same cached value. But if we change the string via instance A, instance B will still have the old value cached even though instance A will have the new value. Thus, it is best to make sure that code which accesses a store string only have a single instance of the class for any given string.
Lists
There are many possible types of list structures that can be used to store arbitrary-sized lists of values. Singly-link lists and Double-linked lists are but two examples. We've been using a Pascal class called TList, which is implemented as a buffer holding an array of values. Whenever the size of a TList grows or shrinks, the buffer is resized. This approach has much faster lookup performance than a linked list, but has the disadvantage of requiring contiguous memory large enough to hold the entire list, and resizing a large list can result in a lot of overhead in the form of copying data around in memory. Because a file heap is slower than a memory heap, either approach can have serious performance consequences. But the utility of lists that can change size is such that we want one that works with a file heap. Therefore, we will implement a store heap class which takes a hybrid approach.
What we will do is use buffers to hold several values, and if we run out of space we expand the list by allocating another buffer and linking the two together. With a sufficiently large buffer size, the lookup can still be much faster than traversing lots of pointers. But wait. This all sounds very familiar, doesn't it? Where have we seen a case where data was allocated in chunks but accessed as a single large "buffer"? It sounds like files on the UOS file system. Each chunk is a cluster, and the Allocation Chain logically connected all the clusters together. Am I suggesting that we implement an entire file system in a file heap, and store lists as files? Certainly, we could implement it that way. However, there is a lot of overhead associated with a general-purpose file system that is not needed for our immediate purposes. But we can certainly borrow some code from that part of UOS. Specifically, we will use an Allocation Cluster Manager instance to keep track of the buffers (clusters) of the list. What we will store in the list are 64-bit integer values. These can be used for any purpose: they might simply be integer values, or they might be pointers into other parts of the heap. The class doesn't know, and doesn't care. If you delete the list, or resize it to be smaller, the excess data is simply lost. Thus, it is the responsibility of the code that uses the store list to make sure that any data pointed to by the list is properly disposed of before the list is deleted or shrunk. The list will logically look like an array of int64s, starting at offset 0.
Like strings, lists have a header. Here is the list header structure:
TStore_List_Header = packed record
Prefix : TPrefix ; // "LIS"
Version : byte ;
Delta : longint ;
Capacity : longint ;
Max : longint ;
Data : int64 ;
end ;
We've previously described the prefix and version. The Delta value indicates how large the buffers are that we allocate. This is in numbers of int64 values. Thus, this is the clustersize used by the ACM, divided by the size of an int64 (8 bytes). Therefore, a delta of 8 would correspond to a clustersize of 64. Capacity indicates the physical size of allocated clusters, in terms of int64s. Max indicates the number of int64s that are in use (logical size). Max is always less than, or equal to, Capacity. Data is the root of the allocation chain used by the ACM.
File heap lists are implemented via the TStore_List class. Here is the class definition, and related constants:
TStore_List = class( TStore_Object )
public // Constructors and destructors...
destructor Destroy ; override ;
protected // Instance data...
ACM : TCOM_Allocation_Cluster_Manager64 ;
Buf : pInt64_Array ;
Header : TStore_List_Header ;
public // Property handlers...
function Get_Item( Index : integer ) : int64 ;
procedure Set_Item( Index : integer ; Value : int64 ) ;
public // Override...
procedure Set_Address( _Value : int64 ) ; override ;
function Valid_Header : boolean ; override ;
public // API...
function Add( _Value : int64 ) : longint ; virtual ;
function Get_Capacity : longint ; virtual ;
procedure Set_Capacity( _Value : longint ) ; virtual ;
function Get_Count : longint ; virtual ;
procedure Set_Count( _Value : longint ) ; virtual ;
procedure Delete ; virtual ;
public // Properties...
property Capacity : longint
read Get_Capacity
write Set_Capacity ;
property Count : longint
read Get_Count
write Set_Count ;
property Items[ Index : integer ] : int64
read Get_Item
write Set_Item ;
default ;
end ; // TStore_List
const Store_List_Facility = -1 ;
Store_ListErr_Success = 0 ;
Store_ListErr_Invalid_Object = 1 ;
Store_ListErr_Index_Error = 2 ;
The Capacity property corresponds to the Header.Capacity value. The Count property corresponds to the Header.Max value. The Items property provides an array-like way to read and write the values in the list. ACM is the the allocation cluster manager instance that the store list instance uses. Buf is a buffer used in various methods. Header is the store list header.
destructor TStore_List.Destroy ;
begin
if( ACM <> nil ) then
begin
ACM.Detach ;
ACM := nil ;
end ;
freemem( Buf ) ;
Buf := nil ;
inherited Destroy ;
end ;
The destructor detaches from the ACM and frees the temporary buffer, then calls the ancestor's destructor.
procedure TStore_List.Set_Address( _Value : int64 ) ;
var Old : TCOM_Allocation_Cluster_Manager64 ;
UEC : TUnified_Exception ;
begin
if( ( Store = nil ) or ( _Value = 0 ) ) then
begin
exit ;
end ;
Store.Read_Data( Header, _Value, sizeof( Header ), UEC ) ;
if( UEC <> nil ) then
begin
exit ;
end ;
if( not Valid_Header ) then // Unknown version
begin
exit ;
end ;
_Address := _Value ;
Old := ACM ;
ACM := TCOM_Allocation_Cluster_Manager64.Create ;
ACM.Attach ;
if( Old <> nil ) then
begin
Old.Detach ;
end ;
ACM.Store := Store ;
ACM.Clustersize := Header.Delta * sizeof( int64 ) ;
if( Header.Data <> 0 ) then
begin
ACM.Set_Root_And_Size( Header.Data, Header.Capacity * sizeof( int64 ) ) ;
end ;
end ; // TStore_List.Set_Address
function TStore_List.Valid_Header : boolean ;
begin
Set_Last_Error( nil ) ;
Result := False ;
if( Header.Version >= 10 ) then // Unknown version
begin
Set_Last_Error( Create_Simple_UE( Store_List_Facility, 1, Store_ListErr_Invalid_Object,
UE_Error, 'Invalid object', '' ) ) ;
exit ;
end ;
if( Header.Prefix <> List_Prefix ) then // Not a list header
begin
Set_Last_Error( Create_Simple_UE( Store_List_Facility, 1, Store_ListErr_Invalid_Object,
UE_Error, 'Invalid object', '' ) ) ;
exit ;
end ;
Result := True ;
end ;
The Valid_Header method is almost identical to the one used for TStore_String, except that we check for a different prefix. Likewise, the Set_Address method is similar to TStore_String.Address, but we have a little more to do. We load the header and validate it. Then we create an allocation cluster manager (and detach from an existing one, if we had one). Then we set the ACM clustersize and, if there is any existing data (Header.Data <> 0), set the ACM root and size.
function TStore_List.Get_Count : longint ;
begin
Result := Header.Max ;
end ;
procedure TStore_List.Set_Count( _Value : longint ) ;
var UEC : TUnified_Exception ;
begin
if( _Value = Count ) then
begin
exit ; // No change
end ;
if( Capacity < _Value ) then
begin
Capacity := _Value ;
end ;
if( _Store <> nil ) then
begin
Header.Max := _Value ;
Store.Write_Data( Header, Address, sizeof( Header ), UEC ) ;
end ;
end ;
Get_Count simply returns the Max value from the header. Set_Count does nothing if the new count is the same as the current count. If the Capacity is less than the new count, we set the capacity to the new count since the physical size must be at least as large as the logical size. Finally, we update the header and write it to the heap (if we have a store assigned). Note that if no store is assigned, the method effectively does nothing.
function TStore_List.Get_Capacity : longint ;
begin
Result := Header.Capacity ;
end ;
procedure TStore_List.Set_Capacity( _Value : longint ) ;
var UEC : TUnified_Exception ;
begin
if( Header.Delta = 0 ) then
begin
exit ;
end ;
_Value := ( _Value + Header.Delta - 1 ) div Header.Delta * Header.Delta ; // Round up by delta
if( _Value = Capacity ) then
begin
exit ; // No change
end ;
if( _Value < Count ) then
begin
Count := _Value ;
end ;
if( ( _Store <> nil ) and ( ACM <> nil ) ) then
begin
ACM.Set_Size( _Value * sizeof( int64 ) ) ;
Header.Capacity := _Value ;
Header.Data := ACM.Get_Root ;
Store.Write_Data( Header, Address, sizeof( Header ), UEC ) ;
end ;
end ;
Get_Capacity returns the Capacity value from the header. Set_Capacity immediately exits if the delta is 0. This shouldn't happen, but we are being cautious. Next we round the new capacity up to the nearest clustersize boundary - capacity must always correlate to the maximum physical space available to the list. If the result indicates no difference from the current capacity, we exit. If the new capacity is less than the count, we first set the count to the capacity - otherwise the header would indicate that the logical size was larger than the physical size. Finally, assuming that we have a store assigned, we set the ACM size, update the header, and write the header to the heap.
function TStore_List.Add( _Value : int64 ) : longint ;
begin
Result := Count ;
Count := Result + 1 ;
Set_Item( Result, _Value ) ;
end ;
procedure TStore_List.Delete ;
var H : TStore_String_Header ;
UEC : TUnified_Exception ;
begin
Set_Last_Error( nil ) ;
if( ( _Store <> nil ) and ( ACM <> nil ) ) then
begin
ACM.Set_Size( 0 ) ;
fillchar( H, sizeof( H ), 0 ) ;
_Store.Write_Data( H, Address, sizeof( H ), UEC ) ;
_Store.freemem( Header.Data ) ;
_Store.freemem( Address ) ;
end ;
Free ;
end ;
The Add method appends a value to the end of the list and returns the index of the new item. The method simply increments the Count (which may increase the capacity), then calls Set_Item to set the new value.
The Delete method is almost exactly like the Delete method for TStore_String. The only addition is that we set the ACM size to 0 in order to release the allocation chain for the list.
function TStore_List.Get_Item( Index : integer ) : int64 ;
var AC, I : int64 ;
Offset : longint ;
UEC : TUnified_Exception ;
begin
Set_Last_Error( nil ) ;
if( ( Store = nil ) or ( Index < 0 ) or ( Index >= Header.Max ) or ( ACM = nil ) ) then
begin
Set_Last_Error( Create_Simple_UE( Store_List_Facility, 1, Store_ListErr_Index_Error,
UE_Error, 'Index error', '' ) ) ;
Result := 0 ;
exit ;
end ;
AC := Index * sizeof( int64 ) div ACM.Clustersize ;
I := ACM.Get_Allocation_Cluster( AC ) ;
Offset := Index * sizeof( int64 ) - AC * ACM.Clustersize ;
ReAllocmem( Buf, ACM.Clustersize ) ;
Store.Read_Data( Buf^[ 0 ], I, ACM.Clustersize, UEC ) ;
move( Buf^[ Offset div sizeof( int64 ) ], Result, sizeof( Result ) ) ;
end ;
The Get_Item method retrieves a value from the list and returns it. First we clear any pending errors, then we check that we have a store assigned, and that the index is within the range of 0 to Count - 1. If any of these checks fail, we set an error and return 0. Otherwise, we calculate the allocation cluster index (AC) and offset within that cluster. We get the cluster's offset from the ACM (in I), make sure our buffer is appropriately sized, then read the cluster into the buffer. Finally, we return the appropriate value to the caller.
procedure TStore_List.Set_Item( Index : integer ; Value : int64 ) ;
var AC, I : int64 ;
Offset : longint ;
UEC : TUnified_Exception ;
begin
Set_Last_Error( nil ) ;
if( ( Store = nil ) or ( Index < 0 ) or ( Index >= Header.Max ) or ( ACM = nil ) ) then
begin
Set_Last_Error( Create_Simple_UE( Store_List_Facility, 1, Store_ListErr_Index_Error,
UE_Error, 'Index error', '' ) ) ;
exit ;
end ;
AC := Index * sizeof( int64 ) div ACM.Clustersize ;
I := ACM.Get_Allocation_Cluster( AC ) ;
Offset := Index * sizeof( int64 ) - AC * ACM.Clustersize ;
ReAllocmem( Buf, ACM.Clustersize ) ;
Store.Read_Data( Buf^[ 0 ], I, ACM.Clustersize, UEC ) ;
move( Value, Buf^[ Offset div sizeof( int64 ) ], sizeof( Value ) ) ;
Store.Write_Data( Buf^[ 0 ], I, ACM.Clustersize, UEC ) ;
end ;
The Set_Item method is very similar to Get_Item. We do a sanity check and cause an error if something is wrong. Then we calculate the allocation cluster index and offset within the cluster. Then we reallocate the buffer to the necessary size and read the allocation cluster into the buffer. Finally, we move the value into the buffer and write it back out to the heap.
Using the ACM simplfies the implmentation of the store list. The reason it works is that file heaps are a store like any other and can be used interchangeably. However, we need a couple changes to TUOS_File_Heap to make sure it works preoperly in this case. In short, we need to make sure that the overrides for certain operations map onto the appropriate new file heap methods. To wit:
function TUOS_File_Heap.Allocate( Size : TStore_Address64 ) : TStore_Address64 ;
begin
Result := Getmem( Size ) ;
end ;
procedure TUOS_File_Heap.Deallocate( PTR, Size : TStore_Address64 ) ;
begin
Freemem( PTR ) ;
end ;
String Lists
The third class that we will use with file heaps is the store string list. This is identical to the store list except that the items in the list are not arbitrary integer values. Rather, they are pointers to string headers in the heap. Of course, we could use a normal store list and assign the string pointers manually. However, this next class will provide an easier interface and will manage the allocation and deallocation of the strings automatically since it knows exactly what the values stored in the list are.
The TStore_String_List class implements a string list for us. The list will logically look like an array of strings, starting at offset 0.
Like the previous classes, string lists have a header. Because the string list is nothing more than a variant of a normal list, the header is exactly the same for both classes, although a different prefix value is used to uniquely identify the structure. TStore_String_List is a descendent of TStore_List. Here is the class definition, and related constants:
TStore_String_List = class( TStore_List )
protected
_Format : longint ;
public
function Get_Item( Index : integer ) : string ;
procedure Set_Item( Index : integer ; Value : string ) ;
function Get_Format : longint ;
procedure Set_Format( Value : longint ) ;
public // Override...
function Valid_Header : boolean ; override ;
public // API...
function Add( _Value : string ) : longint ;
procedure Delete ; override ;
procedure Set_Count( _Value : longint ) ; override ;
function IndexOf( _Value : string ; CI : boolean ) : longint ;
public // Properties...
property Items[ Index : integer ] : string
read Get_Item
write Set_Item ;
default ;
end ; // TStore_String_List
const Store_String_List_Facility = -1 ;
Store_String_ListErr_Success = 0 ;
Store_String_ListErr_Invalid_Object = 1 ;
Store_String_ListErr_Index_Error = 2 ;
Store_String_ListErr_Creation_Error = 3 ;
The only differences from the TStore_List ancestor is that we redefine the Items property (and the getter/setter) to deal with strings instead of integers. We also add a new method named IndexOf, and a _Format instance variable that is used in IndexOf.
function TStore_String_List.Get_Format : longint ;
begin
Result := _Format ;
end ;
procedure TStore_String_List.Set_Format( Value : longint ) ;
begin
_Format := Value ;
end ;
function TStore_String_List.Valid_Header : boolean ;
begin
Set_Last_Error( nil ) ;
Result := False ;
if( Header.Version >= 10 ) then // Unknown version
begin
Set_Last_Error( Create_Simple_UE( Store_String_List_Facility, 1, Store_String_ListErr_Invalid_Object,
UE_Error, 'Invalid object', '' ) ) ;
exit ;
end ;
if( Header.Prefix <> String_List_Prefix ) then // Not a string list header
begin
Set_Last_Error( Create_Simple_UE( Store_String_List_Facility, 1, Store_String_ListErr_Invalid_Object,
UE_Error, 'Invalid object', '' ) ) ;
exit ;
end ;
Result := True ;
end ;
Get_Format and Set_Format allow access to the _Format instance variable. The Valid_Header method is identical to the ancestor's method except that we compare the Prefix field of the structure to String_List_Prefix instead of List_Prefix.
procedure TStore_String_List.Delete ;
begin
Count := 0 ;
inherited Delete ;
end ;
function TStore_String_List.Add( _Value : string ) : longint ;
var S : TStore_String ;
begin
Result := -1 ; // Assume failure
S := Create_Store_String( Store, _Value ) ; // Create string
if( S = nil ) then
begin
Set_Last_Error( Create_Simple_UE( Store_String_List_Facility, 1, Store_String_ListErr_Creation_Error,
UE_Error, 'Creation error', '' ) ) ;
exit ;
end ;
Result := inherited Add( S.Address ) ;
end ;
The Delete method sets the count to 0, which will result in all strings being deleted (see following code), and then calls the inherited method to do the rest of the work.
The Add function takes a string, creates a store string for that value, and the calls the inherited Add with the address of the string's header.
procedure TStore_String_List.Set_Count( _Value : longint ) ;
var Loop : longint ;
I : int64 ;
S : TStore_String ;
begin
// Delete strings that we are removing...
for Loop := Count - 1 downto _Value do
begin
I := inherited Get_Item( Loop ) ;
if( I <> 0 ) then
begin
S := Get_Store_String( Store, I ) ;
if( S <> nil ) then
begin
S.Delete ; // Delete and free
end ; // if( S <> nil )
end ; // if( I <> 0 )
inherited Set_Item( Loop, 0 ) ;
end ; // for Loop := Count - 1 downto _Value
inherited Set_Count( _Value ) ;
end ;
As mentioned above, since we know that our list contains pointers to string headers, when the count is reduced, we need to release any strings that are above the new count. We also set such pointers to 0 so that if we expand the count again, we don't have pointers to deleted data in the list. Once we've deleted the strings, we call the inherited Set_Count method.
function TStore_String_List.Get_Item( Index : integer ) : string ;
var P : int64 ;
S : TStore_String ;
begin
Set_Last_Error( nil ) ;
Result := '' ;
if( ( Store = nil ) or ( Index < 0 ) or ( Index >= Header.Max ) or ( ACM = nil ) ) then
begin
Set_Last_Error( Create_Simple_UE( Store_String_List_Facility, 1, Store_String_ListErr_Index_Error,
UE_Error, 'Index error', '' ) ) ;
exit ;
end ;
P := inherited Get_Item( Index ) ;
if( P = 0 ) then // Null pointer
begin
exit ;
end ;
S := Get_Store_String( Store, P ) ;
if( S = nil ) then // Not a valid string
begin
Set_Last_Error( Create_Simple_UE( Store_String_List_Facility, 1, Store_String_ListErr_Invalid_Object,
UE_Error, 'Invalid object', '' ) ) ;
exit ;
end ;
try
Result := S.Value ;
finally
S.Free ;
end ;
end ;
Get_Item does the same sanity checks as the ancestor's method. The difference is that it gets an instance of the store string and returns that value. This is done via the Get_Store_String function, which we will look at later in this article. Once we have the string's value, we free the TStore_String instance.
procedure TStore_String_List.Set_Item( Index : integer ; Value : string ) ;
var P : int64 ;
S : TStore_String ;
begin
Set_Last_Error( nil ) ;
if( ( Store = nil ) or ( Index < 0 ) or ( Index >= Header.Max ) or ( ACM = nil ) ) then
begin
Set_Last_Error( Create_Simple_UE( Store_String_List_Facility, 1, Store_String_ListErr_Index_Error,
UE_Error, 'Index error', '' ) ) ;
exit ;
end ;
P := inherited Get_Item( Index ) ;
if( P <> 0 ) then // Pre-existing string
begin
S := Get_Store_String( Store, P ) ;
if( S = nil ) then
begin
Set_Last_Error( Create_Simple_UE( Store_String_List_Facility, 1, Store_String_ListErr_Invalid_Object,
UE_Error, 'Invalid object', '' ) ) ;
exit ;
end ;
S.Value := Value ; // Change string
end else
begin
S := Create_Store_String( Store, Value ) ; // Create string
if( S = nil ) then
begin
Set_Last_Error( Create_Simple_UE( Store_String_List_Facility, 1, Store_String_ListErr_Creation_Error,
UE_Error, 'Creation error', '' ) ) ;
exit ;
end ;
inherited Set_Item( Index, S.Address ) ;
end ;
S.Free ;
end ; // TStore_String_List.Set_Item
Set_Item does the same sanity checks as the ancestor method. What we do next depends upon the current value of the specified index in the list. If it is 0, there is no string associated with the index (it is considered a null string). In such a case, we create a new store string and then set the index to the pointer for that new string. If the value is non-zero, we load the existing string and change its value. We will discuss both Get_Store_String and Create_Store_String later in this article.
function TStore_String_List.IndexOf( _Value : string ; CI : boolean ) : longint ;
var Loop : longint ;
S : TStore_String ;
U_S : TUOS_String ;
begin
Result := -1 ; // Assume not count
if( ( Store = nil ) or ( Count = 0 ) ) then
begin
exit ;
end ;
S := TStore_String.Create ;
U_S := TUOS_String.Create ;
try
if( CI ) then
begin
U_S.Assign_From_String( _Value, _Format ) ;
U_S.Lowercase ;
_Value := U_S.As_String ;
end ;
IndexOf is used to find a string within the list. The string to search for is passed, along with a flag indicating whether or not to do the comparisons as case-sensistive or case-insensitive. The result is the (first) index of the string, or -1 if the string wasn't found. We will talk about TUOS_String at the end of the article, but the point of it is to create a lowercase version of the passed string. If the CI parameter (Case Insensitive) is true, we convert the passed string to lowercase once before we do a bunch of string comparisons. We create a single instance of a TStore_String, which we will use to obtain the strings from the heap.
S.Store := Store ;
for Loop := 0 to Count - 1 do
begin
S.Address := inherited Get_Item( Loop ) ;
if( CI ) then
begin
U_S.Assign_From_String( S.Value, SF_UTF8 ) ;
U_S.Lowercase ;
if( U_S.As_String = _Value ) then
begin
Result := Loop ;
exit ;
end ;
end else
if( S.Value = _Value ) then
begin
Result := Loop ;
exit ;
end ;
end ;
finally
S.Free ;
U_S.Free ;
end ;
end ; // TStore_String_List.IndexOf
The main portion of the method is a loop that starts at offset 0 and ends at the last string in the list (Count - 1). Within the loop, we set the S instance Address to the pointer value. Then if we are doing a case-insensitive operation, we convert the string to lowercase. Either way, we compare the values and return the index if the string was found. At the end of the method, we free the string objects.
API
There are three basic operations that we want to perform on these three store structures: creating a new one, deleting an old one, and accessing an existing one. Although we can construct the instances as needed, it would be nice to have a single function call that would allow us to do each one. In fact, we saw that the TStore_String_List class methods used some of these API calls. So, let's look at them - one for each of the classes for all three operations.
function Create_Store_String( Store : TUOS_File_Heap ;
S : string ) : TStore_String ;
var I : int64 ;
begin
Result := nil ;
if( Store = nil ) then
begin
exit ;
end ;
I := Store.Getmem( sizeof( TStore_String_Header ) ) ;
if( I = 0 ) then
begin
exit ;
end ;
Result := TStore_String.Create ;
fillchar( Result.Header, sizeof( Result.Header ), 0 ) ;
Result.Header.Prefix := String_Prefix ;
Result.Header.Version := 1 ;
Store.Write( I, sizeof( Result.Header ), Result.Header ) ;
Result.Store := Store ;
Result.Address := I ;
Result.Value := S ;
end ;
function Create_Store_List( Store : TUOS_File_Heap ;
_Delta : integer ) : TStore_List ;
var I : int64 ;
begin
Result := nil ;
if( Store = nil ) then
begin
exit ;
end ;
I := Store.Getmem( sizeof( TStore_List_Header ) ) ;
if( I = 0 ) then
begin
exit ;
end ;
Result := TStore_List.Create ;
fillchar( Result.Header, sizeof( Result.Header ), 0 ) ;
Result.Header.Prefix := List_Prefix ;
Result.Header.Version := 1 ;
_Delta := ( _Delta + Store.Min_Storage - 1 ) and ( not ( Store.Min_Storage - 1 ) ) ;
Result.Header.Delta := _Delta ;
Store.Write( I, sizeof( Result.Header ), Result.Header ) ;
Result.Store := Store ;
Result.Address := I ;
end ; // Create_Store_List
function Create_Store_String_List( Store : TUOS_File_Heap ;
_Delta : integer ) : TStore_String_List ;
var I : int64 ;
begin
Result := nil ;
if( Store = nil ) then
begin
exit ;
end ;
I := Store.Getmem( sizeof( TStore_List_Header ) ) ;
if( I = 0 ) then
begin
exit ;
end ;
Result := TStore_String_List.Create ;
fillchar( Result.Header, sizeof( Result.Header ), 0 ) ;
Result.Header.Prefix := String_List_Prefix ;
Result.Header.Version := 1 ;
_Delta := ( _Delta + Store.Min_Storage - 1 ) and ( not ( Store.Min_Storage - 1 ) ) ;
Result.Header.Delta := _Delta ;
Store.Write( I, sizeof( Result.Header ), Result.Header ) ;
Result.Store := Store ;
Result.Address := I ;
end ; // Create_Store_String_List
These functions create a new instance of one of the classes in the file heap. In all cases, the store in question is passed to the function. In the case of the store string, the string value is passed. For the other two, a delta value is passed, as described previously. Although the code is slightly different in each case, the basic operation is the same: validate the store, allocate room for a header of the appropriate type, initialize the header, and write the header to the store. Then an instance of the appropriate class is created, the instance's values (store, address, etc) are set. In the case of a store string, the passed string value is set in the string object. Lists are always created empty. The instance is then returned to the caller.
procedure Delete_Store_String( Store : TUOS_File_Heap ; Address : int64 ) ;
var S : TStore_String ;
begin
S := Get_Store_String( Store, Address ) ;
if( S <> nil ) then
begin
S.Delete ;
S.Free ;
end ;
end ;
procedure Delete_Store_List( Store : TUOS_File_Heap ; Address : int64 ) ;
var S : TStore_List ;
begin
S := Get_Store_List( Store, Address ) ;
if( S <> nil ) then
begin
S.Delete ;
S.Free ;
end ;
end ;
procedure Delete_Store_String_List( Store : TUOS_File_Heap ; Address : int64 ) ;
var S : TStore_String_List ;
begin
S := Get_Store_String_List( Store, Address ) ;
if( S <> nil ) then
begin
S.Delete ;
S.Free ;
end ;
end ;
These three procedures are used to delete an existing file heap object. They all do the same operation: get an instance of the appropriate class and call the Delete method.
function Get_Store_String( Store : TUOS_File_Heap ;
Address : int64 ) : TStore_String ;
var Header : TStore_String_Header ;
UEC : TUnified_Exception ;
begin
Result := nil ;
if( ( Store = nil ) or ( Address = 0 ) ) then
begin
exit ;
end ;
Store.Read_Data( Header, Address, sizeof( Header ), UEC ) ;
if( UEC <> nil ) then
begin
exit ;
end ;
Result := TStore_String.Create ;
Result.Header := Header ;
if( not Result.Valid_Header ) then // Unknown version
begin
Result.Free ;
Result := nil ;
exit ;
end ;
Result.Store := Store ;
Result.Address := Address ; // Force a read
end ; // Get_Store_String
function Get_Store_List( Store : TUOS_File_Heap ;
Address : int64 ) : TStore_List ;
var Header : TStore_List_Header ;
UEC : TUnified_Exception ;
begin
Result := nil ;
if( ( Store = nil ) or ( Address = 0 ) ) then
begin
exit ;
end ;
Store.Read_Data( Header, Address, sizeof( Header ), UEC ) ;
if( UEC <> nil ) then
begin
exit ;
end ;
Result := TStore_List.Create ;
Result.Header := Header ;
if( not Result.Valid_Header ) then // Unknown version
begin
Result.Free ;
Result := nil ;
exit ;
end ;
Result.Store := Store ;
Result.Address := Address ; // Force a read
end ; // Get_Store_List
function Get_Store_String_List( Store : TUOS_File_Heap ;
Address : int64 ; Format : longint = SF_UTF8 ) : TStore_String_List ;
var Header : TStore_List_Header ;
UEC : TUnified_Exception ;
begin
Result := nil ;
if( ( Store = nil ) or ( Address = 0 ) ) then
begin
exit ;
end ;
Store.Read_Data( Header, Address, sizeof( Header ), UEC ) ;
if( UEC <> nil ) then
begin
exit ;
end ;
Result := TStore_String_List.Create ;
Result.Header := Header ;
if( not Result.Valid_Header ) then // Unknown version
begin
Result.Free ;
Result := nil ;
exit ;
end ;
Result.Format := Format ;
Result.Store := Store ;
Result.Address := Address ; // Force a read
end ; // Get_Store_String_List
These final three functions are used to obtain an instance of one of the classes that is wrapped around the file heap data. The basic operation is the same in all cases: verify the parameters, read the header from the store, create an instance of the appropriate class, assign the header, and check the header to validity. If the passed parameters are invalid or the header doesn't match the type of class, we return nil. Otherwise, we set the store and address for the instance and return it to the caller. The instance's format value is set from the format parameter on Get_Store_String_List. The format defaults to Unicode UTF-8, which is what UOS uses.
TUOS_String
You may recall the TUnicode_String class from article 17. This was a class optimized for use with the file system, so it isn't suited for more general use. TUOS_String is a more generalized form of TUnicode_String. We covered TUnicode_String thoroughly in the aforementioned article, so won't go over the details of TUOS_String. The essential differences are that TUnicode_String has a fixed size whereas TUOS_String is dynamically sized. TUnicode_String also has support for wildcard matching, whereas TUOS_String does not. The ability to dynamically size the string is accomplished via a dynamic array. Since the size of the array is managed by the array, we don't need to store the size ourselves. Hence, instead of using offset 0 to store the length (as we did in TUnicode_String), we use offset 0 for the first byte of content. Thus, the data for TUnicode_String starts at offset 1 whereas it starts are offset 0 in TUOS_String. This change in starting offset also resulted in several related changes in TUOS_String from the TUnicode_String code. However, despite the internal offset change, externally the string starts with offset 1. Here is the implementation of the class:
const SF_ASCII = 1 ;
SF_UTF16 = 2 ;
SF_UTF8 = 3 ;
SF_UTF32 = 4 ;
type TUOS_String = class
private // Instance data...
Contents : array of cardinal ;
protected // Property handlers...
// Return length of our contents...
function Get_Length : integer ;
procedure Set_Length( Value : integer ) ;
public // API...
function As_String : string ;
// Assign our contents from a UTF8 string...
procedure Assign_From_String( const S : string ;
Format : integer ) ;
// Create...
function Copy( Start, Len : integer ) : TUOS_String ;
// Return true if our contents are equal to the match
function Equal( Match : TUOS_String ) : boolean ;
// Insert character at given position
procedure Insert( Position : integer ;
Value : cardinal ) ;
// Convert our characters to lowercase...
procedure Lowercase ;
// Position of substring...
function Pos( const Value : string ;
Start : integer = 1 ) : integer ; overload ;
function Pos( const Value : TUOS_String ;
Start : integer = 1 ) : integer ; overload ;
// Return rightmost instance of Value
function RPos( Value : char ) : integer ;
public // Properties...
property Length : integer
read Get_Length
write Set_Length ;
end ; // TUOS_String
// API...
function TUOS_String.As_String : string ;
var Dummy, Loop : integer ;
begin
System.setlength( Result, Length ) ;
for Loop := 0 to Length - 1 do
begin
Dummy := Contents[ Loop ] ;
if( Dummy > 127 ) then
begin
Dummy := Dummy or 128 ;
end ;
Result[ Loop + 1 ] := chr( Dummy ) ;
end ;
end ;
procedure TUOS_String.Assign_From_String( const S : string ;
Format : integer ) ;
var Index, Size, Mask : integer ;
Value : cardinal ;
begin
Index := 1 ; // Index in Spec
setlength( Contents, 0 ) ;
if( Format = SF_UTF8 ) then // UTF8
begin
while( Index <= system.length( S ) ) do
begin
Value := 0 ;
if( S[ Index ] > #$FC ) then
begin
Size := 6 ;
Mask := 1 ;
end else
if( S[ Index ] > #$F8 ) then
begin
Size := 5 ;
Mask := 3 ;
end else
if( S[ Index ] > #$F0 ) then
begin
Size := 4 ;
Mask := 7 ;
end else
if( S[ Index ] > #$E0 ) then
begin
Size := 3 ;
Mask := $F ;
end else
if( S[ Index ] > #$C0 ) then
begin
Size := 2 ;
Mask := $1F ;
end else
begin
Size := 1 ;
Mask := $7F ;
end ;
while( Size > 0 ) do
begin
dec( Size ) ;
Value := Value or ( ord( S[ Index ] ) and Mask ) ;
if( Size > 0 ) then
begin
Value := Value shl 6 ;
end ;
Mask := $3F ;
inc( Index ) ;
end ;
setlength( Contents, system.length( Contents ) + 1 ) ;
Contents[ system.length( Contents ) - 1 ] := Value ;
end ; // while( Index < system.length( S ) )
end else
begin
Value := 0 ;
Index := 1 ; // Index in S
while( Index <= system.length( S ) ) do
begin
move( PChar( S )[ Index - 1 ], Value, Format ) ;
Index := Index + Format ;
setlength( Contents, system.length( Contents ) + 1 ) ;
Contents[ system.length( Contents ) - 1 ] := Value ;
end ;
end ; // if( Format = 3 )
end ; // TUOS_String.Assign_From_String
function TUOS_String.Get_Length : integer ;
begin
Result := system.length( Contents ) ;
end ;
procedure TUOS_String.Set_Length( Value : Integer ) ;
begin
setlength( Contents, Value ) ;
end ;
function lowcase( V : cardinal ; var _Folding_Index : integer ) : cardinal ;
var L, H, Index : integer ;
begin
Result := V ;
_Folding_Index := -1 ; // Indicates no translation
if( ( V < $41 ) or ( V > $118BF ) ) then // Not within range of our table
begin
exit ;
end ;
L := 0 ;
H := high( Foldings ) ;
while( L < H ) do
begin
Index := L + ( ( H - L ) div 2 ) ;
if( V = Foldings[ Index, 0 ] ) then
begin
if( Foldings[ Index, 2 ] > 0 ) then // Multiple outputs
begin
Result := 0 ;
_Folding_Index := Index ;
exit ;
end ;
Result := Foldings[ Index, 1 ] ;
exit ;
end else
if( V < Foldings[ Index, 0 ] ) then
begin
H := Index ;
end else
begin
L := Index + 1 ;
end ;
if( L >= H ) then
begin
exit ;
end ;
end ; // while( L < H )
end ; // lowcase
function TUOS_String.Copy( Start, Len : integer ) : TUOS_String ;
begin
// Setup...
Result := TUOS_String.Create ;
if( Start > Length ) then
begin
exit ;
end ;
if( Start < 1 ) then
begin
Start := 1 ;
end ;
if( Start + Len - 1 > Length ) then
begin
Len := Length - Start + 1 ;
end ;
Result.Length := Len ;
move( Contents[ Start - 1 ], Result.Contents[ 0 ], Len * sizeof( cardinal ) ) ;
end ; // TUOS_String.Copy
function TUOS_String.Equal( Match : TUOS_String ) : boolean ;
var Loop : integer ;
begin
Result := False ;
if( Length <> Match.Length ) then
begin
exit ;
end ;
for Loop := 0 to Length - 1 do
begin
if( ( Contents[ Loop ] <> Match.Contents[ Loop ] ) ) then
begin
exit ;
end ;
end ;
Result := True ;
end ; // TUOS_String.Equal
procedure TUOS_String.Insert( Position : integer ; Value : cardinal ) ;
begin
if( Position < 0 ) then
begin
exit ;
end ;
setlength( Contents, system.length( Contents ) + 1 ) ;
move( Contents[ Position ], Contents[ Position + 1 ], ( System.length( Contents ) - Position ) * sizeof( cardinal ) ) ;
Contents[ Position ] := Value ;
end ;
procedure TUOS_String.Lowercase ;
var Dummy, V : integer ;
_Folding_Index : integer ;
begin
Dummy := 0 ;
while( Dummy < Length ) do
begin
V := lowcase( Contents[ Dummy ], _Folding_Index ) ;
if( ( V = 0 ) and ( _Folding_Index >= 0 ) ) then
begin
Contents[ Dummy ] := Foldings[ _Folding_Index, 1 ] ;
for V := 2 to 3 do
begin
if( Foldings[ _Folding_Index, V ] <> 0 ) then
begin
Insert( Dummy + 1, Foldings[ _Folding_Index, V ] ) ;
inc( Dummy ) ;
end ;
end ;
end else
begin
Contents[ Dummy ] := V ;
end ;
inc( Dummy ) ;
end ;
end ; // TUOS_String.Lowercase
function TUOS_String.Pos( const Value : TUOS_String ;
Start : integer = 1 ) : integer ;
var Dummy, Dummy1 : integer ;
Found : boolean ;
begin
Result := 0 ;
if( Start > Length ) then
begin
exit ;
end ;
if( Value.Length > Length - Start + 1 ) then
begin
exit ; // Substring is longer than our contents
end ;
for Dummy := Start to Length - Value.Length + 1 do
begin
Found := True ;
for Dummy1 := 1 to Value.Length do
begin
if( Value.Contents[ Dummy1 ] <> Contents[ Dummy1 + Dummy - 1 ] ) then
begin
Found := False ;
break ;
end ;
end ; // for Dummy1
if( Found ) then
begin
Result := Dummy ;
exit ;
end ;
end ; // for Dummy
end ; // TUOS_String.Pos
function TUOS_String.Pos( const Value : string ;
Start : integer = 1 ) : integer ;
var Dummy, Dummy1 : integer ;
Found : boolean ;
begin
Result := 0 ;
if( Start > Length ) then
begin
exit ;
end ;
if( System.Length( Value ) > Length - Start + 1 ) then
begin
exit ; // Substring is longer than our contents
end ;
for Dummy := Start - 1 to Length - system.length( Value ) do
begin
Found := True ;
for Dummy1 := 1 to system.length( Value ) do
begin
if( ord( Value[ Dummy1 ] ) <> Contents[ Dummy1 + Dummy - 1 ] ) then
begin
Found := False ;
break ;
end ;
end ; // for Dummy1
if( Found ) then
begin
Result := Dummy + 1 ;
exit ;
end ;
end ; // for Dummy
end ; // TUOS_String.Pos
function TUOS_String.RPos( Value : char ) : integer ;
var Loop, V : cardinal ;
begin
V := ord( Value ) ;
for Loop := Length - 1 downto 0 do
begin
if( Contents[ Loop ] = V ) then
begin
Result := Loop + 1 ;
exit ;
end ;
end ;
Result := 0 ;
end ;
The main point of TUOS_String is to handle Unicode in terms of converting between different Unicode formats and doing case conversions. The Pascal String that we use is really just a byte collection. This collection of bytes can be interpreted in any way. However, if we use the built-in lowercase function, it assumes an ASCII string. Thus, we cannot use Pascal's lowercase function. We can pass data around using Pascal strings, but if we are going to interpret the data as UTF-8 for such things as converting to lower case, we use TUOS_String.
In the next article, we will make use of these classes and a file heap to implement the SYSUAF.DAT file.
Copyright © 2017 by Alan Conroy. This article may be copied
in whole or in part as long as this copyright is included.
|