OpenEdge Database Concepts
Storage areas are the foundation of your OpenEdge database and are used to store all of your database objects (equivalent to tablespaces or filegroups in other databases). Each storage area is a grouping of one or more extents (physical files) and each extent stores a collection of blocks.
There are three different categories of storage areas you need to be concerned about: Data Storage, Before Image and After Image areas.
Data storage areas store table and index data, either for your application or for the internal metaschema required by OpenEdge. The before image area exists to enable transaction rollbacks as well as to make sure that your database is transactionally consistent in case of a crash. The after image area enables data recovery by logging completed transactions and allowing you to apply those transactions against a copy of your database. Although enabling AI is optional it is highly recommended that you enable it to protect your data in case of corruption.
Blocks... Not Records
All reads or writes in OpenEdge are physically performed at a block level and not at an individual record level. This rule applies if you are reading from disk or from one of the buffer pools. This is an important distinction to understand when tuning the various storage settings for your database.
Once the database block is retrieved, OpenEdge parses the block to retrieve the appropriate record or index data for the request. Why this is so important will become clearer in the next few sections.
All Roads Lead To The Buffer Pool
In order for a database block to be read or written it must be placed into a buffer pool. If there are no free blocks available another block will have to be evicted from the buffer pool. If the block has been modified and not yet written to disk it will be written to disk prior to eviction.
This process applies to both index and data blocks used by your application. It also applies for probkup and the majority of OpenEdge utilities.
Indexes In Action
When your code looks up a record or a set of records the database must read index blocks in addition to the data blocks. Depending on the size and complexity of your index at least 2 index blocks will be read (either from memory or disk). If your index contains many columns or is badly fragmented it may read several more blocks.
Once the proper index blocks are retrieved, OpenEdge will parse the block for the ROWID for the first record that matches the index selection. The ROWID tells OpenEdge which block in the database that needs to be retrieved for that record.
This process will be repeated if OpenEdge finds more records that match the index criteria.
Once OpenEdge knows which block to read it will attempt to retrieve that block from the appropriate buffer pool or from disk. OpenEdge will then parse the block based on the ROWID to determine where in that block the first fragment of the desired record exists. If there are multiple fragments for that record OpenEdge will then retrieve each of the blocks that contain those fragments and assemble the entire record into memory.
Data Storage Areas
How and where OpenEdge decides to store records and index blocks depends on a number of factors; all of which determine how many IO operations are required to retrieve or store an individual record.
You can specify the storage area for each database object (table, index and LOB) at the time of creation or by moving the object with the proutil utility. Indexes and LOBs should be stored in different areas than the tables they belong to.
Tables should arranged in storage areas according to their record size, total number of records, maintenance needs and rate of growth. Do not group tables together by logical relationships or application usage.
Never use the "Schema Area" to store your application objects, it should be considered reserved for the OpenEdge internal metaschema objects.
OpenEdge supports two different versions of data storage areas; Type I areas and Type II areas. Type I areas should be considered deprecated and should only be used for the default "Schema Area" which is used to store the OpenEdge metaschema.
Properly configured Type II areas will perform much more efficiently than Type I areas and also support a number of features that Type I areas do not.
Database Block Size
At the current time OpenEdge does not allow you to set the data area block size. Instead it is inherited from the database block size. The block size defines the unit of measure not just for storage, but for how much data will be read or written at one time.
The supported values are 1K, 2K, 4K and 8K. Until OpenEdge supports larger block sizes you should always choose 8K as your database block size. This allows storing the most data possible in one block as well as reducing the number of IO operations required when writing to the database.
Records Per Block
For each data storage area you must define a records per block setting (RPB). This setting controls how many "slots" are available for record storage in each block. The maximum size of each slot can be roughly approximated by diving the block size by the RPB setting. If a record exceeds the size of a slot it will be split into as many fragments as needed in order to store the entire record.
When records are fragmented between database blocks; more disk IO and memory are required to access those records. How much extra IO is required is directly related to how many records are fragmented and how frequently those records are accessed.
Currently supported RPB values are 1, 2, 4, 8, 16, 32, 64, 128 and 256. Choosing the correct value will depend on the average size of the records you intend to store in that area.
Differences Between Type I And Type II Areas
Type II areas support the following features that Type I areas do not:
- Multi Tenant (segregating data based on user)
- Table Partitioning
- Buffer pool assignment at an object level
- Enhanced performance for index rebuilds
- Enhanced performance for table drops
- Segregating table data from other tables
- Support for reading tables without an index
- Extending database extents by a cluster of blocks instead of one block at a time
No new features are planned for Type I areas and they have been relatively unchanged since the 1990s. All of your application objects should be stored in appropriately configured Type II areas.
When Being Antisocial Is Appropriate
The most important performance issue with Type I areas is that database blocks are "social"; meaning that one database block can contain records from multiple tables. This is the main reason why databases with Type I areas need to be dumped and reloaded on a regular basis.
Mixing table data reduces the effectiveness of the buffer pools and increases the number of IO operations required to read data from that table. Assume that the example block below is representative of the database as a whole.
|Record Slot||Table Name|
If you wanted to read all 50 State records the database would be forced to read multiple blocks (as many as 50) and store all of those blocks in the buffer pool. In a properly configured Type II area the entire State table could fit in one block.
This can make a substantial performance difference even if you are fortunate enough to have enough memory to cache your entire database in memory. Eliminating unnecessary IO operations and making more effective use of the buffer pools is key to getting the best performance out of your application.