Period BladeLet

The Period Bladelet provides SQL support for managing information about fixed intervals in a timeline: real world phenomenon like the duration of a hotel stay, or the scheduled activity of some manufacturing equipment. The BladeLet implements a pair of OPAQUE TYPE User-Defined Type (UDT) instances, and a set of User-Defined Functions (UDFs) to perform the operations appropriate to these types. In addition to being a useful bundle of extensions in its own right, the Period Bladelet also provides a good example of several extensibility features: building an OPAQUE TYPE, how to use ORDBMS the Operator Class feature with R-Trees, a User-Defined Aggregate example, and a set of statistics and selectivity facilities.

Introduction and Background.

Design Overview.

User-Defined Types and User-Defined Functions

List of User-Defined Functions.

User-Defined Aggregate.

Code Walk Through.

Glossary and References

Introduction and Background.

The motivation for Period extensions to SQL is best explained with an example. Consider a scheduling system. For dramatic effect our example discusses a railway, but the same logic applies to hotel rooms, rental cars, production scheduling, personal appointments, and any application where a shared resource is being allocated for fixed periods of time. Temporal information is becoming an increasingly important part of many information systems. See Snodgrass, Richard T. Developing Time-Oriented Database Applications in SQL ISBN: 1-55860-436-7 for more details than there is space for here.
Figure 1, which is intended to convey a set of track usage schedules for railway trains, illustrates the basic concept of temporal intervals.

         Time:
Track  : T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12 T13 T14 T15 T16 T17
Seg #:

  1      |--|  |--|  |-----|     |--|               |-------|

  2         |-----|     |-----|          |---|

  3            |--------------|     |-----------|       |---|

  4         |--|  |--|--|        |-----|        |-------|   |-------)

  5      |----------------------------------------------------------)

  6                                 |--|  

  7         |-----|     |-----|     |-------|       |---------------)


  8            |-----|  |--------|              |---|

  9     

 10      (-----|              |----------|   |-------|   |----------)

Figure 1: Diagrammatic Representation of Temporal Periods

Each number on the left identifies a track segment. Each of the '|---|' strings signifies that some train is scheduled onto a particular track segment for some period of time. Segment schedules all have a start date or datetime indicator, extend for some number of units of time, and then finish. Each of the single interval instances we call a period. More formally, a period is a fixed interval in the time line, which contrast periods with the SQL-92 INTERVAL type corresponding to floating intervals in a time-line.
In certain special cases we also need to consider schedules that are "open". A period's start may be unknown, or the situation may have 'always been thus', or the time at which a period's finishes may be undetermined. Railway schedules are not open-ended in this way but in applications such as command and control systems managing "open" Periods is a necessity..
A situation like the one illustrated in Figure 1 might be accommodated in a SQL-92 RDBMS with a table that looks like Figure 2:
CREATE TABLE Track_Schedule  (
        Train    INTEGER                 NOT NULL,
        Track    INTEGER                 NOT NULL,
        Starts   DATETIME YEAR TO SECOND NOT NULL,
        Ends     DATETIME YEAR TO SECOND NOT NULL,
        CHECK ( Starts < Ends ) CONSTRAINT Starts_must_preceed_Ends,
            FOREIGN KEY ( Train ) REFERENCES Trains ( Id ) CONSTRAINT Train_FK,
            FOREIGN KEY ( Track ) REFERENCES Tracks ( Id ) CONSTRAINT Track_FK
);

Figure 2: SQL-92 Handling of Temporal Periods

Figure 2's Track_Schedule table is a reasonable approach to the problem, but it has several deficiencies. First, querying this table's information in an application requires writing some fairly complex SQL-92 expressions. Second, within the limits of most RDBMS products it is impossible to index queries that perform temporal operations like overlap, within, and contains. And third, some behaviors of range data structures like Period cannot be reasonably accommodated within SQL-92. All of these problems are addressed by the Period Bladelet's functionality.
To illustrate each problem in more detail, let us examine a couple of fairly obvious business questions and see how they would be handled using the Track_Schedule table/SQL-92 approach. In each of the examples that follow we first list the question in plain language, and then present the corresponding query.
Clearly, it is rather desirable for our railway company to prevent two trains from being scheduled on the same track segment for overlapping periods. (Obviously the situation is more complex than this, but bear with us here.) Answering this question using just the standard SQL-92 predicates is more awkward than it needs to be.

"How are there any situations where two different trains are scheduled on the same track over the next week?"

SELECT T1.Track, T1.Train, T2.Train, T1.Starts, T1.When,
       T2.Start, T2.Ends
FROM Track_Schedule T1, Track_Schedule T2
WHERE T1.Finish >= CURRENT
   AND T1.Start <= CURRENT + 7 UNIT DAY
   AND T2.Start <= T1.Finish
   AND T2.Finish >= T1.Start
   AND T1.Track = T2.Track
   AND T1.Start <> T2.Start
   AND T1.End    <> T2.End
   AND T1.Train <> T2.Train;

Figure 3: SQL-92 Query Performing Temporal Overlaps Predicate

While this is not a particularly complex query expressions, it takes a rather close reading of the query's logical details to figure out that its WHERE clause is computing an Overlap(). Other temporal expressions, such as Contains() and Within() are similar in form, and this similarity makes it difficult to spot errors in temporal expressions, particularly because applications that care about overlapping ranges employ these query language patterns often. And what about more complex kinds of query; ones that join tables based on overlapping intervals such as how many stays in a hotel overlap a conference or a special price? Simplifying and clarifying declarative SQL would reduce many of these problems.
In Figure 4, we present the way this same question would be answered using the facilities of the Period BladeLet. (Look ahead to Figure 6 for the re-defined Track_Schedule table used in this query). Even though it accomplishes more than the expression in Figure 3, the SQL in Figure 4 is clearly simpler, and it is fairly obvious what it is that the query is attempting to do.
SELECT T1.Track, T1.Train, T2.Train, GetIntersect ( T1.When, T2.When)
   FROM Track_Schedule T1, Track_Schedule T2
WHERE Overlap ( T1.When, DT_Period( CURRENT, CURRENT + 7 UNITS DAY) )
   AND Overlap ( T1.When, T2.When )
   AND T1.When <> T2.When
   AND T1.Track = T2.Track
   AND T1.Train <> T2.Train;

Figure 4: Query Performing Temporal Overlaps Predicate with Period BladeLet

The second problem with the query in Figure 3 relates to performance. Developers familiar with SQL can tell by looking at the WHERE clause in Figure 3 that an RDBMS will probably be unable to use an index to accelerate this query. The best that can be hoped for is that the DBA builds a compound B-Tree < Starts, Ends > or < Ends, Starts > index. But because of the different legs of the query uses a different operator (<= or >=) with different columns, the best plan that a SQL-92 RDBMS can come up with is to scan of both indices and a hash join to identify matching rows. This is generally so expensive that the query planner is likely to punt on the question and simply scan the entire table.
By contrast, both Overlap() predicates in Figure 4, and a range of other temporal operations, are amenable to indexing. An R-Tree on the Track_Schedule.When column can be used both to limit the segments checked to only those over the next week, and also on the inner of a join between the tables. In contrast with some other approaches, there is no absolute requirement to use a resolution table. Sometimes, a HASH join is preferred to a NEST LOOP. Sometimes, it is not. Part of the Period BladeLet's functionality is a statistics and selectivity estimator to help make this choice.
Finally, some intuitively simple temporal business questions are enormously difficult to express in SQL-92. Sometimes, for example, it is desirable to determine whether or not there is some fixed interval of time that all of the Periods in some set overlap, and if they do, the extent of this overlap is. That is, is there a period of time during which all of a set of events are occuring simultaneously? Performing this operation in SQL-92 requires finding the maximum starting time and minimum finish time for which there exists no non-overlapping period in the set. Figure 5 illustrates how this operation is handled using the Period BladeLet.

"Between 10:30am and 10:35am on 10th April, 2001 there will be a scheduled reduction in power to electrical rail in Montana. This is not a problem unless all tracks are in use at the same time. Is this the case?"

SELECT Min_Overlap( TS.When )
FROM Track_Schedule TS, Track_Segment TR
WHERE TS.Track = TR.Id
   AND TR.IsElectric
   AND Overlaps ( DT_Period("2001-04-10 10:30:00",
                            "2001-04-10 10:35:00" ),
                  TS.When )
   AND Within ( TR.Where, :Montana );

Figure 5: User-Defined Aggregate Query

Min_Overlap() is an example of a User-Defined Aggregate (UDA) implemented as part of the Period BladeLet. In much the same way that MIN() takes a set of numbers or strings and returns the smallest value in the list, Min_Overlap() accepts a list of Period instances and returns a Period value -- which might or might not be one of the values ipassed in -- that is the minimum overlapping period for all of them. Actually, this aggregate is not that useful on its own. It is rather more useful when used as a building block for other, more complex business operations.
The Period BladeLet ships with complete source code so it can be modified to accommodate the specifics of other problems. In this release the BladeLet only supports Periods with DATE and DATETIME YEAR TO SECOND delimiters. Developers needing FRACTIONs or other, more exotic time units will need to change the code. The Period BladeLet deviates from some more common practices among datablade products in that it does not install using Blade Manager. Instead, you should register and unregister it using the SQL scripts that come with.
This page provides an overview of the BladeLet's functionality, and a guide to the finer points of its implementation. Currently, this work is ported as far as NT, Solaris and RedHat Linux without problems. Experience has shown that other platforms are rather easy to do, but APITB.

Contents

Design Overview.

To overcome both the query complexity and performance problems, the Period BladeLet provides a pair of new OPAQUE TYPE extensions: Period, and DT_Period. They correspond to T-SQL Period ( DATE ) and Period ( DATETIME ) objects, respectively. Both of these types come with a set of UDFs to handle common temporal operations, and to interact the built-in DATE, DATETIME and INTERVAL types. Both of them can be indexed using the R-Tree for optimal performance.

In Figure 6, we re-write the table introduced in Figure 2 to illustrate one of the new types. The idea is that instead of storing the start and finish values in separate columns, DT_Period combines them into a single data object, stored in a single column. The CHECK() integrity constraint in Figure 2 is moved within the user-defined functions that implement the type's behaviors. Encapsulating this rule within the type's behavior simplifies its use in SPL variables and query expressions. Both the DT_Period and Period data types enforce this rule.
CREATE TABLE Track_Schedule  (
        Train    INTEGER      NOT NULL,
        Track    INTEGER      NOT NULL,
        When     DT_Period    NOT NULL,
            FOREIGN KEY ( Train ) REFERENCES Trains ( Id ) CONSTRAINT Train_FK,
            FOREIGN KEY ( Track ) REFERENCES Tracks ( Id ) CONSTRAINT Track_FK
);
Figure 6: Track Schedule Table Re-defined using Period BladeLet Types

The public format of these types is quite similar, and both Period and DT_Period handle the "start is epoch" and "finish is unbound" cases in similar fashions. Rather than relying on programmers to get the syntax right each time--which is not always easy in the case of DATETIME and INTERVAL strings--the BladeLet includes a set of "constructor" UDFs to simplify data management. Figure 7 lists several examples of both of these types.

"1999-10-10 12:10:10" to "1999-12-20 22:20:20" "2000-02-09 08:30:30" to "2000-03-20 08:30:30"

"2000-03-20 08:30:30" to "2000-08-07 18:40:40" "2000-08-07 18:40:40" to "2000-10-07 04:50:50"

"EPOCH" to "1999-10-20 22:20:20" "1999-12-20 22:20:20" to "FOREVER"

"EPOCH" to "2001-03-07 01:11:10" "EPOCH" to "FOREVER"

"10/10/1999" to "12/20/1999" "2/9/2000" to "3/20/2000"

"3/20/2000" to "8/7/2000" "8/7/2000" to "10/7/2000"

"EPOCH" to "10/20/1999" "12/20/1999" to "FOREVER"

"EPOCH" to "3/7/2001" "EPOCH" to "FOREVER"

Figure 7: Examples of DT_Period and Period Data Type Instances

So far as the rest of the DBMS is concerned, the Period and DT_Period types are treated like any of the built-in types. They both fit entirely within a data row, the Period being 8 bytes long, and the DT_Period being 48 bytes long. Included in the regression tests for this Bladelet are back-up and recovery tests. Although I have not tested it, it should also be possible to use replication with these types.
A large set of temporal operations are implemented as part of the BladeLet. These deal with the intuitively obvious set of comparisons between two Period instances; Overlap, Contains, Equal, Within, Before, After, etc. Consult the section on User-Defined Functions for a detailed list. Where ever it was possible to do so, the operations have been combined using an R-Tree operator class (opclass) to allow developers to exploit the R-Tree indexing technology. In practice this is incredibly useful: without indexing the regression test suite takes almost two hours to run on an NT laptop, but only one minute with indexing.
So far as the indexing operations are concerned, EPOCH and FOREVER are treated as the smallest possible and largest possible date or datetime values. Internally, the byte level values used to store them are not valid SQL-92 internal formats. Exercise caution when writing code to modify these values.

Contents

User-Defined Types and User-Defined Functions in the BladeLet.

The BladeLet ships with two OPAQUE TYPE instances, and a large set of User-Defined Functions implementing behaviors for them. In addition to the DATE and DATETIME YEAR TO SECOND boundaries, the BladeLet implements two "special" values: EPOCH and FOREVER. The idea is to support periods where the open or the close is not known at the time the object is created. These words can appear in the public string, and there are user-defined functions that can get and set these values.
Tables involving these types can be both backed up using standard techniques, and because these range values can reside entirely within a single data page, they can be replicated.

List of User-Defined Types.

Period

A Period is a fixed interval bound by a Start and a Finish DATE. Internally, DATE values are represented using four byte INTEGER days, and the EPOCH and FOREVER values are represented using a DATE one larger than the minimal possible date value, and one less than the maximum possible date value, respectively. So far as a SQL developer is concerned, this detail is invisible. All of the UDFs described in the next section apply to the type, although the nature of the INTERVAL values, and the sizes of each Period object, are somewhat different.

DT_Period

DT_Period is a fixed interval bound by a DATETIME Start and Finish. Internally, Start and Finish are stored using real DATETIME data structures, and to implement EPOCH and FOREVER these SAPI data structures are extended with additional macro values. All of the User-Defined Functions described in the next section work for DT_Period, although the details of INTERVAL and size vary from the Period.

List of User-Defined Functions.

This list of UDFs implemented as part of the BladeLet is broken into several sections, each containing a particular category of UDF. Different categories of UDF perform different functions: some of them are SQL-level expressions, some of them are never seen by SQL developers. Each UDF description includes the UDF's signature (its name, and the vector of its arguments), its return type, and a brief description of what it does. Note that this list can be extracted from the database's system catalogs using the query presented in Figure 8.
SELECT UPPER ( S.procname ||
              ' ( ' || S.paramtypes::LVARCHAR || ' ) -> ' ||
              ifx_ret_types ( S.procid )::LVARCHAR
             )
FROM sysprocedures S
WHERE S.paramtypes::LVARCHAR LIKE '%dt_period%'
    OR ifx_ret_types(S.procid)::LVARCHAR LIKE '%dt_period%';

Figure 8: Query to Extract Information About DT_Period User-Defined Functions from Catalogs

Constructor Functions.
Constructor UDFs are the input (and output) logic used to create an instance of the new data type. In addition to functions converting the public string format into the internal data structures, other constructor functions convert a vector of data values into an instance of the new data type.

DT_PERIODIN ( LVARCHAR ) -> DT_PERIOD -- CAST LVARCHAR to DT_Period

DT_PERIOD ( DATETIME,DATETIME ) -> DT_PERIOD

DT_PERIOD ( DATETIME ) -> DT_PERIOD

Each of these three UDFs creates a valid instance of a DT_Period type based on the values in its arguments. Figure 6 illustrates several examples of the kind of strings DT_PERIODIN() converts into the type's internal structure. The second pair of functions creates DT_Period instances with the start and finish elements either populated by the function arguments. Code ensuring that DT_Period.Finish is >= DT_Period.Start is implemented within these functions.

DT_PERIODOUT ( DT_PERIOD ) -> LVARCHAR -- CAST DT_Period to LVARCHAR

This last UDF is not strictly speaking a constructor, but because it is commonly invoked by the ORDBMS as part of query processing, it is included in this section. It converts a DT_Period object into an LVARCHAR string, suitable for printing, or for passing as an argument into a DT_PeriodIn() UDF. Figure 9 provides several illustrations of how this routine is called. Note that the DT_PeriodOutput() UDF is actually called on every occasion.

EXECUTE FUNCTION DT_PeriodIn('"1999-10-10 12:10:10" to "1999-12-20 22:20:20"');
(expression) "1999-10-10 12:10:10" to "1999-12-20 22:20:20"

EXECUTE FUNCTION DT_PeriodIn('"2000-02-09 08:30:30" to "2000-03-20 08:30:30"');
(expression) "2000-02-09 08:30:30" to "2000-03-20 08:30:30"

EXECUTE FUNCTION DT_PeriodIn('"2000-03-20 08:30:30" to "2000-08-07 18:40:40"');
(expression) "2000-03-20 08:30:30" to "2000-08-07 18:40:40"

EXECUTE FUNCTION DT_PeriodIn('"2000-04-19 18:40:40" to "2000-06-08 18:40:40"');
(expression) "2000-04-19 18:40:40" to "2000-06-08 18:40:40"

EXECUTE FUNCTION DT_PeriodIn('"EPOCH" to "2000-06-08 18:40:40"');
(expression) "EPOCH" to "2000-06-08 18:40:40"

EXECUTE FUNCTION DT_PeriodIn('"2000-04-19 18:40:40" to "FOREVER"');
(expression) "2000-04-19 18:40:40" to "FOREVER"

EXECUTE FUNCTION DT_PeriodIn('"EPOCH" to "FOREVER"');
(expression) "EPOCH" to "FOREVER"

Figure 9: Constructor UDF for the DT_Period Data Type

Typically, constructor UDFs appear in write queries; INSERT and UPDATE. The public format seen here is the same one used when a table with a DT_Period column is unloaded to a text file.

Manipulator Functions.

Manipulator UDFs modify the value of a DT_Period. Strictly, what each of these UDFs does is to create a new instance of the data type, and populate it with data taken from the arguments, one of which is usually another (original) instance of the type. Duplicating the data is useful to support the transactional, set oriented query operations.

SET_START_EPOCH ( DT_PERIOD ) -> DT_PERIOD
SET_FINISH_FOREVER ( DT_PERIOD ) -> DT_PERIOD

These first two functions open, respectively, the start and the finish DATE or DATETIME of a DT_Period. Open ends--which are called either EPOCH if the start is open, or FOREVER if the finish is open--indicates either that the open or close is unknown, or unscheduled. It is important to distinguish between such circumstances, and the situation where the entire Period is unknown, which is handled by using a NULL instead of the entire Period instance.

EXECUTE FUNCTION Set_Start_Epoch (
DT_PeriodIn('"1999-10-10 12:10:10" to "1999-12-20 22:20:20"')
);
(expression) "EPOCH" to "1999-12-20 22:20:20"

EXECUTE FUNCTION Set_Finish_Forever (
DT_PeriodIn('"2000-02-09 08:30:30" to "2000-03-20 08:30:30"')
);
(expression) "2000-02-09 08:30:30" to "FOREVER"

EXECUTE FUNCTION Set_Start_Epoch (
                  Set_Finish_Forever (
                   DT_PeriodIn('"2000-03-20 08:30:30" to "2000-08-07 18:40:40"')
                 ));
(expression) "EPOCH" to "FOREVER"

Figure 10: Opening the DT_Period at Either End

GETINTERSECT ( DT_PERIOD,DT_PERIOD ) -> DT_PERIOD
GETUNION ( DT_PERIOD,DT_PERIOD ) -> DT_PERIOD

Given two range values, these two UDFs calculate either their Intersection (if an intersection exists) or their Union. Any two ranges can be used to compute a Union, which is derived from the two outlier values of the ranges: the smallest start and the largest finish. Intersection is more complex, because only some pairs of ranges actually overlap. When the arguments to the GetIntersection() do not overlap, the UDF does not throw an exception. Instead, it returns a NULL value.

To gain a clearer understanding of these UDFs (and several other examples below) we introduce a set of DT_Period data values in Figure 11. On the left of this diagram, a list of identifiers--'A' through 'H'--each related to a range. For simplicity, this set of examples includes no open ended Periods.

     A    |----|
     B             |---|
     C                 |------------|
     D                     |----|
     E                              |----|
     F                                   |----|
     G                                            |----|
     H                            |---|

Figure 11: Sample Data for the DT_Period/Period Data Type

Suppose these are stored in a table called Test_DT_Period. For the purposes of this documentation--and to provide a minimum of positive 'regression tests'--the following query invokes a range of User-Defined Functions over these examples.

SELECT '==========================================================' AS Spacer,
       T1.When AS First,
       T2.When AS Second,
       T1.Picture AS First_Pic,
       T2.Picture AS Secon_Pic,
       '----------------------------------------------------------' AS Spacer_1,
       '     CompareString (' || T1.Id || ' , ' || T2.Id || ' ) is ' ||
           CompareString( T1.When, T2.When )::LVARCHAR
          AS CompareString,
       '             Equal (' || T1.Id || ' , ' || T2.Id || ' ) is ' ||
           Equal( T1.When, T2.When )::LVARCHAR::CHAR(2)
          AS Equal,
        .
        .    One for each of the boolean operator UDFs included with this BladeLet.
        .
       '      AfterTouches (' || T1.Id || ' , ' || T2.Id || ' ) is ' ||
           AfterTouches ( T1.When, T2.When )::LVARCHAR::CHAR(2) AS AfterTouches,

       '----------------------------------------------------------' AS Spacer_2,
       'GetIntersect (' || T1.Id || ' , ' || T2.Id || ' ) is ' ||
           GetIntersect ( T1.When, T2.When)::LVARCHAR::VARCHAR(48)
         AS Intersect,
       'GetUnion (' || T1.Id || ' , ' || T2.Id || ' ) is ' ||
           GetUnion ( T1.When, T2.When)::LVARCHAR::VARCHAR(48)
         AS Union,
        .
        . One for each of the several UDFs included with the BladeLet.
        .
      'Period_Interval( GetIntersect (' || T1.Id || ' , ' || T2.Id || ' )) is ' ||
           Period_Interval(GetIntersect (T1.When, T2.When))::LVARCHAR::VARCHAR(32)
         AS Union_Interval
FROM Test_DT_Period T1, Test_DT_Period T2;

Figure 12: Query Performing Minimal Regression Tests for UDFs in BladeLet

The following figure presents a single printout comparing two pairs of DT_Period data instances.Similar queries are used throughout this documentation to illustrate how the various User-Defined Functions operate.

spacer              ==========================================================
first               "1999-10-10 12:10:10" to "1999-12-20 22:20:20"
second              "1999-10-10 12:10:10" to "1999-12-20 22:20:20"
first_pic             A    |----|
secon_pic             A    |----|
spacer_1            ----------------------------------------------------------
comparestring            CompareString (A , A ) is EQ_LT_GT_EQ
equal                            Equal (A , A ) is t
notequal                      NotEqual (A , A ) is f
containsnottouches ContainsNotTouches (A , A ) is f
contains                      Contains (A , A ) is t
withinnottouches      WithinNotTouches (A , A ) is f
within                          Within (A , A ) is t
overlapnottouches    OverlapNotTouches (A , A ) is f
overlap                        Overlap (A , A ) is t
before                          Before (A , A ) is f
beforetouches            BeforeTouches (A , A ) is f
after                            After (A , A ) is f
aftertouches              AfterTouches (A , A ) is f
spacer_2            ----------------------------------------------------------
intersect           GetIntersect (A , A ) is "1999-10-10 12:10:10" to "1999-12-20 22:20:20"
union              GetUnion (A , A ) is "1999-10-10 12:10:10" to "1999-12-20 22:20:20"
start_union         Start( GetUnion (A , A )) is 1999-10-10 12:10:10
finish_intersect    Finish( GetIntersect (A , A )) is 1999-12-20 22:20:20
length_union        Length( GetUnion (A , A )) is 6171010
intersect_interval Period_Interval( GetIntersect (A , A )) is 71 10:10:10

.
.
.

spacer              ==========================================================
first               "2000-02-09 08:30:30" to "2000-03-20 08:30:30"
second              "1999-10-10 12:10:10" to "1999-12-20 22:20:20"
first_pic             B             |---|
secon_pic             A    |----|
spacer_1            ----------------------------------------------------------
comparestring            CompareString (B , A ) is GT_GT_GT_GT
equal                            Equal (B , A ) is f
notequal                      NotEqual (B , A ) is t
containsnottouches ContainsNotTouches (B , A ) is f
contains                      Contains (B , A ) is f
withinnottouches      WithinNotTouches (B , A ) is f
within                          Within (B , A ) is f
overlapnottouches    OverlapNotTouches (B , A ) is f
overlap                        Overlap (B , A ) is f
before                          Before (B , A ) is f
beforetouches            BeforeTouches (B , A ) is f
after                            After (B , A ) is t
aftertouches              AfterTouches (B , A ) is f
spacer_2            ----------------------------------------------------------
intersect
union               GetUnion (B , A ) is "1999-10-10 12:10:10" to "2000-03-20 08:30:30"
start_union         Start( GetUnion (B , A )) is 1999-10-10 12:10:10
finish_intersect
length_union        Length( GetUnion (B , A )) is 13983620
intersect_interval

Figure 13: Output of Regression Tests Query for Two Pairs of DT_Periods

Figure 13 illustrates the effect of these two manipulator UDFs. In the query that compares 'A' with itself, the Intersection and Union of the two DT_Period data values is the same, as the two values are 'Equal'. In the second query, comparing 'A' with 'B', the two DT_Period objects are not equal and do not overlap (in fact, 'B' is After 'A'). Therefore, the Union of 'A' and 'B' is simple A.Start to B.Finish, and their Intersection is NULL. When a NULL argument is passed into a UDF, by default it simply returns a NULL result. Calculating Finish_Intersect and Intersect_Interval illustrate NULL argument behavior.

Interrogator Functions.

Interrogators extract information from a single Period or DT_Period. They calculate intermediate results in more complex query expressions. Figures 12 and 13 include examples that illustrate how each of these UDFs is invoked.

START_IS_EPOCH ( DT_PERIOD ) -> BOOLEAN

Returns true if the start of the DT_Period arguement is set to EPOCH.

FINISH_IS_FOREVER ( DT_PERIOD ) -> BOOLEAN

Returns true if the finish of the DT_Period arguement is set to FOREVER.

LENGTH ( DT_PERIOD ) -> INTEGER
PERIOD_INTERVAL ( DT_PERIOD ) -> INTERVAL

These two user-defined functions return a calculation of the length of the DT_Period argument. The Length() UDF returns the size in INTEGER seconds. The Period_Interval() UDF returns the size as an INTERVAL DAY(8) TO SECOND data value.

START ( DT_PERIOD ) -> DATETIME
FINISH ( DT_PERIOD ) -> DATETIME

Returns, respectively, the start and finish DATETIME YEAR TO SECOND of the argument DT_Period. If the start is EPOCH, or the finish is FOREVER, this UDF will return a NULL result. If alternative behavior is desired, use the SQL NVL() expression and a UDF to specificy what the application's default minimum and maximum date or datetime are.

Comparison Functions.

The key to making use of any new data type is the set of comparison functions that allow developers to write queries using the type. All comparison UDFs take two instances of the data type, and return a boolean value signifying whether or not the UDF's comparison was true or false.

EQUAL ( DT_PERIOD,DT_PERIOD ) -> BOOLEAN -- Indexed
NOTEQUAL ( DT_PERIOD,DT_PERIOD ) -> BOOLEAN -- Not Indexed

Equal() returns true if the start and finish of the first argument are equal to, respectively, the start and finish of the second. NotEqual() is true in all other cases. In Figure 12, the Period labelled 'A' is equal to itself.

Example:

' Equal (' || T1.Id || ' , ' || T2.Id || ' ) is ' ||
Equal( T1.When, T2.When )::LVARCHAR::CHAR(2) AS Equal,

' NotEqual (' || T1.Id || ' , ' || T2.Id || ' ) is ' ||
NotEqual ( T1.When, T2.When )::LVARCHAR::CHAR(2) AS NotEqual,

'----------------------------------------------------------' AS Spacer_2
FROM Test_DT_Period T1, Test_DT_Period T2
WHERE T1.Id IN ('A', 'D' )
AND T2.Id IN ('A', 'D' );

spacer         ==========================================================
first          "2000-04-19 18:40:40" to "2000-06-08 18:40:40"
second         "1999-10-10 12:10:10" to "1999-12-20 22:20:20"
first_pic        D                     |----|
secon_pic        A    |----|
spacer_1       ----------------------------------------------------------
comparestring       CompareString (D , A ) is GT_GT_GT_GT
equal                       Equal (D , A ) is f
notequal                 NotEqual (D , A ) is t
spacer_2       ----------------------------------------------------------

spacer         ==========================================================
first          "1999-10-10 12:10:10" to "1999-12-20 22:20:20"
second         "2000-04-19 18:40:40" to "2000-06-08 18:40:40"
first_pic        A    |----|
secon_pic        D                     |----|
spacer_1       ----------------------------------------------------------
comparestring       CompareString (A , D ) is LT_LT_LT_LT
equal                       Equal (A , D ) is f
notequal                 NotEqual (A , D ) is t
spacer_2       ----------------------------------------------------------

spacer         ==========================================================
first          "2000-04-19 18:40:40" to "2000-06-08 18:40:40"
second         "2000-04-19 18:40:40" to "2000-06-08 18:40:40"
first_pic        D                     |----|
secon_pic        D                     |----|
spacer_1       ----------------------------------------------------------
comparestring       CompareString (D , D ) is EQ_LT_GT_EQ
equal                       Equal (D , D ) is t
notequal                 NotEqual (D , D ) is f
spacer_2       ----------------------------------------------------------

Figure 14: Equal() and NotEqual() Example Query Expressions

The results of the Equal() and NotEqual() UDF are consistent with the Compare() UDF (see below). Further, the if Equal ( 'A', 'B' ), then Hash('A') = Hash ( 'B' ). Designing the

CONTAINSNOTTOUCHES ( DT_PERIOD,DT_PERIOD ) -> BOOLEAN -- Indexed
CONTAINS ( DT_PERIOD,DT_PERIOD ) -> BOOLEAN -- Indexed

For any two Periods, there is a possibility that the first contains the second. For these UDFs, contains is interpreted to mean that both the start and the fnish of the second argument fall between the start and finish of the first. Where these two UDFs differ is in their treatment of cases where ends touch: that is, when first.start = second.start or first.finish = second.finish.

'ContainsNotTouches (' || T1.Id || ' , ' || T2.Id || ' ) is ' ||
ContainsNotTouches ( T1.When, T2.When ) AS ContainsNotTouches,

' Contains (' || T1.Id || ' , ' || T2.Id || ' ) is ' ||
Contains ( T1.When, T2.When )::LVARCHAR::CHAR(2) AS Contains,

'----------------------------------------------------------' AS Spacer_2

FROM Test_DT_Period T1, Test_DT_Period T2
WHERE T1.Id IN ('A', 'C', 'D' )
AND T2.Id IN ('C', 'D' );

spacer              ==========================================================
first               "1999-10-10 12:10:10" to "1999-12-20 22:20:20"
second              "2000-03-20 08:30:30" to "2000-08-07 18:40:40"
first_pic             A    |----|
secon_pic             C                 |------------|
spacer_1            ----------------------------------------------------------
comparestring            CompareString (A , C ) is LT_LT_LT_LT
containsnottouches ContainsNotTouches (A , C ) is f
contains                      Contains (A , C ) is f
spacer_2            ----------------------------------------------------------

spacer              ==========================================================
first               "2000-03-20 08:30:30" to "2000-08-07 18:40:40"
second              "2000-03-20 08:30:30" to "2000-08-07 18:40:40"
first_pic             C                 |------------|
secon_pic             C                 |------------|
spacer_1            ----------------------------------------------------------
comparestring            CompareString (C , C ) is EQ_LT_GT_EQ
containsnottouches ContainsNotTouches (C , C ) is f
contains                      Contains (C , C ) is t
spacer_2            ----------------------------------------------------------

spacer              ==========================================================
first               "2000-04-19 18:40:40" to "2000-06-08 18:40:40"
second              "2000-03-20 08:30:30" to "2000-08-07 18:40:40"
first_pic             D                     |----|
secon_pic             C                 |------------|
spacer_1            ----------------------------------------------------------
comparestring            CompareString (D , C ) is GT_LT_GT_LT
containsnottouches ContainsNotTouches (D , C ) is f
contains                      Contains (D , C ) is f
spacer_2            ----------------------------------------------------------

spacer              ==========================================================
first               "2000-03-20 08:30:30" to "2000-08-07 18:40:40"
second              "2000-04-19 18:40:40" to "2000-06-08 18:40:40"
first_pic             C                 |------------|
secon_pic             D                     |----|
spacer_1            ----------------------------------------------------------
comparestring            CompareString (C , D ) is LT_LT_GT_GT
containsnottouches ContainsNotTouches (C , D ) is t
contains                      Contains (C , D ) is t
spacer_2            ----------------------------------------------------------

Figure 15: Contains() and ContainsNotTouches() Example Query Expressions

Figure 15 includes several rows illustrating aspects of the Contains() and ContainsNotTouches() UDFs. From the comparison between 'C' and itself, you can see the difference between Contains() and ContainsNotTouches(). Comparing 'C' with 'D' illustrates the asymmetry of both of these routines: 'C' contains 'D', and 'C' contains 'D' without touching it, but it is not the case that 'D' contains 'C'.

WITHINNOTTOUCHES ( DT_PERIOD,DT_PERIOD ) -> BOOLEAN -- Indexed
WITHIN ( DT_PERIOD,DT_PERIOD ) -> BOOLEAN -- Indexed

The Within() UDFs are the commutative UDFs for Contains(). If Contains ( 'A', 'B'), then Within( 'B', 'A'). For these UDFs, within is interpreted to mean that both the start and the fnish of the first argument fall between the start and finish of the second. Where these two UDFs differ is in their treatment of cases where ends touch: that is, when first.start = second.start or first.finish = second.finish. Simple within includes touching, but WithinNotTouches() is only true when the containment is complete.

OVERLAPNOTTOUCHES ( DT_PERIOD,DT_PERIOD ) -> BOOLEAN -- Indexed
OVERLAP ( DT_PERIOD,DT_PERIOD ) -> BOOLEAN -- Indexed

Four UDFs calculate an OUT Parameter, in addition to the boolean result. These OUT parameters can be used in statement local variables in a query expression. Each of these UDFs computes the size of the overlap in seconds. Returning an INTERVAL would be useful, but unfortunately there are two kinds of interval, and figuring out which is to be returned is difficult.

OVERLAPNOTTOUCHES ( DT_PERIOD,DT_PERIOD,INTEGER ) -> BOOLEAN -- Indexed
OVERLAP ( DT_PERIOD,DT_PERIOD,INTEGER ) -> BOOLEAN -- Indexed

In the following figure we illustrate how these UDFs are used in a query. The Statement Local Variable is called Overlap_Length, and is of type INTEGER. This is computed by the OverlapNotTouches() user-defined function that is the last predicate in the WHERE clause. Note that the result data in Figure 15 is the entire list of results. Each table in the FROM clause is restricted to three rows, so the query in Figure 16 performs nine comparisons. 'H' and 'D' do not overlap at all, and the three comparisons between 'C', 'D' and 'H' with themselves do not satisfy OverlapNotTouches(). Only four pairs of Period objects do satisfy the UDF, so this query only produces four result rows.

       ' OverlapNotTouches (' || T1.Id || ' , ' || T2.Id || ' ) is ' ||
           OverlapNotTouches ( T1.When, T2.When )::LVARCHAR::CHAR(2) ||
       ' with Length ' || Overlap_Length AS OverlapNotTouches,

' Overlap (' || T1.Id || ' , ' || T2.Id || ' ) is ' ||
Overlap ( T1.When, T2.When )::LVARCHAR::CHAR(2) AS Overlap,

       '----------------------------------------------------------' AS Spacer_2
FROM Test_DT_Period T1, Test_DT_Period T2
WHERE T1.Id IN ( 'C', 'D', 'H' )
   AND T2.Id IN ( 'C', 'D', 'H' )
   AND OverlapNotTouches ( T1.When, T2.When, Overlap_Length # INTEGER );

spacer             ==========================================================
first              "2000-03-20 08:30:30" to "2000-08-07 18:40:40"
second             "2000-04-19 18:40:40" to "2000-06-08 18:40:40"
first_pic            C                 |------------|
secon_pic            D                     |----|
spacer_1           ----------------------------------------------------------
comparestring           CompareString (C , D ) is LT_LT_GT_GT
overlapnottouches   OverlapNotTouches (C , D ) is t with Length 4320000
overlap                       Overlap (C , D ) is t
spacer_2           ----------------------------------------------------------

spacer             ==========================================================
first              "2000-03-20 08:30:30" to "2000-08-07 18:40:40"
second             "2000-07-28 08:30:30" to "2000-09-17 04:50:50"
first_pic            C                 |------------|
secon_pic            H                            |---|
spacer_1           ----------------------------------------------------------
comparestring           CompareString (C , H ) is LT_LT_GT_LT
overlapnottouches   OverlapNotTouches (C , H ) is t with Length 900610
overlap                       Overlap (C , H ) is t
spacer_2           ----------------------------------------------------------

spacer             ==========================================================
first              "2000-04-19 18:40:40" to "2000-06-08 18:40:40"
second             "2000-03-20 08:30:30" to "2000-08-07 18:40:40"
first_pic            D                     |----|
secon_pic            C                 |------------|
spacer_1           ----------------------------------------------------------
comparestring           CompareString (D , C ) is GT_LT_GT_LT
overlapnottouches   OverlapNotTouches (D , C ) is t with Length 4320000
overlap                       Overlap (D , C ) is t
spacer_2           ----------------------------------------------------------

spacer             ==========================================================
first              "2000-07-28 08:30:30" to "2000-09-17 04:50:50"
second             "2000-03-20 08:30:30" to "2000-08-07 18:40:40"
first_pic            H                            |---|
secon_pic            C                 |------------|
spacer_1           ----------------------------------------------------------
comparestring           CompareString (H , C ) is GT_LT_GT_GT
overlapnottouches   OverlapNotTouches (H , C ) is t with Length 900610
overlap                       Overlap (H , C ) is t
spacer_2           ----------------------------------------------------------

Figure 16: OverlapsNotTouches() Returning a Statement Local Variable

As with the Length() UDF, the Overlap() and OverlapNotTouches() UDFs returning a Length value differ in the units. When computing DT_Period overlaps, where the start and finish are datetime values, the value returned is the number of seconds in the interval. When the start, or the finish, is EPOCH, the OUT parameter is set to 0, because there is a bug that prevents setting an OUT parameter to NULL. For Period objects, where the start and finish are even dates, the interval length is computed in number of days.

BEFORE             ( DT_PERIOD, DT_PERIOD ) -> BOOLEAN -- Not Indexed
BEFORETOUCHES      ( DT_PERIOD, DT_PERIOD ) -> BOOLEAN -- Indexed
AFTER              ( DT_PERIOD, DT_PERIOD ) -> BOOLEAN -- Not Indexed
AFTERTOUCHES       ( DT_PERIOD, DT_PERIOD ) -> BOOLEAN -- Indexed

These last four functions refer to relationships where the two Periods relate to non-overlapping intervals. One Period can come before, or after, another, if the finish of one comes (on or) before the start of the other. Touches is a special case where the start or one Period equals the finish of the other. As we have defined the terms, touching Periods do Overlap. They do not, however, OverlapNotTouches().

It is important to bring up a subtle aspect of this definition. An entirely different set of rules, where Overlap() is true only if there is some non-zero interval of time common to both Periods, results in a different, although entirely consistent, semantic.

R-Tree Support Functions and Operator Class

All of the UDFs in the previous section--except Before() and After()--can be indexed using the R-Tree. Building an R-Tree requires a set of support routines. These are introduced in the following section. These UDFs are not called from directly from SQL. There are alternative UDF calls--GetUnion(), GetIntersection(), and Length()--which return these results.

UNION ( DT_PERIOD,DT_PERIOD,DT_PERIOD ) -> INTEGER
SIZE ( DT_PERIOD,FLOAT ) -> INTEGER
INTER ( DT_PERIOD,DT_PERIOD,DT_PERIOD ) -> INTEGER

For each type, there is an operator class that must be used in the CREATE INDEX statement, as we see in the following figure.

CREATE INDEX Index_Index_1 ON Index_Test ( When DT_Period_ops) USING RTREE;

Part of the regression tests for the Period bladelet is a test of the logical correctness, and the efficacy, of the R-Tree. This CREATE INDEX statement is in a script file Period\install\reg_index.sql. That script also measures the performance advantages of the index over table scans.

User-Defined Aggregate Support Function.

As we mentioned in the introduction, the Period BladeLet includes a User-Defined Aggregate called Min_Overlap(), which simplifies the task of figuring out whether or not there is some period of time during which every one of a set of DT_Periods is current. For example, Figure 17 below reproduces a sub-set of the data from Figure 1. Each of the scheduled periods in Figure 17 would correspond to a single row in a table like Figure 6 Track_Schedule. The final line in Figure 17 shows what the result would be if the Min_Overlap() UDA were run over just this set of scheduled intervals.

         Time:
Track  : T1 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T12 T13 T14 T15 T16 T17
Seg #:

  3                                  |-----------|

  4                               |-----|

  5      |----------------------------------------------------------)

  6                                 |---|  

  7                                 |-------|

Min_Overlap()                       |===|

Figure 17: Sub-Set of Figure 1 Data Illustrating Functionality of Min_Overlap() Aggregate

Implementing a UDA requires that the developer implements four support routines and then combines them to implement a single aggregate. Four support routines are necessary due to the parallelizable query processing infrastructure built into the IDS product. When a query specifies that the Min_Overlap() aggregate is to be invoked, the IDS product divides the total data set into some number of sub-tasks and decides how many parallel threads to apply to the problem and runs the INIT support routine once for each of them. Then, within each sub-set of the overall aggregat, IDS invokes the ITER function once per row. Once all rows in all sub-sets have been examined, IDS then combines the results of each sub-task using COMBINE, before finally calling FINAL to compute the result.

DT_PERIOD_OVERLAP_INIT ( DT_PERIOD )                  ->
                               DT_PERIOD_OVERLAP_STATE
DT_PERIOD_OVERLAP_ITER ( DT_PERIOD_OVERLAP_STATE,
                          DT_PERIOD )                  ->
                               DT_PERIOD_OVERLAP_STATE
DT_PERIOD_OVERLAP_COMBINE ( DT_PERIOD_OVERLAP_STATE,
                            DT_PERIOD_OVERLAP_STATE )   ->
                               DT_PERIOD_OVERLAP_STATE
DT_PERIOD_OVERLAP_FINAL ( DT_PERIOD_OVERLAP_STATE )    ->
                               DT_PERIOD

Readers should note that there is potential for serious mis-understanding when reading this description of the Min_Overlap() aggregate. Were you to run the aggregate over all of the data presented in Figure 1, the result would be NULL. There is no period of time during which every one of the periods is active. This case needs to be distinguished from more intuitively obvious, but formally more complex, question relating to periods of common activity for a set of resources (rather than overlapping periods per se.) Min_Overlaps() only ever returns a single Period (or DT_Period) instance. But when a set of resources are being allocated, and the question concerns periods during which all of the resources are allocated, the aggregate would need to return a set of Periods, rather than a single value.

User-defined Aggregates can return SET{} results, but it is suggested that such work be done in using SPL rather than 'C'.

Miscellaneous Support Function.

This user-defined function returns a string that reflects the 4 relationships between the start and finish of the first and second argument. Internally, this UDF simply computes a compare() between all four pairs of start and finish values of the first and second arguments. In the final section of this page, where we describe the internal design of the Bladelet's functionality, we list the possible values that this UDF returns.

COMPARESTRING ( DT_PERIOD,DT_PERIOD ) -> LVARCHAR

Several figures illustrate the CompareString() UDF in action. It is primarily intended as an aid for debugging.

Query Processing Support Functions.

As part of query processing, for example, to perform equi-joins and merge-joins, the following user-defined functions are useful. They are never called as part of any SQL query, but are instead invoked by the engine in HASH joins, or MERGE-SORT operations.

HASH ( DT_PERIOD ) -> INTEGER
COMPARE ( DT_PERIOD,DT_PERIOD ) -> INTEGER

Database Administrative Support Functions.

Database Administrative Support Functions are typically invoked by the ORDBMS as part of its data processing. They are never called directly by SQL queries but are instead invoked when data from the ORDBMS is backed up, or sent to a client. These functions are paired into symmetric operations, each of which relates to a different administrative activity.

DT_PERIODSEND ( DT_PERIOD ) -> SENDRECV
DT_PERIODRECV ( SENDRECV ) -> DT_PERIOD
DT_PERIODIMPT ( IMPEXP ) -> DT_PERIOD
DT_PERIODEXPT ( DT_PERIOD ) -> IMPEXP
DT_PERIODIMPB ( IMPEXPBIN ) -> DT_PERIOD
DT_PERIODEXPB ( DT_PERIOD ) -> IMPEXPBIN

The regression test set for the Period BladeLet includes the use of the ontape utility to back-up and recover a database that involves these types. The idea is to test that these UDFs are working. They appear to be, but this is the weakest area of the testing.

Contents

Code Walk through and Implementation Overview

The 'C' code implementing this BladeLet is to be found in ./src/Period.c, ./src/Period.h, and ./src/DT_Period.h. Several support functions are found in ./src/support.c. To build the shared object, the wad includes a Microsoft Developer Studio project file; ./Period.dsw, and a UNIX makefile in ./UNIX.mak.
A secondary objective of this BladeLet was to provide a generalizable framework for doing a variety of range object; ranges of INTEGER, FLOAT, VARCHAR in addition to DATE and DATETIME. Key to achieving this is the design of the type's comparison sub-system. It turns out that, given the rule that within any range object, finish >= start, comparing the start and the finish of two ranges yields only 18 logically possible outcomes, and that these can be determined entirely using only a Compare() function for the types at either end of the range. In Figure 18 below, we present the list of these outcomes from ./src/Period.h.
#define DT_CMP_ERROR 0 /* Invalid - throw exception    */
#define EQ_EQ_EQ_EQ 1   /* ( 1 -> 1 )   ( 1 -> 1 ) EQ */
#define EQ_LT_EQ_LT 2   /* ( 1 -> 1 )   ( 1 -> 2 ) LT */
#define LT_LT_LT_LT 3   /* ( 1 -> 1 )   ( 2 -> 2 ) LT */
#define EQ_EQ_GT_GT 4   /* ( 1 -> 2 )   ( 1 -> 1 ) GT */
#define EQ_LT_GT_EQ 5   /* ( 1 -> 2 )   ( 1 -> 2 ) EQ */
#define EQ_LT_GT_LT 6   /* ( 1 -> 2 )   ( 1 -> 3 ) LT */
#define LT_LT_EQ_EQ 7   /* ( 1 -> 2 )   ( 2 -> 2 ) LT */
#define LT_LT_EQ_LT 8   /* ( 1 -> 2 )   ( 2 -> 3 ) LT */
#define EQ_LT_GT_GT 9   /* ( 1 -> 3 )   ( 1 -> 2 ) GT */
#define LT_LT_GT_GT 10 /* ( 1 -> 3 )   ( 2 -> 2 ) LT */
#define LT_LT_GT_EQ 11 /* ( 1 -> 3 )   ( 2 -> 3 ) LT */
#define LT_LT_GT_LT 12 /* ( 1 -> 3 )   ( 2 -> 4 ) LT */
#define GT_GT_GT_GT 13 /* ( 2 -> 2 )   ( 1 -> 1 ) GT */
#define GT_EQ_GT_EQ 14 /* ( 2 -> 2 )   ( 1 -> 2 ) GT */
#define GT_LT_GT_LT 15 /* ( 2 -> 2 )   ( 1 -> 3 ) GT */
#define GT_EQ_GT_GT 16 /* ( 2 -> 3 )   ( 1 -> 2 ) GT */
#define GT_LT_GT_EQ 17 /* ( 2 -> 3 )   ( 1 -> 3 ) GT */
#define GT_LT_GT_GT 18 /* ( 2 -> 4 )   ( 1 -> 3 ) GT */

Figure 18: Set of Logically Possible Comparisons Between Two Legal Range Objects

In Figure 18, the composition of the macro string ("GT_LT_GT_LT", for example) reflects the relationships between the four elements of the two range values; First.Start and Second.Start, First.Start and Second.Finish, First.Finish and Second.Start, and finally First.Finish and Second.Finish. Each relationship can have one of three possible values; LessThan, Equal, or GreaterThan. To clarify the relationship between the ranges, each macro is accompanied by a pair of examples where the start and finish values are taken from the range of INTEGERS, 1 through 4.
Thus, for example, in case number 9, First.Start equals Second.Start, First.Start is less than Second.Finish, First.Finish is greater than Second.Start, and First.Finish is greater than Second.Finish. This situation arises when the pair of ranges involved are of the form ( 1 -> 3 ) and ( 1 -> 2 ).
Different logical relationships between pairs of ranges--Overlap(), Within(), Equal() and so on--can be determined as a mapping from these basic possibilities. Further, sorting operations can be supported by implementing a Compare() for ranges although such a concept has little semantic value. The macro strings in Figure 18 are returned the CompareString( DT_Period, DT_Period ) User-defined function.
Statistics and Selectivity
The Period BladeLet includes statistics gathering and selectivity estimation functionality. Given the nature of the objects being indexed, standard approaches--histograms--are a bit awkward. It is not clear, for example, how one would figure out the average size of an Period in a column given histograms of the start and finish. And using histogram techniques to determine the selectivity of operations like Overlap() and Within() is problematic because the start and finish values are not independent: a necessary assumption before standard probability can be used.

Glossary

Terms and acronyms used by this tech note include:

Blade or BladeLet Set of semantically related extensions -- types and functions -- to the ORDBMS.

APITB A Pain in the Behind.

COLLECTION Non-first normal form object. That is, a set of data values that can be considered as a single data value for some purposes (variables). COLLECTIONS can also be thought of as small, in-memory, temporary tables for the purpose of querying.

DBA DataBase Administrator.

Iterator An iterator is a special kind of UDF that returns more than one result. Implementing Iterators raises conceptual and engineering difficulties. This Bladelet contains an example of a quite complex Iterator.

User-defined Function (UDF) Module of procedural logic that extends SQL. This Bladelet included UDFs implemented in 'C' and SPL. Through out this document I use the term Routine synonymously with UDF.

SAPI Server Application Programming Interface. This is the set of data structures and 'C' functions used to implement ORDBMS extensions.

SPL Stored Procedure Language. Simple procedural language that can be used to implement UDFs. SPL is much simpler than 'C', but it is not parallelizable, nor as run-time efficient for complex operations.

Statement Local Variable Variable value returned by reference from a UDF and associated with a name in the query expression. When referring to this as the mechanism for returning more than one value from a UDF it is more common to refer to it as an OUT parameter.

References

Contents

Last updated 31-March-2000.

"1999-10-10 12:10:10" to "1999-12-20 22:20:20"	"2000-02-09 08:30:30" to "2000-03-20 08:30:30"
"2000-03-20 08:30:30" to "2000-08-07 18:40:40"	"2000-08-07 18:40:40" to "2000-10-07 04:50:50"
"EPOCH" to "1999-10-20 22:20:20"	"1999-12-20 22:20:20" to "FOREVER"
"EPOCH" to "2001-03-07 01:11:10"	"EPOCH" to "FOREVER"
"10/10/1999" to "12/20/1999"	"2/9/2000" to "3/20/2000"
"3/20/2000" to "8/7/2000"	"8/7/2000" to "10/7/2000"
"EPOCH" to "10/20/1999"	"12/20/1999" to "FOREVER"
"EPOCH" to "3/7/2001"	"EPOCH" to "FOREVER"

Blade or BladeLet	Set of semantically related extensions -- types and functions -- to the ORDBMS.
APITB	A Pain in the Behind.
COLLECTION	Non-first normal form object. That is, a set of data values that can be considered as a single data value for some purposes (variables). COLLECTIONS can also be thought of as small, in-memory, temporary tables for the purpose of querying.
DBA	DataBase Administrator.
Iterator	An iterator is a special kind of UDF that returns more than one result. Implementing Iterators raises conceptual and engineering difficulties. This Bladelet contains an example of a quite complex Iterator.
User-defined Function (UDF)	Module of procedural logic that extends SQL. This Bladelet included UDFs implemented in 'C' and SPL. Through out this document I use the term Routine synonymously with UDF.
SAPI	Server Application Programming Interface. This is the set of data structures and 'C' functions used to implement ORDBMS extensions.
SPL	Stored Procedure Language. Simple procedural language that can be used to implement UDFs. SPL is much simpler than 'C', but it is not parallelizable, nor as run-time efficient for complex operations.
Statement Local Variable	Variable value returned by reference from a UDF and associated with a name in the query expression. When referring to this as the mechanism for returning more than one value from a UDF it is more common to refer to it as an OUT parameter.