Database Management Systems:
The following relations are given (primary keys are underlined):
PLAY-ACTOR(AId, Name, Nationality, BirthDate)
COMEDY(ComId, Title, Director, SceneNumber, Year)
PLAY-ACTOR-IN-COMEDY(ComId, AId, Role)
PLANNING(ComId, Theater, Date, StartTime, LengthOfTime)
Assume the following cardinalities:
• card(PLAY-ACTOR)= 104 tuples,
MIN(BirthDate) = 1-1-1960, MAX(BirthDate) = 31-12-1999,
• card(COMEDY)= 103 tuples,
distinct values of SceneNumber ≈ 15,
• card(PLAY-ACTOR-IN-COMEDY)= 106 tuples,
distinct values of Role ≈ 30,
• card(PLANNING)= 108 tuples,
MIN(Date) = 1-1-2010, MAX(Date) = 31-12-2010,
MIN(LengthOfTime) = 81, MAX(LengthOfTime) = 180,
Furthermore, assume the following reduction factor for the group by condition:
• having count(DISTINCT Theater)≥50 ≈ 1/10 .
Consider the following SQL query:
select Title, Director
from COMEDY C, PLANNING P, PLAY-ACTOR-IN-COMEDY AC
where P.ComId=C.ComId and AC.ComId=C.ComId
and LengthOfTime=180 and SceneNumber>12
and AC.Aid in (select Aid from PLAY-ACTOR
where BirthDate ≥ 1996)
group by ComId, Title, Director
having count(DISTINCT Theater) ≥ 50
For the SQL query:
1) Report the corresponding algebraic expression and specify the cardinality of each node (representing an intermediate result or a leaf). If necessary, assume a data distribution. Analyze also the group by anticipation.
2) Select one or more secondary physical structures to increase query performance. Justify your choice and report the corresponding execution plan (join orders, access methods, etc.).