[Back to Results | New Search]

Student Number 91426003 Author Chieh-Huan Ku(¥j³Ç·Ø) Author's Email Address No Public. Statistics This thesis had been viewed 1608 times. Download 0 times. Department Graduate Institute of Industrial Management Year 2003 Semester 2 Degree Master Type of Document Master's Thesis Language English Title Simultaneous Determination of View Selection and Update Policy with Stochastic Query and Response Time Constraints Date of Defense 2004-06-14 Page Count 85 Keyword AMPL/MINOS Data warehouse configuration M/G/1 MVPP Poisson process Renewal theory View maintenance policy View selection Abstract Data warehouse is built up to reply queries efficiently. The view selection is to select a set of views to materialize under constraints, when minimizing the total of query processing cost and view maintenance cost. The update policy decides when to refresh the data in a data warehouse. Previous researches dealt with these two problems independently, however under the real situation, they are correlated with each other. Therefore, we simultaneously determine view selection and update policy in designing a data warehouse when the arrival of query follows a Poisson process and the response time of query is within a given threshold with a desired probability.

In this research, we propose a mathematical model for determining optimal update policy when the set of materialized views are known. In the model, we adopt view maintenance policy for view update frequency, which does not change with the selected views in the former researches. Our model also incorporates the stochastic phenomenon to reflect the uncertain demand of query which is common in the real life. The mean system response time constrained by a specified time is formulated by a M/G/1 model. Furthermore, we develop a two-phase greedy algorithm for searching a better set of views to materialize.

As to application, we consider different special cases to implement the mathematical model and the greedy algorithm. A computational analysis is conducted to explore the impact of different constraints and system parameters on view selection. In addition, we also design some experiments to evaluate the difference of view selection and solution running time between the greedy algorithm and exhaustive algorithm.Table of Content Contents

Abstract......................................................................I

Contents.....................................................................II

List of Figures..............................................................IV

List of Tables.............................................................VIII

Chapter 1 Introduction........................................................1

1.1 Research background and motivation........................................1

1.2 Problem description.......................................................3

1.3 Research objectives.......................................................4

1.4 Research methodology and procedure........................................4

Chapter 2 Literature Review...................................................7

2.1 Frameworks for view selection.............................................7

2.2 Cost model...............................................................11

2.3 View maintenance.........................................................12

2.4 Algorithm................................................................13

Chapter 3 Mathematical Model and Greedy Algorithm............................16

3.1 Definitions and assumptions..............................................16

3.1.1 Definitions............................................................16

3.1.2 Assumptions............................................................17

3.2 Notations and mathematical model.........................................18

3.2.1 Notations..............................................................18

3.2.2 Mathematical model.....................................................19

3.2.2.1 Modeling of view maintenance policy..................................22

3.2.2.2 Modeling of response time constraint.................................30

3.3 Greedy algorithm.........................................................34

3.4 An example for greedy algorithm..........................................37

Chapter 4 Model Application..................................................41

4.1 Implementation of the model: two special cases...........................41

4.2 Validation and evaluation................................................44

4.2.1 Validation.............................................................44

4.2.2 Evaluation.............................................................49

4.3 Remarks for computational analysis.......................................52

Chapter 5 Conclusion.........................................................53

5.1 Research Contribution....................................................53

5.2 Further Research.........................................................53

Reference....................................................................55

Appendix A. Mathematical model in AMPL format................................58

Appendix B. Greedy algorithm in AMPL format..................................63

Appendix C. Additional Cases.................................................67

List of Figures

Fig.1.1 Query answering path..................................................1

Fig.1.2 Research procedure....................................................6

Fig.2.1 Lattice (The eight TPC-D views).......................................8

Fig.2.2 (a) An expression A-DAG...............................................8

Fig.2.2 (b) An expression AO-DAG..............................................8

Fig.2.3 Multiple view processing plan........................................10

Fig.3.1 Generation of viewsets...............................................17

Fig.3.2 Hint Graph...........................................................19

Fig.3.3 Framework of case 1..................................................20

Fig.3.4 The relational matrix between queries and viewsets for case 1........20

Fig.3.5 The relational matrix between viewsets and views for case 1..........21

Fig.3.6 The relational matrix between views and update operations for case 1.21

Fig.3.7 The relational matrix between viewsets and operations for case 1.....21

Fig.3.8 Query replying procedure.............................................24

Fig.3.9 Complete view selection model........................................33

Fig.3.10 Complete view selection model (cont.)...............................34

Fig.3.11 Algorithm (1).......................................................35

Fig.3.12 Algorithm (2).......................................................36

Fig.3.13 Example for greedy algorithm........................................37

Fig.3.14 The relational matrix between queries and viewsets for the example

in Fig.3.13.........................................................38

Fig.3.15 The relational matrix between viewsets and views for the example

in Fig.3.13.........................................................38

Fig.3.16 The relational matrix between views and update operations for the

example in Fig.3.13.................................................38

Fig.3.17 The relational matrix between viewsets and operations for the example

in Fig.3.13.........................................................38

Fig.4.1 Case 1...............................................................43

Fig.4.2 Case 2...............................................................44

Fig.4.3 The effect of different response time constraints on final view

selection............................................................45

Fig.4.4 The effect of different storage space constraints on final view

selection............................................................46

Fig.4.5 The effect of different currency constraints on final view

selection............................................................48

Fig.4.6 The effect of different probabilities that the response time is

within the response time constraint on final view selection..........49

Fig.4.7 The improvement of solution running time by greedy algorithm under

different number of viewsets.........................................51

Fig.C.1 The relational matrix between queries and viewsets for case 2........67

Fig.C.2 The relational matrix between viewsets and views for case 2..........67

Fig.C.3 The relational matrix between views and update operations for case 2.68

Fig.C.4 The relational matrix between viewsets and operations for case 2.....68

Fig.C.5 Case 3...............................................................69

Fig.C.6 The relational matrix between queries and viewsets for case 3........69

Fig.C.7 The relational matrix between viewsets and views for case 3..........70

Fig.C.8 The relational matrix between views and update operations for case 3.70

Fig.C.9 The relational matrix between viewsets and operations for case 3.....71

Fig.C.10 Case 4..............................................................72

Fig.C.11 The relational matrix between queries and viewsets for case 4.......72

Fig.C.12 The relational matrix between viewsets and views for case 4.........73

Fig.C.13 The relational matrix between views and update operations for

case 4..............................................................73

Fig.C.14 The relational matrix between viewsets and operations for case 4....74

Fig.C.15 Case 5..............................................................75

Fig.C.16 The relational matrix between queries and viewsets for case 5.......75

Fig.C.17 The relational matrix between viewsets and views for case 5.........76

Fig.C.18 The relational matrix between views and update operations for

case 5..............................................................76

Fig.C.19 The relational matrix between viewsets and operations for case 5....77

Fig.C.20 Case 6..............................................................78

Fig.C.21 The relational matrix between queries and viewsets for case 6.......78

Fig.C.22 The relational matrix between viewsets and views for case 6.........79

Fig.C.23 The relational matrix between views and update operations for

case 6..............................................................79

Fig.C.24 The relational matrix between viewsets and operations for

case 6..............................................................80

Fig.C.25 Case 7..............................................................81

Fig.C.26 The relational matrix between queries and viewsets for case 7.......82

Fig.C.27 The relational matrix between viewsets and views for case 7.........83

Fig.C.28 The relational matrix between views and update operations for

case 7..............................................................84

Fig.C.29 The relational matrix between viewsets and operations for

case 7..............................................................85

List of Tables

Table 2.1 Comparisons of literatures in framework............................10

Table 2.2 Comparisons of literatures in cost model...........................12

Table 2.3 Comparisons of literatures in algorithm............................15

Table 3.1 Modification of cost and Temp_SET_2 in the first loop..............39

Table 3.2 Modification of cost and Temp_SET_2 in the second loop.............40

Table 4.1 The effect of different response time constraints on final view

selection..........................................................45

Table 4.2 The effect of different storage space constraints on final view

selection..........................................................46

Table 4.3 Reasonable range of storage space constraint.......................47

Table 4.4 The effect of different currency constraints on final view

selection..........................................................47

Table 4.5 The effect of different probabilities that the response time is

within the response time constraint on final view selection........49

Table 4.6 Quality evaluation for model and algorithm in case 1...............50

Table 4.7 Quality evaluation for model and algorithm in case 2...............50

Table 4.8 Applicability evaluation for model and algorithm...................51Reference Reference

Allen, A. O., Probability, Statistics, and Queuing Theory with computer science applications, Academic Press, 1978

Baralis, E., S. Paraboschi, E. Teniente, ¡§Materialized view selection in a multidimensional database¡¨, 23rd VLDB Conference, pp. 156-165, 1997

Chaudhuri, S., U. Dayal, ¡§An Overview of Data Warehousing and OLAP Technology¡¨, http://www.cs.sfu.ca/CourseCentral/459/han/papers/chaudhuri97.pdf

Chen, J. W., ¡§Determine the Planning Strategy for BOM Items in the Context of MTO Environment: Model and Application¡¨, Master thesis, National Central University, 2003

Colby, L. S., A. Kawaguchi, D. F. Lieuwen, I. S. Mumick, K. A. Ross, ¡§Supporting Multiple View Maintenance Policies¡¨, ACM SIGMOD Conference, pp. 405-416, May 1997.

Fourer, R., D. M. Gay, B. W. Kernighan, AMPL: A Modeling Language for Mathematical Programming, The Scientific Press Series, 1993.

Fourer, R., D. M. Gay, B. W. Kernighan, AMPL: A Modeling Language for Mathematical Programming with AMPL Plus Student Edition for Microsoft Windows, Duxbury Press, 1997.

Gupta, A., J. A. Blakeley, ¡§Using partial information to update materialized views¡¨ Information Systems Vol. 20, No. 8 pp. 641-662, 1995

Gupta, H., V. Harinarayan, A. Rajaraman, J. D. Ullman, ¡§Index Selection for OLAP¡¨, In Proc. Of the Intl. Conf. on Data Engineering, 1997

Gupta, H. ¡§Selections of views to materialize in a data warehouse¡¨, in ICDT, 1997

Gupta, H., I. S. Mumick, ¡§Selections of views to materialize under a maintenance cost constraint¡¨, Proceedings of the International Conference on Data Engineering, 1998

Harinarayan, V., A. Rajaraman, J. D.Ullman, ¡§Implementing data cubes efficiently¡¨, ACM Sigmod, June 1996

Horng, J. T., C. W. Chen, ¡§A mechanism for view consistency in a data warehousing system¡¨, The journal of systems and software (56), pp. 23-37, 2001

Horng, J. T., Y. J. Chang, B. J. Lin, C. Y. Kao, ¡§Materialized View Selection Using Genetic Algorithms in a Data Warehouse System¡¨, IEEE, pp. 2221-2227, 1999

Inmon, W. H., Building the Data Warehouse, John Wiley & Sons, Inc., 3rd edition

Kuno, H. A., E. A. Rundensteiner, ¡¨Materialized object¡Voriented views in multiview¡¨, IEEE, 1995

Kuno, H. A., E. A. Rundensteiner, ¡§Using object oriented principles to optimize update propagation to materialized views¡¨, IEEE, pp. 310-317, 1996

Kuno, H. A., E. A. Rundensteiner, ¡§Incremental maintenance of materialized views in multiview: Strategies and performance evaluation¡¨, IEEE Transactions on Knowledge and Data Engineering, Vol. 10, No.5, pp. 768-792, September/October 1998

Martin, J., System Analysis for Data Transmission, Prentice-Hall, Englewood Cliffs, New Jersey, 1972

Mistry, H., P. Roy, S. Sudarshan, K. Ramamritham, ¡§Materialized view selection and maintenance using multi-query optimization¡¨, http://www.cse.iitb.ac.in/~krithi/papers/ sigmod2001_viewmaint.pdf, 2000

Patel, J. K., C. B. Read, Handbook of the Normal Distribution, Marcel Dekker, 1982

Rajagopalan, S., ¡§Make to Order or Make to Stock: Model and Application¡¨, Management Science Vol. 48, No.2, pp. 241-256, February 2002.

Ross, S. M., Introduction to Probability Models, 8th edition, Academic Press, 2003

Salem, K., K. S. Beyer, R. Cochrane, B.G. Lindsay, ¡§How To Roll a Join: Asynchronous Incremental View Maintenance¡¨, SIGMOD Conference, pp. 129-140,2000..

Segev, A., W. Fang, ¡§Currency based updates to distributed materialized views¡¨, IEEE, pp.512-520, 1990

Segev, A., W. Fang, ¡§Optimal update policies for distributed materialized views¡¨, Management Science Vol.37, No7, pp. 851-870, July 1991

Soutyrina, E., F. Fotouhi, ¡§Optimal View Selection for Multidimensional Database Systems¡¨, IEEE, pp. 45-52, 1997

Theodoratos, D., T. Sellis, ¡§Data warehouse configuration¡¨, Proceedings of the 23rd VLDB Conference, pp. 126-135, 1997

Theodoratos, D., T. Sellis, ¡§Designing data warehouses¡¨, Elsevier, pp. 279-301, June 1999

Theodoratos, D., M. Bouzeghoub, ¡§Data Currency Quality Satisfaction in the Design of Data Warehouse¡¨, International Journal of Cooperative Information Systems Vol.10, No.3 pp. 299-326, 2001

Yang, J., K. Karlapalem, Q. Li, ¡§A framework for designing materialized views in data warehousing environment¡¨, Technical Report HKUST-CS96-35, October 1996

Yang, J., K. Karlapalem, Q. Li, ¡§Algorithms for Materialized View Design in Data Warehouse Environment¡¨, Proceedings of the 23rd VLDB Conference, pp. 136-145, 1997

Zhang, C., X. Yao and J. Yang, ¡§Evolving Materialized Views in Data Warehouse¡¨, IEEE, pp. 823- 829, 1999

Zhang, C., X. Yao, Senior Member, IEEE, J. Yang, ¡§An Evolutionary Approach to Materialized Views Selection in a Data Warehouse Environment¡¨, IEEE Transactions on Systems, Man, and Cybernetics-Part C: Application and Reviews, Vol. 31, No. 3, pp. 282-294, August 2001

Zhuge, Y., G. M. Hector, J. Hammer, J. Widom, ¡§View Maintenance in a Warehousing Environment¡¨, Technical Report, Stanford University, 1995Advisor Gwo-Ji Sheen(¨H°ê°ò)

Files disapprove authorization

91426003.pdf Date of Submission 2004-06-28

Our service phone is (03)422-7151 Ext. 57407,E-mail is also welcomed.