1) Briefly discuss the role of data cube aggregation and dimensional reduction in data reduction process (16).

2) Explain the architecture of a typical data mining system.

3) Suppose that the data for analysis include attribute age. The age values for the data tuples are (in increasing order)

13,15,16,16,19,20,20,21,22,22,25,25,25,30,33,33,35,35,35,36,40,45,46,52,70

i) What is the mean of the data? What is the median

ii) What is the mode of the data? Comment on data’s modality (i.e bimodal,trimodal etc.)

iii) What is the mid range of the data?

iv) Can you find (roughly) the first quartile (q1) and the third quartile (q3) of the data?

v) Give the five number summaries of the data?

vi) Show a bloxplot of the data?

vii) How is the quantile- quantile ploat different form a quantile plot?

4) What is transactional database? Describe any five advanced database systems.

5) Briefly describe the issues to consider during data integration.

6) What are the values ranges of the following normalization methods?

i) Min-max normalization

ii) Z-score normalization

iii) Normalization by decimal scaling

7) Suppose that the data for analysis include the attribute age. The age value for the data tuples are (in increasing order)

13,15,16,16,19,20,21,22,22,25,25,25,25,30,33,33,35,35,35,35,36,40,45,46,52,70

i) Use min-max normalization to transform the value 35 for age onto the range [0.0,0.1]

ii) Use z-score normalization to transform the value 35 for age, where the standard deviation of age is 12.94.

iii) Use normalization by decimal scaling to transform the value 35 for age.

8) Describe about mining methodology and user interaction issues (major issues of dm).

(Or)

Describe three challenges to data mining regarding data mining methodology and user interaction issues

9) What is data reduction? Explain dimensionality reduction? What is lossless and lossy dimension reduction?

10) Define data reduction? Explain numerosity reduction?

11) What is redundancy? Why correlation analysis is useful? Describe how correlation coefficient is computed?

12) What is lossy and lossless dimensional reduction? Describe any one technique for lossy dimensionality reduction.

13) What is data integration? What is entity identification and why is it useful? what is redundancy? Why correlation analysis is useful? Describe how correlation coefficient is computed?

14) What is numerosity reduction? What are available techniques for numerosity reduction? Describe any two techniques for numerosity reduction

15) Briefly describe the functionalities of data mining system?

16) What is data discretization ? Explain entrophy based data discretization ?

17) What is data reduction? Discuss about dimensionality reduction?

18) Explain various data reduction techniques.

19) Why data mining functionalities are used? Explain with an example data characterization and data discrimination.

20) Briefly explain various forms of data pre-processing with neat diagram.

21) What is data integration? Discuss the issues to be considered for data integration.

(or)

Briefly discuss about data integration

22) What is data mining? Briefly explain the knowledge discovery process.

(or)

Explain data mining as a step in process of knowledge discovery

23) What is data cleaning? Describe the approaches to fill missing values.

24) With examples, describe in detail about the available techniques for concept hierarchy Generation for categorical data.

25) What is data dispersion? Describe the common measures for data dispersion.

26) What discuss the parametric and non parametric methods of numerosity reduction?

27) Briefly discuss the discretization and concept hierarchy techniques.

28) What kind of data required for data mining?

29) What kinds of patterns can be mined?

30) Discuss about dimensionality reduction and data compression.

31) Discuss in detail about major issues in data mining.

32) Explain Discretization and concept hierarchy generation for numeric data.

33) What motivated data mining? Why is it important? What is data mining?

34) Why preprocess the data? Explain data cleaning.

35) Explain about concept hierarchy generation for categorical data?

36) Briefly discuss data smoothing techniques?

37) Briefly explain the role of data cube aggregation and dimensionality reduction in the data reduction process.

38) Briefly discuss about data transformation

39) Discuss the role of data compression and numerosity reduction in data reduction process.

40) Explain about advance database systems and advanced database applications.

41) Suppose that the data for analysis include the attribute age. The age value for the data tuples are (in increasing order)

13,15,16,16,19,20,21,22,22,25,25,25,25,30,33,33,35,35,35,35,36,40,45,46,52,70

a) Use smoothing by means to smooth the above data using a bin depth 3. Illustrate your steps. Comment on the effect for the given data

