Exploring Data Exercises

Back to Syllabus

last updated 21-Jun-2021


Selected exercises from the text's First edition Exploration chapter supplemental reading, taken from the end of the chapter.

2. Identify at least two advantages and two disadvantages of using color to visually represent information.

3. What are the arrangement issues that arise with respect to three-dimensional plots?
It may be better to state this more generally as “What are the issues . . . ,” since selection, as well as arrangement plays a key issue in displaying a three-dimensional plot.

4. Discuss the advantages and disadvantages of using sampling to reduce the number of data objects that need to be displayed. Would simple random sampling (without replacement) be a good approach to sampling? Why or why not?

7. How might you address the problem that a histogram depends on the number and location of the bins?

8. Describe how a box plot can give information about whether the value of an attribute is symmetrically distributed. What can you say about the symmetry of the distributions of the attributes shown in the boxplot below?

9. Compare sepal length, sepal width, petal length, and petal width, using the scatterplot matrix below.

10. Comment on the use of a box plot to explore a data set with four attributes: age, weight, height, and income.

11. Give a possible explanation as to why most of the values of petal length and width fall in the buckets along the diagonal in the scatterplot matrix above.

12. Use the figures above to identify a characteristic shared by the petal width and petal length attributes.


16. Construct a data cube from the table below. Is this a dense or sparse data cube? If it is sparse, identify the cells that are empty.

Fact table for Exercise 16.

Product ID Location ID Number Sold
1 1 10
1 3 6
2 1 5
2 2 22