# Exploring Data Exercises

last updated 21-Jun-2021

Selected exercises from the text's First edition Exploration chapter supplemental reading, taken from the end of the chapter.

2. Identify at least two advantages and two disadvantages of using color to visually represent information.

3. What are the arrangement issues that arise with respect to three-dimensional plots?
It may be better to state this more generally as “What are the issues . . . ,” since selection, as well as arrangement plays a key issue in displaying a three-dimensional plot.

4. Discuss the advantages and disadvantages of using sampling to reduce the number of data objects that need to be displayed. Would simple random sampling (without replacement) be a good approach to sampling? Why or why not?

7. How might you address the problem that a histogram depends on the number and location of the bins?

8. Describe how a box plot can give information about whether the value of an attribute is symmetrically distributed. What can you say about the symmetry of the distributions of the attributes shown in the boxplot below?

• (a) If the line representing the median of the data is in the middle of the box, then the data is symmetrically distributed, at least in terms of the 75% of the data between the first and third quartiles. For the remaining data, the length of the whiskers and outliers is also an indication, although, since these features do not involve as many points, they may be misleading.
• (b) Sepal width and length seem to be relatively symmetrically distributed, petal length seems to be rather skewed, and petal width is somewhat skewed. 9. Compare sepal length, sepal width, petal length, and petal width, using the scatterplot matrix below. 10. Comment on the use of a box plot to explore a data set with four attributes: age, weight, height, and income.

11. Give a possible explanation as to why most of the values of petal length and width fall in the buckets along the diagonal in the scatterplot matrix above.

12. Use the figures above to identify a characteristic shared by the petal width and petal length attributes.

16. Construct a data cube from the table below. Is this a dense or sparse data cube? If it is sparse, identify the cells that are empty.

Fact table for Exercise 16.

Product ID Location ID Number Sold
1 1 10
1 3 6
2 1 5
2 2 22