Why is data gathered?

Some data is gathered because it is required by law, some of it is gathered as part of the operational process of an institution or company, and some data is gathered just for the sake of gathering it. The reality is that data is gathered all the time, often without people being aware of it. Often data collected can be used to help answer future questions or solve future problems. One of the challenges in these situations is that the data was not necessarily gathered with one particular goal in mind. This means that the data scientist must determine what subset of the data will be the most relevant because superfluous data can make it difficult to determine reasonable conclusions.
The previous section discussed examples of data that is gathered about people. You may have discussed some of these examples with your students. Now consider what can be done with that data. For example, what are the benefits of giving doctors access to a person’s complete medical history? What can a web site do with a person’s browsing habits? What is one way that a person’s financial transactions might be used? Who should have access to all this information?

Student Activity

Consider the following scenarios. For each scenario, describe what data you would gather and how you could use it.

1. A city wants to find ways to reduce traffic congestion on one major, busy street. We need to figure out what kind of data would be most useful in coming up with a solution. Some possibilities may be: the average number of cars using the street at a given time of day, the busiest times, the average speed of cars using the street, the most common routes cars are taking (i.e. where they are coming from to enter the busy street and where they are going to when leaving the street), and the busiest intersections.

2. A hotel wants to increase its profit. Profit is determined by how much money is coming in (revenue) minus how much money is going out (expenses). So to increase profit, the hotel must increase revenue and/or decrease costs. Consider: What data will help the hotel make decisions that increase its profit? Money comes in when a guest stays at the hotel, but there are other ways as well–e.g. restaurant, amenities such as wi-fi, etc. What are expenses the hotel must consider? These include salaries paid, utility bills, free services offered, and more.

In both scenarios you do not have to come up with an answer to the problem. You need to come up with examples of data that should be gathered in order to make informed decisions.