This week you are going to build your first prototype graph data model for a real-world dataset. To start, download the flights.csv dataset from the following link:
Note: we do not need the other two files, just flights.csv
About the data
The U.S. Department of Transportation’s (DOT) Bureau of Transportation Statistics tracks the on-time performance of domestic flights operated by large air carriers. Summary information on the number of on-time, delayed, canceled, and diverted flights is published in DOT’s monthly Air Travel Consumer Report and in this dataset of 2015 flight delays and cancellations.
To simplify things in these early stages of your graph data modeling career, the application question is pre-defined:
“As an air travel enthusiast, I want to know how airports are connected, so that I can identify the busiest airports.”
As you prepare to answer this question with your initial model, be sure to consider the following follow-up questions:
- What are the entities?
- What are the connections between the entities?
- What properties are needed?