Optimizing a Graph Data Model

By now you should be comfortable with creating graph data models to answer specific business needs, implementing the models, and testing the models for scalability. The final stop on our tour of graph data modeling is inevitable, often maligned, constantly dreaded second request/change to the business question. In your data modeling career, you will undoubtedly be faced with changes in scope that require retooling and redevelopment. Luckily, you already have the tools to accomplish anything the business unit can throw your way!

Open the database you created in Week 4 (in which you completed exercises 1-2), and execute the browser command:

Get Your Custom Essay Written From Scratch
We have worked on a similar problem. If you need help click order now button and submit your assignment instructions.
Just from $13/Page
Order Now

:play 4.0-neo4j-modeling-exercises

Follow the instructions for Exercises 3 and 4. Stop after exercise 4.

Part 1 – Adding another question to the model

Our new question is as follows:

“As a frequent traveller I want to find flights from <origin> to <destination> on <date> so that I can book my business flight.”

For example:

Find all the flights going from Los Angeles (LAS) to Chicago Midway International (MDW) on the 3rd January, 2019.

Implementing the query

Here is the query:

MATCH (origin:Airport {code: ‘LAS’})

    <-[:ORIGINATES_FROM]-(flight:Flight)-[:LANDS_IN]->

    (destination:Airport {code: ‘MDW’})

WHERE flight.date = ‘2019-1-3’

RETURN origin, destination, flight

Deliverable 1

Profile the query above, take a screenshot of the output, and write 2-3 paragraphs explaining what you see in the query plan.

Part 2 – Performing another refactor

We want to introduce AirportDay nodes so that we do not have to scan through all the flights going from an airport when we are only interested in a subset of them.

This is an instance where we do not want to remove the relationships between airports and flights because we need them for our first query “What are the airports and flight information for flight number 1016 for airline WN?”.

In this case, we are adding the AirportDay node that will have date information. That way we do not have to go through the Flight nodes to find a date. Just like the Flight nodes, it will have a unique ID, AirportDay.airportDayId so that it can use it in the query.

Deliverable 2

Create a constraint that ensures the new AirportDay node includes an airportDayid that is unique. Then, perform the following:

  • MATCH the data you want to move.
  • Create the new AirportDay nodes.
  • Connect the new nodes to the existing graph.

One quick note about MERGE in the form of a helpful hint:

MERGE enables us to add a single AirportDay node per the airportDayId value and ensure that only one relationship is created between a Flight and an AirportDay node.

Add your code to your word document from Part 1 and include a screenshot of the result.

Part 3 – Profile your queries

After a refactor, you should check that all queries perform OK.  Thus, you will want to profile our original question:

What are the airports and flight information for flight number 1016 for airline WN?

PROFILEMATCH (origin)<-[:ORIGINATES_FROM](flight:Flight)      [:LANDS_IN]->(destination)WHERE flight.airline = ‘WN’ AND      flight.number = ‘1016’ RETURN origin, destination, flight

Deliverable 3

Briefly discuss the results of your profile. Keep in mind that the query will have more db hits than when we originally ran it, because we added another 10k nodes to the graph.

Next, rerun the original query to the second question:

PROFILE

MATCH (origin:Airport {code: ‘LAS’})

    <-[:ORIGINATES_FROM]-(flight:Flight)-[:LANDS_IN]->

    (destination:Airport {code: ‘MDW’})

WHERE flight.date = ‘2019-1-3’

RETURN origin, destination, flight

Briefly discuss the results of your profile.

Finally, write a revised query for the second question based on the additions you made to the model. Profile the query and discuss what changes you saw in performance. Evaluate whether or not your refactor was successful, and if you would recommend additional changes to the model for our current two questions.

Turn in all three deliverable sections in a single word document.

0 replies

Leave a Reply

Want to join the discussion?
Feel free to contribute!

Leave a Reply

Your email address will not be published. Required fields are marked *