Safekipedia

Query optimization

Adapted from Wikipedia ยท Discoverer experience

Query optimization is a special feature used by many types of databases, like relational database management systems, NoSQL, and graph databases. It helps these databases find the fastest way to answer questions, or queries, that people ask them.

When someone sends a question to a database, it first gets checked by a parser. Then the query optimizer takes over. This part tries to decide the best way to get the answer, looking at many possible ways the work could be done. Sometimes, database engines can be helped with special instructions called hints.

Queries can be simple, like asking for an address using a special number called a Social Security number. Or they can be very complicated, asking for things like average salaries of certain groups of people. Because databases hold lots of information arranged in many ways, there are usually many possible paths to get the same answer. Some paths are quick, while others can take a long time. Query optimization tries to pick the quickest path without spending too much time figuring it out, so the answer comes back fast.

General considerations

When a database gets a question, it needs to decide the fastest way to find the answer. This process is called query optimization. Sometimes, spending more time planning can lead to a better result, but databases must balance planning time with how good the plan is.

Some databases use a method called cost-based optimization. They look at different ways to answer the question and guess how much "cost" each way has. This cost includes things like how much reading from the disk is needed, how much thinking the computer needs to do, and how much space is used. By picking the plan with the smallest cost, the database hopes to answer the question as quickly as possible.

Implementation

Most query optimizers show how a query will be done using a tree of steps. Each step is a node in the tree and does one job to help finish the query. For example, one node might join two groups of information together, while another might sort the information.

The way tables are joined together can change how fast a query works. Joining bigger tables first can take much longer than joining smaller tables first. Query optimizers use special methods to pick the best order to join tables.

Estimating how long different ways of doing a query will take is tricky. It depends on guessing how much information will be used at each step. These guesses can be wrong if the information in the tables is connected in certain ways, like when picking a specific car model also means picking a specific car make. Keeping information about the tables up to date helps the optimizer make better choices.

Extensions

Classical query optimization looks at different ways to run a query and picks the fastest one. But sometimes, there are more things to think about than just speed. For example, in cloud computing, you might also care about how much money a query costs.

Some ways to improve query optimization take into account many different goals at once, like speed and cost. They try to find the best balance between these goals, depending on what the user prefers. This helps make sure the query runs well in many different situations.

Related articles

This article is a child-friendly adaptation of the Wikipedia article on Query optimization, available under CC BY-SA 4.0.