Update (03/07/2023)
- Fixed spelling and grammar mistakes.
Article
Today I was focused on defining the standards for pagination on a new product and decided to dialog with a coworker. Below is the idea I threw out:
Any list of items I return from the API will be in this format:
{
results: [],
next: null,
}
The only thing that would identify the result as a part of pagination… would be the value of the “next” key. If it is
null
then there would be NO more items beyond the ones I have already returned. IFF there are more items… I would return something like… next: 2 Thus identifying that there is a page=2 of items more for the client. And depending on page=2's results… it’s next field would identify if there were a page=3 more of results… etc.What do you think? I feel like I am at a point that making these pagination standards will be critical for the future of other API endpoints.
This sparked my coworker's response:
I really like the standard of using
total
,limit
, andskip
. With only usingnext
you don’t know how many items or pages there are. Using total, limit, and skip really lends itself to handling pagination on the client — where the user can pick how many results to display per page.
Talk about good points. However, there exists similarities between our designs. If I was to go implement either of these, they would both end up being an implementation of the offset method. Whether the offsets were to be statically stored on the server or defined by the front end was the only difference. Therefore, both ideas are implementation preferences defining how the offsets should be handled and NOT differences in the underlying concept of how we would load the data.
This flipped a switch in my brain that reminded me of an issue I faced at a previous job. One day we had customers start to call-in and complain about the time it took to load files(listings) beyond the 10th page. The offset method was used to display the metadata to the user, and this ended up being a side effect.
Taking all this into account I thought there had to be a better way.
What is Offset Pagination?
Perhaps you have seen SQL Query like this before:
SELECT id, firstname, lastname, created, updated
FROM contacts
ORDER BY id
LIMIT 0, 10
The query above grabs the first ten records. Similarly, to get the ten records on page 4 we would execute the following:
SELECT id, firstname, lastname, created, updated
FROM contacts
ORDER BY id
LIMIT 30, 10
Problems with Offset Pagination:
-
The higher the offset the slower the query. Source 2
-
One must calculate the number of pages based off the total number of records.
-
Must scan an index to count the number of rows.
-
Complicated solutions to speed up results on pages farther back.
-
Even with an index, we scan the index, sort the data, then select the items we want to return. The first two steps are obviously a waste as we will be manipulating data that is not relevant to the results we want to return.
There MUST be a better way! And there is. Introducing seek pagination.
Seek/Keyset Pagination
The seek method is based on filtering out the data from the previous pages. We do this by having the client send the ID of the last record listed. We take that ID and place it in the WHERE clause providing us with only relevant data. Obviously, this implementation requires your data to be deterministically sortable.
"Don't touch what you don't need" - Youtube
An example SQL statement follows:
SELECT id, firstname, lastname
FROM contacts
WHERE id < 10
ORDER BY id DESC
LIMIT 10;
Therefore, from a REST perspective the results would appear as so:
{
"results": [
{id: 1, ... },
{id: 2, ... }
{id: 3, ... }
...
{id: 10, ... }
]
}
And when the user wanted to get the next page of items, they simply tack on the last_id
from the previous pagination result.
GET /api/contacts?last_id=10
Resulting:
{
"results": [
{id: 11, ... },
{id: 12, ... }
{id: 13, ... }
...
{id: 20, ... }
]
}
For more information about seek/keyset please visit the following sources.
Sources:
https://blog.jooq.org/2013/10/26/faster-sql-paging-with-jooq-using-the-seek-method/
https://use-the-index-luke.com/sql/partial-results/fetch-next-page
https://stackoverflow.com/a/3215973
https://stackoverflow.com/a/4502426
https://www.citusdata.com/blog/2016/03/30/five-ways-to-paginate/
https://blog.jooq.org/2016/08/10/why-most-programmers-get-pagination-wrong/
https://www.youtube.com/watch?v=GzMaN-IX7wQ
http://www.unicode.org/notes/tn9/tn9-1.html