728x90

Context

You have applied the Database per Service pattern. Each service has its own database. Some business transactions, however, span multiple service so you need a mechanism to implement transactions that span services. For example, let’s imagine that you are building an e-commerce store where customers have a credit limit. The application must ensure that a new order will not exceed the customer’s credit limit. Since Orders and Customers are in different databases owned by different services the application cannot simply use a local ACID transaction.

Problem

How to implement transactions that span services?

Forces

  • 2PC is not an option

Solution

Implement each business transaction that spans multiple services as a saga. A saga is a sequence of local transactions. Each local transaction updates the database and publishes a message or event to trigger the next local transaction in the saga. If a local transaction fails because it violates a business rule then the saga executes a series of compensating transactions that undo the changes that were made by the preceding local transactions.

There are two ways of coordination sagas:

  • Choreography - each local transaction publishes domain events that trigger local transactions in other services
  • Orchestration - an orchestrator (object) tells the participants what local transactions to execute

Example: Choreography-based saga

An e-commerce application that uses this approach would create an order using a choreography-based saga that consists of the following steps:

  1. The Order Service receives the POST /orders request and creates an Order in a PENDING state
  2. It then emits an Order Created event
  3. The Customer Service’s event handler attempts to reserve credit
  4. It then emits an event indicating the outcome
  5. The OrderService’s event handler either approves or rejects the Order

Take a tour of an example saga

Example: Orchestration-based saga

An e-commerce application that uses this approach would create an order using an orchestration-based saga that consists of the following steps:

  1. The Order Service receives the POST /orders request and creates the Create Order saga orchestrator
  2. The saga orchestrator creates an Order in the PENDING state
  3. It then sends a Reserve Credit command to the Customer Service
  4. The Customer Service attempts to reserve credit
  5. It then sends back a reply message indicating the outcome
  6. The saga orchestrator either approves or rejects the Order

Take a tour of an example saga

Resulting context

This pattern has the following benefits:

  • It enables an application to maintain data consistency across multiple services without using distributed transactions

This solution has the following drawbacks:

  • Lack of automatic rollback - a developer must design compensating transactions that explicitly undo changes made earlier in a saga rather than relying on the automatic rollback feature of ACID transactions
  • Lack of isolation (the “I” in ACID) - the lack of isolation means that there’s risk that the concurrent execution of multiple sagas and transactions can use data anomalies. consequently, a saga developer must typical use countermeasures, which are design techniques that implement isolation. Moreover, careful analysis is needed to select and correctly implement the countermeasures. See Chapter 4/section 4.3 of my book Microservices Patterns for more information.

There are also the following issues to address:

  • In order to be reliable, a service must atomically update its database and publish a message/event. It cannot use the traditional mechanism of a distributed transaction that spans the database and the message broker. Instead, it must use one of the patterns listed below.
  • A client that initiates the saga, which an asynchronous flow, using a synchronous request (e.g. HTTP POST /orders) needs to be able to determine its outcome. There are several options, each with different trade-offs:
    • The service sends back a response once the saga completes, e.g. once it receives an OrderApproved or OrderRejected event.
    • The service sends back a response (e.g. containing the orderID) after initiating the saga and the client periodically polls (e.g. GET /orders/{orderID}) to determine the outcome
    • The service sends back a response (e.g. containing the orderID) after initiating the saga, and then sends an event (e.g. websocket, web hook, etc) to the client once the saga completes.

Learn more

Example code

The following examples implement the customers and orders example in different ways:

728x90

https://developer.atlassian.com/server/confluence/pagination-in-the-rest-api/

728x90

https://www.geeksforgeeks.org/unified-modeling-language-uml-class-diagrams/

 

Class Diagram | Unified Modeling Language (UML) - GeeksforGeeks

A Computer Science portal for geeks. It contains well written, well thought and well explained computer science and programming articles, quizzes and practice/competitive programming/company interview Questions.

www.geeksforgeeks.org

https://velog.io/@khyunjiee/UML-Class-Diagram

 

'Architecture' 카테고리의 다른 글

Pattern: Saga  (5) 2024.09.27
Pagination in the REST API  (0) 2024.09.24
Pagination 101: why it matters and how to do it right in your API  (0) 2024.08.22
[UML] How to appear typedef in C++  (0) 2024.08.12
728x90

APIs (Application Programming Interfaces) are a way for software applications to communicate with one another. They allow developers to create applications that use data and functionality provided by other software systems. APIs are used extensively in modern software development, and they are an essential part of building scalable and performant applications.

One challenge that developers face when working with APIs is how to handle large amounts of data. APIs often return large datasets, and processing these datasets can be time-consuming and resource-intensive. This is where pagination comes in.

What is Pagination?

Pagination is a technique for breaking up large datasets into smaller, more manageable chunks. Instead of returning the entire dataset in one response, an API can return a subset of the data along with metadata that describes the overall dataset. This allows the client application to request additional subsets of data as needed.

Why is Pagination Important?

Pagination is important for several reasons:

  1. Performance: Returning a large dataset in a single response can be slow and resource-intensive. By breaking up the dataset into smaller chunks, the API can return the data more quickly and with fewer resources.

  2. Memory Management: Processing a large dataset can require a lot of memory, which can be a problem for resource-constrained devices like mobile phones or embedded systems. By using pagination, the API can limit the amount of data that needs to be stored in memory at any given time.

  3. User Experience: For client applications that display data to users, pagination can improve the user experience by providing a faster and more responsive interface. Users can see the initial results quickly and can request additional data as needed.

Pagination Techniques

There are several pagination techniques that an API can use. The most common techniques are:

Offset Pagination

A common pagination technique that uses an offset parameter to determine the starting point of the next set of results. For example, if the client has already retrieved the first 100 results, they can request the next 100 results by specifying an offset of 100. The offset-based approach is simple to implement and easy to understand, but it has some potential drawbacks. One issue is that it can be inefficient when dealing with large data sets, as the database has to skip over the previous results to get to the requested offset. Another issue is that the ordering of the results can change between requests, which can lead to inconsistent or unexpected results.

Cursor Pagination

This uses a cursor parameter to determine the starting point of the next set of results. The cursor can be a unique identifier or a bookmark that points to a specific location in the result set. For example, if the client has already retrieved the first 100 results, they can request the next 100 results by specifying a cursor that corresponds to the last result retrieved. The cursor-based approach can be more efficient than offset-based pagination, as the database can use an index to quickly find the location of the cursor. It also avoids the issue of inconsistent or unexpected results, as the ordering of the results is based on the cursor rather than a fixed offset. However, it can be more complex to implement and requires careful management of the cursor to ensure that it is unique and stable.

Time-Based Pagination

Another approach to pagination that is used when dealing with time-series data, where the results are ordered by a timestamp or date. In time-based pagination, the client specifies a start time and an end time, and the server returns all the results that fall within that time range. This approach is commonly used in applications such as social media feeds, where users want to see the latest posts or updates.

Time-based pagination can be implemented using either offset-based or cursor-based pagination techniques. In offset-based time pagination, the start and end times are converted into an offset value, and the client can retrieve the next set of results by specifying the next offset value. In cursor-based time pagination, the start and end times are converted into cursor values, and the client can retrieve the next set of results by specifying the cursor that corresponds to the last result retrieved.

One benefit of time-based pagination is that it allows for easy retrieval of the most recent results, which is often a common use case for time-series data. However, it can be more complex to implement than simple offset or cursor-based pagination, as the time range must be converted into an appropriate pagination parameter, and careful management of time zones and date formats is required to ensure consistency and accuracy.

Implementing pagination using query parameters or HTTP headers

Implementing pagination using query parameters or HTTP headers is a common approach to designing RESTful APIs. Here's a brief explanation of each approach:

Using query parameters: In this approach, the client includes pagination parameters in the query string of the API request URL. The most common pagination parameters are "page" and "per_page", which indicate the current page number and the number of results per page, respectively. For example, to retrieve the second page of 10 results per page, the client would send a request to the API with the following URL:

This approach is simple to implement and widely supported by web frameworks and libraries. However, it has some limitations, such as the inability to include metadata or links to other pages in the response.

Using HTTP headers: In this approach, the client includes pagination parameters in the headers of the API request. The most common pagination headers are "Link" and "Range", which allow for more flexibility and control over the pagination behavior. The Link header can include URLs for the first, last, previous, and next pages of the result set, while the Range header can specify a range of results to return based on an offset and a limit.

Here's an example of a request using the Link header:

And here's an example of a request using the Range header:

This approach is more flexible and allows for more control over the pagination behavior, but it can be more complex to implement and requires careful management of the pagination headers to ensure compatibility with clients.

In general, both query parameters and HTTP headers are valid approaches to implementing pagination in RESTful APIs. The choice of approach will depend on the specific requirements of the API and the preferences of the developer or team.

How to ensure consistency and reliability in pagination results - even in the presence of data changes

Ensuring consistency and reliability in pagination results is crucial, especially when dealing with large amounts of data that may change frequently. To achieve this, there are several techniques that you can use. One such technique is cursor-based pagination, which uses a cursor or pointer to the next subset of data instead of a fixed offset. By using a cursor, you can ensure that the pagination results remain consistent even in the presence of data changes.

Another important aspect is sorting the data consistently, so that the data is sorted in the same way each time a request is made. This is essential in cursor-based pagination, as inconsistent sorting can cause unexpected results and make it difficult to ensure consistency in pagination results.

Handling deletes and inserts is another crucial technique to ensure consistency. You can adjust the cursor or offset to skip over deleted data or include newly inserted data, depending on the situation. Additionally, caching the data and pagination results can help improve API performance and ensure consistency, but it should be done carefully to avoid issues.

Versioning your API is also crucial when making changes that impact pagination, as it ensures clients can migrate to the new API version without breaking their pagination functionality.

By using these techniques, you can ensure consistency and reliability in pagination results, even in the presence of data changes. It's important to test your pagination functionality thoroughly and monitor your API for any issues to ensure that your pagination functionality remains consistent and reliable over time. With these best practices, you can design and implement pagination that improves your API's performance and provides a better user experience for your customers.

Designing pagination for APIs that return large amounts of data

When an API returns large amounts of data, it can impact the performance of both the API server and the client application. Pagination is a common technique used to mitigate these performance issues by breaking up the data into smaller subsets or pages.

To design pagination for APIs that return large amounts of data, you'll need to consider the following factors:

  1. Determine the appropriate page size: The page size is the number of items or records to return per page. The appropriate page size will depend on the amount of data and the performance constraints of the API server and client application. It's important to strike a balance between the number of items returned per page and the response time of the API.

  2. Choose a pagination approach: As mentioned earlier, there are different pagination approaches, including offset-based and cursor-based pagination. The choice of approach will depend on the specific requirements of your application.

  3. Include pagination parameters in the API request: The API request should include parameters that specify the page number or cursor and the page size. The client application can use these parameters to navigate the pagination results.

  4. Return metadata with the pagination results: In addition to the data, the API should return metadata about the pagination results, such as the total number of pages or the total number of items. This metadata can help the client application build the pagination user interface and improve the user experience.

  5. Handle errors and exceptions: When designing pagination for large amounts of data, it's important to handle errors and exceptions properly. For example, if the client requests a page that does not exist, the API should return an appropriate error response.

Maximizing API Performance with Efficient Server-Side Pagination

When implementing server-side pagination, optimizing the query used to retrieve data from the database is important. This can be achieved by following a few tips:

First, use the correct database indexes, which can significantly improve query performance. When designing the database schema, it's important to include indexes on columns that are frequently used in queries.

Second, use a LIMIT and OFFSET clause in SQL to limit the number of rows returned and specify the starting row for the query result. This will avoid retrieving all data from the database and only fetch the data needed for the current page.

Third, use query caching to improve performance by caching the query result in memory, which can reduce the number of database queries and the load on the database server.

Fourth, optimize the database queries by using efficient query patterns, optimizing query execution plans, and reducing the number of joins or subqueries.

Lastly, consider using a pagination library, which can help you implement efficient server-side pagination quickly and easily.

7 Best Practices for Your API Design:

  1. Use a standard approach: There are different pagination approaches, including offset-based and cursor-based pagination. It's important to use a standard approach that is well documented and widely adopted to ensure compatibility with client applications and improve the user experience.

  2. Provide clear documentation: Your API documentation should provide clear guidance on how to use pagination, including the required parameters and their meanings. It's important to explain the purpose of each parameter and how to use them properly.

  3. Use consistent naming conventions: Consistent naming conventions can make it easier for developers to use your API. You should use standard names for pagination parameters, such as "page" and "page_size," and avoid using ambiguous or non-standard names.

  4. Handle errors and exceptions properly: When designing and implementing pagination in APIs, it's important to handle errors and exceptions properly. For example, if the client requests a page that does not exist, the API should return an appropriate error response. Error responses should include a clear message explaining the error and a status code that indicates the nature of the error.

  5. Optimize performance: To improve the performance of your API, you should optimize the database queries used to retrieve the data for pagination. This can involve using efficient query patterns, optimizing query execution plans, and reducing the number of joins or subqueries.

  6. Provide metadata with pagination results: In addition to the data, your API should return metadata about the pagination results, such as the total number of pages or the total number of items. This metadata can help the client application build the pagination user interface and improve the user experience.

  7. Test your API thoroughly: Before releasing your API, you should test it thoroughly to ensure that pagination works as expected. This can involve testing different page sizes, different starting pages, and different query parameters to ensure that your API can handle a wide range of use cases.

By following these best practices, you can design and implement pagination in APIs that are easy to use, reliable, and scalable. This can help you provide a better user experience for your API users and improve the performance and reliability of your API.

Pagination: A Game-Changer for Your API Performance and User Experience

Pagination is a critical technique for working with large datasets in APIs. By breaking up large datasets into smaller, more manageable chunks,  while also reducing the load on servers and improving API performance. When designing and implementing pagination in your API, it's important to consider factors such as the data size, data volatility, and client needs. You should also choose a pagination approach that fits your API's requirements and aligns with RESTful design principles.

We hope this guide has been helpful in understanding the need for pagination in APIs and how to design and implement it effectively. With these principles in mind, you can create APIs that are easy to use, performant, and scalable, making them more valuable to your users and your organization.

'Architecture' 카테고리의 다른 글

Pattern: Saga  (5) 2024.09.27
Pagination in the REST API  (0) 2024.09.24
Class Diagram | Unified Modeling Language (UML)  (1) 2024.09.23
[UML] How to appear typedef in C++  (0) 2024.08.12

+ Recent posts