EF Core + Cursor-based pagination 的寫法

我們都知道,可以透過分頁技巧,將大量資料切成小部分回傳給調用端,常見的技巧就是 Offset-based 以及 Cursor-based,我將使用 EF Core + Postgresql 實作分頁

下圖出自:A deep dive into cursor-based pagination in MongoDB (engage.so)

A deep dive into cursor-based pagination in MongoDB

分頁策略

  • 基於頁數和大小的分頁:這種方法是根據頁數和大小計算的結果,並從資料庫中取出這些元素。這種方法的缺點是隨著新資料的到來,計算的結果可能會變動,而且當分頁到集合的較遠處時,效能會下降,也就是當分頁數越大的時候查詢時間就會越久,在小型資料庫並不會有特別的感覺,但是當資料量龐大時,就能感受到時間明顯的差距。
  • 基於游標和大小的分頁:這種方法是使用上一個結果來取得下一個結果,游標是一個能夠保證下一個結果,例如自動增加的識別符或時間戳。這種方法不會受到新資料的影響,而且能夠利用資料庫的索引來提高效能。

對於數據量很大的場景,基於游標的分頁是更好的選擇。但是,基於頁碼的分頁在數據量較小且變化不頻繁的場景下,也是一種簡單有效的方法。開發者應該根據自己的需求和情況,選擇最合適的分頁方式。

開發環境

  • ASP.NET Core 8
  • Rider 2023.3.2
  • PostgreSQL 16.1

定義 DB Schema

這裡我使用 EF Core Code First 產生出 DB Schema

MemberDataEntity 同等於 Member Table,裡面要特別提的有兩個欄位,分別是 Id、SequenceId,

  • Id:為 PK,由應用程式賦予的值,比如來自於 IdGenerate Server
  • SequenceId:是  DB  自動增長的 Auto Increment
using Microsoft.EntityFrameworkCore;

namespace Lab.CursorPaging.WebApi.Member.Repository;

public class MemberDbContext : DbContext
{
    private static readonly bool[] s_migrated = { false };

    public MemberDbContext(DbContextOptions<MemberDbContext> options) : base(options)
    {
        if (!s_migrated[0])
        {
            lock (s_migrated)
            {
                if (!s_migrated[0])
                {
                    this.Database.Migrate();
                    s_migrated[0] = true;
                }
            }
        }
    }

    public DbSet<MemberDataEntity> Members { get; set; } = default!;

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        base.OnModelCreating(modelBuilder);

        modelBuilder.Entity<MemberDataEntity>(builder =>
        {
            //property
            builder.ToTable("Member");
            builder.HasKey(x => x.Id);
            builder.Property(x => x.Name).IsRequired();
            builder.Property(x => x.Age).IsRequired();
            builder.Property(x => x.CreatedAt).IsRequired();
            builder.Property(x => x.CreatedBy).IsRequired();
            builder.Property(x => x.UpdatedAt).IsRequired();
            builder.Property(x => x.UpdatedBy).IsRequired();
            builder.Property(p => p.SequenceId).ValueGeneratedOnAdd();

            //index            
            builder.HasIndex(x => x.SequenceId).IsUnique();
        });
    }
}

public class MemberDataEntity
{
    public string Id { get; set; }

    public string Name { get; set; } = default!;

    public int Age { get; set; }

    public string? Email { get; set; }

    public string? Phone { get; set; }

    public string? Address { get; set; }

    public DateTimeOffset CreatedAt { get; set; }

    public string CreatedBy { get; set; }

    public DateTimeOffset UpdatedAt { get; set; }

    public string UpdatedBy { get; set; }

    // [DatabaseGenerated(DatabaseGeneratedOption.Identity)]
    public long SequenceId { get; set; }
}

 

Raw SQL 的寫法如下

-- auto-generated definition
create table "Member"
(
    "Id"         text                     not null
        constraint "PK_Member"
            primary key,
    "Name"       text                     not null,
    "Age"        integer                  not null,
    "Email"      text,
    "Phone"      text,
    "Address"    text,
    "CreatedAt"  timestamp with time zone not null,
    "CreatedBy"  text                     not null,
    "UpdatedAt"  timestamp with time zone not null,
    "UpdatedBy"  text                     not null,
    "SequenceId" bigint generated by default as identity
);

alter table "Member"
    owner to postgres;

create unique index "IX_Member_SequenceId"
    on "Member" ("SequenceId");

 

產生假資料

這裡我搭配 Faker.Data 產生出了煞有其事的資料

[HttpPost]
[Route("/api/members:batch-generate")]
public async Task<ActionResult> BatchGenerate()
{
    await using var dbContext = await this._memberDbContextFactory.CreateDbContextAsync();
    for (var i = 0; i < 1000; i++)
    {
        var now = DateTimeOffset.UtcNow;
        var member = new MemberDataEntity
        {
            Id = Guid.NewGuid().ToString(),
            Name = Faker.Name.FullName(),
            Age = DateTime.Now.Year - Faker.Date.Birthday().Year,
            Email = Faker.User.Email(),
            Phone = Faker.Phone.GetPhoneNumber(),
            Address = Faker.Address.SecondaryAddress(),
            CreatedAt = now,
            CreatedBy = "sys",
            UpdatedAt = now,
            UpdatedBy = "sys",
        };

        dbContext.Members.Add(member);
    }

    var count = await dbContext.SaveChangesAsync();

    return this.NoContent();
}

 

看,很像有這麼一回事的資料們!

Offset-based 分頁查詢

Offset-based 需要知道幾個資訊,目前在第幾頁,一頁要呈現幾筆,總共有幾筆

  • Offset 來決定要跳過那些資料,根據目前在哪一頁 PageIndex 以及每一頁要呈現幾筆 PageSize 算出要 Skip 多少筆資料,計算公式為 Skip=PageIndex-1 * PageSize。

下圖出自:NestJS Prisma Pagination - Offset vs Cursor with Examples (progressivecoder.com)

NestJS Prisma Pagination - Offset vs Cursor with Examples

API 的設計

我把分頁資訊放在 Request Header,當然,要放在 Body 也是可以的,

  1. X-Page-Index:目前在第幾頁
  2. X-Page-Size:一頁要呈現幾筆
[HttpGet]
[Route("/api/members:page-index")]
public async Task<ActionResult<IEnumerable<MemberDataEntity>>> GetPageIndex()
{
    var pageIndex = this.Request.Headers.TryGetValue("X-Page-Index", out var pages)
        ? int.Parse(pages.FirstOrDefault() ?? string.Empty)
        : 1;

    var pageSize = this.TryGetPageSize();

    await using var dbContext = await this._memberDbContextFactory.CreateDbContextAsync();
    var query = dbContext.Members.Select(p => p);
    var totalCount = await dbContext.Members.CountAsync();
    query = query
        .Skip((pageIndex - 1) * pageSize)
        .Take(pageSize);
    query = query.Take(pageSize + 1);
    
    var results = await query.AsNoTracking().ToListAsync();
    this.Response.Headers.Add("X-Total-Count", totalCount.ToString());
    
    return this.Ok(results);
}

 

Cursor-based 分頁查詢

Cursor-based 是利用比大小,來取決於要指向哪一筆資料,只要比大小的演算法一致,就可以拿來使用,我過去也曾經比過 Guid,也是可以套用,在這個範例我用的是 DB 自動增長的 Sequence Id

下圖出自:A deep dive into cursor-based pagination in MongoDB (engage.so)

A deep dive into cursor-based pagination in MongoDB

API 的設計

我把分頁資訊放在 Request Header,當然,要放在 Body 也是可以的,

  1. X-Next-Page-Token:存放最後一筆的資訊,這裡我存放 Id、SequenceId
  2. X-Page-Size:一頁要呈現幾筆

 

從 Header 取得  X-Next-Page-Token、X-Page-Size 程式碼如下

private int TryGetPageSize() =>
    this.Request.Headers.TryGetValue("X-Page-Size", out var sizes)
        ? int.Parse(sizes.FirstOrDefault() ?? string.Empty)
        : 10;

private (string? LastId, long? LastSequenceId) TryGetPageToken()
{
    if (this.Request.Headers.TryGetValue("X-Next-Page-Token", out var nextToken))
    {
        var decodeResult = DecodePageToken(nextToken);
        return (decodeResult.lastId, decodeResult.lastSequenceId);
    }

    return (null, null);
}

 

將 Id 和 SequenceId 轉換為下一頁的令牌,序列化後轉成 Base64

// 將 Id 和 SequenceId 轉換為下一頁的令牌
private static string EncodePageToken(string? lastId, long? lastSequenceId)
{
    if (lastId == null || lastSequenceId == null)
    {
        return null;
    }

    var json = JsonSerializer.Serialize(new { lastId, lastSequenceId });
    return Convert.ToBase64String(Encoding.UTF8.GetBytes(json));
}

 

將下一頁的令牌解碼為 Id 和 SequenceId

// 將下一頁的令牌解碼為 Id 和 SequenceId
private static (string lastId, long lastSequenceId) DecodePageToken(string nextToken)
{
    if (string.IsNullOrEmpty(nextToken))
    {
        return (null, 0);
    }

    string lastId = null;
    long lastSequenceId = 0;
    var base64Bytes = Convert.FromBase64String(nextToken);
    var json = Encoding.UTF8.GetString(base64Bytes);
    var jsonNode = JsonNode.Parse(json);
    var jsonObject = jsonNode.AsObject();
    if (jsonObject.TryGetPropertyValue("lastSequenceId", out var lastSequenceIdNode))
    {
        lastSequenceId = lastSequenceIdNode.GetValue<long>();
    }

    if (jsonObject.TryGetPropertyValue("lastId", out var lastIdNode))
    {
        lastId = lastIdNode.GetValue<string>();
    }

    return (lastId, lastSequenceId);
}

 

找出比 lastSequenceId、lastId 還要大的資料,指向正確的資料列。

if (string.IsNullOrWhiteSpace(lastId) == false)
{
	query = query.Where(p => p.Id.CompareTo(lastId) > 0);
}

if (lastSequenceId.HasValue)
{
	query = query.Where(p => p.SequenceId > lastSequenceId);
}

PS.這些資料都要有索引

 

取資料時,多取一筆,用來判定有沒有下頁,若有下一頁,

  1. 把多取的一筆資料刪掉。
  2. 產生 Next Page Token 的值,放在 Response Header。
query = query.Take(pageSize + 1);
var results = await query.AsNoTracking().ToListAsync();

// 是否有下一頁
bool hasNextPage = results.Count > pageSize;

if (hasNextPage)
{
	// 有下一頁,刪除最後一筆
	results.RemoveAt(results.Count - 1);

	// 產生下一頁的令牌
	var after = results.LastOrDefault();
	if (after != null)
	{
		var nextToken = EncodePageToken(after.Id, after.SequenceId);
		this.Response.Headers.Add("X-Next-Page-Token", nextToken);
	}
}

 

完整程式碼如下:

[HttpGet]
[Route("/api/members:cursor")]
public async Task<ActionResult<IEnumerable<MemberDataEntity>>> GetCursor()
{
    var pageSize = this.TryGetPageSize();
    var pageTokenResult = this.TryGetPageToken();
    var lastId = pageTokenResult.LastId;
    var lastSequenceId = pageTokenResult.LastSequenceId;

    await using var dbContext = await this._memberDbContextFactory.CreateDbContextAsync();
    var query = dbContext.Members.Select(p => p);
    if (string.IsNullOrWhiteSpace(lastId) == false)
    {
        query = query.Where(p => p.Id.CompareTo(lastId) > 0);
    }

    if (lastSequenceId.HasValue)
    {
        query = query.Where(p => p.SequenceId > lastSequenceId);
    }

    query = query.Take(pageSize + 1);
    var results = await query.AsNoTracking().ToListAsync();

    // 是否有下一頁
    bool hasNextPage = results.Count > pageSize;

    if (hasNextPage)
    {
        // 有下一頁,刪除最後一筆
        results.RemoveAt(results.Count - 1);

        // 產生下一頁的令牌
        var after = results.LastOrDefault();
        if (after != null)
        {
            var nextToken = EncodePageToken(after.Id, after.SequenceId);
            this.Response.Headers.Add("X-Next-Page-Token", nextToken);
        }
    }

    return this.Ok(results);
執行執行}

 

執行結果如下:

Response Header 得到了 X-Next-Page-Token

 

Body 內容如下:

[
    {
        "id": "0002c3f6-bba8-4da4-93f6-938bb9f1e3e5",
        "name": "GisellePoff",
        "age": 58,
        "email": "virgiemarander3@basketballmail.com",
        "phone": "501-157-3781 x16723",
        "address": "Apt. 546",
        "createdAt": "2024-01-07T08:01:27.401095+00:00",
        "createdBy": "sys",
        "updatedAt": "2024-01-07T08:01:27.401095+00:00",
        "updatedBy": "sys",
        "sequenceId": 1
    },
    {
        "id": "00bc08f3-436e-4d09-a366-aff2096e81e4",
        "name": "NaomaGottesman",
        "age": 25,
        "email": "Williemae.Dehart1@mail2cardinal.com",
        "phone": "288-733-2245 x4447",
        "address": "Apt. 266",
        "createdAt": "2024-01-07T08:01:27.400446+00:00",
        "createdBy": "sys",
        "updatedAt": "2024-01-07T08:01:27.400446+00:00",
        "updatedBy": "sys",
        "sequenceId": 2
    },
    {
        "id": "00da353d-e6e5-49f9-94fb-8964d4846ecd",
        "name": "AbelLuoto",
        "age": 20,
        "email": "Edmundo.Smelser@mail2zoologist.com",
        "phone": "1-210-035-0878 x72348",
        "address": "Apt. 460",
        "createdAt": "2024-01-07T08:01:27.401749+00:00",
        "createdBy": "sys",
        "updatedAt": "2024-01-07T08:01:27.401749+00:00",
        "updatedBy": "sys",
        "sequenceId": 3
    },
    {
        "id": "00dbba1c-e40b-49cf-b6e2-077f89170a9a",
        "name": "HyoStockburger",
        "age": 28,
        "email": "lennie2@fromtennessee.com",
        "phone": "046.343.6773 x4560",
        "address": "Apt. 573",
        "createdAt": "2024-01-07T08:01:27.413944+00:00",
        "createdBy": "sys",
        "updatedAt": "2024-01-07T08:01:27.413944+00:00",
        "updatedBy": "sys",
        "sequenceId": 4
    },
    {
        "id": "00f32bdd-3738-4883-9777-a4fb05b517d5",
        "name": "MyrticeTondre",
        "age": 23,
        "email": "phetphongsygabrielle@incredimail.com",
        "phone": "427.680.1481",
        "address": "Apt. 809",
        "createdAt": "2024-01-07T08:01:27.423377+00:00",
        "createdBy": "sys",
        "updatedAt": "2024-01-07T08:01:27.423377+00:00",
        "updatedBy": "sys",
        "sequenceId": 5
    },
    {
        "id": "0136d7d2-787d-4dca-9685-2ed419015cbd",
        "name": "GermainePanrell",
        "age": 25,
        "email": "heribertomillwee5@mail2rome.com",
        "phone": "(230)360-5188 x206",
        "address": "Apt. 33",
        "createdAt": "2024-01-07T08:01:27.403636+00:00",
        "createdBy": "sys",
        "updatedAt": "2024-01-07T08:01:27.403636+00:00",
        "updatedBy": "sys",
        "sequenceId": 6
    },
    {
        "id": "0141fecf-9ae1-4118-b4b9-d0ccfcb87fd3",
        "name": "TrudiMcdonnel",
        "age": 57,
        "email": "tylerhoerr8@4email.net",
        "phone": "537-341-6516 x85628",
        "address": "Apt. 531",
        "createdAt": "2024-01-07T08:01:27.41869+00:00",
        "createdBy": "sys",
        "updatedAt": "2024-01-07T08:01:27.41869+00:00",
        "updatedBy": "sys",
        "sequenceId": 7
    },
    {
        "id": "01427c45-c4bb-4e8d-848d-08f485f0af75",
        "name": "KaylaJoline",
        "age": 48,
        "email": "Eleonora.Loveless1@mail2trex.com",
        "phone": "(264)213-1325",
        "address": "Apt. 276",
        "createdAt": "2024-01-07T08:01:27.4157+00:00",
        "createdBy": "sys",
        "updatedAt": "2024-01-07T08:01:27.4157+00:00",
        "updatedBy": "sys",
        "sequenceId": 8
    },
    {
        "id": "0190216c-3489-4030-afec-3dfefb4bcdba",
        "name": "ShalaMaccartney",
        "age": 35,
        "email": "Mellisa.Dates6@mail2edgar.com",
        "phone": "716.575.3723 x5682",
        "address": "Apt. 623",
        "createdAt": "2024-01-07T08:01:27.415287+00:00",
        "createdBy": "sys",
        "updatedAt": "2024-01-07T08:01:27.415287+00:00",
        "updatedBy": "sys",
        "sequenceId": 9
    },
    {
        "id": "0195bd79-8b10-4c70-abbf-de2717a35243",
        "name": "LoreenTippin",
        "age": 30,
        "email": "Kubic.Blanca1@kbjrmail.com",
        "phone": "1-163-332-1426",
        "address": "Apt. 838",
        "createdAt": "2024-01-07T08:01:27.403883+00:00",
        "createdBy": "sys",
        "updatedAt": "2024-01-07T08:01:27.403883+00:00",
        "updatedBy": "sys",
        "sequenceId": 10
    }
]

 

拿掉序列化和 Base64,寫法可以再更簡單一點

[HttpGet]
[Route("/api/members:cursor2")]
public async Task<ActionResult<IEnumerable<MemberDataEntity>>> GetCursor2()
{
    var pageSize = this.TryGetPageSize();
    long? nextPageId = this.Request.Headers.TryGetValue("X-Next-Page-Id", out var data)
        ? long.TryParse(data.FirstOrDefault(), out var id)
            ? id
            : null
        : null;

    await using var dbContext = await this._memberDbContextFactory.CreateDbContextAsync();
    var query = dbContext.Members.Select(p => p);

    if (nextPageId.HasValue)
    {
        query = query.Where(p => p.SequenceId > nextPageId);
    }

    query = query.Take(pageSize + 1);
    var results = await query.AsNoTracking().ToListAsync();

    // 是否有下一頁
    bool hasNextPage = results.Count > pageSize;

    if (hasNextPage)
    {
        // 有下一頁,刪除最後一筆
        results.RemoveAt(results.Count - 1);

        var after = results.LastOrDefault();
        this.Response.Headers.Add("X-Next-Page-Id", after.SequenceId.ToString());
    }

    return this.Ok(results);
}

 

執行結果如下:

 

範例位置

sample.dotblog/Paging/Lab.CursorPaging at a6e6c839fb73b283f6be56460064a6e213a28395 · yaochangyu/sample.dotblog (github.com)

 

若有謬誤,煩請告知,新手發帖請多包涵


Microsoft MVP Award 2010~2017 C# 第四季
Microsoft MVP Award 2018~2022 .NET

Image result for microsoft+mvp+logo