计算分页前记录的总大小,并在分页后返回它们,而无需两次访问数据库

我有一个分页方法,它需要页面和限制,然后将它们应用于给定的集合(可能会应用预过滤器/查询),然后对其应用分页,因此服务器端无需执行任何操作从计数记录到应用分页(仅统计最终结果,该结果无论如何都存储在内存中)

        public async Task<PagingServiceResponse<T>> Apply<T>(IQueryable<T> set)
        {
            var httpQuery = _accessor.HttpContext.Request.Query;
            var pageValue = httpQuery.LastOrDefault(x => x.Key == "page").Value;
            if (pageValue.Count > 0) int.TryParse(pageValue, out _page);
            var limitValue = httpQuery.LastOrDefault(x => x.Key == "limit").Value;
            if (limitValue.Count > 0) int.TryParse(limitValue, out _limit);
            if (_limit > 1000 || _limit <= 0) _limit = 1000;
            if (_page <= 0) _page = 1;
            _size = await set.CountAsync();
            set = set.Take(_limit);
            if (_page > 1) set = set.Skip(_page * _limit);

            var data = await set.ToListAsync();
            var currentSize = data.Count;
            return new PagingServiceResponse<T>
            {
                Data = data,
                Size = _size,
                CurrentSize = currentSize,
                Page = _page,
                PerPage = _limit
            };
        }

So the problem here is that this hits the database twice, to check the total count (CountAsync) and receive the data (ToListAsync)

并且我尝试不执行此操作,因为它执行两次查询(这不是纯查询),因此对其应用了筛选器操作。

如果有其他建议或其他建议,我会全力以赴。

我正在使用PostgreSQL和实体框架核心(npgsql)

评论
  • 深井冰
    深井冰 回复

    不,分页的全部前提是您需要在获取总记录的子集之前了解完整的行数。 1次查询命中中唯一可以完成的方法是加载所有记录。 (对于大集合,这是一个更糟糕的选择!:)

    One issue I see is that you are using Take to take a limit of rows (0<=1000?) then skipping the page size and page #? To me if the limit is 1000 and your page size is 25, and you're loading the first page, wouldn't this return 1000 rows? (rather than the first page's 25?) Normally I'd expect a paging query to act more like:

    var pagedData = set.Skip(page * pageSize).Take(pageSize).ToList();
    

    Where page is 0-based. (0 = page #1). This ensures that only a max of 25 rows are pulled back.

    您可以做一些事情来进一步减轻分页查询和获得计数的成本:

    1. Structure your query build & execution so that Ordering and Projections (Select/ProjectTo) occur after obtaining the Count.

    2. Make sure the context is short-lived and "fresh". This won't speed up the Count, but loading the sub-set will be slower the more entities that are being tracked.

    3. When an accurate count is not needed, provide a rough one that can be expanded as users select a further page, or can opt for retrieving a complete count.

    粗略计数类似于Google搜索给出近似值,而不是实际结果计数。我使用的一种相对简单的技术是获取当前页面的大小和分页器显示的页面数。需要调整分页控件以不显示导航到“最后”页面,并且还需要调整显示记录数。

    So for example 10 pages with a page size of 25. Before getting a count I base the count on the top ({PageSize} x {MaxPageCount} + 1) or 251. To get maxPageCount we need to look at the page number against the # of expected pages to display. (I.e. 10)

    int maxPageCount = (((page) / 10)+1) * 10;
    int roughCountLimit = pageSize * maxPageCount + 1;
    
    rowCount = set.Take(roughCountLimit).Count();
    bool isRoughCount = rowCount == roughCountLimit;
    var pagedData = set.Skip(page * pageSize).Take(pageSize).ToList();
    

    对于第1页到第10页,最多返回11页。 即

    page #1 (0) / 10 = 0.  (0+1)* 10 = 10.
    page #2 (1) / 10 = 0.  (0+1)* 10 = 10.
    page #10 (9) / 10 = 0. (0+1)* 10 = 10.
    

    这个想法是传呼机将显示如下内容:

    "1 2 3 4 5 6 7 8 9 10 ..." while our page count would be set up to look at isRoughCount and display: "250+" rather than "251" if isRoughCount is True.

    If and when a user selects "..." to load page #11 then going back to the maxPageCount:

     page #11 (10) / 10 = 1. (1+1)* 10 = 20.
    

    This will result in roughCountLimit becoming 501. This will load up to 21 pages of records. If the database happened to only return 251 records, then Page 11 would still display with the 1 remaining record, and since isRoughCount would be false, the row count will update to display "251". If the user continues to navigate through pages using the "...", the rough count limit will continue to increase. This will make the query gradually slower, but for those initial few sets of pages, the query will retrieve counts significantly faster.

    分页和搜索的关键是用户应具有通常在结果的第一页或第一页结果中查找数据的工具。他们需要浏览10页结果的实际次数,更不用说超过10页的结果了。 (这表明您需要更好的搜索/过滤功能)同时,即使搜索非常好,处理了非常大的数据集,用户通常也不会在乎是否有5000行或500,000,000行。我们可以通过报告“至少”有250行来极大地加快查询速度,然后在且仅当需要时才进行扩展。页面计数可以显示为超链接,以在需要时运行特定的全计数查询,或者只是好奇特定的504,231,188行计数。这个(昂贵的)事实不必成为每个查询的一部分。

  • 衮开、
    衮开、 回复

    It is not possible to hit the database just once to get both number of objects and objects. If you want to do pagination both queries are required. Link to similar question