1 From: Yunsheng Lin <linyunsheng@huawei.com>
2 Date: Fri, 20 Oct 2023 17:59:48 +0800
3 Subject: [PATCH] page_pool: unify frag_count handling in
4 page_pool_is_last_frag()
6 Currently when page_pool_create() is called with
7 PP_FLAG_PAGE_FRAG flag, page_pool_alloc_pages() is only
8 allowed to be called under the below constraints:
9 1. page_pool_fragment_page() need to be called to setup
10 page->pp_frag_count immediately.
11 2. page_pool_defrag_page() often need to be called to drain
12 the page->pp_frag_count when there is no more user will
13 be holding on to that page.
15 Those constraints exist in order to support a page to be
16 split into multi fragments.
18 And those constraints have some overhead because of the
19 cache line dirtying/bouncing and atomic update.
21 Those constraints are unavoidable for case when we need a
22 page to be split into more than one fragment, but there is
23 also case that we want to avoid the above constraints and
24 their overhead when a page can't be split as it can only
25 hold a fragment as requested by user, depending on different
27 use case 1: allocate page without page splitting.
28 use case 2: allocate page with page splitting.
29 use case 3: allocate page with or without page splitting
30 depending on the fragment size.
32 Currently page pool only provide page_pool_alloc_pages() and
33 page_pool_alloc_frag() API to enable the 1 & 2 separately,
34 so we can not use a combination of 1 & 2 to enable 3, it is
35 not possible yet because of the per page_pool flag
38 So in order to allow allocating unsplit page without the
39 overhead of split page while still allow allocating split
40 page we need to remove the per page_pool flag in
41 page_pool_is_last_frag(), as best as I can think of, it seems
42 there are two methods as below:
43 1. Add per page flag/bit to indicate a page is split or
44 not, which means we might need to update that flag/bit
45 everytime the page is recycled, dirtying the cache line
46 of 'struct page' for use case 1.
47 2. Unify the page->pp_frag_count handling for both split and
48 unsplit page by assuming all pages in the page pool is split
49 into a big fragment initially.
51 As page pool already supports use case 1 without dirtying the
52 cache line of 'struct page' whenever a page is recyclable, we
53 need to support the above use case 3 with minimal overhead,
54 especially not adding any noticeable overhead for use case 1,
55 and we are already doing an optimization by not updating
56 pp_frag_count in page_pool_defrag_page() for the last fragment
57 user, this patch chooses to unify the pp_frag_count handling
58 to support the above use case 3.
60 There is no noticeable performance degradation and some
61 justification for unifying the frag_count handling with this
62 patch applied using a micro-benchmark testing in [1].
64 1. https://lore.kernel.org/all/bf2591f8-7b3c-4480-bb2c-31dc9da1d6ac@huawei.com/
66 Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
67 CC: Lorenzo Bianconi <lorenzo@kernel.org>
68 CC: Alexander Duyck <alexander.duyck@gmail.com>
69 CC: Liang Chen <liangchen.linux@gmail.com>
70 CC: Alexander Lobakin <aleksander.lobakin@intel.com>
71 Link: https://lore.kernel.org/r/20231020095952.11055-2-linyunsheng@huawei.com
72 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
75 --- a/include/net/page_pool/helpers.h
76 +++ b/include/net/page_pool/helpers.h
77 @@ -115,28 +115,49 @@ static inline long page_pool_defrag_page
80 /* If nr == pp_frag_count then we have cleared all remaining
81 - * references to the page. No need to actually overwrite it, instead
82 - * we can leave this to be overwritten by the calling function.
83 + * references to the page:
84 + * 1. 'n == 1': no need to actually overwrite it.
85 + * 2. 'n != 1': overwrite it with one, which is the rare case
86 + * for pp_frag_count draining.
88 - * The main advantage to doing this is that an atomic_read is
89 - * generally a much cheaper operation than an atomic update,
90 - * especially when dealing with a page that may be partitioned
91 - * into only 2 or 3 pieces.
92 + * The main advantage to doing this is that not only we avoid a atomic
93 + * update, as an atomic_read is generally a much cheaper operation than
94 + * an atomic update, especially when dealing with a page that may be
95 + * partitioned into only 2 or 3 pieces; but also unify the pp_frag_count
96 + * handling by ensuring all pages have partitioned into only 1 piece
97 + * initially, and only overwrite it when the page is partitioned into
98 + * more than one piece.
100 - if (atomic_long_read(&page->pp_frag_count) == nr)
101 + if (atomic_long_read(&page->pp_frag_count) == nr) {
102 + /* As we have ensured nr is always one for constant case using
103 + * the BUILD_BUG_ON(), only need to handle the non-constant case
104 + * here for pp_frag_count draining, which is a rare case.
106 + BUILD_BUG_ON(__builtin_constant_p(nr) && nr != 1);
107 + if (!__builtin_constant_p(nr))
108 + atomic_long_set(&page->pp_frag_count, 1);
113 ret = atomic_long_sub_return(nr, &page->pp_frag_count);
116 + /* We are the last user here too, reset pp_frag_count back to 1 to
117 + * ensure all pages have been partitioned into 1 piece initially,
118 + * this should be the rare case when the last two fragment users call
119 + * page_pool_defrag_page() currently.
121 + if (unlikely(!ret))
122 + atomic_long_set(&page->pp_frag_count, 1);
127 -static inline bool page_pool_is_last_frag(struct page_pool *pool,
129 +static inline bool page_pool_is_last_frag(struct page *page)
131 - /* If fragments aren't enabled or count is 0 we were the last user */
132 - return !(pool->p.flags & PP_FLAG_PAGE_FRAG) ||
133 - (page_pool_defrag_page(page, 1) == 0);
134 + /* If page_pool_defrag_page() returns 0, we were the last user */
135 + return page_pool_defrag_page(page, 1) == 0;
139 @@ -161,7 +182,7 @@ static inline void page_pool_put_page(st
140 * allow registering MEM_TYPE_PAGE_POOL, but shield linker.
142 #ifdef CONFIG_PAGE_POOL
143 - if (!page_pool_is_last_frag(pool, page))
144 + if (!page_pool_is_last_frag(page))
147 page_pool_put_defragged_page(pool, page, dma_sync_size, allow_direct);
148 --- a/net/core/page_pool.c
149 +++ b/net/core/page_pool.c
150 @@ -380,6 +380,14 @@ static void page_pool_set_pp_info(struct
153 page->pp_magic |= PP_SIGNATURE;
155 + /* Ensuring all pages have been split into one fragment initially:
156 + * page_pool_set_pp_info() is only called once for every page when it
157 + * is allocated from the page allocator and page_pool_fragment_page()
158 + * is dirtying the same cache line as the page->pp_magic above, so
159 + * the overhead is negligible.
161 + page_pool_fragment_page(page, 1);
162 if (pool->p.init_callback)
163 pool->p.init_callback(page, pool->p.init_arg);
165 @@ -676,7 +684,7 @@ void page_pool_put_page_bulk(struct page
166 struct page *page = virt_to_head_page(data[i]);
168 /* It is not the last user for the page frag case */
169 - if (!page_pool_is_last_frag(pool, page))
170 + if (!page_pool_is_last_frag(page))
173 page = __page_pool_put_page(pool, page, -1, false);
174 @@ -752,8 +760,7 @@ struct page *page_pool_alloc_frag(struct
175 unsigned int max_size = PAGE_SIZE << pool->p.order;
176 struct page *page = pool->frag_page;
178 - if (WARN_ON(!(pool->p.flags & PP_FLAG_PAGE_FRAG) ||
180 + if (WARN_ON(size > max_size))
183 size = ALIGN(size, dma_get_cache_alignment());