Discussion:
UBIFS assert failed in ubifs_set_page_dirty at 1421
Artem Bityutskiy
2014-10-20 13:11:59 UTC
Permalink
Hi,

first of all, what is your architecture? ARM? And how easily can you
reproduce this? And can you try a kernel newer than 3.10?

And for fs-devel and mm people, here is the link to the original report:
http://lists.infradead.org/pipermail/linux-mtd/2014-October/055930.html,
cache 16240kB is below limit 16384kB for oom_score_adj 529
Free memory is -1820kB above reserved
lowmemorykiller: Killing '.networkupgrade' (6924), adj 705,
to free 20968kB on behalf of 'kswapd0' (543) because
cache 16240kB is below limit 16384kB for oom_score_adj 529
Free memory is -2192kB above reserved
OK, no memory and OOM starts. So your system is in trouble anyway :-)
UBIFS assert failed in ubifs_set_page_dirty at 1421 (pid 543)
UBIFS complain here that someone marks a page as dirty "directly", not
through one of the UBIFS functions. And that someone is the page reclaim
path.

Now, I do not really know what is going on here, so I am CCing a couple
of mailing lists, may be someone will help.

Here is what I see is going on.

1. UBIFS wants to make sure that no one marks UBIFS-backed pages (and
actually inodes too) as dirty directly. UBIFS wants everyone to ask
UBIFS to mark a page as dirty.

2. This is because for every dirty page, UBIFS needs to reserve certain
amount of space on the flash media, because all writes are out-of-place,
even when you are changing an existing file.

3. There are exactly 2 places where UBIFS-backed pages may be marked as
dirty:

a) ubifs_write_end() [->wirte_end] - the file write path
b) ubifs_page_mkwrite() [->page_mkwirte] - the file mmap() path

4. If anything calls 'ubifs_set_page_dirty()' directly (not through
write_end()/mkwrite()), and the page was not dirty, UBIFS will complain
with the assertion that you see.
CPU: 3 PID: 543 Comm: kswapd0 Tainted: P O 3.10.0_s40 #1
[<8001d8a0>] (unwind_backtrace+0x0/0x108) from [<80019f44>] (show_stack+0x20/0x24)
[<80019f44>] (show_stack+0x20/0x24) from [<80af2ef8>] (dump_stack+0x24/0x2c)
[<80af2ef8>] (dump_stack+0x24/0x2c) from [<80297234>] (ubifs_set_page_dirty+0x54/0x5c)
[<80297234>] (ubifs_set_page_dirty+0x54/0x5c) from [<800cea60>] (set_page_dirty+0x50/0x78)
[<800cea60>] (set_page_dirty+0x50/0x78) from [<800f4be4>] (try_to_unmap_one+0x1f8/0x3d0)
[<800f4be4>] (try_to_unmap_one+0x1f8/0x3d0) from [<800f4f44>] (try_to_unmap_file+0x9c/0x740)
[<800f4f44>] (try_to_unmap_file+0x9c/0x740) from [<800f5678>] (try_to_unmap+0x40/0x78)
[<800f5678>] (try_to_unmap+0x40/0x78) from [<800d6a04>] (shrink_page_list+0x23c/0x884)
[<800d6a04>] (shrink_page_list+0x23c/0x884) from [<800d76c8>] (shrink_inactive_list+0x21c/0x3c8)
[<800d76c8>] (shrink_inactive_list+0x21c/0x3c8) from [<800d7c20>] (shrink_lruvec+0x3ac/0x524)
[<800d7c20>] (shrink_lruvec+0x3ac/0x524) from [<800d8970>] (kswapd+0x854/0xdc0)
[<800d8970>] (kswapd+0x854/0xdc0) from [<80051e28>] (kthread+0xc8/0xcc)
[<80051e28>] (kthread+0xc8/0xcc) from [<80015198>] (ret_from_fork+0x14/0x20)
So the reclaim path seems to be marking UBIFS-backed pages as dirty
directly, I do not know why, the reclaim path is extremely complex and I
am no expert there. But may be someone on the MM list may help.

Note, this warning is not necessarily fatal. It just indicates that
UBIFS sees something which it believes should not happen.
UBIFS assert failed in do_writepage at 936 (pid 543)
CPU: 1 PID: 543 Comm: kswapd0 Tainted: P O 3.10.0_s40 #1
[<8001d8a0>] (unwind_backtrace+0x0/0x108) from [<80019f44>] (show_stack+0x20/0x24)
[<80019f44>] (show_stack+0x20/0x24) from [<80af2ef8>] (dump_stack+0x24/0x2c)
[<80af2ef8>] (dump_stack+0x24/0x2c) from [<802990b8>] (do_writepage+0x1b8/0x1c4)
[<802990b8>] (do_writepage+0x1b8/0x1c4) from [<802991e8>] (ubifs_writepage+0x124/0x1dc)
[<802991e8>] (ubifs_writepage+0x124/0x1dc) from [<800d6eb8>] (shrink_page_list+0x6f0/0x884)
[<800d6eb8>] (shrink_page_list+0x6f0/0x884) from [<800d76c8>] (shrink_inactive_list+0x21c/0x3c8)
[<800d76c8>] (shrink_inactive_list+0x21c/0x3c8) from [<800d7c20>] (shrink_lruvec+0x3ac/0x524)
[<800d7c20>] (shrink_lruvec+0x3ac/0x524) from [<800d8970>] (kswapd+0x854/0xdc0)
[<800d8970>] (kswapd+0x854/0xdc0) from [<80051e28>] (kthread+0xc8/0xcc)
[<80051e28>] (kthread+0xc8/0xcc) from [<80015198>] (ret_from_fork+0x14/0x20)
And here UBIFS sees a page being writted, but there is no budget
allocated for it, so the write may fail with -ENOSPC (no space), which
is not supposed to ever happen.

This is not necessarily fatal either, but indicates that UBIFS's
assumptions about how the system functions are wrong.

Now the question is: is it UBIFS which has incorrect assumptions, or
this is the Linux MM which is not doing the right thing? I do not know
the answer, let's see if the MM list may give us a clue.

Thanks!

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jijiagang
2014-10-21 02:30:41 UTC
Permalink
Hi,
Thanks for your reply.
first of all, what is your architecture? ARM? And how easily can you reproduce this? And can you try a kernel newer than 3.10?
The architecture is ARM. It can be reproduced in about 10 minutes by run Monkey test.
The test command is:
monkey -s 1000 --ignore-crashes --ignore-timeouts --ignore-security-exceptions --pct-trackball 0 --pct-nav 0 --pct-majornav 0 --pct-anyevent 0 -v -v -v --throttle 300 1200000000 > /sdcard/monkey.log 2>&1 &

We have searched commits in the linux stable git repository, but found no related commits.

Lowmomery killer starts often on android platform.
Now the question is: is it UBIFS which has incorrect assumptions, or this is the Linux MM which is not doing the right thing? I do not know the answer, let's see if the MM list may give us a clue.
Please give us a hand, any reply will be appreciated.

Thanks!

-----Original Message-----
From: Artem Bityutskiy [mailto:***@gmail.com]
Sent: Monday, October 20, 2014 9:12 PM
To: Caizhiyong; linux-***@vger.kernel.org; linux-***@kvack.org
Cc: Jijiagang; ***@intel.com; linux-***@lists.infradead.org; Wanli (welly)
Subject: Re: UBIFS assert failed in ubifs_set_page_dirty at 1421

Hi,

first of all, what is your architecture? ARM? And how easily can you reproduce this? And can you try a kernel newer than 3.10?

And for fs-devel and mm people, here is the link to the original report:
http://lists.infradead.org/pipermail/linux-mtd/2014-October/055930.html,
cache 16240kB is below limit 16384kB for oom_score_adj 529
Free memory is -1820kB above reserved
lowmemorykiller: Killing '.networkupgrade' (6924), adj 705,
to free 20968kB on behalf of 'kswapd0' (543) because
cache 16240kB is below limit 16384kB for oom_score_adj 529
Free memory is -2192kB above reserved
OK, no memory and OOM starts. So your system is in trouble anyway :-)
UBIFS assert failed in ubifs_set_page_dirty at 1421 (pid 543)
UBIFS complain here that someone marks a page as dirty "directly", not through one of the UBIFS functions. And that someone is the page reclaim path.

Now, I do not really know what is going on here, so I am CCing a couple of mailing lists, may be someone will help.

Here is what I see is going on.

1. UBIFS wants to make sure that no one marks UBIFS-backed pages (and actually inodes too) as dirty directly. UBIFS wants everyone to ask UBIFS to mark a page as dirty.

2. This is because for every dirty page, UBIFS needs to reserve certain amount of space on the flash media, because all writes are out-of-place, even when you are changing an existing file.

3. There are exactly 2 places where UBIFS-backed pages may be marked as
dirty:

a) ubifs_write_end() [->wirte_end] - the file write path
b) ubifs_page_mkwrite() [->page_mkwirte] - the file mmap() path

4. If anything calls 'ubifs_set_page_dirty()' directly (not through write_end()/mkwrite()), and the page was not dirty, UBIFS will complain with the assertion that you see.
CPU: 3 PID: 543 Comm: kswapd0 Tainted: P O 3.10.0_s40 #1
[<8001d8a0>] (unwind_backtrace+0x0/0x108) from [<80019f44>]
(show_stack+0x20/0x24) [<80019f44>] (show_stack+0x20/0x24) from
[<80af2ef8>] (dump_stack+0x24/0x2c) [<80af2ef8>]
(dump_stack+0x24/0x2c) from [<80297234>]
(ubifs_set_page_dirty+0x54/0x5c) [<80297234>]
(ubifs_set_page_dirty+0x54/0x5c) from [<800cea60>]
(set_page_dirty+0x50/0x78) [<800cea60>] (set_page_dirty+0x50/0x78)
from [<800f4be4>] (try_to_unmap_one+0x1f8/0x3d0) [<800f4be4>]
(try_to_unmap_one+0x1f8/0x3d0) from [<800f4f44>]
(try_to_unmap_file+0x9c/0x740) [<800f4f44>]
(try_to_unmap_file+0x9c/0x740) from [<800f5678>]
(try_to_unmap+0x40/0x78) [<800f5678>] (try_to_unmap+0x40/0x78) from
[<800d6a04>] (shrink_page_list+0x23c/0x884) [<800d6a04>]
(shrink_page_list+0x23c/0x884) from [<800d76c8>]
(shrink_inactive_list+0x21c/0x3c8)
[<800d76c8>] (shrink_inactive_list+0x21c/0x3c8) from [<800d7c20>]
(shrink_lruvec+0x3ac/0x524) [<800d7c20>] (shrink_lruvec+0x3ac/0x524)
from [<800d8970>] (kswapd+0x854/0xdc0) [<800d8970>]
(kswapd+0x854/0xdc0) from [<80051e28>] (kthread+0xc8/0xcc)
[<80051e28>] (kthread+0xc8/0xcc) from [<80015198>]
(ret_from_fork+0x14/0x20)
So the reclaim path seems to be marking UBIFS-backed pages as dirty directly, I do not know why, the reclaim path is extremely complex and I am no expert there. But may be someone on the MM list may help.

Note, this warning is not necessarily fatal. It just indicates that UBIFS sees something which it believes should not happen.
UBIFS assert failed in do_writepage at 936 (pid 543)
CPU: 1 PID: 543 Comm: kswapd0 Tainted: P O 3.10.0_s40 #1
[<8001d8a0>] (unwind_backtrace+0x0/0x108) from [<80019f44>]
(show_stack+0x20/0x24) [<80019f44>] (show_stack+0x20/0x24) from
[<80af2ef8>] (dump_stack+0x24/0x2c) [<80af2ef8>]
(dump_stack+0x24/0x2c) from [<802990b8>] (do_writepage+0x1b8/0x1c4)
[<802990b8>] (do_writepage+0x1b8/0x1c4) from [<802991e8>]
(ubifs_writepage+0x124/0x1dc) [<802991e8>]
(ubifs_writepage+0x124/0x1dc) from [<800d6eb8>]
(shrink_page_list+0x6f0/0x884) [<800d6eb8>]
(shrink_page_list+0x6f0/0x884) from [<800d76c8>]
(shrink_inactive_list+0x21c/0x3c8)
[<800d76c8>] (shrink_inactive_list+0x21c/0x3c8) from [<800d7c20>]
(shrink_lruvec+0x3ac/0x524) [<800d7c20>] (shrink_lruvec+0x3ac/0x524)
from [<800d8970>] (kswapd+0x854/0xdc0) [<800d8970>]
(kswapd+0x854/0xdc0) from [<80051e28>] (kthread+0xc8/0xcc)
[<80051e28>] (kthread+0xc8/0xcc) from [<80015198>]
(ret_from_fork+0x14/0x20)
And here UBIFS sees a page being writted, but there is no budget allocated for it, so the write may fail with -ENOSPC (no space), which is not supposed to ever happen.

This is not necessarily fatal either, but indicates that UBIFS's assumptions about how the system functions are wrong.

Now the question is: is it UBIFS which has incorrect assumptions, or this is the Linux MM which is not doing the right thing? I do not know the answer, let's see if the MM list may give us a clue.

Thanks!

��{.n�+�������+%��lzwm��b�맲��r��zX���u�ޖ)����w*jg��������ݢj/���z�ޖ��
Dave Chinner
2014-10-21 03:38:47 UTC
Permalink
Post by Artem Bityutskiy
3. There are exactly 2 places where UBIFS-backed pages may be marked as
a) ubifs_write_end() [->wirte_end] - the file write path
b) ubifs_page_mkwrite() [->page_mkwirte] - the file mmap() path
4. If anything calls 'ubifs_set_page_dirty()' directly (not through
write_end()/mkwrite()), and the page was not dirty, UBIFS will complain
with the assertion that you see.
CPU: 3 PID: 543 Comm: kswapd0 Tainted: P O 3.10.0_s40 #1
Kernel is tainted. Not worth wasting time on unless it can be
reproduced on an untainted kernel...

Cheers,

Dave.
--
Dave Chinner
***@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Jijiagang
2014-10-21 08:41:38 UTC
Permalink
Dear Dave,
Thanks for your reply.
We removed some modules to keep kernel untainted. It still can be reproduced.

Here is the log:
UBIFS assert failed in ubifs_set_page_dirty at 1421 (pid 543)
CPU: 1 PID: 543 Comm: kswapd0 Not tainted 3.10.0_s40 #1
[<8001d8a0>] (unwind_backtrace+0x0/0x108) from [<80019f44>] (show_stack+0x20/0x24)
[<80019f44>] (show_stack+0x20/0x24) from [<80af2ef8>] (dump_stack+0x24/0x2c)
[<80af2ef8>] (dump_stack+0x24/0x2c) from [<80297234>] (ubifs_set_page_dirty+0x54/0x5c)
[<80297234>] (ubifs_set_page_dirty+0x54/0x5c) from [<800cea60>] (set_page_dirty+0x50/0x78)
[<800cea60>] (set_page_dirty+0x50/0x78) from [<800f4be4>] (try_to_unmap_one+0x1f8/0x3d0)
[<800f4be4>] (try_to_unmap_one+0x1f8/0x3d0) from [<800f4f44>] (try_to_unmap_file+0x9c/0x740)
[<800f4f44>] (try_to_unmap_file+0x9c/0x740) from [<800f5678>] (try_to_unmap+0x40/0x78)
[<800f5678>] (try_to_unmap+0x40/0x78) from [<800d6a04>] (shrink_page_list+0x23c/0x884)
[<800d6a04>] (shrink_page_list+0x23c/0x884) from [<800d76c8>] (shrink_inactive_list+0x21c/0x3c8)
[<800d76c8>] (shrink_inactive_list+0x21c/0x3c8) from [<800d7c20>] (shrink_lruvec+0x3ac/0x524)
[<800d7c20>] (shrink_lruvec+0x3ac/0x524) from [<800d8970>] (kswapd+0x854/0xdc0)
[<800d8970>] (kswapd+0x854/0xdc0) from [<80051e28>] (kthread+0xc8/0xcc)
[<80051e28>] (kthread+0xc8/0xcc) from [<80015198>] (ret_from_fork+0x14/0x20)
UBIFS assert failed in ubifs_writepage at 1009 (pid 6)
CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 3.10.0_s40 #1
Workqueue: writeback bdi_writeback_workfn (flush-ubifs_1_0)
[<8001d8a0>] (unwind_backtrace+0x0/0x108) from [<80019f44>] (show_stack+0x20/0x24)
[<80019f44>] (show_stack+0x20/0x24) from [<80af2ef8>] (dump_stack+0x24/0x2c)
[<80af2ef8>] (dump_stack+0x24/0x2c) from [<80299294>] (ubifs_writepage+0x1d0/0x1dc)
[<80299294>] (ubifs_writepage+0x1d0/0x1dc) from [<800cefc0>] (__writepage+0x24/0x4c)
[<800cefc0>] (__writepage+0x24/0x4c) from [<800cf288>] (write_cache_pages+0x1b0/0x408)
[<800cf288>] (write_cache_pages+0x1b0/0x408) from [<800cf538>] (generic_writepages+0x58/0x70)
[<800cf538>] (generic_writepages+0x58/0x70) from [<800cf594>] (do_writepages+0x44/0x48)
[<800cf594>] (do_writepages+0x44/0x48) from [<80139a30>] (__writeback_single_inode+0x50/0x238)
[<80139a30>] (__writeback_single_inode+0x50/0x238) from [<8013ae48>] (writeback_sb_inodes+0x264/0x44c)
[<8013ae48>] (writeback_sb_inodes+0x264/0x44c) from [<8013b0c4>] (__writeback_inodes_wb+0x94/0xcc)
[<8013b0c4>] (__writeback_inodes_wb+0x94/0xcc) from [<8013b3dc>] (wb_writeback+0x228/0x2f8)
[<8013b3dc>] (wb_writeback+0x228/0x2f8) from [<8013b6b4>] (wb_do_writeback+0x208/0x24c)
[<8013b6b4>] (wb_do_writeback+0x208/0x24c) from [<8013b774>] (bdi_writeback_workfn+0x7c/0x1dc)
[<8013b774>] (bdi_writeback_workfn+0x7c/0x1dc) from [<8004ac64>] (process_one_work+0x160/0x460)
[<8004ac64>] (process_one_work+0x160/0x460) from [<8004b0ac>] (worker_thread+0x148/0x49c)
[<8004b0ac>] (worker_thread+0x148/0x49c) from [<80051e28>] (kthread+0xc8/0xcc)
[<80051e28>] (kthread+0xc8/0xcc) from [<80015198>] (ret_from_fork+0x14/0x20)
UBIFS assert failed in do_writepage at 936 (pid 6)
CPU: 3 PID: 6 Comm: kworker/u8:0 Not tainted 3.10.0_s40 #1
Workqueue: writeback bdi_writeback_workfn (flush-ubifs_1_0)
[<8001d8a0>] (unwind_backtrace+0x0/0x108) from [<80019f44>] (show_stack+0x20/0x24)
[<80019f44>] (show_stack+0x20/0x24) from [<80af2ef8>] (dump_stack+0x24/0x2c)
[<80af2ef8>] (dump_stack+0x24/0x2c) from [<802990b8>] (do_writepage+0x1b8/0x1c4)
[<802990b8>] (do_writepage+0x1b8/0x1c4) from [<802991e8>] (ubifs_writepage+0x124/0x1dc)
[<802991e8>] (ubifs_writepage+0x124/0x1dc) from [<800cefc0>] (__writepage+0x24/0x4c)
[<800cefc0>] (__writepage+0x24/0x4c) from [<800cf288>] (write_cache_pages+0x1b0/0x408)
[<800cf288>] (write_cache_pages+0x1b0/0x408) from [<800cf538>] (generic_writepages+0x58/0x70)
[<800cf538>] (generic_writepages+0x58/0x70) from [<800cf594>] (do_writepages+0x44/0x48)
[<800cf594>] (do_writepages+0x44/0x48) from [<80139a30>] (__writeback_single_inode+0x50/0x238)
[<80139a30>] (__writeback_single_inode+0x50/0x238) from [<8013ae48>] (writeback_sb_inodes+0x264/0x44c)
[<8013ae48>] (writeback_sb_inodes+0x264/0x44c) from [<8013b0c4>] (__writeback_inodes_wb+0x94/0xcc)
[<8013b0c4>] (__writeback_inodes_wb+0x94/0xcc) from [<8013b3dc>] (wb_writeback+0x228/0x2f8)
[<8013b3dc>] (wb_writeback+0x228/0x2f8) from [<8013b6b4>] (wb_do_writeback+0x208/0x24c)
[<8013b6b4>] (wb_do_writeback+0x208/0x24c) from [<8013b774>] (bdi_writeback_workfn+0x7c/0x1dc)
[<8013b774>] (bdi_writeback_workfn+0x7c/0x1dc) from [<8004ac64>] (process_one_work+0x160/0x460)
[<8004ac64>] (process_one_work+0x160/0x460) from [<8004b0ac>] (worker_thread+0x148/0x49c)
[<8004b0ac>] (worker_thread+0x148/0x49c) from [<80051e28>] (kthread+0xc8/0xcc)
[<80051e28>] (kthread+0xc8/0xcc) from [<80015198>] (ret_from_fork+0x14/0x20)
UBIFS assert failed in ubifs_release_budget at 567 (pid 6)
CPU: 3 PID: 6 Comm: kworker/u8:0 Not tainted 3.10.0_s40 #1
Workqueue: writeback bdi_writeback_workfn (flush-ubifs_1_0)

-----Original Message-----
From: Dave Chinner [mailto:***@fromorbit.com]
Sent: Tuesday, October 21, 2014 11:39 AM
To: Artem Bityutskiy
Cc: Caizhiyong; linux-***@vger.kernel.org; linux-***@kvack.org; Jijiagang; ***@intel.com; linux-***@lists.infradead.org; Wanli (welly)
Subject: Re: UBIFS assert failed in ubifs_set_page_dirty at 1421
Post by Artem Bityutskiy
3. There are exactly 2 places where UBIFS-backed pages may be marked as
a) ubifs_write_end() [->wirte_end] - the file write path
b) ubifs_page_mkwrite() [->page_mkwirte] - the file mmap() path
4. If anything calls 'ubifs_set_page_dirty()' directly (not through
write_end()/mkwrite()), and the page was not dirty, UBIFS will
complain with the assertion that you see.
CPU: 3 PID: 543 Comm: kswapd0 Tainted: P O 3.10.0_s40 #1
Kernel is tainted. Not worth wasting time on unless it can be reproduced on an untainted kernel...

Cheers,

Dave.
--
Dave Chinner
***@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Loading...