[Unionfs] Problem on 2.6.23.11 build
Erez Zadok
ezk at cs.sunysb.edu
Sun Dec 16 12:35:33 EST 2007
In message <47643E4E.70108 at scalableinformatics.com>, Joe Landman writes:
> Erez Zadok wrote:
>
> > Joe, are you sure your're applying the patch against a 2.6.23 kernel source
> > and not a differet one (say, 2.6.22 or 2.6.24)?
>
> Hi Erez:
>
> Yes, I pulled it from
> http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.23.11.tar.bz2 this
> morning.
>
> It looks Trond's NFS patches for 2.6.23 are stamping on it.
>
> root at pegasus-i:/local1/linux-2.6.23.11# grep fs_private
> include/linux/writeback.h
> root at pegasus-i:/local1/linux-2.6.23.11# cd ..
> root at pegasus-i:/local1# mkdir t
> root at pegasus-i:/local1# cd t
> root at pegasus-i:/local1/t# tar -xjf ../linux-2.6.23.11.tar.bz2
> root at pegasus-i:/local1/t# cd linux-2.6.23.11/
> root at pegasus-i:/local1/t/linux-2.6.23.11# grep fs_private
> include/linux/writeback.h
> void *fs_private; /* For use by ->writepages() */
> root at pegasus-i:/local1/t/linux-2.6.23.11# patch -p1 <
> ../../linux-2.6.23-NFS_ALL.dif
Ah, yeah, if you're applying extra patches, I should know. :-)
Yes, b/t .23 and .24, some things changed in the writeback code, esp. for
nfs. But these changes were accompanied by VFS/MM-level changes too. I
hope the patch(es) you're applying take care of all that stuff.
> patching file Documentation/kernel-parameters.txt
> patching file fs/Kconfig
> patching file fs/lockd/mon.c
> patching file fs/lockd/xdr.c
> patching file fs/lockd/xdr4.c
> ...
> patching file net/sunrpc/xprtrdma/verbs.c
> patching file net/sunrpc/xprtrdma/xprt_rdma.h
> patching file net/sunrpc/xprtsock.c
> root at pegasus-i:/local1/t/linux-2.6.23.11# grep fs_private
> include/linux/writeback.h
> root at pegasus-i:/local1/t/linux-2.6.23.11#
>
> Looking at his patches, it looks like they take fs_private out of the struct
>
> diff --git a/include/linux/writeback.h b/include/linux/writeback.h
> index b4af6bc..a111d3d 100644
> --- a/include/linux/writeback.h
> +++ b/include/linux/writeback.h
> @@ -61,8 +61,6 @@ struct writeback_control {
> unsigned for_reclaim:1; /* Invoked from the page
> allocator */
> unsigned for_writepages:1; /* This is a writepages() call */
> unsigned range_cyclic:1; /* range_start is cyclic */
> -
> - void *fs_private; /* For use by ->writepages() */
> };
>
>
> > my 2.6.23.11 kernel has a field named fs_private in struct
> > writeback_control, defined in include/linux/writeback.h
>
> I am looking at 2.6.24-rc5 to see what happens there. Sure enough,
> fs_private is gone.
>
> Any words of wisdom on this? Should I try the later patch for the
> 2.6.24-rc* kernel?
>
> Thanks.
>
> Joe
You should be able to use the latest unionfs for .23 patch, but apply this
small patch on top.
Cheers,
Erez.
diff --exclude-from=.diff_exclude -ru 23/fs/unionfs/mmap.c latest/fs/unionfs/mmap.c
--- 23/fs/unionfs/mmap.c 2007-12-15 21:45:38.000000000 -0500
+++ latest/fs/unionfs/mmap.c 2007-12-16 12:32:47.000000000 -0500
@@ -19,40 +19,12 @@
#include "union.h"
-/*
- * Some lower file systems (e.g., NFS) expect the VFS to call its writepages
- * only, which in turn will call generic_writepages and invoke each of the
- * lower file system's ->writepage. NFS in particular uses the
- * wbc->fs_private field in its nfs_writepage, which is set in its
- * nfs_writepages. So if we don't call the lower nfs_writepages first, then
- * NFS's nfs_writepage will dereference a NULL wbc->fs_private and cause an
- * OOPS. If, however, we implement a unionfs_writepages and then we do call
- * the lower nfs_writepages, then we "lose control" over the pages we're
- * trying to write to the lower file system: we won't be writing our own
- * new/modified data from the upper pages to the lower pages, and any
- * mmap-based changes are lost.
- *
- * This is a fundamental cache-coherency problem in Linux. The kernel isn't
- * able to support such stacking abstractions cleanly. One possible clean
- * way would be that a lower file system's ->writepage method have some sort
- * of a callback to validate if any upper pages for the same file+offset
- * exist and have newer content in them.
- *
- * This whole NULL ptr dereference is triggered at the lower file system
- * (NFS) because the wbc->for_writepages is set to 1. Therefore, to avoid
- * this NULL pointer dereference, we set this flag to 0 and restore it upon
- * exit. This probably means that we're slightly less efficient in writing
- * pages out, doing them one at a time, but at least we avoid the oops until
- * such day as Linux can better support address_space_ops in a stackable
- * fashion.
- */
static int unionfs_writepage(struct page *page, struct writeback_control *wbc)
{
int err = -EIO;
struct inode *inode;
struct inode *lower_inode;
struct page *lower_page;
- int saved_for_writepages = wbc->for_writepages;
struct address_space *lower_mapping; /* lower inode mapping */
gfp_t mask;
@@ -98,15 +70,11 @@
unlock_page(lower_page);
goto out_release;
}
- /* workaround for some lower file systems: see big comment on top */
- if (wbc->for_writepages && !wbc->fs_private)
- wbc->for_writepages = 0;
BUG_ON(!lower_mapping->a_ops->writepage);
wait_on_page_writeback(lower_page); /* prevent multiple writers */
clear_page_dirty_for_io(lower_page); /* emulate VFS behavior */
err = lower_mapping->a_ops->writepage(lower_page, wbc);
- wbc->for_writepages = saved_for_writepages; /* restore value */
if (err < 0)
goto out_release;
More information about the unionfs
mailing list