Back to home page

OSCL-LXR

 
 

    


0001 ===================
0002 Setting up NFS/RDMA
0003 ===================
0004 
0005 :Author:
0006   NetApp and Open Grid Computing (May 29, 2008)
0007 
0008 .. warning::
0009   This document is probably obsolete.
0010 
0011 Overview
0012 ========
0013 
0014 This document describes how to install and setup the Linux NFS/RDMA client
0015 and server software.
0016 
0017 The NFS/RDMA client was first included in Linux 2.6.24. The NFS/RDMA server
0018 was first included in the following release, Linux 2.6.25.
0019 
0020 In our testing, we have obtained excellent performance results (full 10Gbit
0021 wire bandwidth at minimal client CPU) under many workloads. The code passes
0022 the full Connectathon test suite and operates over both Infiniband and iWARP
0023 RDMA adapters.
0024 
0025 Getting Help
0026 ============
0027 
0028 If you get stuck, you can ask questions on the
0029 nfs-rdma-devel@lists.sourceforge.net mailing list.
0030 
0031 Installation
0032 ============
0033 
0034 These instructions are a step by step guide to building a machine for
0035 use with NFS/RDMA.
0036 
0037 - Install an RDMA device
0038 
0039   Any device supported by the drivers in drivers/infiniband/hw is acceptable.
0040 
0041   Testing has been performed using several Mellanox-based IB cards, the
0042   Ammasso AMS1100 iWARP adapter, and the Chelsio cxgb3 iWARP adapter.
0043 
0044 - Install a Linux distribution and tools
0045 
0046   The first kernel release to contain both the NFS/RDMA client and server was
0047   Linux 2.6.25  Therefore, a distribution compatible with this and subsequent
0048   Linux kernel release should be installed.
0049 
0050   The procedures described in this document have been tested with
0051   distributions from Red Hat's Fedora Project (http://fedora.redhat.com/).
0052 
0053 - Install nfs-utils-1.1.2 or greater on the client
0054 
0055   An NFS/RDMA mount point can be obtained by using the mount.nfs command in
0056   nfs-utils-1.1.2 or greater (nfs-utils-1.1.1 was the first nfs-utils
0057   version with support for NFS/RDMA mounts, but for various reasons we
0058   recommend using nfs-utils-1.1.2 or greater). To see which version of
0059   mount.nfs you are using, type:
0060 
0061   .. code-block:: sh
0062 
0063     $ /sbin/mount.nfs -V
0064 
0065   If the version is less than 1.1.2 or the command does not exist,
0066   you should install the latest version of nfs-utils.
0067 
0068   Download the latest package from: https://www.kernel.org/pub/linux/utils/nfs
0069 
0070   Uncompress the package and follow the installation instructions.
0071 
0072   If you will not need the idmapper and gssd executables (you do not need
0073   these to create an NFS/RDMA enabled mount command), the installation
0074   process can be simplified by disabling these features when running
0075   configure:
0076 
0077   .. code-block:: sh
0078 
0079     $ ./configure --disable-gss --disable-nfsv4
0080 
0081   To build nfs-utils you will need the tcp_wrappers package installed. For
0082   more information on this see the package's README and INSTALL files.
0083 
0084   After building the nfs-utils package, there will be a mount.nfs binary in
0085   the utils/mount directory. This binary can be used to initiate NFS v2, v3,
0086   or v4 mounts. To initiate a v4 mount, the binary must be called
0087   mount.nfs4.  The standard technique is to create a symlink called
0088   mount.nfs4 to mount.nfs.
0089 
0090   This mount.nfs binary should be installed at /sbin/mount.nfs as follows:
0091 
0092   .. code-block:: sh
0093 
0094     $ sudo cp utils/mount/mount.nfs /sbin/mount.nfs
0095 
0096   In this location, mount.nfs will be invoked automatically for NFS mounts
0097   by the system mount command.
0098 
0099     .. note::
0100       mount.nfs and therefore nfs-utils-1.1.2 or greater is only needed
0101       on the NFS client machine. You do not need this specific version of
0102       nfs-utils on the server. Furthermore, only the mount.nfs command from
0103       nfs-utils-1.1.2 is needed on the client.
0104 
0105 - Install a Linux kernel with NFS/RDMA
0106 
0107   The NFS/RDMA client and server are both included in the mainline Linux
0108   kernel version 2.6.25 and later. This and other versions of the Linux
0109   kernel can be found at: https://www.kernel.org/pub/linux/kernel/
0110 
0111   Download the sources and place them in an appropriate location.
0112 
0113 - Configure the RDMA stack
0114 
0115   Make sure your kernel configuration has RDMA support enabled. Under
0116   Device Drivers -> InfiniBand support, update the kernel configuration
0117   to enable InfiniBand support [NOTE: the option name is misleading. Enabling
0118   InfiniBand support is required for all RDMA devices (IB, iWARP, etc.)].
0119 
0120   Enable the appropriate IB HCA support (mlx4, mthca, ehca, ipath, etc.) or
0121   iWARP adapter support (amso, cxgb3, etc.).
0122 
0123   If you are using InfiniBand, be sure to enable IP-over-InfiniBand support.
0124 
0125 - Configure the NFS client and server
0126 
0127   Your kernel configuration must also have NFS file system support and/or
0128   NFS server support enabled. These and other NFS related configuration
0129   options can be found under File Systems -> Network File Systems.
0130 
0131 - Build, install, reboot
0132 
0133   The NFS/RDMA code will be enabled automatically if NFS and RDMA
0134   are turned on. The NFS/RDMA client and server are configured via the hidden
0135   SUNRPC_XPRT_RDMA config option that depends on SUNRPC and INFINIBAND. The
0136   value of SUNRPC_XPRT_RDMA will be:
0137 
0138     #. N if either SUNRPC or INFINIBAND are N, in this case the NFS/RDMA client
0139        and server will not be built
0140 
0141     #. M if both SUNRPC and INFINIBAND are on (M or Y) and at least one is M,
0142        in this case the NFS/RDMA client and server will be built as modules
0143 
0144     #. Y if both SUNRPC and INFINIBAND are Y, in this case the NFS/RDMA client
0145        and server will be built into the kernel
0146 
0147   Therefore, if you have followed the steps above and turned no NFS and RDMA,
0148   the NFS/RDMA client and server will be built.
0149 
0150   Build a new kernel, install it, boot it.
0151 
0152 Check RDMA and NFS Setup
0153 ========================
0154 
0155 Before configuring the NFS/RDMA software, it is a good idea to test
0156 your new kernel to ensure that the kernel is working correctly.
0157 In particular, it is a good idea to verify that the RDMA stack
0158 is functioning as expected and standard NFS over TCP/IP and/or UDP/IP
0159 is working properly.
0160 
0161 - Check RDMA Setup
0162 
0163   If you built the RDMA components as modules, load them at
0164   this time. For example, if you are using a Mellanox Tavor/Sinai/Arbel
0165   card:
0166 
0167   .. code-block:: sh
0168 
0169     $ modprobe ib_mthca
0170     $ modprobe ib_ipoib
0171 
0172   If you are using InfiniBand, make sure there is a Subnet Manager (SM)
0173   running on the network. If your IB switch has an embedded SM, you can
0174   use it. Otherwise, you will need to run an SM, such as OpenSM, on one
0175   of your end nodes.
0176 
0177   If an SM is running on your network, you should see the following:
0178 
0179   .. code-block:: sh
0180 
0181     $ cat /sys/class/infiniband/driverX/ports/1/state
0182     4: ACTIVE
0183 
0184   where driverX is mthca0, ipath5, ehca3, etc.
0185 
0186   To further test the InfiniBand software stack, use IPoIB (this
0187   assumes you have two IB hosts named host1 and host2):
0188 
0189   .. code-block:: sh
0190 
0191     host1$ ip link set dev ib0 up
0192     host1$ ip address add dev ib0 a.b.c.x
0193     host2$ ip link set dev ib0 up
0194     host2$ ip address add dev ib0 a.b.c.y
0195     host1$ ping a.b.c.y
0196     host2$ ping a.b.c.x
0197 
0198   For other device types, follow the appropriate procedures.
0199 
0200 - Check NFS Setup
0201 
0202   For the NFS components enabled above (client and/or server),
0203   test their functionality over standard Ethernet using TCP/IP or UDP/IP.
0204 
0205 NFS/RDMA Setup
0206 ==============
0207 
0208 We recommend that you use two machines, one to act as the client and
0209 one to act as the server.
0210 
0211 One time configuration:
0212 -----------------------
0213 
0214 - On the server system, configure the /etc/exports file and start the NFS/RDMA server.
0215 
0216   Exports entries with the following formats have been tested::
0217 
0218   /vol0   192.168.0.47(fsid=0,rw,async,insecure,no_root_squash)
0219   /vol0   192.168.0.0/255.255.255.0(fsid=0,rw,async,insecure,no_root_squash)
0220 
0221   The IP address(es) is(are) the client's IPoIB address for an InfiniBand
0222   HCA or the client's iWARP address(es) for an RNIC.
0223 
0224   .. note::
0225     The "insecure" option must be used because the NFS/RDMA client does
0226     not use a reserved port.
0227 
0228 Each time a machine boots:
0229 --------------------------
0230 
0231 - Load and configure the RDMA drivers
0232 
0233   For InfiniBand using a Mellanox adapter:
0234 
0235   .. code-block:: sh
0236 
0237     $ modprobe ib_mthca
0238     $ modprobe ib_ipoib
0239     $ ip li set dev ib0 up
0240     $ ip addr add dev ib0 a.b.c.d
0241 
0242   .. note::
0243     Please use unique addresses for the client and server!
0244 
0245 - Start the NFS server
0246 
0247   If the NFS/RDMA server was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in
0248   kernel config), load the RDMA transport module:
0249 
0250   .. code-block:: sh
0251 
0252     $ modprobe svcrdma
0253 
0254   Regardless of how the server was built (module or built-in), start the
0255   server:
0256 
0257   .. code-block:: sh
0258 
0259     $ /etc/init.d/nfs start
0260 
0261   or
0262 
0263   .. code-block:: sh
0264 
0265     $ service nfs start
0266 
0267   Instruct the server to listen on the RDMA transport:
0268 
0269   .. code-block:: sh
0270 
0271     $ echo rdma 20049 > /proc/fs/nfsd/portlist
0272 
0273 - On the client system
0274 
0275   If the NFS/RDMA client was built as a module (CONFIG_SUNRPC_XPRT_RDMA=m in
0276   kernel config), load the RDMA client module:
0277 
0278   .. code-block:: sh
0279 
0280     $ modprobe xprtrdma.ko
0281 
0282   Regardless of how the client was built (module or built-in), use this
0283   command to mount the NFS/RDMA server:
0284 
0285   .. code-block:: sh
0286 
0287     $ mount -o rdma,port=20049 <IPoIB-server-name-or-address>:/<export> /mnt
0288 
0289   To verify that the mount is using RDMA, run "cat /proc/mounts" and check
0290   the "proto" field for the given mount.
0291 
0292   Congratulations! You're using NFS/RDMA!