Back to home page

OSCL-LXR

 
 

    


0001 =================================================
0002 Linux API for read access to z/VM Monitor Records
0003 =================================================
0004 
0005 Date  : 2004-Nov-26
0006 
0007 Author: Gerald Schaefer (geraldsc@de.ibm.com)
0008 
0009 
0010 
0011 
0012 Description
0013 ===========
0014 This item delivers a new Linux API in the form of a misc char device that is
0015 usable from user space and allows read access to the z/VM Monitor Records
0016 collected by the `*MONITOR` System Service of z/VM.
0017 
0018 
0019 User Requirements
0020 =================
0021 The z/VM guest on which you want to access this API needs to be configured in
0022 order to allow IUCV connections to the `*MONITOR` service, i.e. it needs the
0023 IUCV `*MONITOR` statement in its user entry. If the monitor DCSS to be used is
0024 restricted (likely), you also need the NAMESAVE <DCSS NAME> statement.
0025 This item will use the IUCV device driver to access the z/VM services, so you
0026 need a kernel with IUCV support. You also need z/VM version 4.4 or 5.1.
0027 
0028 There are two options for being able to load the monitor DCSS (examples assume
0029 that the monitor DCSS begins at 144 MB and ends at 152 MB). You can query the
0030 location of the monitor DCSS with the Class E privileged CP command Q NSS MAP
0031 (the values BEGPAG and ENDPAG are given in units of 4K pages).
0032 
0033 See also "CP Command and Utility Reference" (SC24-6081-00) for more information
0034 on the DEF STOR and Q NSS MAP commands, as well as "Saved Segments Planning
0035 and Administration" (SC24-6116-00) for more information on DCSSes.
0036 
0037 1st option:
0038 -----------
0039 You can use the CP command DEF STOR CONFIG to define a "memory hole" in your
0040 guest virtual storage around the address range of the DCSS.
0041 
0042 Example: DEF STOR CONFIG 0.140M 200M.200M
0043 
0044 This defines two blocks of storage, the first is 140MB in size an begins at
0045 address 0MB, the second is 200MB in size and begins at address 200MB,
0046 resulting in a total storage of 340MB. Note that the first block should
0047 always start at 0 and be at least 64MB in size.
0048 
0049 2nd option:
0050 -----------
0051 Your guest virtual storage has to end below the starting address of the DCSS
0052 and you have to specify the "mem=" kernel parameter in your parmfile with a
0053 value greater than the ending address of the DCSS.
0054 
0055 Example::
0056 
0057         DEF STOR 140M
0058 
0059 This defines 140MB storage size for your guest, the parameter "mem=160M" is
0060 added to the parmfile.
0061 
0062 
0063 User Interface
0064 ==============
0065 The char device is implemented as a kernel module named "monreader",
0066 which can be loaded via the modprobe command, or it can be compiled into the
0067 kernel instead. There is one optional module (or kernel) parameter, "mondcss",
0068 to specify the name of the monitor DCSS. If the module is compiled into the
0069 kernel, the kernel parameter "monreader.mondcss=<DCSS NAME>" can be specified
0070 in the parmfile.
0071 
0072 The default name for the DCSS is "MONDCSS" if none is specified. In case that
0073 there are other users already connected to the `*MONITOR` service (e.g.
0074 Performance Toolkit), the monitor DCSS is already defined and you have to use
0075 the same DCSS. The CP command Q MONITOR (Class E privileged) shows the name
0076 of the monitor DCSS, if already defined, and the users connected to the
0077 `*MONITOR` service.
0078 Refer to the "z/VM Performance" book (SC24-6109-00) on how to create a monitor
0079 DCSS if your z/VM doesn't have one already, you need Class E privileges to
0080 define and save a DCSS.
0081 
0082 Example:
0083 --------
0084 
0085 ::
0086 
0087         modprobe monreader mondcss=MYDCSS
0088 
0089 This loads the module and sets the DCSS name to "MYDCSS".
0090 
0091 NOTE:
0092 -----
0093 This API provides no interface to control the `*MONITOR` service, e.g. specify
0094 which data should be collected. This can be done by the CP command MONITOR
0095 (Class E privileged), see "CP Command and Utility Reference".
0096 
0097 Device nodes with udev:
0098 -----------------------
0099 After loading the module, a char device will be created along with the device
0100 node /<udev directory>/monreader.
0101 
0102 Device nodes without udev:
0103 --------------------------
0104 If your distribution does not support udev, a device node will not be created
0105 automatically and you have to create it manually after loading the module.
0106 Therefore you need to know the major and minor numbers of the device. These
0107 numbers can be found in /sys/class/misc/monreader/dev.
0108 
0109 Typing cat /sys/class/misc/monreader/dev will give an output of the form
0110 <major>:<minor>. The device node can be created via the mknod command, enter
0111 mknod <name> c <major> <minor>, where <name> is the name of the device node
0112 to be created.
0113 
0114 Example:
0115 --------
0116 
0117 ::
0118 
0119         # modprobe monreader
0120         # cat /sys/class/misc/monreader/dev
0121         10:63
0122         # mknod /dev/monreader c 10 63
0123 
0124 This loads the module with the default monitor DCSS (MONDCSS) and creates a
0125 device node.
0126 
0127 File operations:
0128 ----------------
0129 The following file operations are supported: open, release, read, poll.
0130 There are two alternative methods for reading: either non-blocking read in
0131 conjunction with polling, or blocking read without polling. IOCTLs are not
0132 supported.
0133 
0134 Read:
0135 -----
0136 Reading from the device provides a 12 Byte monitor control element (MCE),
0137 followed by a set of one or more contiguous monitor records (similar to the
0138 output of the CMS utility MONWRITE without the 4K control blocks). The MCE
0139 contains information on the type of the following record set (sample/event
0140 data), the monitor domains contained within it and the start and end address
0141 of the record set in the monitor DCSS. The start and end address can be used
0142 to determine the size of the record set, the end address is the address of the
0143 last byte of data. The start address is needed to handle "end-of-frame" records
0144 correctly (domain 1, record 13), i.e. it can be used to determine the record
0145 start offset relative to a 4K page (frame) boundary.
0146 
0147 See "Appendix A: `*MONITOR`" in the "z/VM Performance" document for a description
0148 of the monitor control element layout. The layout of the monitor records can
0149 be found here (z/VM 5.1): https://www.vm.ibm.com/pubs/mon510/index.html
0150 
0151 The layout of the data stream provided by the monreader device is as follows::
0152 
0153         ...
0154         <0 byte read>
0155         <first MCE>              \
0156         <first set of records>    |
0157         ...                       |- data set
0158         <last MCE>                |
0159         <last set of records>    /
0160         <0 byte read>
0161         ...
0162 
0163 There may be more than one combination of MCE and corresponding record set
0164 within one data set and the end of each data set is indicated by a successful
0165 read with a return value of 0 (0 byte read).
0166 Any received data must be considered invalid until a complete set was
0167 read successfully, including the closing 0 byte read. Therefore you should
0168 always read the complete set into a buffer before processing the data.
0169 
0170 The maximum size of a data set can be as large as the size of the
0171 monitor DCSS, so design the buffer adequately or use dynamic memory allocation.
0172 The size of the monitor DCSS will be printed into syslog after loading the
0173 module. You can also use the (Class E privileged) CP command Q NSS MAP to
0174 list all available segments and information about them.
0175 
0176 As with most char devices, error conditions are indicated by returning a
0177 negative value for the number of bytes read. In this case, the errno variable
0178 indicates the error condition:
0179 
0180 EIO:
0181      reply failed, read data is invalid and the application
0182      should discard the data read since the last successful read with 0 size.
0183 EFAULT:
0184         copy_to_user failed, read data is invalid and the application should
0185         discard the data read since the last successful read with 0 size.
0186 EAGAIN:
0187         occurs on a non-blocking read if there is no data available at the
0188         moment. There is no data missing or corrupted, just try again or rather
0189         use polling for non-blocking reads.
0190 EOVERFLOW:
0191            message limit reached, the data read since the last successful
0192            read with 0 size is valid but subsequent records may be missing.
0193 
0194 In the last case (EOVERFLOW) there may be missing data, in the first two cases
0195 (EIO, EFAULT) there will be missing data. It's up to the application if it will
0196 continue reading subsequent data or rather exit.
0197 
0198 Open:
0199 -----
0200 Only one user is allowed to open the char device. If it is already in use, the
0201 open function will fail (return a negative value) and set errno to EBUSY.
0202 The open function may also fail if an IUCV connection to the `*MONITOR` service
0203 cannot be established. In this case errno will be set to EIO and an error
0204 message with an IPUSER SEVER code will be printed into syslog. The IPUSER SEVER
0205 codes are described in the "z/VM Performance" book, Appendix A.
0206 
0207 NOTE:
0208 -----
0209 As soon as the device is opened, incoming messages will be accepted and they
0210 will account for the message limit, i.e. opening the device without reading
0211 from it will provoke the "message limit reached" error (EOVERFLOW error code)
0212 eventually.