0001 ================================
0002 Application Data Integrity (ADI)
0003 ================================
0004
0005 SPARC M7 processor adds the Application Data Integrity (ADI) feature.
0006 ADI allows a task to set version tags on any subset of its address
0007 space. Once ADI is enabled and version tags are set for ranges of
0008 address space of a task, the processor will compare the tag in pointers
0009 to memory in these ranges to the version set by the application
0010 previously. Access to memory is granted only if the tag in given pointer
0011 matches the tag set by the application. In case of mismatch, processor
0012 raises an exception.
0013
0014 Following steps must be taken by a task to enable ADI fully:
0015
0016 1. Set the user mode PSTATE.mcde bit. This acts as master switch for
0017 the task's entire address space to enable/disable ADI for the task.
0018
0019 2. Set TTE.mcd bit on any TLB entries that correspond to the range of
0020 addresses ADI is being enabled on. MMU checks the version tag only
0021 on the pages that have TTE.mcd bit set.
0022
0023 3. Set the version tag for virtual addresses using stxa instruction
0024 and one of the MCD specific ASIs. Each stxa instruction sets the
0025 given tag for one ADI block size number of bytes. This step must
0026 be repeated for entire page to set tags for entire page.
0027
0028 ADI block size for the platform is provided by the hypervisor to kernel
0029 in machine description tables. Hypervisor also provides the number of
0030 top bits in the virtual address that specify the version tag. Once
0031 version tag has been set for a memory location, the tag is stored in the
0032 physical memory and the same tag must be present in the ADI version tag
0033 bits of the virtual address being presented to the MMU. For example on
0034 SPARC M7 processor, MMU uses bits 63-60 for version tags and ADI block
0035 size is same as cacheline size which is 64 bytes. A task that sets ADI
0036 version to, say 10, on a range of memory, must access that memory using
0037 virtual addresses that contain 0xa in bits 63-60.
0038
0039 ADI is enabled on a set of pages using mprotect() with PROT_ADI flag.
0040 When ADI is enabled on a set of pages by a task for the first time,
0041 kernel sets the PSTATE.mcde bit fot the task. Version tags for memory
0042 addresses are set with an stxa instruction on the addresses using
0043 ASI_MCD_PRIMARY or ASI_MCD_ST_BLKINIT_PRIMARY. ADI block size is
0044 provided by the hypervisor to the kernel. Kernel returns the value of
0045 ADI block size to userspace using auxiliary vector along with other ADI
0046 info. Following auxiliary vectors are provided by the kernel:
0047
0048 ============ ===========================================
0049 AT_ADI_BLKSZ ADI block size. This is the granularity and
0050 alignment, in bytes, of ADI versioning.
0051 AT_ADI_NBITS Number of ADI version bits in the VA
0052 ============ ===========================================
0053
0054
0055 IMPORTANT NOTES
0056 ===============
0057
0058 - Version tag values of 0x0 and 0xf are reserved. These values match any
0059 tag in virtual address and never generate a mismatch exception.
0060
0061 - Version tags are set on virtual addresses from userspace even though
0062 tags are stored in physical memory. Tags are set on a physical page
0063 after it has been allocated to a task and a pte has been created for
0064 it.
0065
0066 - When a task frees a memory page it had set version tags on, the page
0067 goes back to free page pool. When this page is re-allocated to a task,
0068 kernel clears the page using block initialization ASI which clears the
0069 version tags as well for the page. If a page allocated to a task is
0070 freed and allocated back to the same task, old version tags set by the
0071 task on that page will no longer be present.
0072
0073 - ADI tag mismatches are not detected for non-faulting loads.
0074
0075 - Kernel does not set any tags for user pages and it is entirely a
0076 task's responsibility to set any version tags. Kernel does ensure the
0077 version tags are preserved if a page is swapped out to the disk and
0078 swapped back in. It also preserves that version tags if a page is
0079 migrated.
0080
0081 - ADI works for any size pages. A userspace task need not be aware of
0082 page size when using ADI. It can simply select a virtual address
0083 range, enable ADI on the range using mprotect() and set version tags
0084 for the entire range. mprotect() ensures range is aligned to page size
0085 and is a multiple of page size.
0086
0087 - ADI tags can only be set on writable memory. For example, ADI tags can
0088 not be set on read-only mappings.
0089
0090
0091
0092 ADI related traps
0093 =================
0094
0095 With ADI enabled, following new traps may occur:
0096
0097 Disrupting memory corruption
0098 ----------------------------
0099
0100 When a store accesses a memory localtion that has TTE.mcd=1,
0101 the task is running with ADI enabled (PSTATE.mcde=1), and the ADI
0102 tag in the address used (bits 63:60) does not match the tag set on
0103 the corresponding cacheline, a memory corruption trap occurs. By
0104 default, it is a disrupting trap and is sent to the hypervisor
0105 first. Hypervisor creates a sun4v error report and sends a
0106 resumable error (TT=0x7e) trap to the kernel. The kernel sends
0107 a SIGSEGV to the task that resulted in this trap with the following
0108 info::
0109
0110 siginfo.si_signo = SIGSEGV;
0111 siginfo.errno = 0;
0112 siginfo.si_code = SEGV_ADIDERR;
0113 siginfo.si_addr = addr; /* PC where first mismatch occurred */
0114 siginfo.si_trapno = 0;
0115
0116
0117 Precise memory corruption
0118 -------------------------
0119
0120 When a store accesses a memory location that has TTE.mcd=1,
0121 the task is running with ADI enabled (PSTATE.mcde=1), and the ADI
0122 tag in the address used (bits 63:60) does not match the tag set on
0123 the corresponding cacheline, a memory corruption trap occurs. If
0124 MCD precise exception is enabled (MCDPERR=1), a precise
0125 exception is sent to the kernel with TT=0x1a. The kernel sends
0126 a SIGSEGV to the task that resulted in this trap with the following
0127 info::
0128
0129 siginfo.si_signo = SIGSEGV;
0130 siginfo.errno = 0;
0131 siginfo.si_code = SEGV_ADIPERR;
0132 siginfo.si_addr = addr; /* address that caused trap */
0133 siginfo.si_trapno = 0;
0134
0135 NOTE:
0136 ADI tag mismatch on a load always results in precise trap.
0137
0138
0139 MCD disabled
0140 ------------
0141
0142 When a task has not enabled ADI and attempts to set ADI version
0143 on a memory address, processor sends an MCD disabled trap. This
0144 trap is handled by hypervisor first and the hypervisor vectors this
0145 trap through to the kernel as Data Access Exception trap with
0146 fault type set to 0xa (invalid ASI). When this occurs, the kernel
0147 sends the task SIGSEGV signal with following info::
0148
0149 siginfo.si_signo = SIGSEGV;
0150 siginfo.errno = 0;
0151 siginfo.si_code = SEGV_ACCADI;
0152 siginfo.si_addr = addr; /* address that caused trap */
0153 siginfo.si_trapno = 0;
0154
0155
0156 Sample program to use ADI
0157 -------------------------
0158
0159 Following sample program is meant to illustrate how to use the ADI
0160 functionality::
0161
0162 #include <unistd.h>
0163 #include <stdio.h>
0164 #include <stdlib.h>
0165 #include <elf.h>
0166 #include <sys/ipc.h>
0167 #include <sys/shm.h>
0168 #include <sys/mman.h>
0169 #include <asm/asi.h>
0170
0171 #ifndef AT_ADI_BLKSZ
0172 #define AT_ADI_BLKSZ 48
0173 #endif
0174 #ifndef AT_ADI_NBITS
0175 #define AT_ADI_NBITS 49
0176 #endif
0177
0178 #ifndef PROT_ADI
0179 #define PROT_ADI 0x10
0180 #endif
0181
0182 #define BUFFER_SIZE 32*1024*1024UL
0183
0184 main(int argc, char* argv[], char* envp[])
0185 {
0186 unsigned long i, mcde, adi_blksz, adi_nbits;
0187 char *shmaddr, *tmp_addr, *end, *veraddr, *clraddr;
0188 int shmid, version;
0189 Elf64_auxv_t *auxv;
0190
0191 adi_blksz = 0;
0192
0193 while(*envp++ != NULL);
0194 for (auxv = (Elf64_auxv_t *)envp; auxv->a_type != AT_NULL; auxv++) {
0195 switch (auxv->a_type) {
0196 case AT_ADI_BLKSZ:
0197 adi_blksz = auxv->a_un.a_val;
0198 break;
0199 case AT_ADI_NBITS:
0200 adi_nbits = auxv->a_un.a_val;
0201 break;
0202 }
0203 }
0204 if (adi_blksz == 0) {
0205 fprintf(stderr, "Oops! ADI is not supported\n");
0206 exit(1);
0207 }
0208
0209 printf("ADI capabilities:\n");
0210 printf("\tBlock size = %ld\n", adi_blksz);
0211 printf("\tNumber of bits = %ld\n", adi_nbits);
0212
0213 if ((shmid = shmget(2, BUFFER_SIZE,
0214 IPC_CREAT | SHM_R | SHM_W)) < 0) {
0215 perror("shmget failed");
0216 exit(1);
0217 }
0218
0219 shmaddr = shmat(shmid, NULL, 0);
0220 if (shmaddr == (char *)-1) {
0221 perror("shm attach failed");
0222 shmctl(shmid, IPC_RMID, NULL);
0223 exit(1);
0224 }
0225
0226 if (mprotect(shmaddr, BUFFER_SIZE, PROT_READ|PROT_WRITE|PROT_ADI)) {
0227 perror("mprotect failed");
0228 goto err_out;
0229 }
0230
0231 /* Set the ADI version tag on the shm segment
0232 */
0233 version = 10;
0234 tmp_addr = shmaddr;
0235 end = shmaddr + BUFFER_SIZE;
0236 while (tmp_addr < end) {
0237 asm volatile(
0238 "stxa %1, [%0]0x90\n\t"
0239 :
0240 : "r" (tmp_addr), "r" (version));
0241 tmp_addr += adi_blksz;
0242 }
0243 asm volatile("membar #Sync\n\t");
0244
0245 /* Create a versioned address from the normal address by placing
0246 * version tag in the upper adi_nbits bits
0247 */
0248 tmp_addr = (void *) ((unsigned long)shmaddr << adi_nbits);
0249 tmp_addr = (void *) ((unsigned long)tmp_addr >> adi_nbits);
0250 veraddr = (void *) (((unsigned long)version << (64-adi_nbits))
0251 | (unsigned long)tmp_addr);
0252
0253 printf("Starting the writes:\n");
0254 for (i = 0; i < BUFFER_SIZE; i++) {
0255 veraddr[i] = (char)(i);
0256 if (!(i % (1024 * 1024)))
0257 printf(".");
0258 }
0259 printf("\n");
0260
0261 printf("Verifying data...");
0262 fflush(stdout);
0263 for (i = 0; i < BUFFER_SIZE; i++)
0264 if (veraddr[i] != (char)i)
0265 printf("\nIndex %lu mismatched\n", i);
0266 printf("Done.\n");
0267
0268 /* Disable ADI and clean up
0269 */
0270 if (mprotect(shmaddr, BUFFER_SIZE, PROT_READ|PROT_WRITE)) {
0271 perror("mprotect failed");
0272 goto err_out;
0273 }
0274
0275 if (shmdt((const void *)shmaddr) != 0)
0276 perror("Detach failure");
0277 shmctl(shmid, IPC_RMID, NULL);
0278
0279 exit(0);
0280
0281 err_out:
0282 if (shmdt((const void *)shmaddr) != 0)
0283 perror("Detach failure");
0284 shmctl(shmid, IPC_RMID, NULL);
0285 exit(1);
0286 }