From: Florian Krohm Date: Thu, 6 Mar 2025 17:42:05 +0000 (+0000) Subject: s390x: Add disassembly checker (Bug 498037) X-Git-Tag: VALGRIND_3_25_0~121 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=b98855efc89a402357469fbde06912a3bc0b0bfd;p=thirdparty%2Fvalgrind.git s390x: Add disassembly checker (Bug 498037) Add program disasm-test to check that s390_disasm generates the same disassembly for a given insn than objdump -d does. The focus is on insns that have extended mnemonics most of which are vector insns. The checker resides in none/tests/s390x/disasm-test with comprehensive documentation in the README file there. It is integrated into the regression testing framework but currently disabled, because s390_disasm has not been fixed yet. Fixes https://bugs.kde.org/show_bug.cgi?id=498037 --- diff --git a/.gitignore b/.gitignore index de2306df7..c394d2717 100644 --- a/.gitignore +++ b/.gitignore @@ -2231,6 +2231,17 @@ /none/tests/riscv64/integer /none/tests/riscv64/muldiv +# /none/tests/s390x/disasm-test/ +/none/tests/s390x/disasm-test/*.dSYM +/none/tests/s390x/disasm-test/*.stderr.diff* +/none/tests/s390x/disasm-test/*.stderr.out +/none/tests/s390x/disasm-test/*.stdout.diff* +/none/tests/s390x/disasm-test/*.stdout.out +/none/tests/s390x/disasm-test/.deps +/none/tests/s390x/disasm-test/Makefile +/none/tests/s390x/disasm-test/Makefile.in +/none/tests/s390x/disasm-test/disasm-test + # /none/tests/scripts/ /none/tests/scripts/*.dSYM /none/tests/scripts/*.so diff --git a/Makefile.am b/Makefile.am index db8cfa382..14309dacf 100644 --- a/Makefile.am +++ b/Makefile.am @@ -33,6 +33,7 @@ SUBDIRS = \ perf \ gdbserver_tests \ memcheck/tests/vbit-test \ + none/tests/s390x/disasm-test \ auxprogs \ mpi \ solaris \ diff --git a/NEWS b/NEWS index ad58e7215..fd470fee7 100644 --- a/NEWS +++ b/NEWS @@ -48,6 +48,7 @@ are not entered into bugzilla tend to get forgotten about or ignored. 497455 Update drd/scripts/download-and-build-gcc 497723 Enabling Ada demangling breaks callgrind differentiation between overloaded functions and procedures +498037 s390x: Add disassembly checker 498143 False positive on EVIOCGRAB ioctl 498317 FdBadUse is not a valid CoreError type in a suppression even though it's generated by --gen-suppressions=yes diff --git a/configure.ac b/configure.ac index e6ae0501b..5d3c6d02d 100755 --- a/configure.ac +++ b/configure.ac @@ -5722,6 +5722,7 @@ AC_CONFIG_FILES([ none/tests/arm/Makefile none/tests/arm64/Makefile none/tests/s390x/Makefile + none/tests/s390x/disasm-test/Makefile none/tests/mips32/Makefile none/tests/mips64/Makefile none/tests/nanomips/Makefile diff --git a/none/tests/s390x/disasm-test/Makefile.am b/none/tests/s390x/disasm-test/Makefile.am new file mode 100644 index 000000000..775848921 --- /dev/null +++ b/none/tests/s390x/disasm-test/Makefile.am @@ -0,0 +1,35 @@ +include $(top_srcdir)/Makefile.all.am + +EXTRA_DIST = disasm-test.vgtest disasm-test.stderr.exp disasm-test.stdout.exp + +dist_noinst_SCRIPTS = filter_stderr + +#---------------------------------------------------------------------------- +# Headers +#---------------------------------------------------------------------------- + +pkginclude_HEADERS = +noinst_HEADERS = main.h objdump.h vex.h + +#---------------------------------------------------------------------------- +# disasm_test +#---------------------------------------------------------------------------- + +noinst_PROGRAMS = disasm-test + +SOURCES = \ + main.c \ + generate.c \ + objdump.c \ + opcode.c \ + verify.c \ + vex.c + +disasm_test_SOURCES = $(SOURCES) +disasm_test_CPPFLAGS = $(AM_CPPFLAGS_PRI) \ + -I$(top_srcdir)/VEX/pub \ + -I$(top_srcdir)/VEX/priv +disasm_test_CFLAGS = $(AM_CFLAGS_PRI) +disasm_test_DEPENDENCIES = +disasm_test_LDADD = $(top_builddir)/VEX/libvex-@VGCONF_ARCH_PRI@-@VGCONF_OS@.a +disasm_test_LDFLAGS = $(AM_CFLAGS_PRI) @LIB_UBSAN@ diff --git a/none/tests/s390x/disasm-test/README b/none/tests/s390x/disasm-test/README new file mode 100644 index 000000000..54df05cf9 --- /dev/null +++ b/none/tests/s390x/disasm-test/README @@ -0,0 +1,210 @@ +disasm-test + +The purpose of this program is to ensure that the disassembly as +generated by s390_disasm matches what objdump -d produces for any +given insn. As such the program runs as part of "make regtest". + + +How it works +------------ +1) Given an opcode, generate a C file with testcases. +2) Compile the C file. +3) Run objdump -d on the object file and capture the result in a file. + This file will be referred to as "the objdump file". +4) Read the objdump file, extract the insn bytes and disassembly. +5) Feed the so-extracted insn bytes into VEX's decode-and-irgen + machinery with disassembly (= tracing frontend) being enabled. +6) Intercept the so-disassembled text and compare it with the + disassembly in the objdump file. + + +Invocation +---------- +See disasm-test --help for the most up-to-date instructions. + +disasm-test --all + For all opcodes known to disasm-test, generate testcases and + verify the disassembly. This is how disasm-test is invoked + during regression testing. + +disasm-test --generate OPCODE_NAMEs + For each specified opcode, generate a C file with testcases, + compile it and create the objdump file. + Useful when adding new opcodes. + +disasm-test --verify OBJDUMP_FILEs + For each specified objdump file, verify that the disassembly via VEX + matches. Useful when adding new opcodes. + +disasm-test --run OPCODE_NAMEs + Combines --generate and --verify. Useful when adding new opcodes. + + +Other non-debugging options +--------------------------- +--verbose + Reports activity and miscellaneous information deemed interesting. + +--summary + Write out test generation summary. This option is only observed in + combination with --all. + +--gcc=/path/to/gcc + Specify which GCC to use. Default is: gcc on $PATH. + +--gcc-flags=FLAGS + Specify which flags GCC to use. Default is: "-c -march=arch14". + +--objdump=/path/to/objdump + Specify which objdump to use. Default is: objdump on $PATH. + +--keep-temp + Keep generated files: .c file with testcases, object file, objdump + file, and .vex file + +--show-exc-spec + Show generated insns that cause specification exceptions. + Because insns causing specification exceptions are typically accepted + by GCC and objdump an objdump file may contain them. But comparing + them is pointless. + +--no-show-miscompares + Do not show miscomparing insns. + + +Debugging options +----------------- +--debug + Additional debugging output. + +--d12=INT + Use INT as the value for d12 fields. The given value is checked + for feasibility. + +--d20=INT + Use INT as the value for d20 fields. The given value is checked + for feasibility. + +--sint=INT + Use INT as the only value for signed integer fields. The given value + is NOT checked for feasibility (as the field width is not known). + The value is expected to fit in 32 bits (and is complained about + if it doesn't), as there are no opcodes with immediate operands + that have more than 32 bits. + +--uint=INT + Use INT as the only value for unsigned integer fields. The given value + is NOT checked for feasibility (as the field width is not known). + The value is expected to fit in 32 bits (and is complained about + if it doesn't), as there are no opcodes with immediate operands + that have more than 32 bits. + + +Testcase generation +------------------- +Testcase generation is not exhaustive. That would produce too large a +number of testcases. For example, testing CRB R1,R2,M3,D4(B4) +exhaustively would produce 16x16x16x12x16 = 786432 testcases. Instead, +we attempt to create "interesting" testcases. Like so: + +- for a VR operand pick a VR at random +- for a GPR operand that is not an index or base register pick a GPR + at random +- for a GPR that is base or index register, pick r0 and another GPR + at random +- for a 12-bit displacement pick 0, 1, 2, 4095 +- for a 20-bit displacement pick -524288, -2, -1, 0, 1, 2, 524287 +- for a signed integer field pick the boundary values of its domain and + -2,-1,0,1,2 +- for an unsigned integer field pick the boundary values of its domain + and 1, 2 +- for a mask field, pick all allowed values + +Why are we picking these values? Boundary cases are *always* +interesting, as is 0. '1' is picked because it is odd and '2' is picked +because it is even. + +Note: if an opcode has more than one GPR operand choose different +registers. We do this because we want to catch bugs due to mixed up +registers. +E.g. If the testcase is "brxh r1,r2,8" and s390_disasm produces +"brxh r2,r1,8" we want to catch that. Obviously, to do so the registers +need to be different. The same applies to VRs. + + +Adding a new opcode +------------------- +See extensive documentation at the top of file opcode.c +Any opcode can be added. It is not necessary for the opcode to have +extended mnemonics. + + +Integration into regression testing +----------------------------------- +1) Observe the exit code + + disasm-test --all --no-show-miscompares + + There will be no output to stdout and stderr. If there are no + miscompares in the disassembly the exit code will be 0. Otherwise, + it will be 1. + +2) Observe stderr + + disasm-test --all + + Miscomparing disassembly will be written to stderr. Exit code as + described above. + +3) Observe testcase summary + + disasm-test --all --no-show-miscompares --summary + + Will write information about #testcases as well as failing ones + to stdout. Exit code as described above. + + +Status +------ +Only opcodes with extended mnemonics as identified in Appendix J of the +Principles of Operation are considered. + +There is only partial support for opcodes with optional operands. +In the sense that the generated testcases will always include the +optional operand. + + +TODO +---- +(1) Testcase generation for the "Rotate and ..." family of opcodes needs + to be improved. Several interesting cases involving the T- and Z-bit + are not considered. + +(2) Due to bugs and missing functionality the following miscompares are + observed at this point: + - all vector opcodes (missing functionality) + - c[l][g]rt (bug in s390_disasm) + - bc (bug in objdump 2.38) + +(3) Generated testcases may cause specification exceptions. This + happens iff a constraint involves more than one opcode field. + E.g.: for the VREP opcode the M4 value determines which I2 values + are allowed. This constraint cannot be expressed. However, the + disassembly for such insns is not compared. Use --show-spec-exc + to show those insns. + +(4) In s390_decode_and_irgen the code peeks past the current insn: + + /* If next instruction is execute, stop here */ + if (dis_res->whatNext == Dis_Continue && + (bytes[insn_length] == 0x44 || + (bytes[insn_length] == 0xc6 && (bytes[insn_length + 1] & 0xf) == 0))) { + + Because of this we need to make our insn buffer 7 bytes instead + of 6 and set insn_buf[6] = 0x0. This works because 0x0 != 0x44 + and 0x0 != 0xc6. + Perhaps disable the peek-ahead when inside disasm-test by means + of some global variable? Not pretty either, but explicit. + +(5) For D20XB and D12XB operands add a test with B == X and B != 0 + Not exactly easy to do. diff --git a/none/tests/s390x/disasm-test/disasm-test.stderr.exp b/none/tests/s390x/disasm-test/disasm-test.stderr.exp new file mode 100644 index 000000000..e6a4c4821 --- /dev/null +++ b/none/tests/s390x/disasm-test/disasm-test.stderr.exp @@ -0,0 +1,3 @@ + +One of --verify, --generate, --run, --all, or --unit-test is required + diff --git a/none/tests/s390x/disasm-test/disasm-test.stdout.exp b/none/tests/s390x/disasm-test/disasm-test.stdout.exp new file mode 100644 index 000000000..e69de29bb diff --git a/none/tests/s390x/disasm-test/disasm-test.vgtest b/none/tests/s390x/disasm-test/disasm-test.vgtest new file mode 100644 index 000000000..d7a1a409c --- /dev/null +++ b/none/tests/s390x/disasm-test/disasm-test.vgtest @@ -0,0 +1,6 @@ +prog: disasm-test +# args: --all # enable this eventually +# +# NOTE: there are extra newlines in the output which are *not* +# present when disasm-test is invoked by hand. +# Not sure where they are coming from. diff --git a/none/tests/s390x/disasm-test/filter_stderr b/none/tests/s390x/disasm-test/filter_stderr new file mode 100755 index 000000000..f4c67057e --- /dev/null +++ b/none/tests/s390x/disasm-test/filter_stderr @@ -0,0 +1,3 @@ +#!/bin/sh + +../../filter_stderr "$@" diff --git a/none/tests/s390x/disasm-test/generate.c b/none/tests/s390x/disasm-test/generate.c new file mode 100644 index 000000000..7e69085e3 --- /dev/null +++ b/none/tests/s390x/disasm-test/generate.c @@ -0,0 +1,596 @@ +/* -*- mode: C; c-basic-offset: 3; -*- */ + +/* + This file is part of Valgrind, a dynamic binary instrumentation + framework. + + Copyright (C) 2024-2025 Florian Krohm + + This program is free software; you can redistribute it and/or + modify it under the terms of the GNU General Public License as + published by the Free Software Foundation; either version 2 of the + License, or (at your option) any later version. + + This program is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, see . + + The GNU General Public License is contained in the file COPYING. +*/ + +#include // fprintf +#include // system +#include // strlen +#include // va_list +#include // isdigit +#include // assert +#include "main.h" // error +#include "objdump.h" // MARK + +// FIXME: if more than one VR or GPR (non-base, non-index) are used in +// an opcode use different register! So we can recognise a mixup in +// register order. E.g. vctz %v2,%v2,3 will not allow to detect whether +// the two registers was mixed up. +static unsigned num_tests; // # generated tests + +static void run_cmd(const char *, ...); + + +static const char * +gpr_operand(unsigned regno) +{ + static const char *gprs[] = { + "%r0", "%r1", "%r2", "%r3", "%r4", "%r5", "%r6", "%r7", + "%r8", "%r9", "%r10", "%r11", "%r12", "%r13", "%r14", "%r15" + }; + + return gprs[regno]; +} + + +static const char * +vr_operand(unsigned regno) +{ + static const char *vrs[] = { + "%v0", "%v1", "%v2", "%v3", "%v4", "%v5", "%v6", "%v7", + "%v8", "%v9", "%v10", "%v11", "%v12", "%v13", "%v14", "%v15", + "%v16", "%v17", "%v18", "%v19", "%v20", "%v21", "%v22", "%v23", + "%v24", "%v25", "%v26", "%v27", "%v28", "%v29", "%v30", "%v31" + }; + + return vrs[regno]; +} + + +static unsigned +random_gpr(int allow_r0) +{ + unsigned regno; + + if (allow_r0) { + regno = rand() % 16; + } else { + do { + regno = rand() % 16; + } while (regno == 0); + } + + return regno; +} + + +static unsigned +random_vr(void) +{ + return rand() % 32; +} + + +#if 0 +/* These functions are currently unused. But may become useful in + alternate test generation strategies that use random values instead + of hardwired interesting ones. */ + +static unsigned +random_uint(unsigned num_bits) +{ + assert(num_bits <= 32); + + long long val = rand(); + return val % (1LL << num_bits); +} + + +static int +random_sint(unsigned num_bits) +{ + assert(num_bits <= 32); + + static int sign = 1; + + long long val = rand(); // positive value + int value = val % (1LL << (num_bits - 1)); + + /* alternate */ + if (sign == -1) + value = -value; + sign = -sign; + return value; +} + + +static unsigned +d12_value(void) +{ + return d12_val_specified ? d12_val : random_uint(12); +} + + +static int +d20_value(void) +{ + return d20_val_specified ? d20_val : random_sint(20); +} + + +static unsigned +uint_value(unsigned num_bits) +{ + if (num_bits > 32) + fatal("integer operand > 32 bits not supported\n"); + return uint_val_specified ? uint_val : random_uint(num_bits); +} + + +static int +sint_value(unsigned num_bits) +{ + if (num_bits > 32) + fatal("integer operand > 32 bits not supported\n"); + return sint_val_specified ? sint_val : random_sint(num_bits); +} +#endif + +/* MASK is a bitvector. For a GPR rk the k'th bit will be set. The + function returns a register number which has not been used and + adjusts the bitvector. */ +static unsigned +unique_gpr(unsigned regno, unsigned *mask) +{ + assert(regno < 16); + assert(*mask != ~0U); // Paranoia: avoid infinite loop + + unsigned bit = 1 << regno; + while (*mask & bit) { + regno = random_gpr(/* allow_r0 */1); + bit = 1 << regno; + } + *mask |= bit; + return regno; +} + + +/* Like unique_gpr */ +static unsigned +unique_vr(unsigned regno, unsigned *mask) +{ + assert(regno < 32); + assert(*mask != ~0U); // Paranoia: avoid infinite loop + + unsigned bit = 1 << regno; + while (*mask & bit) { + regno = random_vr(); + bit = 1 << regno; + } + *mask |= bit; + return regno; +} + + +/* Field */ +typedef struct { + const opnd *operand; // the operand to which this field belongs + int is_displacement; // only relevant for OPND_D12/20[X]B operands + int is_length; // only relevant for OPND_D12LB operands + int is_vr; // only relevant for OPND_D12VB operands + long long assigned_value; +} field; + + +/* Write out a single ASM statement for OPC. */ +static void +write_asm_stmt(FILE *fp, const opcode *opc, const field fields[]) +{ + fprintf(fp, " asm volatile(\"%s ", opc->name); + + unsigned gpr_mask, vr_mask, regno; + int inc; + int needs_comma = 0; + + gpr_mask = vr_mask = 0; + for (int i = 0; i < opc->num_fields; i += inc) { + const opnd *operand = fields[i].operand; + + inc = 1; // for most operand kinds + + if (needs_comma++) + fputc(',', fp); + switch (operand->kind) { + case OPND_GPR: + regno = unique_gpr(fields[i].assigned_value, &gpr_mask); + fprintf(fp, "%s", gpr_operand(regno)); + break; + case OPND_VR: + regno = unique_vr(fields[i].assigned_value, &vr_mask); + fprintf(fp, "%s", vr_operand(regno)); + break; + case OPND_D12XB: + case OPND_D20XB: { + long long d = fields[i].assigned_value; + const char *x = gpr_operand(fields[i + 1].assigned_value); + const char *b = gpr_operand(fields[i + 2].assigned_value); + fprintf(fp, "%lld(%s,%s)", d, x, b); + inc = 3; + break; + } + case OPND_D12VB: { + long long d = fields[i].assigned_value; + const char *v = vr_operand(fields[i + 1].assigned_value); + const char *b = gpr_operand(fields[i + 2].assigned_value); + fprintf(fp, "%lld(%s,%s)", d, v, b); + inc = 3; + break; + } + case OPND_D12LB: { + long long d = fields[i].assigned_value; + unsigned l = fields[i + 1].assigned_value; + const char *b = gpr_operand(fields[i + 2].assigned_value); + fprintf(fp, "%lld(%u,%s)", d, l + 1, b); + inc = 3; + break; + } + case OPND_D12B: + case OPND_D20B: { + long long d = fields[i].assigned_value; + const char *b = gpr_operand(fields[i + 1].assigned_value); + fprintf(fp, "%lld(%s)", d, b); + inc = 2; + break; + } + case OPND_MASK: + case OPND_SINT: + case OPND_UINT: + fprintf(fp, "%lld", fields[i].assigned_value); + break; + case OPND_PCREL: + fprintf(fp, "%lld*2", fields[i].assigned_value); + break; + default: + assert(0); + } + } + fprintf(fp, "\");\n"); + + ++num_tests; +} + + +/* IX identifies the element of the FIELDS array to which a value + will be assigned in this iteration. */ +static void +iterate(FILE *fp, const opcode *opc, field fields[], unsigned ix) +{ + /* All fields are assigned. Write out the asm stmt */ + if (ix == opc->num_fields) { + write_asm_stmt(fp, opc, fields); + return; + } + + field *f = fields + ix; + const opnd *operand = f->operand; + + switch (operand->kind) { + case OPND_GPR: + if (operand->name[0] == 'b' || operand->name[0] == 'x') { + /* Choose r0 */ + f->assigned_value = 0; + iterate(fp, opc, fields, ix + 1); + /* Choose any GPR other than r0 */ + f->assigned_value = random_gpr(/* r0_allowed */ 0); + iterate(fp, opc, fields, ix + 1); + } else { + /* Choose any GPR */ + f->assigned_value = random_gpr(/* r0_allowed */ 1); + iterate(fp, opc, fields, ix + 1); + } + break; + + case OPND_VR: + f->assigned_value = random_vr(); // Choose any VR + iterate(fp, opc, fields, ix + 1); + break; + + case OPND_D12B: + case OPND_D12XB: + case OPND_D12LB: + case OPND_D12VB: + if (f->is_displacement) { + if (d12_val_specified) { + f->assigned_value = d12_val; + iterate(fp, opc, fields, ix + 1); + } else { + /* Choose these interesting values */ + static const long long values[] = { 0, 1, 2, 0xfff }; + + for (int i = 0; i < sizeof values / sizeof *values; ++i) { + f->assigned_value = values[i]; + iterate(fp, opc, fields, ix + 1); + } + } + } else if (f->is_length) { + /* Choose these interesting values */ + static const long long values[] = { 0, 1, 2, 255 }; + + for (int i = 0; i < sizeof values / sizeof *values; ++i) { + f->assigned_value = values[i]; + iterate(fp, opc, fields, ix + 1); + } + } else if (f->is_vr) { + /* v0 is not special AFAICT */ + f->assigned_value = random_vr(); + iterate(fp, opc, fields, ix + 1); + } else { + /* Base or index register */ + f->assigned_value = 0; + iterate(fp, opc, fields, ix + 1); + f->assigned_value = random_gpr(/* r0_allowed */ 0); + iterate(fp, opc, fields, ix + 1); + } + break; + + case OPND_D20B: + case OPND_D20XB: + if (f->is_displacement) { + if (d20_val_specified) { + f->assigned_value = d20_val; + iterate(fp, opc, fields, ix + 1); + } else { + /* Choose these interesting values */ + static const long long values[] = { + 0, 1, 2, -1, -2, 0x7ffff, -0x80000 + }; + + for (int i = 0; i < sizeof values / sizeof *values; ++i) { + f->assigned_value = values[i]; + iterate(fp, opc, fields, ix + 1); + } + } + } else { + /* base or index register */ + f->assigned_value = 0; + iterate(fp, opc, fields, ix + 1); + f->assigned_value = random_gpr(/* r0_allowed */ 0); + iterate(fp, opc, fields, ix + 1); + } + break; + + case OPND_SINT: + case OPND_PCREL: + if (sint_val_specified) { + f->assigned_value = sint_val; + iterate(fp, opc, fields, ix + 1); + } else { + if (operand->allowed_values == NULL) { + /* No constraint: Choose these interesting values */ + const long long values[] = { + 0, 1, 2, -1, -2, (1LL << (operand->num_bits - 1)) - 1, + -(1LL << (operand->num_bits - 1)) + }; + + for (int i = 0; i < sizeof values / sizeof *values; ++i) { + f->assigned_value = values[i]; + iterate(fp, opc, fields, ix + 1); + } + } else { + /* Constraint. Choose only allowed values */ + unsigned num_val = operand->allowed_values[0]; + for (int i = 1; i <= num_val; ++i) { + f->assigned_value = operand->allowed_values[i]; + iterate(fp, opc, fields, ix + 1); + } + } + } + break; + + case OPND_UINT: + if (uint_val_specified) { + f->assigned_value = uint_val; + iterate(fp, opc, fields, ix + 1); + } else { + if (operand->allowed_values == NULL) { + /* No constraint: Choose these interesting values */ + const long long values[] = { + 0, 1, 2, (1LL << operand->num_bits) - 1 + }; + + for (int i = 0; i < sizeof values / sizeof *values; ++i) { + f->assigned_value = values[i]; + iterate(fp, opc, fields, ix + 1); + } + } else { + /* Constraint. Choose only allowed values */ + unsigned num_val = operand->allowed_values[0]; + for (int i = 1; i <= num_val; ++i) { + f->assigned_value = operand->allowed_values[i]; + iterate(fp, opc, fields, ix + 1); + } + } + } + break; + + case OPND_MASK: + if (operand->allowed_values == NULL) { + /* No constraint. Choose all possible values */ + unsigned maxval = (1u << operand->num_bits) - 1; + for (int val = 0; val <= maxval; ++val) { + f->assigned_value = val; + iterate(fp, opc, fields, ix + 1); + } + } else { + /* Constraint. Choose only allowed values */ + unsigned num_val = operand->allowed_values[0]; + for (int i = 1; i <= num_val; ++i) { + f->assigned_value = operand->allowed_values[i]; + iterate(fp, opc, fields, ix + 1); + } + } + break; + + case OPND_INVALID: + default: + assert(0); + } +} + + +static void +generate(FILE *fp, const opcode *opc) +{ + /* Array of opcode fields to which we need to assign values. */ + field fields[opc->num_fields]; + field *f; + + int ix = 0; + for (int i = 0; i < opc->num_opnds; ++i) { + const opnd *operand = opc->opnds + i; + + switch (operand->kind) { + case OPND_GPR: + case OPND_VR: + case OPND_SINT: + case OPND_UINT: + case OPND_PCREL: + case OPND_MASK: + f = fields + ix++; + f->operand = operand; + break; + + case OPND_D12XB: + case OPND_D20XB: + for (int j = 1; j <= 3; ++j) { + f = fields + ix++; + f->operand = operand; + f->is_displacement = j == 1; + f->is_length = 0; + f->is_vr = 0; + } + break; + + case OPND_D12B: + case OPND_D20B: + for (int j = 1; j <= 2; ++j) { + f = fields + ix++; + f->operand = operand; + f->is_displacement = j == 1; + f->is_length = 0; + f->is_vr = 0; + } + break; + + case OPND_D12LB: + for (int j = 1; j <= 3; ++j) { + f = fields + ix++; + f->operand = operand; + f->is_displacement = j == 1; + f->is_length = j == 2; + f->is_vr = 0; + } + break; + + case OPND_D12VB: + for (int j = 1; j <= 3; ++j) { + f = fields + ix++; + f->operand = operand; + f->is_displacement = j == 1; + f->is_length = 0; + f->is_vr = j == 2; + } + break; + + case OPND_INVALID: + default: + assert(0); + } + } + assert(ix == opc->num_fields); + + iterate(fp, opc, fields, 0); +} + + +unsigned +generate_tests(const opcode *opc) +{ + srand(42); + + if (verbose) + printf("...generating testcases for '%s'\n", opc->name); + + num_tests = 0; + + char file[strlen(opc->name) + 3]; + sprintf(file, "%s.c", opc->name); + + FILE *fp = fopen(file, "w"); + if (fp == NULL) { + error("%s: fopen failed\n", file); + return 0; + } + + fprintf(fp, "void\n"); + fprintf(fp, "main(void)\n"); + fprintf(fp, "{\n"); + fprintf(fp, " asm volatile(\"%s\");\n", MARK); + generate(fp, opc); + fprintf(fp, " asm volatile(\"%s\");\n", MARK); + fprintf(fp, "}\n"); + fclose(fp); + + if (verbose) + printf("...%u testcases generated for '%s'\n", num_tests, + opc->name); + + run_cmd("%s %s %s.c", gcc, gcc_flags, opc->name); + run_cmd("%s --disassemble=%s %s.o > %s.dump", objdump, FUNCTION, + opc->name, opc->name); + + return num_tests; +} + + +static void +run_cmd(const char *fmt, ...) +{ + va_list args; + va_start(args, fmt); + int need = vsnprintf((char []){ 0 }, 1, fmt, args); + va_end(args); + + char cmd[need + 1]; + va_list args2; + va_start(args2, fmt); + vsnprintf(cmd, sizeof cmd, fmt, args2); + va_end(args2); + + if (debug) + printf("Running command: %s\n", cmd); + + int rc = system(cmd); + + if (rc != 0) + error("Command '%s' failed\n", cmd); +} diff --git a/none/tests/s390x/disasm-test/main.c b/none/tests/s390x/disasm-test/main.c new file mode 100644 index 000000000..75d007b93 --- /dev/null +++ b/none/tests/s390x/disasm-test/main.c @@ -0,0 +1,371 @@ +/* -*- mode: C; c-basic-offset: 3; -*- */ + +/* + This file is part of Valgrind, a dynamic binary instrumentation + framework. + + Copyright (C) 2024-2025 Florian Krohm + + This program is free software; you can redistribute it and/or + modify it under the terms of the GNU General Public License as + published by the Free Software Foundation; either version 2 of the + License, or (at your option) any later version. + + This program is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, see . + + The GNU General Public License is contained in the file COPYING. +*/ + +#include // NULL +#include // exit, malloc +#include // vfprintf +#include // va_list +#include // strchr +#include // assert +#include // unlink +#include "vex.h" // vex_init +#include "main.h" + +int verbose, debug, show_spec_exc, show_miscompares; +int d12_val, d20_val; +long long uint_val, sint_val; +int d12_val_specified, d20_val_specified; +int uint_val_specified, sint_val_specified; + +const char *gcc = "gcc"; // path to GCC +const char *objdump = "objdump"; // path to objdump +const char *gcc_flags = "-c -march=arch14"; + +#define CHECK_CLO(x, s) (strncmp(x, s, sizeof s - 1) == 0) + +static const char usage[] = + "Usage:\n\n" + "disasm-test --generate OPCODES\n" + " Generate testcases for the given opcodes and prepare objdump files.\n\n" + "disasm-test --verify FILES\n" + " Read specified objdump files and compare with VEX disassembly.\n\n" + "disasm-test --run OPCODES\n" + " Generate testcases for the given opcodes and compare the disassembly.\n\n" + "disasm-test --all\n" + " For all opcodes generate testcases and compare the disassembly.\n\n" + "disasm-test --unit-test\n" + " Run unit tests. All other command line options are ignored.\n\n" + "Additional options:\n" + " --verbose\n" + " --debug\n" + " --gcc=/path/to/gcc\n" + " --gcc-flags=FLAGS\n" + " --objdump=/path/to/objdump\n" + " --d12=INT - Use INT as value for d12 offsets\n" + " --d20=INT - Use INT as value for d20 offsets\n" + " --sint=INT - Use INT as value for signed integer fields\n" + " --uint=INT - Use INT as value for unsigned integer fields\n" + " --keep-temp - Do not remove temporary files\n" + " --summary - Output test generation summary (with --all)\n" + " --unit-test - Run unit tests\n" + " --show-spec-exc - Show insns causing specification exceptions\n" + " --no-show-miscompares - Do not show disassembly miscompares\n" + ; + +static long long get_clo_value(const char *, const char *, long long, + long long); +static void remove_temp_files(const char *); +static int opcode_has_errors(const opcode *); + +static int keep_temp = 0; +static int summary = 0; + + +/* Return code: 0 no disassembly mismatches + Return code: 1 at least one disassembly mismatch + + Specification exceptions do not influence the return code. */ +int +main(int argc, char *argv[]) +{ + int all = 0, verify = 0, generate = 0, unit_test = 0; + int num_clargs = 0; + int run = 0; + const char *clargs[argc]; + + assert(sizeof(long long) == 8); + + /* Change to line buffering */ + setlinebuf(stdout); + setlinebuf(stderr); + + show_miscompares = 1; + + /* Collect options and arguments */ + for (int i = 1; i < argc; ++i) { + const char *clo = argv[i]; + + if (CHECK_CLO(clo, "--verify")) { + verify = 1; + } else if (CHECK_CLO(clo, "--generate")) { + generate = 1; + } else if (CHECK_CLO(clo, "--all")) { + all = 1; + } else if (CHECK_CLO(clo, "--verbose")) { + verbose = 1; + } else if (CHECK_CLO(clo, "--debug")) { + debug = 1; + } else if (CHECK_CLO(clo, "--summary")) { + summary = 1; + } else if (CHECK_CLO(clo, "--unit-test")) { + unit_test = 1; + } else if (CHECK_CLO(clo, "--show-spec-exc")) { + show_spec_exc = 1; + } else if (CHECK_CLO(clo, "--no-show-miscompares")) { + show_miscompares = 0; + } else if (CHECK_CLO(clo, "--keep-temp")) { + keep_temp = 1; + } else if (CHECK_CLO(clo, "--run")) { + run = 1; + } else if (CHECK_CLO(clo, "--help")) { + printf("%s\n", usage); + return 0; + } else if (CHECK_CLO(clo, "--gcc=")) { + gcc = strchr(clo, '=') + 1; + } else if (CHECK_CLO(clo, "--gcc-flags=")) { + gcc_flags = strchr(clo, '=') + 1; + } else if (CHECK_CLO(clo, "--objdump=")) { + objdump = strchr(clo, '=') + 1; + } else if (CHECK_CLO(clo, "--d12=")) { + d12_val = get_clo_value(clo, "d12", 0, 0xfff); + } else if (CHECK_CLO(clo, "--d20=")) { + d20_val = get_clo_value(clo, "d20", -0x80000, 0x7ffff); + } else if (CHECK_CLO(clo, "--sint=")) { + /* Integer field is restricted to 32-bit */ + long long max = 0x7fffffff; + sint_val = get_clo_value(clo, "sint", -max - 1, max); + } else if (CHECK_CLO(clo, "--uint=")) { + /* Integer field is restricted to 32-bit */ + uint_val = get_clo_value(clo, "uint", 0, 0xffffffffU); + } else { + if (strncmp(clo, "--", 2) == 0) + fatal("Invalid command line option '%s'\n", clo); + clargs[num_clargs++] = clo; + } + } + + /* Check consistency of command line options */ + if (verify + generate + run + all + unit_test == 0) + fatal("One of --verify, --generate, --run, --all, or --unit-test " + "is required\n"); + if (verify + generate + run + all + unit_test != 1) + fatal("At most one of --verify, --generate, --run, --all, or " + " --unit-test can be given\n"); + + vex_init(); + + if (generate) { + if (num_clargs == 0) + fatal("Missing opcode name[s]\n"); + + for (int i = 0; i < num_clargs; ++i) { + const char *name = clargs[i]; + + opcode *opc = get_opcode_by_name(name); + + if (opc == NULL) { + error("'%s' is not a recognised opcode\n", name); + } else if (opcode_has_errors(opc)) { + error("Opcode '%s' ignored due to syntax errors\n", name); + } else { + generate_tests(opc); + release_opcode(opc); + } + } + return 0; + } + + if (verify) { + if (num_clargs == 0) + fatal("Missing file name[s]\n"); + + int num_mismatch = 0; + + for (int i = 0; i < num_clargs; ++i) { + verify_stats stats = verify_disassembly(clargs[i]); + num_mismatch += stats.num_mismatch; + } + return num_mismatch != 0; + } + + if (run) { + if (num_clargs == 0) + fatal("Missing opcode name[s]\n"); + + unsigned num_mismatch = 0; + + for (int i = 0; i < num_clargs; ++i) { + const char *name = clargs[i]; + + opcode *opc = get_opcode_by_name(name); + + if (opc == NULL) { + error("'%s' is not a recognised opcode\n", name); + } else if (opcode_has_errors(opc)) { + error("Opcode '%s' ignored due to syntax errors\n", name); + } else { + generate_tests(opc); + + char file[strlen(name) + 10]; // large enough + sprintf(file, "%s.dump", name); + + verify_stats stats = verify_disassembly(file); + num_mismatch += stats.num_mismatch; + + if (! keep_temp) + remove_temp_files(name); + release_opcode(opc); + } + } + return num_mismatch != 0; + } + + if (all) { + if (num_clargs != 0) + fatal("Excess arguments on command line\n"); + + unsigned num_tests, num_verified, num_mismatch, num_spec_exc; + num_tests = num_verified = num_mismatch = num_spec_exc = 0; + + for (int i = 0; i < num_opcodes; ++i) { + opcode *opc = get_opcode_by_index(i); // never NULL + + if (opcode_has_errors(opc)) { + error("Opcode '%s' ignored due to syntax errors\n", + opc->name); + continue; + } + num_tests += generate_tests(opc); + + char file[strlen(opc->name) + 10]; + sprintf(file, "%s.dump", opc->name); + + verify_stats stats = verify_disassembly(file); + + num_verified += stats.num_verified; + num_mismatch += stats.num_mismatch; + num_spec_exc += stats.num_spec_exc; + + if (! keep_temp) + remove_temp_files(opc->name); + release_opcode(opc); + } + if (verbose || summary) { + printf("Total: %6u tests generated\n", num_tests); + printf("Total: %6u insns verified\n", num_verified); + printf("Total: %6u disassembly mismatches\n", num_mismatch); + printf("Total: %6u specification exceptions\n", num_spec_exc); + } + return num_mismatch != 0; + } + + if (unit_test) + run_unit_tests(); + + return 0; +} + + +static void +remove_temp_files(const char *op) +{ + char file[strlen(op) + 10]; // large enough + static const char *suffix[] = { ".c", ".o", ".dump", ".vex" }; + + for (int i = 0; i < sizeof suffix / sizeof *suffix; ++i) { + sprintf(file, "%s%s", op, suffix[i]); + unlink(file); + } +} + + +/* A few convenience utilities */ +void +error(const char *fmt, ...) +{ + va_list args; + va_start(args, fmt); + vfprintf(stderr, fmt, args); + va_end(args); +} + + +void +fatal(const char *fmt, ...) +{ + va_list args; + va_start(args, fmt); + vfprintf(stderr, fmt, args); + va_end(args); + exit(EXIT_FAILURE); +} + + +void * +mallock(unsigned n) +{ + void *p = malloc(n); + + if (p == NULL) + fatal("malloc failed\n"); + return p; +} + + +char * +strsave(const char *s) +{ + return strcpy(mallock(strlen(s) + 1), s); +} + + +char * +strnsave(const char *s, unsigned len) +{ + char *p = memcpy(mallock(len + 1), s, len); + + p[len] = '\0'; + return p; +} + + +static long long +get_clo_value(const char *clo, const char *field_name, long long min, + long long max) +{ + long long value; + + const char *p = strchr(clo, '=') + 1; // succeeds + + if (sscanf(p, "%lld", &value) != 1) + fatal("%s value '%s' is not an integer\n", field_name, p); + if (value < min || value > max) + fatal("%s value '%lld' is out of range\n", field_name, value); + return value; +} + + +/* Return 1, if the given opcode has at least one invalid operand. + This indicates that there were parse errors earlier. */ +static int +opcode_has_errors(const opcode *opc) +{ + const opnd *opnds = opc->opnds; + + for (int i = 0; i < opc->num_opnds; ++i) { + if (opnds[i].kind == OPND_INVALID) + return 1; + } + return 0; +} diff --git a/none/tests/s390x/disasm-test/main.h b/none/tests/s390x/disasm-test/main.h new file mode 100644 index 000000000..c988d8874 --- /dev/null +++ b/none/tests/s390x/disasm-test/main.h @@ -0,0 +1,101 @@ +/* -*- mode: C; c-basic-offset: 3; -*- */ + +/* + This file is part of Valgrind, a dynamic binary instrumentation + framework. + + Copyright (C) 2024-2025 Florian Krohm + + This program is free software; you can redistribute it and/or + modify it under the terms of the GNU General Public License as + published by the Free Software Foundation; either version 2 of the + License, or (at your option) any later version. + + This program is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, see . + + The GNU General Public License is contained in the file COPYING. +*/ + +#ifndef MAIN_H +#define MAIN_H + +/* The various kinds of operands. */ +typedef enum { + OPND_GPR, + OPND_VR, + OPND_D12LB, + OPND_D12XB, + OPND_D12VB, + OPND_D20XB, + OPND_D12B, + OPND_D20B, + OPND_SINT, + OPND_UINT, + OPND_MASK, + OPND_PCREL, + OPND_INVALID +} opnd_t; + +/* An operand */ +typedef struct { + char *name; + opnd_t kind; + unsigned num_bits; + int is_unsigned; + // NULL = no values specified. Otherwise, values[0] == #values + // and values[1..#values] are the values. + long long *allowed_values; +} opnd; + +/* An opcode */ +typedef struct { + char *name; + opnd *opnds; + unsigned num_opnds; + /* When generating a testcase this is the number of fields we + need to assign a value to */ + unsigned num_fields; +} opcode; + +typedef struct { + unsigned num_verified; + unsigned num_mismatch; + unsigned num_spec_exc; +} verify_stats; + +__attribute__((format(printf, 1, 2))) +void error(const char *, ...); +__attribute__((noreturn)) __attribute__((format(printf, 1, 2))) +void fatal(const char *, ...); + +verify_stats verify_disassembly(const char *); +unsigned generate_tests(const opcode *); +opcode *get_opcode_by_name(const char *); +opcode *get_opcode_by_index(unsigned); +void release_opcode(opcode *); +void run_unit_tests(void); + +void *mallock(unsigned); +char *strsave(const char *); +char *strnsave(const char *, unsigned); + +extern int verbose; +extern int debug; +extern int show_spec_exc; +extern int show_miscompares; +extern int d12_val, d20_val; +extern long long sint_val, uint_val; +extern int d12_val_specified, d20_val_specified; +extern int uint_val_specified, sint_val_specified; +extern unsigned num_opcodes; +extern const char *gcc; +extern const char *gcc_flags; +extern const char *objdump; + +#endif // MAIN_H diff --git a/none/tests/s390x/disasm-test/objdump.c b/none/tests/s390x/disasm-test/objdump.c new file mode 100644 index 000000000..62d834654 --- /dev/null +++ b/none/tests/s390x/disasm-test/objdump.c @@ -0,0 +1,237 @@ +/* -*- mode: C; c-basic-offset: 3; -*- */ + +/* + This file is part of Valgrind, a dynamic binary instrumentation + framework. + + Copyright (C) 2024-2025 Florian Krohm + + This program is free software; you can redistribute it and/or + modify it under the terms of the GNU General Public License as + published by the Free Software Foundation; either version 2 of the + License, or (at your option) any later version. + + This program is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, see . + + The GNU General Public License is contained in the file COPYING. +*/ + +#include // NULL +#include // sprintf +#include // free +#include // strchr +#include // isdigit +#include "main.h" // error +#include "objdump.h" + +static int get_nibble(const char *); +static int get_byte(const char *); + + +objdump_file * +read_objdump(const char *file) +{ + const char *function = FUNCTION; + const char *mark = MARK; + + /* Slurp file into memory */ + FILE *fp = fopen(file, "rb"); + if (fp == NULL) { + error("%s: fopen failed\n", file); + return NULL; + } + + /* Determine file size */ + int rc = fseek(fp, 0, SEEK_END); + if (rc < 0) { + error("%s: fseek failed\n", file); + return NULL; + } + + long size = ftell(fp); + if (size < 0) { + error("%s: ftell failed\n", file); + return NULL; + } + rewind(fp); + + char *const buf = mallock(size + 1); + size_t num_read = fread(buf, 1, size, fp); + if (num_read != size) { + error("%s: fread failed\n", file); + free(buf); + return NULL; + } + buf[size] = '\0'; + + fclose(fp); + + /* Determine the number of lines in the file. This number + exceeds the number of lines containing insns. */ + unsigned num_lines = 0; + + for (char *p = buf; *p; ++p) { + if (*p == '\n') { + *p = '\0'; + ++num_lines; + } + } + + /* Allocate an objdump_file. */ + objdump_file *ofile = mallock(sizeof (objdump_file)); + + ofile->filebuf = buf; + ofile->lines = mallock(num_lines * sizeof(objdump_line)); + + /* Locate the line containing : */ + char string[strlen(function) + 3 + 1]; + sprintf(string, "<%s>:", function); + + char *cur, *next = 0; // shut up, GCC + const char *end = buf + num_read; + + for (cur = buf; cur != end; cur = next) { + const char *line = cur; + next = strchr(line, '\0') + 1; + if (strstr(line, string)) + break; + } + + /* Process the lines containing insns. These are the lines between + the 1st and 2nd MARK. */ + unsigned linecnt = 0; + int marker_seen = 0; + for (cur = next; cur != end; cur = next) { + char *line = cur; + + next = strchr(line, '\0') + 1; + + char *p; + for (p = line; isspace(*p); ++p) + ; + + if (*p == '\0') continue; // blank line allowed + + unsigned address = 0; + while (*p != ':') { + address = (address << 4) + get_nibble(p); + ++p; + } + + ++p; // skip ':' + + while (isspace(*p)) + ++p; + + /* The leftmost two bits (0:1) encode the length of the insn + in bytes: + 00 -> 2 bytes, 01 -> 4 bytes, 10 -> 4 bytes, 11 -> 6 bytes. */ + unsigned char byte = get_byte(p); + unsigned insn_len = ((((byte >> 6) + 1) >> 1) + 1) << 1; + + /* Temporary buffer. */ + char insn_bytes[6] = { 0 }; + + for (int i = 0; i < insn_len; ++i) { + insn_bytes[i] = get_byte(p); + p += 3; + } + + while (isspace(*p)) // skip white space to disassembled text + ++p; + + char *dis_insn = p; + + /* Remove symbolic jump targets, if any. E.g. change + 1b68: c0 e5 ff ff fd a4 brasl %r14,16b0 to + 1b68: c0 e5 ff ff fd a4 brasl %r14,16b0 + */ + p = strchr(p, '<'); + if (p) { + *p-- = '\0'; + while (isspace(*p)) // remove trailing white space + *p-- = '\0'; + } + + if (strncmp(dis_insn, mark, strlen(mark)) == 0) { + if (marker_seen) + break; // we're done + marker_seen = 1; + } else { + if (marker_seen == 1) { + /* Add the line */ + objdump_line *oline = ofile->lines + linecnt++; + oline->address = address; + oline->insn_len = insn_len; + oline->disassembled_insn = dis_insn; + memcpy(oline->insn_bytes, insn_bytes, sizeof insn_bytes); + + /* Extra byte to allow the decoder to peek past the end of + the current insn */ + // FIXME: introduce global variable that is observed in + // FIXME: the decoder which disables peeking ahead ? + oline->insn_bytes[insn_len] = 0x00; + } + } + } + + if (marker_seen == 0) { + error("%s is not a valid objdump -d file\n", file); + release_objdump(ofile); + return NULL; + } + + ofile->num_lines = linecnt; + + return ofile; +} + + +/* Free all memory allocated for the objdump file */ +void +release_objdump(objdump_file *ofile) +{ + free(ofile->filebuf); + free(ofile->lines); + free(ofile); +} + + +static int +get_nibble(const char *p) +{ + int c = *p; + + if (isdigit(c)) + return c - '0'; + + switch (tolower(c)) { + case 'a': return 10; + case 'b': return 11; + case 'c': return 12; + case 'd': return 13; + case 'e': return 14; + case 'f': return 15; + default: + break; + } + + error("%s: get_nibble failed; continuing with fingers crossed\n", p); + return 0; +} + + +static int +get_byte(const char *p) +{ + int n1 = get_nibble(p); + int n2 = get_nibble(p + 1); + + return (n1 << 4) + n2; +} diff --git a/none/tests/s390x/disasm-test/objdump.h b/none/tests/s390x/disasm-test/objdump.h new file mode 100644 index 000000000..388943f7a --- /dev/null +++ b/none/tests/s390x/disasm-test/objdump.h @@ -0,0 +1,55 @@ +/* -*- mode: C; c-basic-offset: 3; -*- */ + +/* + This file is part of Valgrind, a dynamic binary instrumentation + framework. + + Copyright (C) 2024-2025 Florian Krohm + + This program is free software; you can redistribute it and/or + modify it under the terms of the GNU General Public License as + published by the Free Software Foundation; either version 2 of the + License, or (at your option) any later version. + + This program is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, see . + + The GNU General Public License is contained in the file COPYING. +*/ + +#ifndef OBJDUMP_H +#define OBJDUMP_H + +/* An opcode which marks the begin and end of a sequence of insns + in a testcase whose disassembly should be verified. */ +#define MARK "csch" + +/* Name of the C function containing the generated insns. */ +#define FUNCTION "main" + +/* INSN_BYTES needs an extra byte because s390_decode_and_irgen peeks + at the next instruction to handle some special case. And in case + of INSN_BYTES having only 6 elements that would be an out-of-bounds + memory access. insn_bytes[insn_len] will always be 0x00. */ +typedef struct { + unsigned address; + unsigned insn_len; + unsigned char insn_bytes[6 + 1]; + const char *disassembled_insn; // points into objdump_file::filebuf +} objdump_line; + +typedef struct { + char *filebuf; // contents of objdump file; will be modified ! + unsigned num_lines; // #lines containing insns + objdump_line *lines; +} objdump_file; + +objdump_file *read_objdump(const char *); +void release_objdump(objdump_file *); + +#endif // OBJDUMP_H diff --git a/none/tests/s390x/disasm-test/opcode.c b/none/tests/s390x/disasm-test/opcode.c new file mode 100644 index 000000000..daa489908 --- /dev/null +++ b/none/tests/s390x/disasm-test/opcode.c @@ -0,0 +1,964 @@ +/* -*- mode: C; c-basic-offset: 3; -*- */ + +/* + This file is part of Valgrind, a dynamic binary instrumentation + framework. + + Copyright (C) 2024-2025 Florian Krohm + + This program is free software; you can redistribute it and/or + modify it under the terms of the GNU General Public License as + published by the Free Software Foundation; either version 2 of the + License, or (at your option) any later version. + + This program is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, see . + + The GNU General Public License is contained in the file COPYING. +*/ + +#include // printf +#include // strlen +#include // free +#include // assert +#include // isdigit +#include "main.h" // error + +/* Each line in the initialiser is an opcode specification which + consists of two parts: opcode name and opcode operands. + Name and operands are separated by white space. Operands are + separated by ','. + + Operands should not be confused with fields. E.g. the "bc" + opcode has 2 operands (m1 and d12(x2,b2)) but 4 fields (m,d,x,b). + + The following naming conventions are observed: + + - 'r[0-9]+' denotes a GPR (4-bit wide unsigned value) + - 'b[0-9]+' denotes a GPR used as base register + - 'x[0-9]+' denotes a GPR used as index register + - 'v[0-9]+' denotes a VR (5-bit wide unsigned value) + - 'm[0-9]+' denotes a 4-bit mask (an unsigned value) + - 'l' denotes an 8-bit length (an unsigned value) + - 'l[0-9]+' denotes an 8-bit length (an unsigned value) + - 'i[0-9]+' denotes an integer + You must specify #bits and signedness like so: + i2:8u meaning: 8-bit wide, unsigned integer contents + i3:12s meaning: 12-bit wide, signed integer contents + - 'ri[0-9]+' denotes an integer that is used for PC-relative + addressing. You must specify #bits and signedness. + - 'd12' denotes a 12-bit wide displacement (an unsigned integer) + - 'd20' denotes a 20-bit wide displacement (a signed integer) + + It is also possible to restrict the values that may be assigned to an + operand (typically a mask). This is heavily used by vector opcodes. + + Example: + m4:{ 0..4 } values 0,1,2,3,4 are allowed + m4:{ 0,1,2,3,4 } same as above + m5:{ 0,1,3..7 } values 0,1,3,4,5,6,7 are allowed + + If you need to specify #bits, signedness and allowed values you + need to provide those in the order: signedness, #bits, allowed + values. E.g. i2:s8{-10..42} +*/ +static const char *opcodes[] = { + /* Unsorted list of opcodes (other than vector ops) with + extended mnemonics. See also: + Appendix J, Principles of Operation */ + "bic m1,d20(x2,b2)", + "bcr m1,r2", + "bc m1,d12(x2,b2)", + "bras r1,ri2:s16", + "brasl r1,ri2:s32", + "brc m1,ri2:s16", + "brcl m1,ri2:s32", + "brct r1,ri2:s16", + "brctg r1,ri2:s16", + "brcth r1,ri2:s32", + "brxh r1,r3,ri2:s16", + "brxhg r1,r3,ri2:s16", + "brxle r1,r3,ri2:s16", + "brxlg r1,r3,ri2:s16", + "crb r1,r2,m3,d12(b4)", + "cgrb r1,r2,m3,d12(b4)", + "crj r1,r2,m3,ri4:s16", + "cgrj r1,r2,m3,ri4:s16", + "crt r1,r2,m3", + "cgrt r1,r2,m3", + "cib r1,i2:s8,m3,d12(b4)", + "cgib r1,i2:s8,m3,d12(b4)", + "cij r1,i2:s8,m3,ri4:s16", + "cgij r1,i2:s8,m3,ri4:s16", + "cit r1,i2:s16,m3", + "cgit r1,i2:s16,m3", + "clrb r1,r2,m3,d12(b4)", + "clgrb r1,r2,m3,d12(b4)", + "clrj r1,r2,m3,ri4:s16", + "clgrj r1,r2,m3,ri4:s16", + "clrt r1,r2,m3", + "clgrt r1,r2,m3", + "clt r1,m3,d20(b2)", + "clgt r1,m3,d20(b2)", + "clib r1,i2:u8,m3,d12(b4)", + "clgib r1,i2:u8,m3,d12(b4)", + "clij r1,i2:u8,m3,ri4:s16", + "clgij r1,i2:u8,m3,ri4:s16", + "clfit r1,i2:u16,m3", + "clgit r1,i2:u16,m3", + "iilf r1,i2:u32", + "lochhi r1,i2:s16,m3", + "lochi r1,i2:s16,m3", + "locghi r1,i2:s16,m3", + "locfhr r1,r2,m3", + "locfh r1,d20(b2),m3", + "llilf r1,i2:u32", + "locr r1,r2,m3", + "locgr r1,r2,m3", + "loc r1,d20(b2),m3", + "locg r1,d20(b2),m3", + "nork r1,r2,r3", + "nogrk r1,r2,r3", + "rnsbg r1,r2,i3:u8,i4:u8,i5:u8", // FIXME un/signed i3/4/5 ? t-bit ? z-bit? + "rxsbg r1,r2,i3:u8,i4:u8,i5:u8", // FIXME ditto + "risbg r1,r2,i3:u8,i4:u8,i5:u8", // FIXME ditto + "risbgn r1,r2,i3:u8,i4:u8,i5:u8", // FIXME ditto + "risbhg r1,r2,i3:u8,i4:u8,i5:u8", // FIXME ditto + "risblg r1,r2,i3:u8,i4:u8,i5:u8", // FIXME ditto + "rosbg r1,r2,i3:u8,i4:u8,i5:u8", // FIXME ditto + "selr r1,r2,r3,m4", + "selgr r1,r2,r3,m4", + "selfhr r1,r2,r3,m4", + "stocfh r1,d20(b2),m3", + "stoc r1,d20(b2),m3", + "stocg r1,d20(b2),m3", + + /* Misc. other opcodes */ + "cksm r1,r2", // make sure, it gets disassmbled (BZ 495817) + "clcl r1,r2", // make sure, it gets disassmbled (BZ 495817) + "mvcl r1,r2", // make sure, it gets disassmbled (BZ 495817) + + // If a set of allowed values is specified for an operand this + // implies that any other value would cause a specification exception. + // UNLESS otherwise noted. + + // Chapter 21: Vector Overview and Support Instructions + "vbperm v1,v2,v3", + "vgef v1,d12(v2,b2),m3:{0,1,2,3}", + "vgeg v1,d12(v2,b2),m3:{0,1}", + "vgbm v1,i2:u16", + "vgm v1,i2:u8,i3:u8,m4:{0..3}", + "vl v1,d12(x2,b2),m3:{0,3,4}", // no spec. exc + "vlr v1,v2", + "vlrep v1,d12(x2,b2),m3:{0..3}", + "vlebrh v1,d12(x2,b2),m3:{0..7}", + "vlebrf v1,d12(x2,b2),m3:{0..3}", + "vlebrg v1,d12(x2,b2),m3:{0,1}", + "vlbrrep v1,d12(x2,b2),m3:{1..3}", + "vllebrz v1,d12(x2,b2),m3:{1..3,6}", + "vlbr v1,d12(x2,b2),m3:{1..4}", + "vleb v1,d12(x2,b2),m3", + "vleh v1,d12(x2,b2),m3:{0..7}", + "vlef v1,d12(x2,b2),m3:{0..3}", + "vleg v1,d12(x2,b2),m3:{0,1}", + "vleib v1,i2:s16,m3", + "vleih v1,i2:s16,m3:{0..7}", + "vleif v1,i2:s16,m3:{0..3}", + "vleig v1,i2:s16,m3:{0,1}", + "vler v1,d12(x2,b2),m3:{0..3}", + "vlgv r1,v3,d12(b2),m4:{0..3}", + "vllez v1,d12(x2,b2),m3:{0..3,6}", + // "vlm v1,v3,d12(b2),m4", // cannot express constraint + "vlrlr v1,r3,d12(b2)", + "vlrl v1,d12(b2),i3:u8{0..15}", + "vlbb v1,d12(x2,b2),m3:{0..6}", + "vlvg v1,r3,d12(b2),m4:{0..3}", + "vlvgp v1,r2,r3", + "vll v1,r3,d12(b2)", + "vmrh v1,v2,v3,m4:{0..3}", + "vmrl v1,v2,v3,m4:{0..3}", + "vpk v1,v2,v3,m4:{1..3}", + "vpks v1,v2,v3,m4:{1..3},m5:{0,1}", // no spec. exception for m5 + "vpkls v1,v2,v3,m4:{1..3},m5:{0,1}", // no spec. exception for m5 + "vperm v1,v2,v3,v4", + "vpdi v1,v2,v3,m4", + "vrep v1,v3,i2:u16,m4:{0..3}", + "vrepi v1,i2:s16,m3:{0..3}", + "vscef v1,d12(v2,b2),m3:{0..3}", + "vsceg v1,d12(v2,b2),m3:{0,1}", + "vsel v1,v2,v3,v4", + "vseg v1,v2,m3:{0..2}", + "vst v1,d12(x2,b2),m3", + "vstebrh v1,d12(x2,b2),m3:{0..7}", + "vstebrf v1,d12(x2,b2),m3:{0..3}", + "vstebrg v1,d12(x2,b2),m3:{0,1}", + "vstbr v1,d12(x2,b2),m3:{1..4}", + "vsteb v1,d12(x2,b2),m3", + "vsteh v1,d12(x2,b2),m3:{0..7}", + "vstef v1,d12(x2,b2),m3:{0..3}", + "vsteg v1,d12(x2,b2),m3:{0,1}", + "vster v1,d12(x2,b2),m3:{1..3}", + // "vstm v1,v3,d12(b2),m4", // cannot express constraint + "vstrlr v1,r3,d12(b2)", + "vstrl v1,d12(b2),i3:u8{0..15}", + "vstl v1,r3,d12(b2)", + "vuph v1,v2,m3:{0..2}", + "vuplh v1,v2,m3:{0..2}", + "vupl v1,v2,m3:{0..2}", + "vupll v1,v2,m3:{0..2}", + + // Chapter 22: Vector Integer Instructions + "va v1,v2,v3,m4:{0..4}", + "vacc v1,v2,v3,m4:{0..4}", + "vac v1,v2,v3,v4,m5:{4}", + "vaccc v1,v2,v3,v4,m5:{4}", + "vn v1,v2,v3", + "vnc v1,v2,v3", + "vavg v1,v2,v3,m4:{0..3}", + "vavgl v1,v2,v3,m4:{0..3}", + "vcksm v1,v2,v3", + "vec v1,v2,m3:{0..3}", + "vecl v1,v2,m3:{0..3}", + "vceq v1,v2,v3,m4:{0..3},m5:{0,1}", // no spec. exception for m5 + "vch v1,v2,v3,m4:{0..3},m5:{0,1}", // no spec. exception for m5 + "vchl v1,v2,v3,m4:{0..3},m5:{0,1}", // no spec. exception for m5 + "vclz v1,v2,m3:{0..3}", + "vctz v1,v2,m3:{0..3}", + "vx v1,v2,v3", + "vgfm v1,v2,v3,m4:{0..3}", + "vgfma v1,v2,v3,v4,m5:{0..3}", + "vlc v1,v2,m3:{0..3}", + "vlp v1,v2,m3:{0..3}", + "vmx v1,v2,v3,m4:{0..3}", + "vmxl v1,v2,v3,m4:{0..3}", + "vmn v1,v2,v3,m4:{0..3}", + "vmnl v1,v2,v3,m4:{0..3}", + "vmal v1,v2,v3,v4,m5:{0..2}", + "vmah v1,v2,v3,v4,m5:{0..2}", + "vmalh v1,v2,v3,v4,m5:{0..2}", + "vmae v1,v2,v3,v4,m5:{0..2}", + "vmale v1,v2,v3,v4,m5:{0..2}", + "vmao v1,v2,v3,v4,m5:{0..2}", + "vmalo v1,v2,v3,v4,m5:{0..2}", + "vmh v1,v2,v3,m4:{0..2}", + "vmlh v1,v2,v3,m4:{0..2}", + "vml v1,v2,v3,m4:{0..2}", + "vme v1,v2,v3,m4:{0..2}", + "vmle v1,v2,v3,m4:{0..2}", + "vmo v1,v2,v3,m4:{0..2}", + "vmlo v1,v2,v3,m4:{0..2}", + "vmsl v1,v2,v3,v4,m5:{3},m6:{0,4,8,12}", // no spec. exception for m6 + "vnn v1,v2,v3", + "vno v1,v2,v3", + "vnx v1,v2,v3", + "vo v1,v2,v3", + "voc v1,v2,v3", + "vpopct v1,v2,m3:{0..3}", + "verllv v1,v2,v3,m4:{0..3}", + "verll v1,v3,d12(b2),m4:{0..3}", + "verim v1,v2,v3,i4:u8,m5:{0..3}", + "veslv v1,v2,v3,m4:{0..3}", + "vesl v1,v3,d12(b2),m4:{0..3}", + "vesrav v1,v2,v3,m4:{0..3}", + "vesra v1,v3,d12(b2),m4:{0..3}", + "vesrlv v1,v2,v3,m4:{0..3}", + "vesrl v1,v3,d12(b2),m4:{0..3}", + "vsl v1,v2,v3", + "vslb v1,v2,v3", + "vsld v1,v2,v3,i4:u8{0..7}", // spec exc. + "vsldb v1,v2,v3,i4:u8{0..15}", // otherwise unpredictable + "vsra v1,v2,v3", + "vsrab v1,v2,v3", + "vsrd v1,v2,v3,i4:u8{0..7}", + "vsrl v1,v2,v3", + "vsrlb v1,v2,v3", + "vs v1,v2,v3,m4:{0..4}", + "vscbi v1,v2,v3,m4:{0..4}", + "vsbi v1,v2,v3,v4,m5:{4}", + "vsbcbi v1,v2,v3,v4,m5:{4}", + "vsumg v1,v2,v3,m4:{1,2}", + "vsumq v1,v2,v3,m4:{2,3}", + "vsum v1,v2,v3,m4:{0,1}", + "vtm v1,v2", + + // Chapter 23: Vector String Instructions + "vfae v1,v2,v3,m4:{0..2},m5", + "vfee v1,v2,v3,m4:{0..2},m5:{0..3}", + "vfene v1,v2,v3,m4:{0..2},m5:{0..3}", + "vistr v1,v2,m3:{0..2},m5:{0,1}", + "vstrc v1,v2,v3,v4,m5:{0..2},m6", + "vstrs v1,v2,v3,v4,m5:{0..2},m6:{0,2}", + + // Chapter 24: Vector Floating-Point Instructions + "vfa v1,v2,v3,m4:{2,3,4},m5:{0,8}", + "wfc v1,v2,m3:{2,3,4},m4:{0}", + "wfk v1,v2,m3:{2,3,4},m4:{0}", + "vfce v1,v2,v3,m4:{2,3,4},m5:{0,4,8,12},m6:{0,1}", + "vfch v1,v2,v3,m4:{2,3,4},m5:{0,4,8,12},m6:{0,1}", + "vfche v1,v2,v3,m4:{2,3,4},m5:{0,4,8,12},m6:{0,1}", + "vcfps v1,v2,m3:{2,3},m4:{0,4,8,12},m5:{0,1,3..7}", // aka vcdg + "vcdg v1,v2,m3:{2,3},m4:{0,4,8,12},m5:{0,1,3..7}", // aka vcfps + "vcfpl v1,v2,m3:{2,3},m4:{0,4,8,12},m5:{0,1,3..7}", // aka vcdlg + "vcdlg v1,v2,m3:{2,3},m4:{0,4,8,12},m5:{0,1,3..7}", // aka vcfpl + "vcsfp v1,v2,m3:{2,3},m4:{0,4,8,12},m5:{0,1,3..7}", // aka vcgd + "vcgd v1,v2,m3:{2,3},m4:{0,4,8,12},m5:{0,1,3..7}", // aka vcsfp + "vclfp v1,v2,m3:{2,3},m4:{0,4,8,12},m5:{0,1,3..7}", // aka vclgd + "vclgd v1,v2,m3:{2,3},m4:{0,4,8,12},m5:{0,1,3..7}", // aka vclfp + "vfd v1,v2,v3,m4:{2,3,4},m5:{0,8}", + "vfi v1,v2,m3:{2,3,4},m4:{0,4,8,12},m5:{0,1,3..7}", + "vfll v1,v2,m3:{2,3},m4:{0,8}", + "vflr v1,v2,m3:{3,4},m4:{0,4,8,12},m5:{0,1,3..7}", + "vfmax v1,v2,v3,m4:{2,3,4},m5:{0,8},m6:{0..4,8..12}", + "vfmin v1,v2,v3,m4:{2,3,4},m5:{0,8},m6:{0..4,8..12}", + "vfm v1,v2,v3,m4:{2,3,4},m5:{0,8}", + "vfma v1,v2,v3,v4,m5:{0,8},m6:{2,3,4}", + "vfms v1,v2,v3,v4,m5:{0,8},m6:{2,3,4}", + "vfnma v1,v2,v3,v4,m5:{0,8},m6:{2,3,4}", + "vfnms v1,v2,v3,v4,m5:{0,8},m6:{2,3,4}", + "vfpso v1,v2,m3:{2,3,4},m4:{0,8},m5:{0..2}", + "vfsq v1,v2,m3:{2,3,4},m4:{0,8}", + "vfs v1,v2,v3,m4:{2,3,4},m5:{0,8}", + "vftci v1,v2,i3:u12,m4:{2,3,4},m5:{0,8}", + + // Chapter 25: Vector Decimal Instructions + + // not implemented + + // Chapter 26: Specialized-Function-Assist Instructions + + // "kdsa r1,r2", // cannot express constraint + // "dfltcc r1,r2,r3", // cannot express constraint + // "nnpa", // cannot express constraint + // "sortl r1,r2", // not implemented + "vclfnh v1,v2,m3,m4", // FIXME: m3:{2} m4:{0} no spec but IEEE exc. + "vclfnl v1,v2,m3,m4", // FIXME: m3:{2} m4:{0} no spec but IEEE exc. + "vcrnf v1,v2,v3,m4,m5", // FIXME: m4:{0} m5:{2} no spec but IEEE exc. + "vcfn v1,v2,m3,m4", // FIXME: m3:{0} m4:{1} no spec but IEEE exc. + "vcnf v1,v2,m3,m4", // FIXME: m3:{0} m4:{1} no spec but IEEE exc. +}; + +unsigned num_opcodes = sizeof opcodes / sizeof *opcodes; + + +static const char * +skip_digits(const char *p) +{ + while (isdigit(*p)) + ++p; + return p; +} + + +/* Parse an integer. If not found return initial value of P. */ +static const char * +parse_int(const char *p, int is_unsigned, long long *val) +{ + if (sscanf(p, "%lld", val) != 1) { + error("integer expected\n"); + return p; + } + if (is_unsigned && val < 0) { + error("unsigned value expected\n"); + return p; + } + return skip_digits(val < 0 ? p + 1 : p); +} + + +/* Parse a range of integers. Upon errors return initial value of P. */ +static const char * +parse_range(const char *p, int is_unsigned, long long *from, + long long *to) +{ + if (sscanf(p, "%lld..%lld", from, to) == 2) { + if (*from > *to) { + error("range %lld..%lld is not ascending\n", *from, *to); + return p; + } + if (is_unsigned && (*from < 0 || *to < 0)) { + error("unsigned value expected\n"); + return p; + } + p = strstr(p, "..") + 2; + return skip_digits(*to < 0 ? p + 1 : p); + } + return NULL; // not a range of values +} + + +/* An element is either a single integer or a range of integers + e.g. 3..6 + Note, the function recurses which is a neat trick to avoid + having to realloc the array for the values. + P points to an integer; if not syntax error. + The function returns NULL upon encounterning syntax errors. */ +static long long * +consume_elements(const char *p, int is_unsigned, int num_val) +{ + const char *begin = p; + long long from, to, val; + long long *values; + + /* Try range first. */ + p = parse_range(begin, is_unsigned, &from, &to); + if (p) { + if (p == begin) // errors + return NULL; + if (to < from) { + error("range %lld..%lld is not ascending\n", from, to); + return NULL; + } + num_val += to - from + 1; + + if (*p == '}') { + values = mallock((num_val + 1) * sizeof(long long)); + values[0] = num_val; + } else if (*p == ',') { + values = consume_elements(p + 1, is_unsigned, num_val); + if (values == NULL) + return NULL; + } else { + error("syntax error near '%s'\n", p); + return NULL; + } + for (long long v = to; v >= from; --v) + values[num_val--] = v; + return values; + } + + p = parse_int(begin, is_unsigned, &val); + if (p == begin) // errors + return NULL; + ++num_val; + if (*p == '}') { + values = mallock((num_val + 1) * sizeof(long long)); + values[0] = num_val; + } else if (*p == ',') { + values = consume_elements(p + 1, is_unsigned, num_val); + if (values == NULL) + return NULL; + } else { + error("syntax error near '%s'\n", p); + return NULL; + } + values[num_val] = val; + + return values; +} + + +/* This function is invoked upon encountering a '{' in an opcode + specification. It parses the set elements, explodes any value + ranges and returns an array with the values found. + The function returns NULL upon encountering syntax errors. */ +static long long * +consume_set(const char *p, unsigned num_bits, int is_unsigned) +{ + assert(*p == '{'); + + if (p[1] == '}') { + error("empty value set not allowed\n"); + return NULL; + } + long long *values = consume_elements(p + 1, is_unsigned, 0); + + if (values == NULL) // there were errors + return NULL; + + long long max_val = is_unsigned ? (1LL << num_bits) - 1 + : (1LL << (num_bits - 1)) - 1; + long long min_val = is_unsigned ? 0 : -max_val - 1; + + /* Check for out-of-range values. */ + for (int i = 1; i <= values[0]; ++i) { + long long val = values[i]; + if (val < min_val) { + error("value %lld too small for %s %u bits\n", val, + is_unsigned ? "unsigned" : "signed", num_bits); + return NULL; + } + if (val > max_val) { + error("value %lld too large for %s %u bits\n", val, + is_unsigned ? "unsigned" : "signed", num_bits); + return NULL; + } + } + return values; +} + + +/* Construct an invalid operand. It is used to indicate that there + were parse errors reading an operand. */ +static opnd +invalid_opnd(char *name) +{ + return (opnd){ .name = name, .kind = OPND_INVALID, .num_bits = 0, + .is_unsigned = 1, .allowed_values = 0 }; +} + + +/* GPRs and VRs are understood. Specification not allowed. */ +static opnd +register_operand(const char *opnd_string) +{ + char *name = strsave(opnd_string); + const char *p = skip_digits(opnd_string + 1); + + if (p == opnd_string + 1) { // no digits found + error("%s: invalid register name\n", opnd_string); + return invalid_opnd(name); + } + if (*p == ':') { + error("%s: specification is invalid for registers\n", opnd_string); + return invalid_opnd(name); + } + if (*p != '\0') { + error("'%s' is not understood\n", opnd_string); + return invalid_opnd(strsave(opnd_string)); + } + + opnd_t kind = OPND_GPR; + unsigned num_bits = 4; + + if (opnd_string[0] == 'v') { + kind = OPND_VR; + num_bits = 5; + } + return (opnd){ .name = name, .kind = kind, .num_bits = num_bits, + .is_unsigned = 1, .allowed_values = 0 }; +} + + +static opnd +mask_operand(const char *opnd_string) +{ + unsigned num_bits = 4; + + const char *p = skip_digits(opnd_string + 1); + + if (p == opnd_string + 1) { // no digits found + error("%s: invalid mask name\n", opnd_string); + return invalid_opnd(strsave(opnd_string)); + } + + char *name = strnsave(opnd_string, (unsigned)(p - opnd_string)); + long long *allowed_values = NULL; + + if (p[0] == ':') { + ++p; + /* Either e.g. m7:u2 or m7:{...} or m7:u2{...} */ + if (p[0] == 's') { + error("%s: cannot be a signed integer\n", name); + return invalid_opnd(name); + } + + if (p[0] != 'u' && p[0] != '{') { + error("%s: expected 'u' or '{'\n", name); + return invalid_opnd(name); + } + + if (p[0] == 'u') { + if (sscanf(p + 1, "%u", &num_bits) != 1) { + error("%s: missing #bits\n", name); + return invalid_opnd(name); + } + p = skip_digits(p + 1); + } + if (p[0] == '{') { + allowed_values = consume_set(p, num_bits, 1); + if (allowed_values == NULL) + return invalid_opnd(name); + p = strchr(p + 1, '}'); + if (p == NULL) { + error("%s: expected '}' not found\n", name); + return invalid_opnd(name); + } + ++p; + } + } + if (p[0] != '\0') { + error("'%s' is not understood\n", opnd_string); + return invalid_opnd(name); + } + return (opnd){ .name = name, .kind = OPND_MASK, .num_bits = num_bits, + .is_unsigned = 1, .allowed_values = allowed_values }; +} + + +static opnd +dxb_operand(const char *opnd_string) +{ + unsigned x, b, l, v; + opnd_t kind = OPND_INVALID; + + if (sscanf(opnd_string, "d12(x%u,b%u)", &x, &b) == 2) { + kind = OPND_D12XB; + } else if (sscanf(opnd_string, "d20(x%u,b%u)", &x, &b) == 2) { + kind = OPND_D20XB; + } else if (sscanf(opnd_string, "d12(b%u)", &b) == 1) { + kind = OPND_D12B; + } else if (sscanf(opnd_string, "d20(b%u)", &b) == 1) { + kind = OPND_D20B; + } else if (sscanf(opnd_string, "d12(l,b%u)", &b) == 1) { + kind = OPND_D12LB; + } else if (sscanf(opnd_string, "d12(l%u,b%u)", &l, &b) == 2) { + kind = OPND_D12LB; + } else if (sscanf(opnd_string, "d12(v%u,b%u)", &v, &b) == 2) { + kind = OPND_D12VB; + } + + if (kind == OPND_INVALID) { + error("%s: not a valid dxb operand\n", opnd_string); + return invalid_opnd(strsave(opnd_string)); + } + + const char *p = strchr(opnd_string, ')') + 1; + char *name = strnsave(opnd_string, (unsigned)(p - opnd_string)); + + if (p[0] == ':') { + error("%s: specification is invalid for dbx operands\n", name); + return invalid_opnd(name); + } + if (p[0] != '\0') { + error("'%s' is not understood\n", opnd_string); + return invalid_opnd(name); + } + + return (opnd){ .name = name, .kind = kind, .num_bits = 0, + .is_unsigned = 1, .allowed_values = 0 }; +} + + +static opnd +integer_operand(const char *opnd_string) +{ + const char *colon = strchr(opnd_string, ':'); + if (colon == NULL) { + error("%s: missing signedness specification\n", opnd_string); + return invalid_opnd(strsave(opnd_string)); + } + + opnd_t kind; + char *name = strnsave(opnd_string, (unsigned)(colon - opnd_string)); + if (colon[1] == 's') { + kind = OPND_SINT; + } else if (colon[1] == 'u') { + kind = OPND_UINT; + } else { + error("%s: invalid type spec '%c'\n", name, colon[1]); + return invalid_opnd(name); + } + + /* Check for PC-relative operand */ + if (strncmp(name, "ri", 2) == 0) { + if (kind == OPND_UINT) { + error("%s: PC-relative operand is unsigned\n", name); + return invalid_opnd(name); + } + kind = OPND_PCREL; + } + + unsigned num_bits; + if (sscanf(colon + 2, "%u", &num_bits) != 1) { + error("%s: missing #bits\n", name); + return invalid_opnd(name); + } + + long long *allowed_values = NULL; + const char *p = skip_digits(colon + 2); + if (p[0] == '{') { + allowed_values = consume_set(p, num_bits, kind == OPND_UINT); + if (allowed_values == NULL) + return invalid_opnd(name); + + p = strchr(p + 1, '}'); + if (p == NULL) { + error("%s: expected '}' not found\n", name); + return invalid_opnd(name); + } + ++p; + } + if (p[0] != '\0') { + error("'%s' is not understood\n", opnd_string); + return invalid_opnd(name); + } + + return (opnd){ .name = name, .kind = kind, .num_bits = num_bits, + .is_unsigned = kind == OPND_UINT, + .allowed_values = allowed_values }; +} + + +static opnd +get_operand(const char *opnd_string) +{ + switch (opnd_string[0]) { + case 'r': // GPR + if (! isdigit(opnd_string[1])) break; + /* fall through */ + case 'b': // GPR + case 'x': // GPR + case 'v': // VR + return register_operand(opnd_string); + + /* Masks without specification are understood: + 4-bit wide, unsigned integer value. In contrast to registers + a specification is allowed. However, the interpretation must + be 'unsigned'. */ + case 'm': + return mask_operand(opnd_string); + + /* Address computation using base register, optional index + register and displacement are understood. + d12 = 12-bit displacement = unsigned integer value + d20 = 20-bit displacement = signed integer value */ + case 'd': + return dxb_operand(opnd_string); + + default: + break; + } + + /* All other operands require specification of #bits and signedness. + E.g. ri3:s8, ri4:u24 */ + return integer_operand(opnd_string); +} + + +static const char * +opnd_kind_as_string(int kind) +{ + switch (kind) { + case OPND_GPR: return "gpr"; + case OPND_VR: return "vr"; + case OPND_D12XB: return "d12xb"; + case OPND_D20XB: return "d20xb"; + case OPND_D12B: return "d12b"; + case OPND_D20B: return "d20b"; + case OPND_D12LB: return "d12lb"; + case OPND_D12VB: return "d12vb"; + case OPND_SINT: return "sint"; + case OPND_UINT: return "uint"; + case OPND_MASK: return "mask"; + case OPND_PCREL: return "pcrel"; + case OPND_INVALID: return "INVALID"; + default: + assert(0); + } +} + + +opcode * +get_opcode_by_name(const char *name) +{ + unsigned len = strlen(name); + + for (int i = 0; i < num_opcodes; ++i) { + const char *op = opcodes[i]; + if (strncmp(op, name, len) == 0 && + (op[len] == ' ' || op[len] == '\0')) + return get_opcode_by_index(i); + } + return NULL; +} + + +/* Returns a block of information for the given opcode. */ +static opcode * +parse_opcode(const char *spec) +{ + if (debug) + printf("spec: |%s|\n", spec); + + /* Make local copy */ + char copy[strlen(spec) + 1]; + strcpy(copy, spec); + + /* Skip over the opcode name */ + char *p; + for (p = copy; *p; ++p) { + if (*p == ' ') + break; + } + // *p == ' ' or *p == '\0' + + char *name = strnsave(copy, p - copy); + + while (*p == ' ') + ++p; + + /* Remove any blanks from operand list */ + char *opnd_string = p; + char *q; + for (q = p = opnd_string; *p; ++p) { + if (*p != ' ') + *q++ = *p; + } + *q = '\0'; + + /* Count number of operands by counting ','. That number may be + larger than the actual number of operands because of operands + like d12(x2,b2) with embedded comma. That's OK. */ + unsigned num_comma = 0; + for (p = opnd_string; *p; ++p) + if (*p == ',') + ++num_comma; + + ++num_comma; // 1 comma --> 2 operands + + unsigned need = num_comma * sizeof(opnd); + + opnd *opnds = mallock(need); + + /* Parse operand list */ + int in_paren = 0; + int in_brace = 0; + int num_opnds = 0; + + for (p = opnd_string; *p; ++p) { + int c = *p; + + if (c == '{') ++in_brace; + if (c == '}') --in_brace; + + if (in_paren == 0) { + if (c == '(') { + ++in_paren; + } else if (c == ',' && ! in_brace) { + *p = '\0'; + opnds[num_opnds++] = get_operand(opnd_string); + opnd_string = p + 1; + } + } else { + if (c == ')') + --in_paren; + } + } + if (*opnd_string) + opnds[num_opnds++] = get_operand(opnd_string); + + /* Determine the number of fields in this opcode. + A field is an entity that test generation will assign a + value to */ + unsigned num_fields = 0; + + for (int i = 0; i < num_opnds; ++i) { + switch (opnds[i].kind) { + case OPND_GPR: + case OPND_VR: + case OPND_SINT: + case OPND_UINT: + case OPND_MASK: + case OPND_PCREL: + num_fields += 1; + break; + + case OPND_D12B: + case OPND_D20B: + num_fields += 2; + break; + + case OPND_D12XB: + case OPND_D12LB: + case OPND_D12VB: + case OPND_D20XB: + num_fields += 3; + break; + + case OPND_INVALID: + break; + + default: + assert(0); + } + } + + /* Finalise opcode */ + opcode *opc = mallock(sizeof(opcode)); + + opc->name = name; + opc->opnds = opnds; + opc->num_opnds = num_opnds; + opc->num_fields = num_fields; + + if (debug) { + printf("opcode: |%s|\n", opc->name); + for (int i = 0; i < opc->num_opnds; ++i) { + const opnd *d = opc->opnds + i; + printf("opnd %2d: %-8s type: %-5s", i, d->name, + opnd_kind_as_string(d->kind)); + if (d->kind != OPND_D12XB && d->kind != OPND_D12B && + d->kind != OPND_D20XB && d->kind != OPND_D20B && + d->kind != OPND_D12LB && d->kind != OPND_D12VB) + printf(" #bits: %2u", d->num_bits); + if (d->allowed_values) { + printf(" values:"); + unsigned nval = d->allowed_values[0]; + for (int j = 1; j <= nval; ++j) { + if (d->is_unsigned) + printf(" %u", (unsigned)d->allowed_values[j]); + else + printf(" %d", (int)d->allowed_values[j]); + } + } + printf("\n"); + } + } + + return opc; +} + + +/* Returns a block of information for the given opcode. */ +opcode * +get_opcode_by_index(unsigned ix) +{ + assert(ix < num_opcodes); + + return parse_opcode(opcodes[ix]); +} + + +void +release_opcode(opcode *opc) +{ + for (int i = 0; i < opc->num_opnds; ++i) { + const opnd *q = opc->opnds + i; + free(q->name); + free(q->allowed_values); + } + free(opc->name); + free(opc->opnds); + free(opc); +} + + +/* Unit tests */ + +static const char *unit_tests[] = { + "ok1", "ok2 m2", "ok3 m3:u3", "ok4 m4:{1}", "ok5 m5:{1..4}", + "ok6 m6:{1,2,3}", "ok7 m7:{1..4,5,7..8}", "ok8 m8:{10,1..3,0}", + "ok9 m9:u7{3,2,10..11,15,16}", "ok10 m10:u2", "ok11 m11:{11..11}", + "ok12 r1,d12(l,b2)", "ok13 r3,d12(l2,b3)", "ok14 d12(v1,b3)", + "err1 m", "err2 m2:", "err3 m3:s", "err4 m4:s5", + "err5 m5{", "err6 m6:{", "err7 m7:{}", "err8 m8:{1", "err9 m9:{1,", + "err10 m0:{2..}", "err11 m11:{2..1}", "err12 m11:u{1,2}", + "err13 m13:r", "err14 m14:u" +}; + + +void +run_unit_tests(void) +{ + unsigned num_tests = sizeof unit_tests / sizeof unit_tests[0]; + + debug = 1; + + for (int i = 0; i < num_tests; ++i) + parse_opcode(unit_tests[i]); +} diff --git a/none/tests/s390x/disasm-test/verify.c b/none/tests/s390x/disasm-test/verify.c new file mode 100644 index 000000000..c219607d9 --- /dev/null +++ b/none/tests/s390x/disasm-test/verify.c @@ -0,0 +1,149 @@ +/* -*- mode: C; c-basic-offset: 3; -*- */ + +/* + This file is part of Valgrind, a dynamic binary instrumentation + framework. + + Copyright (C) 2024-2025 Florian Krohm + + This program is free software; you can redistribute it and/or + modify it under the terms of the GNU General Public License as + published by the Free Software Foundation; either version 2 of the + License, or (at your option) any later version. + + This program is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, see . + + The GNU General Public License is contained in the file COPYING. +*/ + +#include // printf +#include // strcpy +#include // isspace +#include "objdump.h" // read_objdump +#include "main.h" // verbose +#include "vex.h" // vex_disasm + +static int disasm_same(const char *, const char *, unsigned); + + +/* Return number of disassembly mismatches. */ +verify_stats +verify_disassembly(const char *file) +{ + verify_stats stats = { 0, 0, 0 }; // return value + + vex_reset(); + + objdump_file *ofile = read_objdump(file); + if (ofile == NULL) + return stats; + + if (verbose) + printf("...verifying %u insns in '%s'\n", ofile->num_lines, file); + + const char *p = strchr(file, '.'); + char vex_file[strlen(file) + 5]; + + if (p == NULL) { + sprintf(vex_file, "%s.vex", file); + } else { + int len = p - file; + strncpy(vex_file, file, len); + strcpy(vex_file + len, ".vex"); + } + FILE *fpvex = fopen(vex_file, "w"); + if (fpvex == NULL) + error("%s: fopen failed\n", vex_file); + + for (int i = 0; i < ofile->num_lines; ++i) { + const objdump_line *oline = ofile->lines + i; + int spec_exc = 0; + const char *disassembly_from_vex = + vex_disasm(oline->insn_bytes, &spec_exc); + + if (spec_exc) { + ++stats.num_spec_exc; + + if (show_spec_exc) { + fprintf(stderr, "*** specification exception for insn "); + for (int j = 0; j < oline->insn_len; ++j) + fprintf(stderr, "%02X", oline->insn_bytes[j]); + fprintf(stderr, " in %s\n", file); + } + /* Instructions causing specification exceptions are not + compared */ + continue; + } + + if (disassembly_from_vex == NULL) + disassembly_from_vex = "MISSING disassembly from VEX"; + if (fpvex) + fprintf(fpvex, "%s\n", disassembly_from_vex); + + /* Compare disassembled insns */ + ++stats.num_verified; + if (! disasm_same(oline->disassembled_insn, disassembly_from_vex, + oline->address)) { + ++stats.num_mismatch; + if (show_miscompares) { + int n = fprintf(stderr, "*** mismatch VEX: |%s|", + disassembly_from_vex); + fprintf(stderr, "%*c", 50 - n, ' '); + fprintf(stderr, "objdump: |%s|\n", oline->disassembled_insn); + } + } + } + if (fpvex) + fclose(fpvex); + release_objdump(ofile); + + if (verbose) { + printf("...%u insns verified\n", stats.num_verified); + printf("...%u disassembly mismatches\n", stats.num_mismatch); + printf("...%u specification exceptions\n", stats.num_spec_exc); + } + + return stats; +} + + +/* Compare two disassembled insns ignoring white space. Return 1 if + equal. */ +static int +disasm_same(const char *from_objdump, const char *from_vex, + unsigned address) +{ + const char *p1 = from_objdump; + const char *p2 = from_vex; + + while (42) { + if (*p1 == '\0' && *p2 == '\0') + return 1; + if (*p1 == '\0' || *p2 == '\0') + return 0; + while (isspace(*p1)) + ++p1; + while (isspace(*p2)) + ++p2; + if (*p1 != *p2) { + long long offset_in_bytes; + unsigned long long target_address; + + /* Consider the case where the VEX disassembly has ".+integer" + or ".-integer" and the objdump disassembly has an + address. */ + if (*p2++ != '.') return 0; + if (sscanf(p2, "%lld", &offset_in_bytes) != 1) return 0; + if (sscanf(p1, "%llx", &target_address) != 1) return 0; + return address + offset_in_bytes == target_address; + } + ++p1; + ++p2; + } +} diff --git a/none/tests/s390x/disasm-test/vex.c b/none/tests/s390x/disasm-test/vex.c new file mode 100644 index 000000000..fa7b48b7c --- /dev/null +++ b/none/tests/s390x/disasm-test/vex.c @@ -0,0 +1,145 @@ +/* -*- mode: C; c-basic-offset: 3; -*- */ + +/* + This file is part of Valgrind, a dynamic binary instrumentation + framework. + + Copyright (C) 2024-2025 Florian Krohm + + This program is free software; you can redistribute it and/or + modify it under the terms of the GNU General Public License as + published by the Free Software Foundation; either version 2 of the + License, or (at your option) any later version. + + This program is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, see . + + The GNU General Public License is contained in the file COPYING. +*/ + +#include "libvex_ir.h" // emptyIRSB +#include "libvex.h" // LibVEX_Init +#include "main_util.h" // GRRR for guest_s390_defs.h which needs STATIC_ASSERT +#include "guest_s390_defs.h" // disInstr_S390 +#include "host_s390_defs.h" // s390_host_hwcaps +#include "main_globals.h" // vex_traceflags + +/* Some VEX header defines this. Need to get rid of it before + including standard headers. */ +#undef NULL +#include // NULL +#include // free +#include // strlen +#include // isspace +#include "main.h" // fatal +#include "vex.h" + +static IRSB *dis_irsb; +static char *last_vex_string; + + +/* This function is called from vfatal, vpanic, or due to a failed + assertion in VEX. vex_printf was called just before. */ +__attribute__((noreturn)) +static void +vex_exit(void) +{ + if (last_vex_string) + fatal("VEX: %s\n", last_vex_string); + else + fatal("vex_exit was called\n"); +} + + +/* This function is called from VEX whenever it wants to print a string. + We've arranged for the disassembled instruction to be printed. So we + intercept it here and stash it away. + However, this function may also be called when something unexpected + occurs in VEX. + Nb: strange function prototype. + nbytes == strlen(string) at all times. */ +static void +vex_put_string(const char *string, unsigned long nbytes) +{ + static unsigned buf_size = 0; + static char *buf = NULL; + unsigned need = strlen(string) + 1; + + if (need > buf_size) { + free(buf); + buf = mallock(need); + } + + /* Copy the string and remove any trailing white space. */ + strcpy(buf, string); + + for (int i = strlen(buf) - 1; i >= 0; --i) { + if (! isspace(string[i])) + break; + buf[i] = '\0'; + } + + last_vex_string = buf; +} + + +/* Initialise the disassembly machinery. */ +void +vex_init(void) +{ + if (vex_initdone) return; + + VexControl vcon; + + LibVEX_default_VexControl(&vcon); + LibVEX_Init(vex_exit, vex_put_string, 0, &vcon); + + /* Enable disassembly. */ + vex_traceflags = VEX_TRACE_FE; + + /* Pretend all hardware extensions are available to avoid running + into an emulation failure */ + s390_host_hwcaps = VEX_HWCAPS_S390X_ALL; + + dis_irsb = emptyIRSB(); +} + + +/* Reset the VEX memory allocator. Otherwise, we'll run out of memory + with a suggestion to recompile valgrind. Yuck. */ +void +vex_reset(void) +{ + if (vex_initdone) { + vexSetAllocModeTEMP_and_clear(); + dis_irsb = emptyIRSB(); + } +} + + +/* Disassemble a single insn. + The returned string will be overwritten the next time vex_disasm + is called. The function may return NULL indicating that something + inside VEX went wrong. */ +const char * +vex_disasm(const unsigned char *codebuf, int *spec_exc) +{ + DisResult res; + + res = disInstr_S390(dis_irsb, codebuf, /* delta */0, /* guest_IA */0, + VexArchS390X, NULL, NULL, VexEndnessBE, 0); + + /* Check for specification exception. Cf. macro s390_insn_assert + in guest_s390_toIR.c */ + if (res.whatNext == Dis_StopHere && + res.jk_StopHere == Ijk_NoDecode) { + *spec_exc = 1; + } + + return last_vex_string; +} diff --git a/none/tests/s390x/disasm-test/vex.h b/none/tests/s390x/disasm-test/vex.h new file mode 100644 index 000000000..c96acea8b --- /dev/null +++ b/none/tests/s390x/disasm-test/vex.h @@ -0,0 +1,32 @@ +/* -*- mode: C; c-basic-offset: 3; -*- */ + +/* + This file is part of Valgrind, a dynamic binary instrumentation + framework. + + Copyright (C) 2024-2025 Florian Krohm + + This program is free software; you can redistribute it and/or + modify it under the terms of the GNU General Public License as + published by the Free Software Foundation; either version 2 of the + License, or (at your option) any later version. + + This program is distributed in the hope that it will be useful, but + WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + General Public License for more details. + + You should have received a copy of the GNU General Public License + along with this program; if not, see . + + The GNU General Public License is contained in the file COPYING. +*/ + +#ifndef VEX_H +#define VEX_H + +void vex_init(void); +void vex_reset(void); +const char *vex_disasm(const unsigned char *, int *); + +#endif // VEX_H