]> git.ipfire.org Git - thirdparty/binutils-gdb.git/commitdiff
gdb/amd-dbgapi: add basic watchpoint support
authorSimon Marchi <simon.marchi@polymtl.ca>
Sat, 24 Jan 2026 05:15:00 +0000 (00:15 -0500)
committerSimon Marchi <simon.marchi@efficios.com>
Thu, 5 Feb 2026 18:37:39 +0000 (13:37 -0500)
Add basic watchpoint support for the amd-dbgapi target.  This means
placing write watchpoints on globally addressable memory.  More
complexity will come eventually to allow placing watchpoints on the
various other address spaces, but that will require adding proper
support for non-default address spaces first.

Implementation
--------------

I think the implementation is not too surprising, just adding the
required target methods.  But there are some things worthy of mention:

 - amd-dbgapi does not support read watchpoints.  If the core attempts
   to insert a read (or access, which means read/write) watchpoint,
   amd_dbgapi_target::insert_watchpoint returns an error.

   If we silently let the beneath target (linux-nat) install the read
   watchpoint, it would be potentially confusing.  Everything would look
   fine to the user, but a read from the GPU would not be caught, so it
   would look like the watchpoint doesn't work.

   There is a loophole though: read watchpoints created before the
   runtime is loaded (and therefore the amd-dbgapi target is pushed)
   will still be inserted.  Only when execution stops, and the user
   tries to resume again, will the check in
   amd_dbgapi_target::insert_watchpoint be hit.

   Another option would be to allow the host read watchpoint to go
   through, but warn that the reads from the AMD GPU device will not be
   watched.  We would need to be smart to avoid flooding the user with
   warnings.  But I decided to upstream the current ROCgdb behavior
   first, we can always change it later.

 - When the amd-dbgapi target gets pushed, we create amd-dbgapi
   watchpoints for any existing hardware write watchpoint location.

 - When the core asks the target to insert a watchpoint, we ask the
   target beneath to insert it first.  If the beneath target fails, we
   return immediately with an error.

 - When the core asks to remove a watchpoint, we ask the target beneath
   to to remove it first.  Even if it fails, we still try to remove the
   amd-dbgapi watchpoint.

 - When stopping after a watchpoint hit while the "precise-memory"
   setting is not enabled, it is possible for the wave to stop a few
   instructions later than the instruction that made the write that
   triggered the watchpoint.  We print a warning in that case, similar
   to what we do when a memory violation happens while "precis-memory"
   is disabled.

Testing
-------

 - Tests precise-memory-warning-watchpoint.exp and
   watchpoint-at-end-of-shader.exp are more or less brought as-is from
   downstream ROCgdb.  I modified precise-memory-warning-watchpoint.exp
   to watch a hipMalloc'ed region instead of a `__device__` global
   variable.  The latter doesn't work upstream, because we don't yet
   support the DWARF constructs that describe the variable location.

 - I added test watchpoint-basic.exp with various simple cases to
   exercises different code paths added by this patch.

Differences from downstream ROCgdb
----------------------------------

While extracting this code from ROCgdb, I made a few minor but possibly
significant (read: erroneous) changes.  Those should be reviewed
carefully.  I think that some code in ROCgdb was written at a time where
the amd-dbgapi target was always pushed at the very start of the
inferior execution, so assumptions were different.

 - The value type for the `amd_dbgapi_inferior_info::watchpoint_map` map
   is now a structure, instead of an std::pair, just because it makes
   the code more readable.

 - The insert_watchpoint and remove_watchpoint methods (and perhaps
   others) now assume that if they are called, the runtime is in the
   "enabled" state.

 - insert_initial_watchpoints has one more check (loc->owner->type !=
   bp_hardware_watchpoint), to filter out non-write watchpoints.
   Otherwise, I think that we could mistakenly insert some write
   watchpoints for some pre-existing read watchpoints.

 - Because it is possible for read watchpoints to be created before the
   target is pushed, remove_watchpoint returns early if it sees that the
   code asks for the removal of a read watchpoint, instead of asserting
   "type == hw_write" (this was caught by the new test).

 - In ROCgdb, remove_watchpoint does:

     if (addr < it->first || (addr + len) > it->second.first)
       return 1;

   I replaced it with some assertions.

   The first half of this condition should always be true, due to how
   std::upper_bound works.

   For the second part: if the watchpoint was created successfully, it
   is because it did fully cover the requested region (see
   insert_one_watchpoint).  I don't see why the core would ask us to
   remove a watchpoint that wasn't successfully inserted.  I am not 100%
   sure about that one, there might be some edge cases where this is not
   true.

 - I changed a manual free in stopped_by_watchpoint to a
   gdb::unique_xmalloc_ptr, even though it changes nothing functionally.

 - I merged some conditions in amd_dbgapi_target_normal_stop.

Change-Id: Ia15fb7434dc0c142a5a32997ada2e3a163c89f98
Approved-by: Lancelot Six <lancelot.six@amd.com>
Co-Authored-By: Laurent Morichetti <laurent.morichetti@amd.com>
gdb/amd-dbgapi-target.c
gdb/breakpoint.c
gdb/breakpoint.h
gdb/testsuite/gdb.rocm/precise-memory-warning-watchpoint.cpp [new file with mode: 0644]
gdb/testsuite/gdb.rocm/precise-memory-warning-watchpoint.exp [new file with mode: 0644]
gdb/testsuite/gdb.rocm/rocm-test-utils.h
gdb/testsuite/gdb.rocm/watchpoint-at-end-of-shader.cpp [new file with mode: 0644]
gdb/testsuite/gdb.rocm/watchpoint-at-end-of-shader.exp [new file with mode: 0644]
gdb/testsuite/gdb.rocm/watchpoint-basic.cpp [new file with mode: 0644]
gdb/testsuite/gdb.rocm/watchpoint-basic.exp [new file with mode: 0644]

index 471b7a7725ed5c9bfacea5193252463d0f92cfaf..4e52683dc55afccb1d040706590fc18cb257a8d9 100644 (file)
@@ -21,6 +21,7 @@
 #include "amd-dbgapi-target.h"
 #include "amdgpu-tdep.h"
 #include "async-event.h"
+#include "breakpoint.h"
 #include "cli/cli-cmds.h"
 #include "cli/cli-decode.h"
 #include "cli/cli-style.h"
@@ -34,6 +35,8 @@
 #include "solib.h"
 #include "target.h"
 
+#include <map>
+
 /* When true, print debug messages relating to the amd-dbgapi target.  */
 
 static bool debug_amd_dbgapi = false;
@@ -227,6 +230,19 @@ struct amd_dbgapi_inferior_info
                     struct breakpoint *>
     breakpoint_map;
 
+  /* Data associated to an inserted watchpoint.  */
+  struct watchpoint_info
+  {
+    /* End address of the watched region.  */
+    CORE_ADDR end_addr;
+
+    /* ID returned by amd-dbgapi.  */
+    amd_dbgapi_watchpoint_id_t id;
+  };
+
+  /* Ordered map of inserted watchpoints.  The key is the start address.  */
+  std::map<CORE_ADDR, watchpoint_info> watchpoint_map;
+
   /* List of pending events the amd-dbgapi target retrieved from the dbgapi.  */
   std::list<std::pair<ptid_t, target_waitstatus>> wave_events;
 
@@ -307,7 +323,12 @@ struct amd_dbgapi_target final : public target_ops
                                        ULONGEST offset, ULONGEST len,
                                        ULONGEST *xfered_len) override;
 
+  int insert_watchpoint (CORE_ADDR addr, int len, target_hw_bp_type type,
+                        expression *cond) override;
+  int remove_watchpoint (CORE_ADDR addr, int len, target_hw_bp_type type,
+                        expression *cond) override;
   bool stopped_by_watchpoint () override;
+  std::vector<CORE_ADDR> stopped_data_addresses () override;
 
   bool stopped_by_sw_breakpoint () override;
   bool stopped_by_hw_breakpoint () override;
@@ -711,13 +732,222 @@ amd_dbgapi_target::xfer_partial (enum target_object object, const char *annex,
   return TARGET_XFER_OK;
 }
 
+/* Ask amd-dbgapi to insert a watchpoint in [ADDR, ADDR + len).
+
+   Return 0 on success, 1 on failure.  */
+
+static int
+insert_one_watchpoint (amd_dbgapi_inferior_info *info, CORE_ADDR addr, int len)
+{
+  amd_dbgapi_watchpoint_id_t watch_id;
+
+  if (amd_dbgapi_set_watchpoint (info->process_id, addr, len,
+                                AMD_DBGAPI_WATCHPOINT_KIND_STORE_AND_RMW,
+                                &watch_id)
+      != AMD_DBGAPI_STATUS_SUCCESS)
+    return 1;
+
+  auto cleanup = make_scope_exit ([&] ()
+    { amd_dbgapi_remove_watchpoint (watch_id); });
+
+  /* A reduced range watchpoint may have been inserted, which would require
+     additional watchpoints to be inserted to cover the requested range.
+
+     For now, verify that the inserted watchpoint covers the requested range
+     and error out if not.  */
+  amd_dbgapi_global_address_t adjusted_address;
+
+  if (amd_dbgapi_watchpoint_get_info (watch_id,
+                                     AMD_DBGAPI_WATCHPOINT_INFO_ADDRESS,
+                                     sizeof (adjusted_address),
+                                     &adjusted_address)
+       != AMD_DBGAPI_STATUS_SUCCESS
+      || adjusted_address > addr)
+    return 1;
+
+  amd_dbgapi_size_t adjusted_size;
+
+  if (amd_dbgapi_watchpoint_get_info (watch_id,
+                                     AMD_DBGAPI_WATCHPOINT_INFO_SIZE,
+                                     sizeof (adjusted_size), &adjusted_size)
+       != AMD_DBGAPI_STATUS_SUCCESS
+      || (adjusted_address + adjusted_size) < (addr + len))
+    return 1;
+
+  using wp_info_t = amd_dbgapi_inferior_info::watchpoint_info;
+
+  if (!(info->watchpoint_map.emplace (addr, wp_info_t {addr + len, watch_id})
+       .second))
+    return 1;
+
+  cleanup.release ();
+  return 0;
+}
+
+/* Insert watchpoints for all existing watchpoint locations associated to
+   the program space of INFO.  */
+
+static void
+insert_initial_watchpoints (amd_dbgapi_inferior_info *info)
+{
+  gdb_assert (info->runtime_state == AMD_DBGAPI_RUNTIME_STATE_LOADED_SUCCESS);
+
+  for (bp_location *loc : all_bp_locations ())
+    {
+      /* Filter out other program spaces.  */
+      if (loc->pspace != info->inf->pspace)
+       continue;
+
+      /* Filter out non-hardware watchpoints.  */
+      if (loc->loc_type != bp_loc_hardware_watchpoint)
+       continue;
+
+      /* Filter out non-write watchpoints (access/read watchpoints might have
+        been created before the runtime got loaded).  */
+      if (loc->owner->type != bp_hardware_watchpoint)
+       continue;
+
+      if (insert_one_watchpoint (info, loc->address, loc->length) != 0)
+       warning (_("Failed to insert existing watchpoint after loading "
+                  "runtime."));
+    }
+}
+
+int
+amd_dbgapi_target::insert_watchpoint (CORE_ADDR addr, int len,
+                                     target_hw_bp_type type, expression *cond)
+{
+  amd_dbgapi_inferior_info &info
+    = get_amd_dbgapi_inferior_info (current_inferior ());
+
+  /* The amd-dbgapi target is not pushed when the runtime is not active.  */
+  gdb_assert (info.runtime_state == AMD_DBGAPI_RUNTIME_STATE_LOADED_SUCCESS);
+
+  if (type != hw_write)
+    {
+      /* We only allow write watchpoints when GPU debugging is active.  */
+      return 1;
+    }
+
+  if (int ret = beneath ()->insert_watchpoint (addr, len, type, cond);
+      ret != 0)
+    return ret;
+
+  if (int ret = insert_one_watchpoint (&info, addr, len);
+      ret != 0)
+    {
+      /* We failed to insert the GPU watchpoint, so remove the CPU watchpoint
+        before returning an error.  */
+      beneath ()->remove_watchpoint (addr, len, type, cond);
+      return ret;
+    }
+
+  return 0;
+}
+
+int
+amd_dbgapi_target::remove_watchpoint (CORE_ADDR addr, int len,
+                                     target_hw_bp_type type,
+                                     expression *cond)
+{
+  amd_dbgapi_inferior_info &info
+    = get_amd_dbgapi_inferior_info (current_inferior ());
+
+  /* The amd-dbgapi target is not pushed when the runtime is not active.  */
+  gdb_assert (info.runtime_state == AMD_DBGAPI_RUNTIME_STATE_LOADED_SUCCESS);
+
+  /* Try to remove the amd-dbgapi watchpoint even if the removal fails for the
+     target beneath.  */
+  int ret = beneath ()->remove_watchpoint (addr, len, type, cond);
+
+  /* We don't allow non-write watchpoints (see the insert_watchpoints method)
+     when the runtime is enabled (i.e. when the amd-dbgapi target is pushed).
+     But there is a loophole: non-write watchpoints can still be created by the
+     user before the runtime is enabled and the amd-dbgapi target is pushed.
+     In that case, there won't be an amd-dbgapi watchpoint to remove, so just
+     return.  */
+  if (type != hw_write)
+    return ret;
+
+  /* Find the watchpoint id for the [addr, addr + len) range.  */
+  auto it = info.watchpoint_map.upper_bound (addr);
+  if (it == info.watchpoint_map.begin ())
+    return 1;
+
+  std::advance (it, -1);
+
+  /* Not a reference, so that we can reference wp_info after erasing *it.  */
+  const auto [start_addr, wp_info] = *it;
+
+  /* Since upper_bound finds the first element greater than ADDR, the previous
+     element has to be less than or equal to ADDR.  */
+  gdb_assert (start_addr <= addr);
+
+  /* In insert_one_watchpoint, we ensured that the inserted watchpoint fully
+     covered the requested range.  It should be the same here.  */
+  gdb_assert (addr + len <= wp_info.end_addr);
+
+  info.watchpoint_map.erase (it);
+  if (amd_dbgapi_remove_watchpoint (wp_info.id) != AMD_DBGAPI_STATUS_SUCCESS)
+    return 1;
+
+  return ret;
+}
+
 bool
 amd_dbgapi_target::stopped_by_watchpoint ()
 {
   if (!ptid_is_gpu (inferior_ptid))
     return beneath ()->stopped_by_watchpoint ();
 
-  return false;
+  amd_dbgapi_watchpoint_list_t watchpoints;
+  if (amd_dbgapi_wave_get_info (get_amd_dbgapi_wave_id (inferior_ptid),
+                               AMD_DBGAPI_WAVE_INFO_WATCHPOINTS,
+                               sizeof (watchpoints), &watchpoints)
+      != AMD_DBGAPI_STATUS_SUCCESS)
+    return false;
+
+  /* Ensure watchpoints.watchpoint_ids is freed on exit.  */
+  gdb::unique_xmalloc_ptr<amd_dbgapi_watchpoint_id_t>
+    watchpoint_ids_holder (watchpoints.watchpoint_ids);
+
+  return watchpoints.count != 0;
+}
+
+std::vector<CORE_ADDR>
+amd_dbgapi_target::stopped_data_addresses ()
+{
+  amd_dbgapi_inferior_info &info
+    = get_amd_dbgapi_inferior_info (current_inferior ());
+
+  if (!ptid_is_gpu (inferior_ptid))
+    return beneath ()->stopped_data_addresses ();
+
+  amd_dbgapi_watchpoint_list_t watchpoints = {};
+  if (amd_dbgapi_wave_get_info (get_amd_dbgapi_wave_id (inferior_ptid),
+                               AMD_DBGAPI_WAVE_INFO_WATCHPOINTS,
+                               sizeof (watchpoints), &watchpoints)
+      != AMD_DBGAPI_STATUS_SUCCESS)
+    return {};
+
+  /* Ensure watchpoints.watchpoint_ids is freed on exit.  */
+  gdb::unique_xmalloc_ptr<amd_dbgapi_watchpoint_id_t>
+    watchpoint_ids_holder (watchpoints.watchpoint_ids);
+
+  std::vector<CORE_ADDR> watch_addr_hit;
+  for (amd_dbgapi_watchpoint_id_t watch_id
+       : gdb::make_array_view (watchpoints.watchpoint_ids, watchpoints.count))
+    {
+      auto it = std::find_if (info.watchpoint_map.begin (),
+                             info.watchpoint_map.end (),
+                             [watch_id] (auto &wp)
+                               { return wp.second.id == watch_id; });
+
+      if (it != info.watchpoint_map.end ())
+       watch_addr_hit.push_back (it->first);
+    }
+
+  return watch_addr_hit;
 }
 
 void
@@ -1031,9 +1261,12 @@ dbgapi_notifier_handler (int err, gdb_client_data client_data)
        case AMD_DBGAPI_RUNTIME_STATE_LOADED_SUCCESS:
          gdb_assert (info.runtime_state == AMD_DBGAPI_RUNTIME_STATE_UNLOADED);
          info.runtime_state = runtime_state;
+
          amd_dbgapi_debug_printf ("pushing amd-dbgapi target");
          info.inf->push_target (&the_amd_dbgapi_target);
 
+         insert_initial_watchpoints (&info);
+
          /* The underlying target will already be async if we are running, but not if
             we are attaching.  */
          if (info.inf->process_target ()->is_async_p ())
@@ -2297,6 +2530,42 @@ Warning: precise memory violation signal reporting is not enabled, reported\n\
 location may not be accurate.  See \"show amdgpu precise-memory\".\n"));
 }
 
+/* Observer callback for normal_stop.  Warn the user if a hardware watchpoint
+   was hit but precise memory is not enabled.  */
+
+static void
+amd_dbgapi_target_normal_stop (bpstat *bs_list, int print_frame)
+{
+  if (bs_list == nullptr
+      || !print_frame
+      || !ptid_is_gpu (inferior_thread ()->ptid))
+    return;
+
+  amd_dbgapi_inferior_info &info
+    = get_amd_dbgapi_inferior_info (current_inferior ());
+
+  if (info.process_id == AMD_DBGAPI_PROCESS_NONE
+      || info.precise_memory.enabled)
+    return;
+
+  bool found_hardware_watchpoint = false;
+
+  for (bpstat *bs = bs_list; bs != nullptr; bs = bs->next)
+    if (bs->breakpoint_at != nullptr
+       && is_hardware_watchpoint (bs->breakpoint_at))
+      {
+       found_hardware_watchpoint = true;
+       break;
+      }
+
+  if (!found_hardware_watchpoint)
+    return;
+
+  gdb_printf (_("\
+Warning: precise memory signal reporting is not enabled, watchpoint stop\n\
+location may not be accurate.  See \"show amdgpu precise-memory\".\n"));
+}
+
 /* Style for some kinds of messages.  */
 
 static cli_style_option fatal_error_style
@@ -2529,6 +2798,7 @@ INIT_GDB_FILE (amd_dbgapi_target)
   gdb::observers::inferior_exit.attach (amd_dbgapi_inferior_exited, "amd-dbgapi");
   gdb::observers::inferior_pre_detach.attach (amd_dbgapi_inferior_pre_detach, "amd-dbgapi");
   gdb::observers::thread_deleted.attach (amd_dbgapi_thread_deleted, "amd-dbgapi");
+  gdb::observers::normal_stop.attach (amd_dbgapi_target_normal_stop, "amd-dbgapi");
 
   add_basic_prefix_cmd ("amdgpu", no_class,
                        _("Generic command for setting amdgpu flags."),
index a4ccad32a8b90346d5c5f9ce5933c5361880b11c..da99ec27e19f02f90d13e5503c702876b29c6fe0 100644 (file)
@@ -2016,9 +2016,9 @@ is_breakpoint (const struct breakpoint *bpt)
          || bpt->type == bp_dprintf);
 }
 
-/* Return true if BPT is of any hardware watchpoint kind.  */
+/* See breakpoint.h.  */
 
-static bool
+bool
 is_hardware_watchpoint (const struct breakpoint *bpt)
 {
   return (bpt->type == bp_hardware_watchpoint
index 0d9111ba92e906ac4d1464bb9d9e9fe327de205f..6b5dbcfe8a3c036e77b3a84d0c58045f911518b9 100644 (file)
@@ -1085,6 +1085,10 @@ struct watchpoint : public breakpoint
 
 extern bool is_breakpoint (const struct breakpoint *bpt);
 
+/* Return true if BPT is of any hardware watchpoint kind.  */
+
+extern bool is_hardware_watchpoint (const struct breakpoint *bpt);
+
 /* Return true if BPT is of any watchpoint kind, hardware or
    software.  */
 
diff --git a/gdb/testsuite/gdb.rocm/precise-memory-warning-watchpoint.cpp b/gdb/testsuite/gdb.rocm/precise-memory-warning-watchpoint.cpp
new file mode 100644 (file)
index 0000000..dfa50be
--- /dev/null
@@ -0,0 +1,43 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2021-2026 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <hip/hip_runtime.h>
+
+#include "rocm-test-utils.h"
+
+__global__ void
+kernel (int *ptr)
+{
+  for (int i = 0; i < 1000; ++i)
+    (*ptr)++;
+}
+
+int *global_ptr;
+
+int
+main (int argc, char* argv[])
+{
+  CHECK (hipMalloc (&global_ptr, sizeof (int)));
+  CHECK (hipMemset (global_ptr, 0, sizeof (int)));
+
+  /* Break here.  */
+  hipLaunchKernelGGL (kernel, dim3 (1), dim3 (1), 0, 0, global_ptr);
+  CHECK (hipDeviceSynchronize ());
+
+  CHECK (hipFree (global_ptr));
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.rocm/precise-memory-warning-watchpoint.exp b/gdb/testsuite/gdb.rocm/precise-memory-warning-watchpoint.exp
new file mode 100644 (file)
index 0000000..9f8dac6
--- /dev/null
@@ -0,0 +1,52 @@
+# Copyright 2021-2026 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Test that when "amdgpu precise-memory" is off, hitting a watchpoint shows a
+# warning about the stop location maybe being inaccurate.
+
+load_lib rocm.exp
+
+require allow_hipcc_tests
+
+if { ![istarget "*-linux*"] } then {
+    continue
+}
+
+standard_testfile .cpp
+
+if {[build_executable "failed to prepare" $testfile $srcfile {debug hip}]} {
+    return
+}
+
+proc do_test { } {
+    clean_restart
+    gdb_load $::binfile
+
+    with_rocm_gpu_lock {
+       if {![runto kernel allow-pending qualified]} {
+           fail "can't run to main"
+           return
+       }
+       gdb_test "watch *((int *) global_ptr)" \
+           "Hardware watchpoint $::decimal: .*" \
+           "insert watchpoint"
+
+       gdb_test "continue" \
+           "hit Hardware watchpoint $::decimal.*Warning: precise memory signal reporting is not enabled.*" \
+           "continue to watchpoint"
+    }
+}
+
+do_test
index 70d232b873b86fd795daa78044ec6ae34a761e23..cbb1f654461739a8e0bbf0ce7ad5ab5b1ceba195 100644 (file)
     }                                                                  \
   while (0)
 
+/* Ensure that all memory operations are completed before continuing,
+   even when "precise-memory" is off.  */
+
+#define WAIT_MEM                                                       \
+  asm volatile (".if .amdgcn.gfx_generation_number < 10\n"             \
+               "  s_waitcnt 0\n"                                       \
+               ".elseif .amdgcn.gfx_generation_number < 11\n"          \
+               "  s_waitcnt_vscnt null, 0\n"                           \
+               ".else\n"                                               \
+               "  s_wait_idle\n"                                       \
+               ".endif")
+
 #endif /* ROCM_TEST_UTILS_H */
diff --git a/gdb/testsuite/gdb.rocm/watchpoint-at-end-of-shader.cpp b/gdb/testsuite/gdb.rocm/watchpoint-at-end-of-shader.cpp
new file mode 100644 (file)
index 0000000..e25c93e
--- /dev/null
@@ -0,0 +1,44 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2024-2026 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <hip/hip_runtime.h>
+
+#include "rocm-test-utils.h"
+
+__global__ void
+set_val (int *p, int v)
+{
+  *p = v;
+}
+
+int
+main ()
+{
+  int *v;
+  CHECK (hipMalloc (&v, sizeof (*v)));
+
+  /* First dispatch to initialize the memory.  */
+  set_val<<<1, 1>>> (v, 64);
+  CHECK (hipDeviceSynchronize ());
+
+  /* Break here.  */
+  set_val<<<1, 1>>> (v, 8);
+
+  CHECK (hipDeviceSynchronize ());
+
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.rocm/watchpoint-at-end-of-shader.exp b/gdb/testsuite/gdb.rocm/watchpoint-at-end-of-shader.exp
new file mode 100644 (file)
index 0000000..ff233c8
--- /dev/null
@@ -0,0 +1,114 @@
+# Copyright 2024-2026 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# This test checks that a write watchpoint is reported to the debugger
+# when the memory is modified by the last statement of a shader.
+
+load_lib rocm.exp
+
+require allow_hipcc_tests
+
+standard_testfile .cpp
+
+if {[build_executable "failed to prepare" $testfile $srcfile {debug hip}]} {
+    return
+}
+
+# This test is known to fail for some architectures when precise memory is
+# not enabled.  This proc finds out if the current target is expected to have
+# a fail or not.
+proc target_has_xfail {} {
+    # Check for targets where this test will fail without precise memory.
+    set targets [find_amdgpu_devices]
+    if { [llength $targets] == 0} {
+       # Can't determine GPU type, don't set up xfail.  The test will probably
+       # not run correctly anyway.
+       return 0
+    }
+
+    # The test will run on GPU-0, so it should be the first of the list.
+    set target [lindex $targets 0]
+
+    # Extract the target family by removing the last 2 chars from the target
+    # gfx number.
+    set target_family [string range $target 0 end-2]
+    verbose -log "Target family: $target_family"
+    return [expr {[lsearch -exact {gfx10 gfx11 gfx12} $target_family] != -1}]
+}
+
+proc do_test {precise_memory has_xfail} {
+    clean_restart
+    gdb_load $::binfile
+
+    with_rocm_gpu_lock {
+       if {![runto [gdb_get_line_number "Break here."] allow-pending]} {
+           return
+       }
+
+       gdb_test "p *v" "= 64"
+       gdb_test "watch -l *v" "Hardware watchpoint $::decimal: -location \\*v"
+
+       if {$precise_memory} {
+           # For architectures that does not support precise memory, a warning
+           # will be displayed.
+           gdb_test "set amdgpu precise-memory $precise_memory" \
+               "(warning: AMDGPU precise memory access reporting could not be enabled\\.)?"
+       }
+
+       if { !$precise_memory && $has_xfail } {
+           setup_xfail "*-*-*"
+       }
+       gdb_test "continue" \
+           [multi_line "Switching to thread $::decimal, lane 0.*" \
+               "" \
+               "Thread $::decimal \".*\" hit Hardware watchpoint $::decimal: -location \\*v" \
+               "" \
+               "Old value = 64" \
+               "New value = 8" \
+               ".*"]
+    }
+}
+
+set has_xfail [target_has_xfail]
+
+# First check if we support precise-memory.
+set supports_precise_memory 0
+clean_restart
+gdb_load $::binfile
+with_rocm_gpu_lock {
+    if {![runto_main]} {
+       return
+    }
+
+    gdb_test_multiple "set amdgpu precise-memory on" "" {
+       -re -wrap "warning: AMDGPU precise memory access reporting could not be enabled\\." {
+           set supports_precise_memory 0
+           pass $gdb_test_name
+       }
+       -re -wrap "^" {
+           set supports_precise_memory 1
+           pass $gdb_test_name
+       }
+    }
+}
+
+foreach_with_prefix precise_memory {on off} {
+    if { $precise_memory && !$supports_precise_memory } {
+       unsupported "target does not support precise memory"
+       continue
+    }
+
+    do_test $precise_memory $has_xfail
+}
diff --git a/gdb/testsuite/gdb.rocm/watchpoint-basic.cpp b/gdb/testsuite/gdb.rocm/watchpoint-basic.cpp
new file mode 100644 (file)
index 0000000..a89f336
--- /dev/null
@@ -0,0 +1,69 @@
+/* This testcase is part of GDB, the GNU debugger.
+
+   Copyright 2026 Free Software Foundation, Inc.
+
+   This program is free software; you can redistribute it and/or modify
+   it under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3 of the License, or
+   (at your option) any later version.
+
+   This program is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+   GNU General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with this program.  If not, see <http://www.gnu.org/licenses/>.  */
+
+#include <hip/hip_runtime.h>
+
+#include "rocm-test-utils.h"
+
+__global__ void
+kernel (int *val1, int *val2)
+{
+  *val1 += 10;
+  WAIT_MEM;
+  *val2 += 100;
+  WAIT_MEM;
+  *val1 += 20;
+  WAIT_MEM;
+  *val2 += 200;
+  WAIT_MEM;
+
+  /* Some devices that don't support "precise memory" miss watchpoints when
+     they would trigger near the end of the kernel.  Execute a bunch of sleeps
+     to make sure this doesn't happen.  Just a handful of instructions should
+     be enough, but this executes quickly anyway.  */
+  for (int i = 0; i < 100000; ++i)
+    __builtin_amdgcn_s_sleep (8);
+}
+
+/* Global pointers for the test to watch.  */
+int *global_ptr1;
+int *global_ptr2;
+int host_global = 5;
+
+int
+main ()
+{
+  /* Break before runtime load.  */
+  CHECK (hipMalloc (&global_ptr1, sizeof (int)));
+  CHECK (hipMalloc (&global_ptr2, sizeof (int)));
+  CHECK (hipMemset (global_ptr1, 0, sizeof (int)));
+  CHECK (hipMemset (global_ptr2, 0, sizeof (int)));
+
+  /* Break after malloc.  */
+  kernel<<<1, 1>>> (global_ptr1, global_ptr2);
+  CHECK (hipDeviceSynchronize ());
+
+  host_global += 12;
+
+  /* Break before second launch.  */
+  kernel<<<1, 1>>> (global_ptr1, global_ptr2);
+  CHECK (hipDeviceSynchronize ());
+
+  CHECK (hipFree (global_ptr1));
+  CHECK (hipFree (global_ptr2));
+  return 0;
+}
diff --git a/gdb/testsuite/gdb.rocm/watchpoint-basic.exp b/gdb/testsuite/gdb.rocm/watchpoint-basic.exp
new file mode 100644 (file)
index 0000000..cd4b9c6
--- /dev/null
@@ -0,0 +1,307 @@
+# Copyright 2026 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Basic watchpoint tests for AMD GPU targets.
+
+load_lib rocm.exp
+
+require allow_hipcc_tests
+
+standard_testfile .cpp
+
+if {[build_executable "failed to prepare" $testfile $srcfile {debug hip}]} {
+    return
+}
+
+proc continue_to_watchpoint_hit { old_value new_value test } {
+    gdb_test "continue" \
+       [multi_line \
+           "hit Hardware watchpoint $::decimal:.*" \
+           "" \
+           "Old value = $old_value" \
+           "New value = $new_value" \
+           ".*"] \
+       $test
+}
+
+# Test inserting a watchpoint on a host variable before the runtime loads, and
+# hitting hit after the runtime loads.
+
+proc_with_prefix test_host_watchpoint_before_runtime_load {} {
+    clean_restart $::testfile
+
+    with_rocm_gpu_lock {
+       if {![runto [gdb_get_line_number "Break before runtime load"]]} {
+           return
+       }
+
+       gdb_test "watch host_global" \
+           "Hardware watchpoint $::decimal: .*" \
+           "set watchpoint on host_global"
+
+       continue_to_watchpoint_hit 5 17 "continue to watchpoint hit"
+
+       gdb_test "continue" \
+           "Inferior 1 .* exited normally.*" \
+           "continue to end"
+    }
+}
+
+# Test inserting a watchpoint on a host variable after the runtime loads and
+# hitting hit.
+
+proc_with_prefix test_host_watchpoint_after_runtime_load {} {
+    clean_restart $::testfile
+
+    with_rocm_gpu_lock {
+       if {![runto [gdb_get_line_number "Break after malloc"]]} {
+           return
+       }
+
+       gdb_test "watch host_global" \
+           "Hardware watchpoint $::decimal: .*" \
+           "set watchpoint on host_global"
+
+       continue_to_watchpoint_hit 5 17 "continue to watchpoint hit"
+
+       gdb_test "continue" \
+           "Inferior 1 .* exited normally.*" \
+           "continue to end"
+    }
+}
+
+# Test inserting a watchpoint before the kernel is launched, then hitting
+# it when the kernel runs.
+
+proc_with_prefix test_watchpoint_before_kernel {} {
+    clean_restart $::testfile
+
+    with_rocm_gpu_lock {
+       if {![runto [gdb_get_line_number "Break after malloc"]]} {
+           return
+       }
+
+       gdb_test "watch *((int *) global_ptr1)" \
+           "Hardware watchpoint $::decimal: .*" \
+           "set watchpoint on *ptr1"
+
+       continue_to_watchpoint_hit 0 10 "continue to watchpoint hit 1"
+       continue_to_watchpoint_hit 10 30 "continue to watchpoint hit 2"
+       continue_to_watchpoint_hit 30 40 "continue to watchpoint hit 3"
+       continue_to_watchpoint_hit 40 60 "continue to watchpoint hit 4"
+
+       gdb_test "continue" \
+           "Inferior 1 .* exited normally.*" \
+           "continue to end"
+    }
+}
+
+# Test inserting a watchpoint while stopped inside the kernel.
+
+proc_with_prefix test_watchpoint_inside_kernel {} {
+    clean_restart $::testfile
+
+    with_rocm_gpu_lock {
+       if {![runto_main]} {
+           return
+       }
+
+       gdb_breakpoint "kernel" allow-pending temporary
+       gdb_test "continue" \
+           "hit Temporary breakpoint.*, kernel .*" \
+           "continue to kernel"
+
+       gdb_test "watch *((int *) global_ptr2)" \
+           "Hardware watchpoint $::decimal: .*" \
+           "set watchpoint on *ptr2 from inside kernel"
+
+       continue_to_watchpoint_hit 0 100 "continue to watchpoint hit 1"
+       continue_to_watchpoint_hit 100 300 "continue to watchpoint hit 2"
+       continue_to_watchpoint_hit 300 400 "continue to watchpoint hit 3"
+       continue_to_watchpoint_hit 400 600 "continue to watchpoint hit 4"
+
+       gdb_test "continue" \
+           "Inferior 1 .* exited normally.*" \
+           "continue to end"
+    }
+}
+
+# Test removing a watchpoint while stopped inside the kernel, then
+# continuing to the end.
+
+proc_with_prefix test_remove_watchpoint_inside_kernel {} {
+    clean_restart $::testfile
+
+    with_rocm_gpu_lock {
+       if {![runto [gdb_get_line_number "Break after malloc"]]} {
+           return
+       }
+
+       gdb_test "watch *((int *) global_ptr1)" \
+           "Hardware watchpoint $::decimal: .*" \
+           "set watchpoint"
+
+       continue_to_watchpoint_hit 0 10 "continue to watchpoint hit"
+       gdb_test "with confirm off -- delete" "" "delete all breakpoints"
+
+       gdb_test "continue" \
+           "Inferior 1 .* exited normally.*" \
+           "continue to end"
+    }
+}
+
+# Test watchpoints on different memory locations.
+
+proc_with_prefix test_multiple_watchpoints {} {
+    clean_restart $::testfile
+
+    with_rocm_gpu_lock {
+       if {![runto [gdb_get_line_number "Break after malloc"]]} {
+           return
+       }
+
+       gdb_test "watch *((int *) global_ptr1)" \
+           "Hardware watchpoint $::decimal: .*" \
+           "set watchpoint on *ptr1"
+
+       gdb_test "watch *((int *) global_ptr2)" \
+           "Hardware watchpoint $::decimal: .*" \
+           "set watchpoint on *ptr2"
+
+       continue_to_watchpoint_hit 0 10 "continue to watchpoint hit 1"
+       continue_to_watchpoint_hit 0 100 "continue to watchpoint hit 2"
+       continue_to_watchpoint_hit 10 30 "continue to watchpoint hit 3"
+       continue_to_watchpoint_hit 100 300 "continue to watchpoint hit 4"
+       continue_to_watchpoint_hit 30 40 "continue to watchpoint hit 5"
+       continue_to_watchpoint_hit 300 400 "continue to watchpoint hit 6"
+       continue_to_watchpoint_hit 40 60 "continue to watchpoint hit 7"
+       continue_to_watchpoint_hit 400 600 "continue to watchpoint hit 8"
+
+       gdb_test "continue" \
+           "Inferior 1 .* exited normally.*" \
+           "continue to end"
+    }
+}
+
+# Test disabling and enabling watchpoints.
+
+proc_with_prefix test_disable_enable_watchpoint {} {
+    clean_restart $::testfile
+
+    with_rocm_gpu_lock {
+       if {![runto [gdb_get_line_number "Break after malloc"]]} {
+           return
+       }
+
+       set wp1_num -1
+       set wp2_num -2
+
+       gdb_test_multiple "watch *((int *) global_ptr1)" "set watchpoint on *ptr1" {
+           -re -wrap "Hardware watchpoint ($::decimal): .*" {
+               set wp1_num $expect_out(1,string)
+               pass $gdb_test_name
+           }
+       }
+
+       gdb_test_multiple "watch *((int *) global_ptr2)" "set watchpoint on *ptr2" {
+           -re -wrap "Hardware watchpoint ($::decimal): .*" {
+               set wp2_num $expect_out(1,string)
+               pass $gdb_test_name
+           }
+       }
+
+       continue_to_watchpoint_hit 0 10 "continue to watchpoint hit 1"
+
+       gdb_test_no_output "disable $wp1_num" "disable watchpoint"
+
+       continue_to_watchpoint_hit 0 100 "continue to watchpoint hit 2"
+       continue_to_watchpoint_hit 100 300 "continue to watchpoint hit 3"
+
+       gdb_test_no_output "enable $wp1_num" "enable wp1"
+       gdb_test_no_output "disable $wp2_num" "disable wp2"
+
+       continue_to_watchpoint_hit 30 40 "continue to watchpoint hit 4"
+       continue_to_watchpoint_hit 40 60 "continue to watchpoint hit 5"
+
+       gdb_test_no_output "enable $wp2_num" "enable wp2"
+
+       continue_to_watchpoint_hit 400 600 "continue to watchpoint hit 6"
+
+       gdb_test "continue" \
+           "Inferior 1 .* exited normally.*" \
+           "continue to end"
+    }
+}
+
+# Test that read and access watchpoints are rejected when GPU debugging is
+# active.  The amd-dbgapi target only supports write watchpoints.  WP_CMD
+# must either be "rwatch" or "awatch".
+
+proc_with_prefix test_non_write_watchpoint_rejected { wp_cmd } {
+    clean_restart $::testfile
+
+    with_rocm_gpu_lock {
+       if {![runto [gdb_get_line_number "Break after malloc"]]} {
+           return
+       }
+
+       gdb_test "$wp_cmd host_global" \
+           "Hardware .*watchpoint $::decimal: .*" \
+           "set watchpoint"
+
+       gdb_test "continue" \
+           "Could not insert hardware watchpoint $::decimal.*Command aborted\\." \
+           "watchpoint rejected on resume"
+    }
+}
+
+# Test setting a read/access watchpoint before the GPU runtime loads.  The
+# watchpoint is set successfully on the CPU, but when the runtime loads,
+# we expect some behavior (error or warning) since the GPU only supports
+# write watchpoints.  WP_CMD must either be "rwatch" or "awatch".
+
+proc_with_prefix test_non_write_watchpoint_before_runtime_load { wp_cmd } {
+    clean_restart $::testfile
+
+    with_rocm_gpu_lock {
+       if {![runto [gdb_get_line_number "Break before runtime load"]]} {
+           return
+       }
+
+       gdb_test "$wp_cmd host_global" \
+           "Hardware .*watchpoint $::decimal: .*" \
+           "set watchpoint"
+
+       gdb_test "continue" "hit Hardware (read|access \\(read/write\\)) watchpoint.*Value = 5\r\n.*" \
+           "continue after setting watchpoint"
+
+       gdb_test "continue" \
+           "Could not insert hardware watchpoint $::decimal.*Command aborted\\." \
+           "watchpoint rejected on resume"
+    }
+}
+
+test_host_watchpoint_before_runtime_load
+test_host_watchpoint_after_runtime_load
+test_watchpoint_before_kernel
+test_watchpoint_inside_kernel
+test_remove_watchpoint_inside_kernel
+test_multiple_watchpoints
+test_disable_enable_watchpoint
+
+foreach_with_prefix wp_cmd { "rwatch" "awatch" } {
+    test_non_write_watchpoint_rejected $wp_cmd
+    test_non_write_watchpoint_before_runtime_load $wp_cmd
+}