]> git.ipfire.org Git - thirdparty/gcc.git/blame - libitm/libitm.texi
sh.c: Do not include algorithm.
[thirdparty/gcc.git] / libitm / libitm.texi
CommitLineData
0a35513e
AH
1\input texinfo @c -*-texinfo-*-
2
3@c %**start of header
4@setfilename libitm.info
5@settitle GNU libitm
6@c %**end of header
7
8
9@copying
98db73df 10Copyright @copyright{} 2011-2014 Free Software Foundation, Inc.
0a35513e
AH
11
12Permission is granted to copy, distribute and/or modify this document
13under the terms of the GNU Free Documentation License, Version 1.2 or
14any later version published by the Free Software Foundation; with no
15Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts.
16A copy of the license is included in the section entitled ``GNU
17Free Documentation License''.
18@end copying
19
20@ifinfo
21@dircategory GNU Libraries
22@direntry
23* libitm: (libitm). GNU Transactional Memory Library
24@end direntry
25
26This manual documents the GNU Transactional Memory Library.
27
28@insertcopying
29@end ifinfo
30
31
32@setchapternewpage odd
33
34@titlepage
35@title The GNU Transactional Memory Library
36@page
37@vskip 0pt plus 1filll
38@comment For the @value{version-GCC} Version*
39@sp 1
40@insertcopying
41@end titlepage
42
43@summarycontents
44@contents
45@page
46
47
48@node Top
49@top Introduction
50@cindex Introduction
51
52This manual documents the usage and internals of libitm, the GNU Transactional
53Memory Library. It provides transaction support for accesses to a process'
54memory, enabling easy-to-use synchronization of accesses to shared memory by
55several threads.
56
57
58@comment
59@comment When you add a new menu item, please keep the right hand
60@comment aligned to the same column. Do not use tabs. This provides
61@comment better formatting.
62@comment
63@menu
64* Enabling libitm:: How to enable libitm for your applications.
65* C/C++ Language Constructs for TM::
66 Notes on the language-level interface supported
67 by gcc.
68* The libitm ABI:: Notes on the external ABI provided by libitm.
69* Internals:: Notes on libitm's internal synchronization.
70* GNU Free Documentation License::
71 How you can copy and share this manual.
c288ec8f 72* Library Index:: Index of this documentation.
0a35513e
AH
73@end menu
74
75
76@c ---------------------------------------------------------------------
77@c Enabling libitm
78@c ---------------------------------------------------------------------
79
80@node Enabling libitm
81@chapter Enabling libitm
82
83To activate support for TM in C/C++, the compile-time flag @option{-fgnu-tm}
84must be specified. This enables TM language-level constructs such as
034209bc
TR
85transaction statements (e.g., @code{__transaction_atomic}, @pxref{C/C++
86Language Constructs for TM} for details).
0a35513e
AH
87
88@c ---------------------------------------------------------------------
89@c C/C++ Language Constructs for TM
90@c ---------------------------------------------------------------------
91
92@node C/C++ Language Constructs for TM
93@chapter C/C++ Language Constructs for TM
94
034209bc
TR
95Transactions are supported in C++ and C in the form of transaction statements,
96transaction expressions, and function transactions. In the following example,
97both @code{a} and @code{b} will be read and the difference will be written to
98@code{c}, all atomically and isolated from other transactions:
99
100@example
101__transaction_atomic @{ c = a - b; @}
102@end example
103
104Therefore, another thread can use the following code to concurrently update
105@code{b} without ever causing @code{c} to hold a negative value (and without
106having to use other synchronization constructs such as locks or C++11
107atomics):
108
109@example
110__transaction_atomic @{ if (a > b) b++; @}
111@end example
112
113GCC follows the @uref{https://sites.google.com/site/tmforcplusplus/, Draft
114Specification of Transactional Language Constructs for C++ (v1.1)} in its
115implementation of transactions.
116
117The precise semantics of transactions are defined in terms of the C++11/C11
118memory model (see the specification). Roughly, transactions provide
119synchronization guarantees that are similar to what would be guaranteed when
120using a single global lock as a guard for all transactions. Note that like
121other synchronization constructs in C/C++, transactions rely on a
122data-race-free program (e.g., a nontransactional write that is concurrent
123with a transactional read to the same memory location is a data race).
0a35513e
AH
124
125@c ---------------------------------------------------------------------
126@c The libitm ABI
127@c ---------------------------------------------------------------------
128
129@node The libitm ABI
130@chapter The libitm ABI
131
132The ABI provided by libitm is basically equal to the Linux variant of Intel's
133current TM ABI specification document (Revision 1.1, May 6 2009) but with the
134differences listed in this chapter. It would be good if these changes would
135eventually be merged into a future version of this specification. To ease
136look-up, the following subsections mirror the structure of this specification.
137
138@section [No changes] Objectives
139@section [No changes] Non-objectives
140
141@section Library design principles
142@subsection [No changes] Calling conventions
143@subsection [No changes] TM library algorithms
144@subsection [No changes] Optimized load and store routines
145@subsection [No changes] Aligned load and store routines
146
147@subsection Data logging functions
148
149The memory locations accessed with transactional loads and stores and the
150memory locations whose values are logged must not overlap. This required
151separation only extends to the scope of the execution of one transaction
152including all the executions of all nested transactions.
153
154The compiler must be consistent (within the scope of a single transaction)
155about which memory locations are shared and which are not shared with other
156threads (i.e., data must be accessed either transactionally or
157nontransactionally). Otherwise, non-write-through TM algorithms would not work.
158
eb00e959
TR
159For memory locations on the stack, this requirement extends to only the
160lifetime of the stack frame that the memory location belongs to (or the
161lifetime of the transaction, whichever is shorter). Thus, memory that is
162reused for several stack frames could be target of both data logging and
163transactional accesses; however, this is harmless because these stack frames'
164lifetimes will end before the transaction finishes.
165
0a35513e
AH
166@subsection [No changes] Scatter/gather calls
167@subsection [No changes] Serial and irrevocable mode
168@subsection [No changes] Transaction descriptor
169@subsection Store allocation
170
171There is no @code{getTransaction} function.
172
173@subsection [No changes] Naming conventions
174
175@subsection Function pointer encryption
176
177Currently, this is not implemented.
178
179
180@section Types and macros list
181
182@code{_ITM_codeProperties} has changed, @pxref{txn-code-properties,,Starting a
183transaction}.
184@code{_ITM_srcLocation} is not used.
185
186
187@section Function list
188
189@subsection Initialization and finalization functions
190These functions are not part of the ABI.
191
192@subsection [No changes] Version checking
193@subsection [No changes] Error reporting
194@subsection [No changes] inTransaction call
195
196@subsection State manipulation functions
197There is no @code{getTransaction} function. Transaction identifiers for
198nested transactions will be ordered but not necessarily sequential (i.e., for
199a nested transaction's identifier @var{IN} and its enclosing transaction's
200identifier @var{IE}, it is guaranteed that @math{IN >= IE}).
201
202@subsection [No changes] Source locations
203
204@subsection Starting a transaction
205
206@subsubsection Transaction code properties
207
208@anchor{txn-code-properties}
209The bit @code{hasNoXMMUpdate} is instead called @code{hasNoVectorUpdate}.
210Iff it is set, vector register save/restore is not necessary for any target
211machine.
212
213The @code{hasNoFloatUpdate} bit (@code{0x0010}) is new. Iff it is set, floating
214point register save/restore is not necessary for any target machine.
215
216@code{undoLogCode} is not supported and a fatal runtime error will be raised
217if this bit is set. It is not properly defined in the ABI why barriers
218other than undo logging are not present; Are they not necessary (e.g., a
219transaction operating purely on thread-local data) or have they been omitted by
220the compiler because it thinks that some kind of global synchronization
221(e.g., serial mode) might perform better? The specification suggests that the
222latter might be the case, but the former seems to be more useful.
223
224The @code{readOnly} bit (@code{0x4000}) is new. @strong{TODO} Lexical or dynamic
225scope?
226
227@code{hasNoRetry} is not supported. If this bit is not set, but
228@code{hasNoAbort} is set, the library can assume that transaction
229rollback will not be requested.
230
231It would be useful if the absence of externally-triggered rollbacks would be
232reported for the dynamic scope as well, not just for the lexical scope
233(@code{hasNoAbort}). Without this, a library cannot exploit this together
234with flat nesting.
235
236@code{exceptionBlock} is not supported because exception blocks are not used.
237
238@subsubsection [No changes] Windows exception state
239@subsubsection [No changes] Other machine state
240
241@subsubsection [No changes] Results from beginTransaction
242
243@subsection Aborting a transaction
244
245@code{_ITM_rollbackTransaction} is not supported. @code{_ITM_abortTransaction}
246is supported but the abort reasons @code{exceptionBlockAbort},
247@code{TMConflict}, and @code{userRetry} are not supported. There are no
248exception blocks in general, so the related cases also do not have to be
249considered. To encode @code{__transaction_cancel [[outer]]}, compilers must
250set the new @code{outerAbort} bit (@code{0x10}) additionally to the
251@code{userAbort} bit in the abort reason.
252
253@subsection Committing a transaction
254
255The exception handling (EH) scheme is different. The Intel ABI requires the
256@code{_ITM_tryCommitTransaction} function that will return even when the
257commit failed and will have to be matched with calls to either
258@code{_ITM_abortTransaction} or @code{_ITM_commitTransaction}. In contrast,
259gcc relies on transactional wrappers for the functions of the Exception
260Handling ABI and on one additional commit function (shown below). This allows
261the TM to keep track of EH internally and thus it does not have to embed the
262cleanup of EH state into the existing EH code in the program.
263@code{_ITM_tryCommitTransaction} is not supported.
264@code{_ITM_commitTransactionToId} is also not supported because the
265propagation of thrown exceptions will not bypass commits of nested
266transactions.
267
268@example
269void _ITM_commitTransactionEH(void *exc_ptr) ITM_REGPARM;
270void *_ITM_cxa_allocate_exception (size_t);
271void _ITM_cxa_throw (void *obj, void *tinfo, void *dest);
272void *_ITM_cxa_begin_catch (void *exc_ptr);
273void _ITM_cxa_end_catch (void);
274@end example
275
276@code{_ITM_commitTransactionEH} must be called to commit a transaction if an
277exception could be in flight at this position in the code. @code{exc_ptr} is
278the current exception or zero if there is no current exception.
279The @code{_ITM_cxa...} functions are transactional wrappers for the respective
280@code{__cxa...} functions and must be called instead of these in transactional
281code.
282
283To support this EH scheme, libstdc++ needs to provide one additional function
284(@code{_cxa_tm_cleanup}), which is used by the TM to clean up the exception
285handling state while rolling back a transaction:
286
287@example
288void __cxa_tm_cleanup (void *unthrown_obj, void *cleanup_exc,
289 unsigned int caught_count);
290@end example
291
292@code{unthrown_obj} is non-null if the program called
293@code{__cxa_allocate_exception} for this exception but did not yet called
294@code{__cxa_throw} for it. @code{cleanup_exc} is non-null if the program is
295currently processing a cleanup along an exception path but has not caught this
296exception yet. @code{caught_count} is the nesting depth of
297@code{__cxa_begin_catch} within the transaction (which can be counted by the TM
298using @code{_ITM_cxa_begin_catch} and @code{_ITM_cxa_end_catch});
299@code{__cxa_tm_cleanup} then performs rollback by essentially performing
300@code{__cxa_end_catch} that many times.
301
302
303
304@subsection Exception handling support
305
306Currently, there is no support for functionality like
307@code{__transaction_cancel throw} as described in the C++ TM specification.
308Supporting this should be possible with the EH scheme explained previously
309because via the transactional wrappers for the EH ABI, the TM is able to
310observe and intercept EH.
311
312
313@subsection [No changes] Transition to serial--irrevocable mode
314@subsection [No changes] Data transfer functions
315@subsection [No changes] Transactional memory copies
316
317@subsection Transactional versions of memmove
318
319If either the source or destination memory region is to be accessed
320nontransactionally, then source and destination regions must not be
321overlapping. The respective @code{_ITM_memmove} functions are still
322available but a fatal runtime error will be raised if such regions do overlap.
323To support this functionality, the ABI would have to specify how the
324intersection of the regions has to be accessed (i.e., transactionally or
325nontransactionally).
326
327@subsection [No changes] Transactional versions of memset
328@subsection [No changes] Logging functions
329
330@subsection User-registered commit and undo actions
331
332Commit actions will get executed in the same order in which the respective
333calls to @code{_ITM_addUserCommitAction} happened. Only
334@code{_ITM_noTransactionId} is allowed as value for the
335@code{resumingTransactionId} argument. Commit actions get executed after
336privatization safety has been ensured.
337
338Undo actions will get executed in reverse order compared to the order in which
339the respective calls to @code{_ITM_addUserUndoAction} happened. The ordering of
340undo actions w.r.t. the roll-back of other actions (e.g., data transfers or
341memory allocations) is undefined.
342
343@code{_ITM_getThreadnum} is not supported currently because its only purpose
344is to provide a thread ID that matches some assumed performance tuning output,
345but this output is not part of the ABI nor further defined by it.
346
347@code{_ITM_dropReferences} is not supported currently because its semantics and
348the intention behind it is not entirely clear. The
349specification suggests that this function is necessary because of certain
350orderings of data transfer undos and the releasing of memory regions (i.e.,
351privatization). However, this ordering is never defined, nor is the ordering of
352dropping references w.r.t. other events.
353
354@subsection [New] Transactional indirect calls
355
356Indirect calls (i.e., calls through a function pointer) within transactions
357should execute the transactional clone of the original function (i.e., a clone
358of the original that has been fully instrumented to use the TM runtime), if
359such a clone is available. The runtime provides two functions to
360register/deregister clone tables:
361
362@example
363struct clone_entry
364@{
365 void *orig, *clone;
366@};
367
368void _ITM_registerTMCloneTable (clone_entry *table, size_t entries);
369void _ITM_deregisterTMCloneTable (clone_entry *table);
370@end example
371
372Registered tables must be writable by the TM runtime, and must be live
373throughout the life-time of the TM runtime.
374
375@strong{TODO} The intention was always to drop the registration functions
376entirely, and create a new ELF Phdr describing the linker-sorted table. Much
377like what currently happens for @code{PT_GNU_EH_FRAME}.
378This work kept getting bogged down in how to represent the @var{N} different
379code generation variants. We clearly needed at least two---SW and HW
380transactional clones---but there was always a suggestion of more variants for
381different TM assumptions/invariants.
382
383The compiler can then use two TM runtime functions to perform indirect calls in
384transactions:
385@example
386void *_ITM_getTMCloneOrIrrevocable (void *function) ITM_REGPARM;
387void *_ITM_getTMCloneSafe (void *function) ITM_REGPARM;
388@end example
389
390If there is a registered clone for supplied function, both will return a
391pointer to the clone. If not, the first runtime function will attempt to switch
392to serial--irrevocable mode and return the original pointer, whereas the second
393will raise a fatal runtime error.
394
395@subsection [New] Transactional dynamic memory management
396
397@example
398void *_ITM_malloc (size_t)
399 __attribute__((__malloc__)) ITM_PURE;
400void *_ITM_calloc (size_t, size_t)
401 __attribute__((__malloc__)) ITM_PURE;
402void _ITM_free (void *) ITM_PURE;
403@end example
404
405These functions are essentially transactional wrappers for @code{malloc},
406@code{calloc}, and @code{free}. Within transactions, the compiler should
407replace calls to the original functions with calls to the wrapper functions.
408
409
410@section [No changes] Future Enhancements to the ABI
411
412@section Sample code
413
414The code examples might not be correct w.r.t. the current version of the ABI,
415especially everything related to exception handling.
416
417
418@section [New] Memory model
419
420The ABI should define a memory model and the ordering that is guaranteed for
421data transfers and commit/undo actions, or at least refer to another memory
422model that needs to be preserved. Without that, the compiler cannot ensure the
423memory model specified on the level of the programming language (e.g., by the
424C++ TM specification).
425
426For example, if a transactional load is ordered before another load/store, then
427the TM runtime must also ensure this ordering when accessing shared state. If
428not, this might break the kind of publication safety used in the C++ TM
429specification. Likewise, the TM runtime must ensure privatization safety.
430
431
432
433@c ---------------------------------------------------------------------
434@c Internals
435@c ---------------------------------------------------------------------
436
437@node Internals
438@chapter Internals
439
440@section TM methods and method groups
441
442libitm supports several ways of synchronizing transactions with each other.
443These TM methods (or TM algorithms) are implemented in the form of
444subclasses of @code{abi_dispatch}, which provide methods for
445transactional loads and stores as well as callbacks for rollback and commit.
446All methods that are compatible with each other (i.e., that let concurrently
447running transactions still synchronize correctly even if different methods
448are used) belong to the same TM method group. Pointers to TM methods can be
449obtained using the factory methods prefixed with @code{dispatch_} in
450@file{libitm_i.h}. There are two special methods, @code{dispatch_serial} and
451@code{dispatch_serialirr}, that are compatible with all methods because they
452run transactions completely in serial mode.
453
454@subsection TM method life cycle
455
456The state of TM methods does not change after construction, but they do alter
457the state of transactions that use this method. However, because
458per-transaction data gets used by several methods, @code{gtm_thread} is
459responsible for setting an initial state that is useful for all methods.
460After that, methods are responsible for resetting/clearing this state on each
461rollback or commit (of outermost transactions), so that the transaction
462executed next is not affected by the previous transaction.
463
464There is also global state associated with each method group, which is
465initialized and shut down (@code{method_group::init()} and @code{fini()})
466when switching between method groups (see @file{retry.cc}).
467
468@subsection Selecting the default method
469
470The default method that libitm uses for freshly started transactions (but
471not necessarily for restarted transactions) can be set via an environment
472variable (@env{ITM_DEFAULT_METHOD}), whose value should be equal to the name
473of one of the factory methods returning abi_dispatch subclasses but without
474the "dispatch_" prefix (e.g., "serialirr" instead of
475@code{GTM::dispatch_serialirr()}).
476
477Note that this environment variable is only a hint for libitm and might not
478be supported in the future.
479
480
481@section Nesting: flat vs. closed
482
483We support two different kinds of nesting of transactions. In the case of
484@emph{flat nesting}, the nesting structure is flattened and all nested
485transactions are subsumed by the enclosing transaction. In contrast,
486with @emph{closed nesting}, nested transactions that have not yet committed
487can be rolled back separately from the enclosing transactions; when they
488commit, they are subsumed by the enclosing transaction, and their effects
489will be finally committed when the outermost transaction commits.
490@emph{Open nesting} (where nested transactions can commit independently of the
491enclosing transactions) are not supported.
492
493Flat nesting is the default nesting mode, but closed nesting is supported and
494used when transactions contain user-controlled aborts
495(@code{__transaction_cancel} statements). We assume that user-controlled
496aborts are rare in typical code and used mostly in exceptional situations.
497Thus, it makes more sense to use flat nesting by default to avoid the
498performance overhead of the additional checkpoints required for closed
499nesting. User-controlled aborts will correctly abort the innermost enclosing
500transaction, whereas the whole (i.e., outermost) transaction will be restarted
501otherwise (e.g., when a transaction encounters data conflicts during
502optimistic execution).
503
504
505@section Locking conventions
506
507This section documents the locking scheme and rules for all uses of locking
508in libitm. We have to support serial(-irrevocable) mode, which is implemented
509using a global lock as explained next (called the @emph{serial lock}). To
510simplify the overall design, we use the same lock as catch-all locking
511mechanism for other infrequent tasks such as (de)registering clone tables or
512threads. Besides the serial lock, there are @emph{per-method-group locks} that
513are managed by specific method groups (i.e., groups of similar TM concurrency
514control algorithms), and lock-like constructs for quiescence-based operations
515such as ensuring privatization safety.
516
517Thus, the actions that participate in the libitm-internal locking are either
518@emph{active transactions} that do not run in serial mode, @emph{serial
519transactions} (which (are about to) run in serial mode), and management tasks
520that do not execute within a transaction but have acquired the serial mode
521like a serial transaction would do (e.g., to be able to register threads with
522libitm). Transactions become active as soon as they have successfully used the
523serial lock to announce this globally (@pxref{serial-lock-impl,,Serial lock
524implementation}). Likewise, transactions become serial transactions as soon as
525they have acquired the exclusive rights provided by the serial lock (i.e.,
526serial mode, which also means that there are no other concurrent active or
527serial transactions). Note that active transactions can become serial
528transactions when they enter serial mode during the runtime of the
529transaction.
530
531@subsection State-to-lock mapping
532
533Application data is protected by the serial lock if there is a serial
534transaction and no concurrently running active transaction (i.e., non-serial).
535Otherwise, application data is protected by the currently selected method
536group, which might use per-method-group locks or other mechanisms. Also note
537that application data that is about to be privatized might not be allowed to be
538accessed by nontransactional code until privatization safety has been ensured;
539the details of this are handled by the current method group.
540
541libitm-internal state is either protected by the serial lock or accessed
542through custom concurrent code. The latter applies to the public/shared part
543of a transaction object and most typical method-group-specific state.
544
545The former category (protected by the serial lock) includes:
546@itemize @bullet
547@item The list of active threads that have used transactions.
548@item The tables that map functions to their transactional clones.
549@item The current selection of which method group to use.
550@item Some method-group-specific data, or invariants of this data. For example,
551resetting a method group to its initial state is handled by switching to the
552same method group, so the serial lock protects such resetting as well.
553@end itemize
554In general, such state is immutable whenever there exists an active
555(non-serial) transaction. If there is no active transaction, a serial
556transaction (or a thread that is not currently executing a transaction but has
557acquired the serial lock) is allowed to modify this state (but must of course
558be careful to not surprise the current method group's implementation with such
559modifications).
560
561@subsection Lock acquisition order
562
563To prevent deadlocks, locks acquisition must happen in a globally agreed-upon
564order. Note that this applies to other forms of blocking too, but does not
565necessarily apply to lock acquisitions that do not block (e.g., trylock()
566calls that do not get retried forever). Note that serial transactions are
567never return back to active transactions until the transaction has committed.
568Likewise, active transactions stay active until they have committed.
569Per-method-group locks are typically also not released before commit.
570
571Lock acquisition / blocking rules:
572@itemize @bullet
573
574@item Transactions must become active or serial before they are allowed to
575use method-group-specific locks or blocking (i.e., the serial lock must be
576acquired before those other locks, either in serial or nonserial mode).
577
578@item Any number of threads that do not currently run active transactions can
579block while trying to get the serial lock in exclusive mode. Note that active
580transactions must not block when trying to upgrade to serial mode unless there
581is no other transaction that is trying that (the latter is ensured by the
582serial lock implementation.
583
584@item Method groups must prevent deadlocks on their locks. In particular, they
585must also be prepared for another active transaction that has acquired
586method-group-specific locks but is blocked during an attempt to upgrade to
587being a serial transaction. See below for details.
588
589@item Serial transactions can acquire method-group-specific locks because there
590will be no other active nor serial transaction.
591
592@end itemize
593
594There is no single rule for per-method-group blocking because this depends on
595when a TM method might acquire locks. If no active transaction can upgrade to
596being a serial transaction after it has acquired per-method-group locks (e.g.,
597when those locks are only acquired during an attempt to commit), then the TM
598method does not need to consider a potential deadlock due to serial mode.
599
600If there can be upgrades to serial mode after the acquisition of
601per-method-group locks, then TM methods need to avoid those deadlocks:
602@itemize @bullet
603@item When upgrading to a serial transaction, after acquiring exclusive rights
604to the serial lock but before waiting for concurrent active transactions to
605finish (@pxref{serial-lock-impl,,Serial lock implementation} for details),
606we have to wake up all active transactions waiting on the upgrader's
607per-method-group locks.
608@item Active transactions blocking on per-method-group locks need to check the
609serial lock and abort if there is a pending serial transaction.
610@item Lost wake-ups have to be prevented (e.g., by changing a bit in each
611per-method-group lock before doing the wake-up, and only blocking on this lock
612using a futex if this bit is not group).
613@end itemize
614
615@strong{TODO}: Can reuse serial lock for gl-*? And if we can, does it make
616sense to introduce further complexity in the serial lock? For gl-*, we can
617really only avoid an abort if we do -wb and -vbv.
618
619
620@subsection Serial lock implementation
621@anchor{serial-lock-impl}
622
623The serial lock implementation is optimized towards assuming that serial
624transactions are infrequent and not the common case. However, the performance
625of entering serial mode can matter because when only few transactions are run
626concurrently or if there are few threads, then it can be efficient to run
627transactions serially.
628
629The serial lock is similar to a multi-reader-single-writer lock in that there
630can be several active transactions but only one serial transaction. However,
631we do want to avoid contention (in the lock implementation) between active
632transactions, so we split up the reader side of the lock into per-transaction
633flags that are true iff the transaction is active. The exclusive writer side
634remains a shared single flag, which is acquired using a CAS, for example.
635On the fast-path, the serial lock then works similar to Dekker's algorithm but
636with several reader flags that a serial transaction would have to check.
637A serial transaction thus requires a list of all threads with potentially
638active transactions; we can use the serial lock itself to protect this list
639(i.e., only threads that have acquired the serial lock can modify this list).
640
641We want starvation-freedom for the serial lock to allow for using it to ensure
642progress for potentially starved transactions (@pxref{progress-guarantees,,
643Progress Guarantees} for details). However, this is currently not enforced by
644the implementation of the serial lock.
645
646Here is pseudo-code for the read/write fast paths of acquiring the serial
647lock (read-to-write upgrade is similar to write_lock:
648@example
649// read_lock:
650tx->shared_state |= active;
651__sync_synchronize(); // or STLD membar, or C++0x seq-cst fence
652while (!serial_lock.exclusive)
653 if (spinning_for_too_long) goto slowpath;
654
655// write_lock:
656if (CAS(&serial_lock.exclusive, 0, this) != 0)
657 goto slowpath; // writer-writer contention
658// need a membar here, but CAS already has full membar semantics
659bool need_blocking = false;
660for (t: all txns)
661 @{
662 for (;t->shared_state & active;)
663 if (spinning_for_too_long) @{ need_blocking = true; break; @}
664 @}
665if (need_blocking) goto slowpath;
666@end example
667
668Releasing a lock in this spin-lock version then just consists of resetting
669@code{tx->shared_state} to inactive or clearing @code{serial_lock.exclusive}.
670
671However, we can't rely on a pure spinlock because we need to get the OS
672involved at some time (e.g., when there are more threads than CPUs to run on).
673Therefore, the real implementation falls back to a blocking slow path, either
674based on pthread mutexes or Linux futexes.
675
676
677@subsection Reentrancy
678
679libitm has to consider the following cases of reentrancy:
680@itemize @bullet
681
682@item Transaction calls unsafe code that starts a new transaction: The outer
683transaction will become a serial transaction before executing unsafe code.
684Therefore, nesting within serial transactions must work, even if the nested
685transaction is called from within uninstrumented code.
686
687@item Transaction calls either a transactional wrapper or safe code, which in
688turn starts a new transaction: It is not yet defined in the specification
689whether this is allowed. Thus, it is undefined whether libitm supports this.
690
691@item Code that starts new transactions might be called from within any part
692of libitm: This kind of reentrancy would likely be rather complex and can
693probably be avoided. Therefore, it is not supported.
694
695@end itemize
696
697@subsection Privatization safety
698
699Privatization safety is ensured by libitm using a quiescence-based approach.
700Basically, a privatizing transaction waits until all concurrent active
701transactions will either have finished (are not active anymore) or operate on
702a sufficiently recent snapshot to not access the privatized data anymore. This
703happens after the privatizing transaction has stopped being an active
704transaction, so waiting for quiescence does not contribute to deadlocks.
705
706In method groups that need to ensure publication safety explicitly, active
707transactions maintain a flag or timestamp in the public/shared part of the
708transaction descriptor. Before blocking, privatizers need to let the other
709transactions know that they should wake up the privatizer.
710
711@strong{TODO} Ho to implement the waiters? Should those flags be
712per-transaction or at a central place? We want to avoid one wake/wait call
713per active transactions, so we might want to use either a tree or combining
714to reduce the syscall overhead, or rather spin for a long amount of time
715instead of doing blocking. Also, it would be good if only the last transaction
716that the privatizer waits for would do the wake-up.
717
718@subsection Progress guarantees
719@anchor{progress-guarantees}
720
721Transactions that do not make progress when using the current TM method will
722eventually try to execute in serial mode. Thus, the serial lock's progress
723guarantees determine the progress guarantees of the whole TM. Obviously, we at
724least need deadlock-freedom for the serial lock, but it would also be good to
725provide starvation-freedom (informally, all threads will finish executing a
726transaction eventually iff they get enough cycles).
727
728However, the scheduling of transactions (e.g., thread scheduling by the OS)
729also affects the handling of progress guarantees by the TM. First, the TM
730can only guarantee deadlock-freedom if threads do not get stopped. Likewise,
731low-priority threads can starve if they do not get scheduled when other
732high-priority threads get those cycles instead.
733
734If all threads get scheduled eventually, correct lock implementations will
735provide deadlock-freedom, but might not provide starvation-freedom. We can
736either enforce the latter in the TM's lock implementation, or assume that
737the scheduling is sufficiently random to yield a probabilistic guarantee that
738no thread will starve (because eventually, a transaction will encounter a
739scheduling that will allow it to run). This can indeed work well in practice
740but is not necessarily guaranteed to work (e.g., simple spin locks can be
741pretty efficient).
742
743Because enforcing stronger progress guarantees in the TM has a higher runtime
744overhead, we focus on deadlock-freedom right now and assume that the threads
745will get scheduled eventually by the OS (but don't consider threads with
746different priorities). We should support starvation-freedom for serial
747transactions in the future. Everything beyond that is highly related to proper
748contention management across all of the TM (including with TM method to
749choose), and is future work.
750
751@strong{TODO} Handling thread priorities: We want to avoid priority inversion
752but it's unclear how often that actually matters in practice. Workloads that
753have threads with different priorities will likely also require lower latency
754or higher throughput for high-priority threads. Therefore, it probably makes
755not that much sense (except for eventual progress guarantees) to use
756priority inheritance until the TM has priority-aware contention management.
757
758
759@c ---------------------------------------------------------------------
760@c GNU Free Documentation License
761@c ---------------------------------------------------------------------
762
763@include fdl.texi
764
765@c ---------------------------------------------------------------------
766@c Index
767@c ---------------------------------------------------------------------
768
c288ec8f
JM
769@node Library Index
770@unnumbered Library Index
0a35513e
AH
771
772@printindex cp
773
774@bye