@@ -416,17 +416,16 @@ \subsection{Exchange}
416416To see where this might be useful,
417417let's tweak our example from \secref {atomicity }:
418418instead of displaying the total number of processed files,
419- the \textsc {ui } might want to show how many were processed per second.
420- We could implement this by having the \textsc {ui } thread read the counter then zero it each second.
419+ the \textsc {UI } might want to show how many were processed per second.
420+ We could implement this by having the \textsc {UI } thread read the counter then zero it each second.
421421But we could get the following race condition if reading and zeroing are separate steps:
422422\begin {enumerate }
423- \item The \textsc {ui} thread reads the counter.
424- \item Before the \textsc {ui} thread has the chance to zero it,
425- the worker thread increments it again.
426- \item The \textsc {ui} thread now zeroes the counter, and the previous increment
427- is lost.
423+ \item The \textsc {UI} thread reads the counter.
424+ \item Before the \textsc {UI} thread has the chance to zero it,
425+ the worker thread increments it again.
426+ \item The \textsc {UI} thread now zeroes the counter, and the previous increment is lost.
428427\end {enumerate }
429- If the \textsc {ui } thread atomically exchanges the current value with zero,
428+ If the \textsc {UI } thread atomically exchanges the current value with zero,
430429the race disappears.
431430
432431\subsection {Test and set }
@@ -462,10 +461,10 @@ \subsection{Fetch and…}
462461all as part of a single atomic operation.
463462You might recall from the exchange example that additions by the worker thread must be atomic to prevent races, where:
464463\begin {enumerate }
465- \item The worker thread loads the current counter value and adds one.
466- \item Before that thread can store the value back,
467- the \textsc {ui } thread zeroes the counter.
468- \item The worker now performs its store, as if the counter was never cleared.
464+ \item The worker thread loads the current counter value and adds one.
465+ \item Before that thread can store the value back,
466+ the \textsc {UI } thread zeroes the counter.
467+ \item The worker now performs its store, as if the counter was never cleared.
469468\end {enumerate }
470469
471470\subsection {Compare and swap }
@@ -718,10 +717,10 @@ \subsection{Spurious LL/SC failures}
718717Many lockless algorithms use \textsc {CAS} loops like this to atomically update a variable when calculating its new value is not atomic.
719718They:
720719\begin {enumerate }
721- \item Read the variable.
722- \item Perform some (non-atomic) operation on its value.
723- \item \textsc {CAS} the new value with the previous one.
724- \item If the \textsc {CAS} failed, another thread beat us to the punch, so try again.
720+ \item Read the variable.
721+ \item Perform some (non-atomic) operation on its value.
722+ \item \textsc {CAS} the new value with the previous one.
723+ \item If the \textsc {CAS} failed, another thread beat us to the punch, so try again.
725724\end {enumerate }
726725If we use \monobox {compare\_ exchange\_ strong} for this family of algorithms,
727726the compiler must emit nested loops:
@@ -988,14 +987,12 @@ \subsection{Acquire-Release}
988987Order does not matter when incrementing the reference count since no action is taken as a result.
989988However, when we decrement, we must ensure that:
990989\begin {enumerate }
991- \item All access to the referenced object happens
992- \emph {before } the count reaches zero.
993- \item Deletion happens \emph {after } the reference count reaches
994- zero.\punckern \footnote {This can be optimized even further by
995- making the acquire barrier only occur conditionally, when the reference
996- count is zero.
997- Standalone barriers are outside the scope of this paper,
998- since they are almost always pessimal compared to a combined load-acquire or store-release.}
990+ \item All access to the referenced object happens \emph {before } the count reaches zero.
991+ \item Deletion happens \emph {after } the reference count reaches zero.\punckern \footnote {%
992+ This can be optimized even further by making the acquire barrier only occur conditionally,
993+ when the reference count is zero.
994+ Standalone barriers are outside the scope of this paper,
995+ since they are almost always pessimal compared to a combined load-acquire or store-release.}
999996\end {enumerate }
1000997
1001998Curious readers might be wondering about the difference between acquire-release and sequentially consistent operations.
@@ -1170,33 +1167,31 @@ \section{If concurrency is the question, \texttt{volatile} is not the answer.}
11701167(This is how most machines ultimately interact with the outside world.)
11711168\keyword {volatile} implies two guarantees:
11721169\begin {enumerate }
1173- \item The compiler will not elide loads and stores that seem `` unnecessary'' \quotekern .
1174- For example, if I have some function:
1175- \begin {colfigure }
1176- \begin {minted }[fontsize=\codesize ,autogobble]{cpp}
1177- void write(int *t)
1178- {
1179- *t = 2;
1180- *t = 42;
1181- }
1182- \end {minted }
1183- \end {colfigure }
1184- the compiler would normally optimize it to:
1185- \begin {minted }[fontsize=\codesize ,autogobble]{cpp}
1186- void write(int *t)
1187- {
1188- *t = 42;
1189- }
1190- \end {minted }
1191- \mintinline {cpp}{*t = 2} is often considered a \introduce {dead store},
1192- seemingly performing no function.
1193- However, when \texttt {t } is directed at an \textsc {MMIO} register,
1194- this assumption becomes unsafe.
1195- In such cases, each write operation could potentially influence the behavior of the associated hardware.
1196-
1197- \item The compiler will not reorder \keyword {volatile}
1198- reads and writes with respect to other \keyword {volatile} ones
1199- for similar reasons.
1170+ \item The compiler will not elide loads and stores that seem `` unnecessary'' \quotekern .
1171+ For example, if I have some function:
1172+ \begin {colfigure }
1173+ \begin {minted }[fontsize=\codesize ,autogobble]{cpp}
1174+ void write(int *t)
1175+ {
1176+ *t = 2;
1177+ *t = 42;
1178+ }
1179+ \end {minted }
1180+ \end {colfigure }
1181+ the compiler would normally optimize it to:
1182+ \begin {minted }[fontsize=\codesize ,autogobble]{cpp}
1183+ void write(int *t)
1184+ {
1185+ *t = 42;
1186+ }
1187+ \end {minted }
1188+ \mintinline {cpp}{*t = 2} is often considered a \introduce {dead store},
1189+ seemingly performing no function.
1190+ However, when \texttt {t } is directed at an \textsc {MMIO} register,
1191+ this assumption becomes unsafe.
1192+ In such cases, each write operation could potentially influence the behavior of the associated hardware.
1193+
1194+ \item The compiler will not reorder \keyword {volatile} reads and writes with respect to other \keyword {volatile} ones for similar reasons.
12001195\end {enumerate }
12011196
12021197These rules fall short of providing the atomicity and order required for safe communication between threads.
0 commit comments