A few weeks ago, the investigation Credit Suisse commissioned Paul Weiss to undertake into how the bank’s relationship with Archegos Capital Management went so wrong–inflicting $5.5 billion of losses on Credit Suisse and a total of $10-billion in losses across the banking system as a whole–was released. The crux of the debacle stemmed from “[Credit Suisse]’s relationship with Archegos Capital Management, the family office of Sung Kook “Bill” Hwang, a former hedge fund manager:”

Archegos, which used borrowed money from several banks to build massive equity positions, ultimately could not meet margin calls by its lenders when share prices dropped, triggering its default on March 25, 2021, costing [Credit Suisse] $5.5 billion in losses.”

This debacle for CS came fast on the heels of the early March meltdown of Greensill Capital, leaving CS holding a suite of funds nominally worth $10-billion, which bought securitized loans from Greensill, suddenly of indeterminate value. As Bloomberg summed up the mess:

Lex Greensill’s business has unraveled at a blistering pace, leaving a tangled trail of destruction all around it.
[In early March,] Greensill Capital filed for administration [bankruptcy] in the U.K., capping a stunning collapse for its founder. The bank that he owns in Germany has been shut down by regulators, the funds he ran in partnership with Credit Suisse are being liquidated and his firm is in the process of being broken up with its core perhaps sold to Apollo-backed Athene Holding Ltd.

Greensill himself has lost his billionaire status, and the myriad strands tangled up in the collapse involve everything from investment funds to the steel industry to Britain’s healthcare system.

As mind-boggling as these losses are, what both Archegos and Greensill were doing were entirely routine, even boring, everyday financial transactions (albeit at huge scale), but precisely the type of transactions a bank as well-staffed and sophisticated as CS should have been capable of monitoring and controlling for risk with the proverbial one hand tied behind its back:

Archegos was investing in widely traded and reasonably liquid securities using margin; and
Greensill was in the “factoring” business, a type of lending that has been around for centuries (some scholars even believe the “moneylenders” of the Bible were in the factoring business)–providing supply-chain finance to firms, accelerating payments to suppliers in return for a fee.

So the question imposes itself with some urgency: What could possibly have gone wrong? How could a Credit Suisse have suffered these two massive losses within weeks of each other–but ones which arose from fundamentally garden-variety activities for a bank with the scale and resources of CS?

To understand how CS blew itself up this way, we need to introduce the concept of “complicated” vs. “complex” systems, and to delve into the unusual failure modes of the complex variety.

This may seem far afield from the more mainstream content and coverage here on Adam Smith, Esq.–because it is!–but I’m publishing this column because I believe it exposes a systematic misapprehension in how our profession conceives of, analyzes, and assigns blame and responsibility for large-scale mishaps. To be specific, our profession’s concepts and assumptions about categories like “cause,” “responsibility,” and even “reasonable [behavior]” and “notice/warning” cannot grapple with complex systems.

It’s even worse than being inadequate to the situational analysis: By applying our familiar thought patterns developed and nurtured around “complicated” systems to “complex” ones, we employ profoundly unsuitable diagnostic tools. If we cannot recognize and comprehend complex systems, we will do our clients a disservice, twist truth, and repudiate justice.

Engineers who design systems and the forensic analysts who follow when things go suddenly and massively south distinguish between “complicated” and “complex” systems, because they each have quite distinct characteristics and quite different failure modes:

Complicated systems contain many components and dynamic, ever-shifting relationships among those components; they can operate across, potentially, an extremely wide range of circumstances, but their behavior is ultimately predictable (given XYZ set of inputs, the system will perform to the corresponding parameters designed into it for those inputs). And, if a complicated system fails, the cause can be tracked down and identified with confidence. In other words, with hindsight, the precise cause of failure can be pinpointed and an effective “fix” designed in. A high-performance jet engine is complicated, but the ways in which it can fail are not mysterious.
Complex systems, by contrast, can shift from “normal” performance to a disastrous mode for no immediately apparent reason. A power plant, say, or a large hospital is complex.

A decade ago The Harvard Business Review wrote about this topic in the business context: Learning to Live with Complexity. They define complicated systems in a sound and straightforward way:

Complicated systems have many moving parts, but they operate in patterned ways. […] Practically speaking, the main difference between complicated and complex systems is that with the former, one can usually predict outcomes by knowing the starting conditions. In a complex system, the same starting conditions can produce different outcomes, depending on the interactions of the elements in the system.

Global commercial banks are squarely “complex” systems, as the HBR article recognizes in the context of Citi’s near-meltdown at the onset of the GFC:

It is very difficult, if not impossible, for an individual decision maker to see an entire complex system. … Many have argued that Citigroup’s near collapse, in 2008, stemmed from an organizational design that locked people into silos; employees with information about the consequences of the bank’s involvement in subprime lending were not connected to those making strategic decisions. It didn’t help, of course, that the CEO at the time, Chuck Prince, conspicuously chose to ignore any warning signs of excessive leverage, as a now-famous remark to the Financial Times in 2007 demonstrates. “As long as the music is playing, you’ve got to get up and dance,” Prince said, adding, “We’re still dancing.”

The daunting challenge is that our minds are designed to think in terms of complicated and linear systems which conform to a Gaussian (bell) curve of behavior, and not in terms of complex or exponential systems where the “power curve” of behavior means extremely rare events take on outsize, even overwhelming, importance. As HBR puts it:

In complex systems, events far from the median may be more common than we think. Tools that assume outliers to be rare can obscure the wide variations contained in complex systems. In the U.S. stock market, the 10 biggest one-day moves accounted for half the market returns over the past 50 years.

With that groundwork laid, let’s return to Paul Weiss” executive summary of what happened at CS vis-a-vis Archegos, and hold the thought in your mind that CS is a complex system:

The Paul, Weiss team ultimately found no evidence of fraud or illegal conduct by individuals or the bank. Nor, the report concluded, was this a situation where the architecture of risk controls and processes was lacking, or the existing risk systems failed to operate sufficiently to identify critical risks and related concerns. Instead, the report found that senior managers persistently failed to address risks connected with trades made by Archegos. The losses “are the result of a fundamental failure of management and controls in [its] investment bank,” Paul, Weiss wrote. “The business was focused on maximizing short-term profits and failed to rein in and, indeed, enabled Archegos’ voracious risk-taking.”

Among the key conclusions in the report, the Paul, Weiss investigation found a failure to effectively manage risk in the investment bank’s prime services business by both the first and second lines of defense, as well as a lack of risk escalation. In the same business, it also found a failure to control limit excesses across both lines of defense as a result of an insufficient discharge of responsibilities in the investment bank and in risk, as well as a lack of prioritization of risk mitigation and enhancement measures, such as dynamic margining.

In other words, not only was there no illegal or even manifestly improper behavior, but the “architecture of risk controls, processes, and systems” was sound. The failures were compound ones of shirking or short-changing responsibilities at multiple levels in what became a disastrous cascade. The Wall Street Journal succinctly describes the escalating leeway CS extended to Archegos in retrospectively-chilling terms (“Credit Suisse Failed to Act on Archegos Risks, Report Says”):

Credit Suisse began waiving risk protections related to Mr. Hwang well before Archegos collapsed. In 2017, changes in Mr. Hwang’s trading prompted a 10% margin call, a common request by a bank to post more cash to back up positions as they became riskier. Credit Suisse waived the requirement and created a “bespoke weekly monitoring of Archegos.”

Then in 2019, Archegos asked to lower its margin requirement, saying competitors were offering a better deal. The margin on the stock-linked derivatives he liked to invest in, known as total return swaps, dropped to 7.5% of the total invested from around 20%.

In return, Archegos agreed to give Credit Suisse more power to close out its positions with little notice. But the report says these protections were “illusory, as the business appears to have had no intention of invoking them for fear of alienating the client.”

The litany of omissions and neglect is a lengthy one:

In September 2020, a credit risk manager escalated concerns about the trades to his supervisor; nothing was done.
Early in 2021, credit risk managers cut Archegos’s internal credit rating citing the firm’s “high performance volatility, concentrated portfolio, and increased use of leverage.” CS “discussed” asking Archegos for more margin but never did.
In March, the counterparty oversight committee again discussed Archegos, by then the prime brokerage unit’s largest client in terms of position size. The committee decided Archegos would be moved to a dynamic margining system within the next couple of weeks, This never happened because–seriously–Archegos kept cancelling calls CS had set up to discuss the change.
Credit Suisse returned $2.4 billion in margin collateral to Archegos between March 11 and March 19.

What are we to make of these high-level findings (Paul Weiss) and timeline (the WSJ)?

Without bending the analogy past the breaking point, compare the CS/Archegos failure to, say, how the Three Mile Island nuclear plant could fail or the Deepwater Horizon disaster occur:

Paul Weiss’ finding that there was no fraud or illegality would be equivalent to a finding that there was no sabotage or intentional misconduct by the plant operators.
Similarly, the finding that “the architecture of risk controls and processes” at CS was fundamentally sound means the design of the power plant or the oil well was intrinsically sound and the embedded safeguards well designed and capable of functioning as expected.

But the Journal’s litany of relaxed, waived, neglected, and suspended safeguards–by a series of individuals doubtlessly well-intentioned in each moment at every seemingly innocent and inconsequential decision-point–nails our hypothesis that CS is a complex and not “merely” a complicated system.

From the engineering profession, what can we learn about complex systems and their failure modes?

My source material for what immediately follows is Richard Cook, MD’s How Complex Systems Fail, published in 2000 by he Cognitive technologies Laboratory at the University of Chicago. I hope Dr. Cook forgives my extensive excerpt, but in consultation with a friend who is a lifelong engineer now at the top of his field, and a true pro, Cook’s is head and shoulders the most comprehensive and yet succinct treatment of complex systems out there, albeit at first blush, counterintuitive (see our point above about this being an exotic and unfamiliar mode of thought for us humans).

I have edited Cook’s piece extensively but have not altered any of his words. As publisher of this column, I also have taken the liberty of highlighting issues that I think are especially germane to the lawyer audience (thus: ***)

Complex systems are intrinsically hazardous systems, ***

All of the interesting systems (e.g. transportation, healthcare, power generation) are inherently and unavoidably hazardous by the own nature. (Lawyers think “hazardous” means flawed and defective; it does not.)

Complex systems are heavily and successfully defended against failure

The high consequences of failure lead over time to the construction of multiple layers of defense against failure. The effect of these measures is to provide a series of shields that normally divert operations away from accidents.

Catastrophe requires multiple failures – single point failures are not enough ***

The array of defenses works. System operations are generally successful. Put another way, there are many more failure opportunities than overt system accidents. (Fingering individual “points of failure” for blame profoundly misapprehends how complex systems function; there is almost an infinite number of possible failure points but the system safeguards itself from any one causing disaster.)

Complex systems contain changing mixtures of failures latent within them. (See above)

The complexity of these systems makes it impossible for them to run without multiple flaws being present. Eradication of all latent failures is limited primarily by economic cost but also because it is difficult before the fact to see how such failures might contribute to an accident.

Post-accident attribution to a ‘root cause’ is fundamentally wrong. ***

Because overt failure requires multiple faults, there is no isolated ‘cause’ of an accident; [identifying] the ‘root cause’ of an accident is impossible. The evaluations based on such reasoning as ‘root cause’ do not reflect a technical understanding of the nature of failure but rather the social, cultural need to blame specific, localized forces or events for outcomes.

Hindsight biases post-accident assessments of human performance.***

Knowledge of the outcome makes it seem that events leading to the outcome should have appeared more salient to practitioners at the time than was actually the case. This means that ex post facto accident analysis of human performance is inaccurate. The outcome knowledge poisons the ability of after-accident observers to recreate the view of practitioners before the accident of those same factors. It seems that practitioners “should have known” that the factors would “inevitably” lead to an accident.

Human operators have dual roles: as producers & as defenders against failure.

The system practitioners operate the system in order to produce its desired product and also work to forestall accidents. Outsiders rarely acknowledge the duality of this role. In non-accident filled times, the production role is emphasized. After accidents, the defense against failure role is emphasized. At either time, the outsider’s view misapprehends the operator’s constant, simultaneous engagement with both roles. (Note CS’s desire to stay on Archegos’ good side and to accommodate its wishes–“[revenue] production mode.”)

All practitioner actions are gambles.***

After accidents, the overt failure often appears to have been inevitable and the practitioner’s actions as blunders. That practitioner actions are gambles appears clear after accidents; in general, post hoc analysis regards these gambles as poor ones. But the converse: that successful outcomes are also the result of gambles; is not widely appreciated.

Change introduces new forms of failure.

The low rate of overt accidents in reliable systems may encourage changes, especially the use of new technology. Because new, high consequence accidents occur at a low rate, multiple system changes may occur before an accident, making it hard to see the contribution of technology to the failure.

Views of ‘cause’ limit the effectiveness of defenses against future events.***

Post-accident remedies for “human error” are usually predicated on obstructing activities that can “cause” accidents. These end-of-the-chain measures do little to reduce the likelihood of further accidents. In fact that likelihood of an identical accident is already extraordinarily low because the pattern of latent failures changes constantly. Instead of increasing safety, post-accident remedies usually increase the coupling and complexity of the system. This increases the potential number of latent failures and also makes the detection and blocking of accident trajectories more difficult.

Failure free operations require experience with failure.***

Recognizing hazard and successfully manipulating system operations to remain inside the tolerable performance boundaries requires intimate contact with failure. More robust system performance is likely to arise in systems where operators can discern the “edge of the envelope”. It also depends on providing calibration about how their actions move system performance towards or away from the edge of the envelope.(Guardrails can help avoid catastrophe, but training drivers to deal with “edge” conditions is more universally powerful and preserves the system intact.)

Here endeth our reading.

Let’s bring it back to Paul Weiss’s autopsy of the CS/Archegos debacle. Understanding the complexity of CS as a “system,” it appears more justifiable than ever that they found no illegality, no intentional misconduct, and a host of decisions and non-decisions that collectively produced -($5.5-billion).

I submit that Paul Weiss’s analysis, premised on categorizing CS as “complex” and not merely “complicated,” gave permission to the lawyers working on the autopsy to reject our oh-so-familiar assumptions about cause, responsibility, notice, and so forth. Applying those comfortable and deeply ingrained thought patterns would lead directly to a profound misapprehension of the essence of the CS/Archegos debacle.

It’s safe to wager that in our 21st Century world, the preponderance of complex systems across our society and economy is only going to grow. Be prepared to see them for what they are.

GE9X jet engine (the most powerful on the commercial market): Complicated but not complex

1 Comment

Mark J Logsdon on September 16, 2021 at 4:55 pm

Excellent report, Bruce. In addition to the two valuable sources (HBR on p 2 and Cook’s report on p 3) you linked, there is a classic, accessible, book-length work, Charles Perrow’s Normal Accidents: Living with High-Risk Technologies (Princeton University Press, 1999) that some ASE readers may find interesting and useful. It turns out that there are solid grounds in the science of cognition why even “the smartest guys in the room” cannot intuit how nonlinear systems with feedback and close coupling will work, nor readily assign blame when consequences move from risk to an actual event.

If – more likely, when – we act in a fiduciary role, do we need to inform ourselves how complex systems may be involved and undertake to include our understanding and insights in our advice? Thanks to ASE for raising the issues and offering some useful background and guidance.

Categories

Credit Suisse and the Failure of Complex Systems

Related Articles

Email Delivery

1 Comment

Sign-up for the Insider’s Email

Categories

Credit Suisse and the Failure of Complex Systems

Related Articles

Email Delivery

1 Comment

Sign-up for the Insider’s Email

Get Our Latest Articles Delivered to Your Inbox