We Need a Just Culture at JFK

8 03 2010

JFK airport is finally in the news, again, for something other than delays or a crash. It managed to make the headlines after a controller let his young son repeat instructions on frequency to departing aircraft, which is something that both Don Brown, of Get The Flick, and Rob Mark, of JetWhine, feel posed no signifiant threat to airport operations. Don is a retired Atlanta Center controller with 25 years under his belt, including time as the union safety representative. Rob is a reputable consultant with just as much to say about piloting and controlling as he has experience doing each. I trust their judgement and think they deserve a little more than just a fleeting glimpse if you’re looking for something to read.

The argument against firing or suspending the controller and his supervisor seems to center around the fact that this is an isolated incident (I’m pretty sure there’s no underground, grass-roots effort by controllers to have their children key the mic until they can negotiate a better contract). But like any other recent problem with the aviation industry, the dilemma is more systemic and chronic than localized and short-term. The safety culture of the FAA is suffering at the hand of flawed public perception and higher-ups who are feeling the squeeze from people who can’t see the human fallibility in occurrences like this.

An organizational culture is a set of behavioral norms (“the way we do things around here”, says Reason). It lies at the intersection of the way people feel (shared values and beliefs) and how the system operates (structure and control). Sometimes with this there is a tense relationship between the top and the bottom of the hierarchy, which is why we’re seeing a lot of talk about how to develop an effective safety culture that can dampen the likelihood of additional risks popping up because people are not on the same page. An informed safety culture is generally considered to have four sub-cultures: a reporting culture that encourages honest disclosure; a flexible culture that can flatten the hierarchy and grant autonomy to people at the sharp-end; a learning culture that is proactive and competent in drawing smart conclusions; and a just culture that fosters trust and bolsters the other three subcultures while still drawing a line somewhere.

Just Culture is such an important, yet nebulous idea that it warranted a book by Dr. Sidney Dekker, a 737NG first officer and professor of human factors and flight safety at Lund University School of Aviation in Sweden. His book is not about diversity in Queens, but rather “Balancing Safety and Accountability” and how easily our society (or an organization, or the FAA) can shoot itself in the foot when it straddles the line between pragmatic and procedural reactions. Should the JFK controller be punished simply because he did something that is frowned upon? Or should he be punished only if there is something legitimate and valuable to be gained from his termination? I don’t even know if letting someone else speak over the radio is explicitly prohibited in the controller’s book of “Thou Shalt”.

In this instance, a smart and just culture would weigh the pros and cons of going after the controller by asking questions like “what is the informed community saying?” and “how will our decision impact our relationship with our workers?”. Before the pages in his book are even numbered with numbers (it’s on xii), Dr. Dekker says that “Unjust responses to failure…are not about bad performance. They are about bad relationships”.

No good parent would send their kid to his room for doing something that was neither bad nor against what they had told them to do. When I was younger, I was only ever punished for beating up my sister (now I just do it without getting punished). But mom and dad never used my middle name when I forgot to save some money during the week.

So maybe, instead of rationalizing the punishment, we need to reason with reality and develop professional tolerances that encourage cooperation for when the system really demands it. Severely punishing the JFK controller for something more dumb and petty than violent and egregious can leave the brotherhood of controllers bitter and resentful towards the system in which they work.

And who wants to be safe when you’re too busy being bitter and resentful?


P.S. – “Just Culture” was in the cockpit of US1549. Captain Sullenberger was reading it during his trip when he landed in the Hudson. He was given another copy – as well as the key to New York City – by Mayor Bloomberg.


Rep. Murtha’s Contribution to the Checklist and System Safety

11 02 2010

This was a headline earlier this week.  Read the first two paragraphs.

Normally I’d just glaze over the details of Rep. Murtha’s passing and accept the insinuation that the operating room mishaps like this are not the norm.  But I just finished reading The Checklist Manifesto by Dr. Atul Gawande, and all of a sudden the term ‘complications’ means so much more.

A routine procedure like the laparoscopy Rep. Murtha was having is fallible because it is routine, and because the task at hand is so minimally-invasive and trite compared to, say, an emergency lobotomy.  It’s simple.  Or, at least, straightforward enough to not give surgeons stage fright in the operating room.  So why did this happen?

When we say that there were complications, we admit that the problem is one of complexity.  Complexity refers not only to there being many players, which is true – the proper tools, personnel, and preparation need to be in place – but also to the way they must interact for there to be a reliably successful outcome.  With people, this interaction is teamwork and the ability to manage all available resources cohesively and quickly.

A startling example Gawande gives of this is that one item on the medical checklists being used in many institutions around the world makes sure – yes, makes sure – that everybody in the operating room knows everybody else’s names.  Introductions.  Apparently, formalities like this were routinely ignored until something happened, at which communication rose to little more than stumbling commands to nameless faces.

James Reason’s 1997 M.T.R.O.O.A. reminds us to ask not why the failure of the surgeon who “hit his intestines” happened, but how it failed to be corrected, especially when we already know that human nature will make good on any opportunity to err under pressure to succeed with only one chance, regardless of skill, know-how, or determination.  It happens that checklists allow an “activation phenomenon” to occur when each doctor, nurse, anesthesiologist, or resident is allowed to contribute.  People will begin to feel valuable and important to the cause (the patient on the operating table) and will be more inclined to speak up if they see something wrong.  If sharp-end operators are going into the workplace with the mid-set that they are completely independent and fully capable of performing alone, then they are more ignorant to the existence of critical dependencies in the medical system than I originally thought.

Perhaps the hairy eyeball for things like checklists and communications comes from the fact that power, or at least the feeling of it, can be lost.  Which is true, granted.  The decentralization of authority is featured by Reason when he describes a “flexible culture” as a requirement for an organizational safety culture.  When emergencies arise, the hierarchy needs to collapse and front-line workers need to be autonomous and trusted to handle the situation promptly.  Seeking approval for quick corrective action wastes valuable time and can have dire consequences.  Gawande discusses this at length when he talks about the federal government and the Hurricane Katrina fiasco.

Also, many industries and professions become obsessed with improving individual components or addressing specific concerns to such a point that they can’t see the forest for the trees; they can’t see latent, systemic threats.  Or worse, they can, but they’ve got a bureaucratic wedgie big enough to keep them from being able to do anything about it in any effective sense.  An army of external distractions and chaotic variances has been encroaching on the safe and simple practices people and organizations have learned to take for granted.  And that’s why we can’t sweat the stupid stuff any longer.

How the nick on Rep. Murtha’s intestines failed to be corrected is probably a result of too little being done too late.  Reactive safety procedures are effective only in pacifying the devil and angel team on our shoulder who scream together, “Well, we tried!”.  When the patient is gushing blood is not the time to start thinking about what to do.  This is where Gawande makes hiscase for the checklist as a tool that can “instill a discipline of higher performance” consistent with predicting failure and preparing for the worst.  Had the surgeon preempted a slip of his hand (which I understand is a common surgical occurrence) with a plan for coordination and by briefing his staff, then we would begin to see the way each player acts as part of a collaborative unit instead of as just a collection of players.

“Man is fallible, but maybe men are less so” says Dr. G.

This is all to say that we can do with less professional arrogance (to be blunt) and fewer who believe that “our jobs are too complicated to reduce to a checklist”.  Again, Gawande says of checklists: “They are quick and simple tools aimed to buttress the skills of expert professionals” not to belittle or replace them.

I have a grand sum of zero experience with medicine, but yet, this post was fairly easy for me to think about.  I just pretended that everything had to do with aviation.

It appears that like gall-bladder surgery, flying an airplane is simple, too.  The acting-as-a-crew part, or the focusing-in-an-emergency part is what’s difficult.  That’s why the first item on the emergency checklist for an engine failure in any single-engine aircraft is stupid: FLY THE AIRPLANE.

Rest in peace, Representative Murtha.


Assuring safety with “the other” SMS

5 02 2010

I’m currently reading a relatively recent advisory circular from the FAA, AC120-92 [PDF], “Introduction to Safety Management Systems for Air Operators”.  It’s for a research project, but I have no shame in admitting that I’d probably be browsing through it even if I didn’t have a grade pushing me along.

A safety management system is the regulatory way of saying to airlines, “Look – your guys’ operations have become so complex and diverse that we’re going to grant airline management some autonomy in the safety department.  Using this framework [plunks a 40-page document on the desk] we want you to establish a safety program that can adapt and evolve with time.  Report to us.  Due soon.  Thanks.”  That’s the quick and dirty of it.

James Reason is just a small mouse in a big world of Swiss Cheese models.

Nick Sabatini, the former associate administrator for aviation safety for the FAA, spoke at IASS (International Aviation Safety Seminar – I know…) in Beijing last fall.  He explained SMS in more eloquent terms as an “evolution of safety” that is “culture-driven, highly measurable, and analytical”

Which is true.  The culture-driven part, especially, as it signals a shift in focus away from a sheltered Orwellian workplace that some airlines have unfortunately adopted to a more natural and approachable environment.  A culture comes to be from common beliefs and values (towards safety, say) that develop organically and authentically.  From that, behavioral norms begin to emerge to the point where behavior that is universally embraced is also behavior universally practiced.  But it’s a gradual process.  (See James Reason’s seminal “Managing the Risks of Organizational Accidents” p. 192 for more.  He made the Swiss Cheese model.)

So an SMS is constructed around four central ideas: safety policy (the commitment from management); safety risk management (identification, analysis, and control of hazards); safety assurance (monitor of effectiveness); and safety promotion (culture).  What caught my attention was something on page seventeen of the AC about safety assurance and how to get the information for proper decisionmaking:

The highlighting is mine and the yellow quote bubble shows that I left a note in the margin.  My note has to do with the fact that, by this definition, safety assurance is still relatively unchanged.  The very safety intelligence that drives the whole SMS can not be fenced in by the bureaucracy and regimen that is consistent with pre-SMS safety philosophy if the whole project hinges on a culture that is dynamic and constantly evolving.

Here, safety assurance essentially takes place in an echo chamber where decision-makers are exposed only to the information the system is designed to share.  Top-down analysis and conclusions may fail to recognize more systemic, yet casual and seldom-reported impediments to safety.  Employee reporting systems are, indeed, effective in collecting information, but they always have and always will amount to a broad channel that feeds from single employees to many in management.  With just the Aviation Safety Reporting System (ASRS) and other structured interactions, valuable insight from people at the sharp-end of the airplane are at risk of being stovepiped into obscurity.

For safety to truly be generative, people with information should not be bound by formal reporting systems.  Greater, broader platforms of communication can elicit productive discussions about a hazard and even shed light on new or potential threats.

This is where a discussion about Enterprise 2.0 and emergent media (that is, media that produces emergent phenomenon) kicks off.  But I’d like to hear your reactions about what I’ve said so far.  Leave comments!


Fledgling and chaotic, what’s it all mean?

3 01 2010

It may strike some as odd that a blog about safety and security in aviation would shoot off references to chaos and disorder.

This blog’s reference to chaos is more scientific and quantifiable than naive and presumptive.  In the 196os, Edward N. Lorenz coined the term “butterfly effect” to covey the gist of a theory called ‘chaos’.  Chaos theory takes a stab at explaining how the outcome of a system that changes with time – like a flight – can vary wildly if changes are made to the original circumstances surrounding that system.

The original connection was with the idea that the flap of a butterfly’s wings can forever change the course of weather (that should clear some things up), but for the purposes of this blog, you have to view aviation operations for what they are: a system that is dynamic with many external attractors and not linear or inherently prim and proper.  As meticulously-orchestrated, precisely-executed, and well-intentioned as every flight may appear to be, even that 45-minute jump from NYC to Boston is more than just sequential pulling and pushing on the yoke.  An “unsafe” event can still occur if connecting passengers are late and the crew is excessively interrupted during their preflight cockpit flows.  Or if a dispatcher calls in sick at the last minute.  Or if a butterfly skipped a beat in Argentina.


Many of the bumps with baggage carts, misinterpreted clearances, runway incursions, technical malfunctions, and sudden cushionless meetings of stone and metal (thanks, Mr. Gann) have palpable causes, an unsurprising chain of events, and conceivable mitigation strategies that are well within the realm of feasibility for aircraft operators but were never implemented or otherwise acted on.  A focus on human factors and risk management has never been far from the industry and there is no shortage of acronyms to describe the work that has already been done.  After ASAP, FOQA, IEP, CRM, and SMS, chaos can expand on the famed Swiss Cheese model by making us aware of the way imperfections in all of our safeguards interact and work together.

The aviation industry is not greater than the sum of its parts – it IS the sum of its parts.  As any poster in a flight school will say, safety is not an accident, and most definitely will not happen by any sort of ‘strength in numbers’ logic.  My effort here is to realize the inherent complexity of aviation systems and its vulnerability to shortcomings in any and all areas of a given operation.

Aviation happens to be impervious to safety because of how intricate it is.

Please join me with your thoughts, comments, and criticisms.  I’m really looking forward to discussing and learning.

Brian Futterman