A Trip to Monte Carlo

Sep. 19 2025 - Grant Pettit

DS-1

T H Hill. The name brings to mind years of oilfield experience and expertise, visions of blue-clad technicians bird-dogging inspectors and saving money and problems for customers, and memories of some of the most important technical advances made in the drilling industry.

Or, maybe more likely, a “huh? Who’s that?”

I’m proud, I think justifiably, of the work done for our industry by the company that hired me out of college. But we’ve always been a niche supplier, laser-focused on drilling and completions in oil & gas, and basically nothing else.

So when 2012 rolled around and T H Hill was bought by Bureau Veritas, it opened up a whole new world of industries and problems and techniques. I could do more things! I can read good, and do other things good, too!

For instables: let’s say you’re building an LNG plant. (Liquefied Natural Gas, that is.) And your main goal is to ~~bring energy to the people~~ make an obscene amount of money. You build the plant so that it is capable of manufacturing 15 million tons per annum (MTPA) of LNG. So you estimate the market price of LNG, multiply by 15 million, and mentally put that much money in your bank account and start looking at yachts.

Then some wiry accountant points out that there will be times when that plant shuts down due to maintenance, something like once a year. (Fine, I’ll subtract that much money.)

“Also, is the electricity grid infallibly stable?” (Hm, minus a bit more money.)

“What about unplanned outages?” (unngghh …)

“What about …” Ok look accountant-man, before you turn my yacht into a pool noodle, can we stop telling doomsday versions of what might happen? Can we actually have a realistic estimation of the production Imma get?

“I don’t know,” says accountant-man, “can we?”

Ah, I see. That would be a better investment story, if I could back up my expected production revenue with data and thought, wouldn’t it?

In a simple world we’d just have an uptime percentage—96% of the time the thing’s running, so the actual production will be 96% of its nameplate capacity.

In this world we have multiple trains, long lists of machines, various redundancies, variable repair times, variable spare-part availabilities … the mind boggles. There’s really just too many branches to collapse that into one percentage using any reasonable analytical method.

So, we go to the casino.

(Ha! You’re back now, aren’t you?)

Specifically the Monte Carlo casino in Monaco, where Stan Ulam’s uncle used to go gamble. Stan thought of a way to solve really complicated problems using simple random-number simulations. It’s maybe not mathematically elegant, but with the staggering computing power we have at our disposal we can basically brute-force our way through any complex system. It’s called the Monte Carlo method.

See, if you have a system like this:

We might be able to gather data about the reliability of each one of those boxes. (In real life there’d be way more boxes, and sub-boxes, and other stuff, but let’s keep this simple, shall we?) If we have a Mean Time To Failure (MTTF) and a Mean Time To Repair (MTTR) we can know the average availability of each block (= MTTF / (MTTF + MTTR)). But we can’t easily figure out how things are related in percentage terms—will the other train fail while we’re working on the first train? Does a failure mean the whole plant’s shut down, or just one train, or are we limping along with reduced production even through our busted train? Especially if we’re offshore or in a remote location, how long does it take the repair crew to even get there to start the repair process?

If we define percentages for each of those possibilities, then we can build this Huge, Disgusting Reliability Block Diagram (HDRBD … I’m joking, I’m getting tired of all these acronyms, it’s just RBD) that will describe the relationship algorithmically, but not analytically. That is, I know all the “if … then” paths that are possible, but I don’t know how to turn it into one big equation.

Good ol’ Stan says to just use a random number generator at each decision point along that RBD—you know, just pull the lever. You get some random thing based on the percent likelihood, which then sends you farther down the RBD path. When you get to the end … you have one possible likelihood for next Tuesday. Do it 10,000 times (thank you computers), and you have 10,000 possible likelihoods, pretty well distributed the way that reality will be likely to work.

And that, my friends, is what we wanted. A description of the most likely spread of production values over the operational time that our accountant was interested in, leading to a pretty reasonable guess as to the amount of money I’m going to make.

That process—using Monte Carlo methods to develop an estimation of the overall performance of a given system modeled by a particular RBD—it’s basically a software program (and there’s several out there, and honestly your dorky 12-year-old nephew could build you one pretty quickly). The engineering comes, as it often does, when you build the model. If you build a complex-looking thing but plug in 100% reliability for every block, you get beautifully stupid results. If you insist that every failure will require an alien invasion and an act of Congress to fix, ditto. Garbage in, garbage out.

Luckily this type of RAM study (that’s what it’s called, by the way, a “Reliability, Availability, and Maintainability” study) has been around long enough that there are publicly accessible databases that will give you typical failure rates for most of the equipment processing plants use. Equipment manufacturers will also gather their own data hoping to beat the average so that they can claim higher levels of reliability or shorter times to repair. Smart RAM engineers will know the right questions to ask (who repairs this thing when it goes down at night, or on a weekend? How long does it take to get there? What spare parts are available, and which ones have a lead time?) to make sure their results are accurate.

And I get to tell people about their next yacht.

Grant has been at BVNA for 15 years, doing failure analysis, drill stem and casing design, standards writing, and teaching others to do it, too.

Grant PettitDirector Of Operations - Standards, Training, & Accreditation