Protein aggregation and formulation stability: when molecular scale meets industrial reality

By Estelle Deguillard

You have identified your candidate protein. Your vaccine antigen binds effectively to the targeted virus. Your new plant-based milk chocolate has the right nutritional profile. Your cosmetic peptide shows promising anti-aging effect in vitro.

Now comes the real challenge: formulation stability! Ensuring formulation stability means understanding and controlling protein aggregation across real storage and processing conditions.

Will your protein denature if the temperature rises from 25 °C to 37 ⁰C? Will the protein in your milk chocolate aggregate, leading to a weird texture? What’s the critical concentration for phase separation or gelation? Can we evaluate long-term stability under real storage conditions?

These aren’t academic questions; these are real life challenges that protein-dependent industries face every day. Finding the right protein is “easy”, using it is difficult.

The computational dilemma

Proteins are notoriously difficult to model. They require fine-grained representation to ensure that the interactions and behaviors are correctly reproduced. With improvements in computational capabilities, modeling full-atom protein formulations looks like to be an achievable dream.

But is it really necessary to model a protein at such a fine representation to predict mesoscale behaviors before going to the lab for final testing?

Consider the scale mismatch: a typical protein (tens of kDa) is hundred times heavier than a typical additive (~0.1-1 kDa). In formulation terms, this is like looking at an Airbus A 380 next to a light aircraft: they share some principles but don’t play in the same league.

Moreover, protein aggregation behavior, which is key to understanding formulation stability, is a complex and multiscale process. Early aggregation events can already emerge within a few nanoseconds or even microseconds in a simulation. A significant time at our scale.

So, when combining a system containing a few protein molecules and additives that are known to prevent aggregation and knowing we are reasonable people with reasonable computational capabilities, you may already end up with a 100 Angstrom cubic simulation box.

That’s HUGE!

I mean, sure we can model it, but I am not sure waiting weeks (I am a very optimistic person) for one formulation will be optimum when we need to get our cream or vaccine out within the quarter.

And that’s not mentioning the list of tests we want to do:

Temperature dependence
pH dependence
Additive concentration
Protein concentration

As much as I’d like to run dozens of vaccine formulation simulations at scale and in parallel, monopolizing hundreds of CPU cores for weeks on a shared cluster is not an option. I doubt my colleagues would find it very amusing.

And that’s why, with the collaboration of our customers, we have been looking into ways of optimizing protein formulation simulations at the coarse-grained level to accelerate such screening projects, be it designing a novel medicine or an effective eco-friendly detergent against chocolate stains or a new anti-aging cream.

After a few years of sweat and head scratching, the Simcenter Culgi team is happy to introduce a coarse-grained workflow that allows for the fast screening of protein formulation

First hurdle: fragmentation and parameterization

You know us, we like to automate things. Especially when one must deal with a few hundred atoms.

One advantage of proteins is that they are relatively simple. I mean, they are mostly made of amino acids and can have some non-standard residues, but nothing we can’t handle. Based on this understanding, we have introduced a protein fragmentation capability in Simcenter Culgi 2605. Provide your protein and let the software handle the fragmentation and parameterization. You end up with a protein that is fragmented based on the amino acids it is composed of. Additionally, we provide a robust workflow to parameterize the other species that will be necessary to the simulation.
This automated workflow simplifies coarse‑grained modeling of complex protein formulation, reducing setup time before aggregation studies.

Step number 2: modeling the formulation itself

Armed with your coarse-grained proteins and list of ingredients, you will happily use our Mixture creator to match perfectly the experimental formulation and start your DPD simulation. And you just need to wait… and wait… and wait some more.

Wait what? I thought coarse-grained simulation was faster than atomistic? Where is the gain? And you are right.

What people don’t necessarily realize is that protein formulations, especially in vaccines, are 99 % water! And the wait time is due to this 99% water. That’s when we realized that to be able to execute a proper screening of protein formulation, we needed to get rid of the water, virtually.

And that’s why we have integrated support for colloidal molecules and long-range interactions within our Brownian Dynamics solver, or, more colloquially: our implicit solvent solver. As the name explains, instead of modeling 99% water, we only model the molecules that we are interested in. Don’t worry, the water is still there, just implicitly.

With both new features, you can model a vaccine formulation 30x faster than you would with explicit solvent simulations. Nice, right?

Benchmarking, the step one must always go through. Or do they?

You have the model, the parameters and the speed but you’re still stuck trying to estimate how many steps you need to compute before considering your simulation is ready for production. Indeed, don’t forget that you must equilibrate before you measure.

That is valid for any computational chemistry project. You must evaluate how long to model and that is frankly, more often than not, somewhat of a guess. We will put enough steps plus some more just for good measure. Not very time effective, right?

Well, I have good news. No need for that anymore.

Introducing our Equilibrium Detection capability:

With Equilibrium Detection, set a required accuracy, a high number of steps, and let the system run. The software will detect automatically when it has reached equilibrium by checking on relevant properties. You can then use that information to set the number of steps for equilibration for all your systems for your screening campaign.

Still too complicated? No worries, just combine it with our Stop on Met Precision we introduced last release. By combining our Equilibrium Detection capability with our Stop On Met Precision, the software will stop the simulation once equilibration is reached and the production step can start automatically.

Accelerating aggregation studies across complex formulation systems

To summarize, with Simcenter Culgi 2605, you can now accelerate protein aggregation studies and assess formulation stability by:

Automatically fragmenting and parameterizing protein formulation systems
Running implicit solvent simulations for aggregation studies in record time
Not worrying about simulation time, thanks to the Equilibrium Detection and Stop-on-met-precision features

And many more things! Did you know you can safely map back any system nowadays? Perfect to understand how detergent molecules diffuse in chocolate stains to clean them…

The computational dilemma

First hurdle: fragmentation and parameterization

Step number 2: modeling the formulation itself

Benchmarking, the step one must always go through. Or do they?

Accelerating aggregation studies across complex formulation systems

What to read next:

Leave a Reply Cancel reply