Lab-grown brain organoids have been called “mini-brains” by some of their more enthusiastic proponents, but the reality is that these clumps of neural cells are some ways from replicating the performance of the human brain. Instead, they act as useful models of some brain processes for experimental use.

Still, these organoids have become increasingly complex. A new study shows that they can master a form of feedback learning that is a benchmark for adaptability and real-time processing in neural systems.

The study could lead to organoids better positioned to study how learning processes go awry in brain conditions such as Alzheimer’s disease.

The study, published in Cell Reports, begins with a cart and a pole.

A Moving Cart

Imagine trying to balance a ruler that is upright on your palm. You’d need to carefully learn to feel the weight of the ruler and how it is positioned on your hand to keep it from falling under the force of gravity. We need to learn a form of this balance from our early years, when we move from crawling to toddling.

The cartpole test is essentially a digital version of this task, where a virtual cart is placed on a track. Inside the cart is, as you might have guessed, a pole, which will fall unless the cart’s movement along the track is carefully managed. The test is used as an engineering benchmark to ascertain if a control system can adapt and respond to information.

“We’re trying to understand the fundamentals of how neurons can be adaptively tuned to solve problems,” said University of California, Santa Cruz engineer and study co-author Ash Robbins in a statement.

Applying Coaching Signals

In the new study, Robbins and their colleagues taught a brain organoid derived from mouse cells to control the cartpole task. According to the authors, the results are the first example of goal-directed learning in lab-grown brain organoids.

In the study, the researchers used a reinforcement learning algorithm that fed the results of each cartpole test back into the organoid via electrical signals. By modulating the strength of these signals, the team could inform the organoids of the virtual pole’s angle. The organoid responded by outputting signals to balance the pole. The authors observed how long it took for the pole to fall in tests they called episodes. The reinforcement algorithm sent a learning signal to the organoid at the end of every fifth episode if its average performance had not improved as opposed to the previous 20 episodes. This signal would subtly modify which neurons within the organoid were stimulated.

“You could think of it like an artificial coach that says, ‘you’re doing it wrong, tweak it a little bit in this way,’” Robbins said. “We’re learning how to best give it these coaching signals.”

A Better Memory in the Future

Over time, the reinforcement training produced results. If the organoid was randomly stimulated, it averaged a win rate of 4.5 percent, while the adaptive training regimen averaged a 46 percent win rate.

This improvement wasn’t enduring. After 15 minutes of training, the authors left the organoid to rest for 45 minutes. During this time, the neurons appeared to forget their training, and their win rate dropped back to a low baseline.

The team hopes that future iterations of their brain models will have slightly longer memories.

“It is likely that more sophisticated organoids, perhaps grown to include multiple brain regions involved in animal learning, will be needed to recapitulate the kind of long-term adaptive performance improvement we see in animals,” said USCS genomicist and engineer David Haussler, who co-authored the study, in a statement.

‘’We’ll see,” Haussler concluded.

