Perceptron: AI that uses sound to see, learn to walk, and predict seismic physics

The research in the field of machine learning and AI, now a key technology in virtually every industry and every company, is far too extensive for anyone to read in full. This column, Perceptron, aims to gather some of the most relevant recent discoveries and articles – particularly in, but not limited to, artificial intelligence – and explain why they matter.

This month, Meta engineers unveiled two recent innovations from the depths of the company’s research labs: an AI system that compresses audio files and an algorithm that can accelerate protein folding AI performance by 60x. Elsewhere, MIT scientists revealed that they are using spatial acoustic information to help machines better visualize their surroundings by simulating how a listener would hear a sound from any point in a room.

Meta’s compression work doesn’t exactly venture into uncharted territory. Last year, Google announced Lyra, a neural audio codec trained to compress low-bitrate speech. But Meta claims its system is the first to work for CD-quality stereo audio, making it useful for commercial applications like voice calling.

An architectural drawing of Meta’s AI audio compression model. Photo credit: Meta

Using AI, Meta’s compression system called Encodec can compress and decompress audio in real-time on a single CPU core at rates of around 1.5 kbps to 12 kbps. Compared to MP3, Encodec achieves about 10 times the compression rate at 64 kbps without any noticeable loss in quality.

The researchers behind Encodec say that human evaluators preferred the quality of Encodec-processed audio over Lyra-processed audio, suggesting that Encodec could eventually be used to deliver better-quality audio in bandwidth-limited situations or is expensive.

Meta’s protein folding work has less immediate commercial potential. But it could lay the groundwork for important scientific research in the field of biology.

Protein structures predicted by Metas System. Photo credit: Meta

According to Meta, its ESMFold AI system has predicted the structures of around 600 million proteins from bacteria, viruses and other microbes that have not yet been characterized. That’s more than triple the 220 million structures that Alphabet-powered DeepMind was able to predict earlier this year, covering nearly every protein from known organisms in DNA databases.

Meta’s system is not as accurate as DeepMind’s. Of the roughly 600 million proteins it produced, only a third were “high quality.” But it’s 60 times faster at predicting structures, allowing it to scale structure prediction to much larger databases of proteins.

Not to pay undue attention to Meta, this month the company’s AI department also detailed a system designed for mathematical reasoning. Company researchers say their “neural problem solver” learned from a set of successful mathematical proofs to generalize to new, different types of problems.

Meta is not the first to build such a system. OpenAI has developed its own called Lean, which was announced in February. Separately, DeepMind has been experimenting with systems capable of solving sophisticated mathematical problems in the study of symmetries and knots. But Meta claims that its neural problem solver was able to solve five times more international math Olympiads than any previous AI system and has outperformed other systems on widely used math benchmarks.

Meta notes that mathematical AI could benefit the fields of software verification, cryptography, and even aerospace.

To draw our attention to MIT’s work, researchers there have developed a machine learning model that can capture how sound in a room travels through space. By modeling acoustics, the system can learn a room’s geometry from sound recordings, which can then be used to create visual renderings of a room.

The researchers say the technology could be applied to virtual and augmented reality software or robots that need to navigate complex environments. In the future, they plan to improve the system so that it can be generalized to new and larger scenes, like entire buildings or even entire cities.

At Berkeley’s robotics division, two separate teams are accelerating the speed at which a four-legged robot can walk and learn other tricks. One team wanted to combine best-of-breed work from numerous other advances in reinforcement learning to enable a robot to go from blank slate to robust walking on uncertain terrain in just 20 minutes in real time.

“Perhaps surprisingly, we find that with several careful design decisions regarding the task and the algorithm implementation, it is possible for a four-legged robot to learn to walk with deep RL from scratch in less than 20 minutes in a number of different environments and surface types.” . Crucially, it does not require novel algorithmic components or other unexpected innovations,” the researchers write.

Instead, they choose and combine some cutting-edge approaches and get amazing results. You can read the paper here.

Robotic dog demo from EECS Professor Pieter Abbeel’s lab in Berkeley, California in 2022. (Photo courtesy of Philipp Wu/Berkeley Engineering)

Another locomotion learning project from (TechCrunch’s pal) Pieter Abbeel’s lab has been described as “training an imagination.” They have endowed the robot with the ability to make predictions about how its actions will work, and although it starts out quite helpless, it quickly gains more knowledge of the world and how it works. This leads to a better prediction process leading to better knowledge, and so on in feedback until it’s up and running in less than an hour. It learns just as quickly to recover from being pushed or otherwise “tortured,” as the jargon puts it. Their work is documented here.

Work with a potentially more immediate application came earlier this month from Los Alamos National Laboratory, where researchers developed a machine learning technique to predict the friction experienced during earthquakes — providing a way to predict earthquakes. Using a language model, the team say they were able to analyze the statistical characteristics of seismic signals sent out by a fault in a lab earthquake machine to predict the timing of a next tremor.

“The model is not limited to physics, but it does predict the physics, the actual behavior of the system,” said Chris Johnson, one of the project’s research leaders. “Now we’re making a future prediction from past data that goes beyond describing the current state of the system.”

Photo credit: dreamtime

It’s challenging to apply the technique in the real world, the researchers say, because it’s not clear if there’s enough data to train the prediction system. Nonetheless, they are optimistic about the applications, which could include predicting damage to bridges and other structures.

Last week was a cautionary note from MIT researchers warning that neural networks used to simulate actual neural networks should be carefully examined for training bias.

Neural networks are, of course, based on the way our own brains process and signal information, reinforcing certain connections and combinations of nodes. But that doesn’t mean that the synthetics and the real ones work the same. In fact, the MIT team found that neural network-based simulations of grid cells (part of the nervous system) elicited similar activity only when carefully coerced by their creators. When allowed to govern themselves, as actual cells do, they did not produce the desired behavior.

This does not mean that deep learning models are useless in this area – on the contrary, they are very valuable. But, as Professor Ila Fiete said in the school’s news post: “They can be a powerful tool, but you have to be very careful about interpreting them and determining whether they really make de novo predictions or even shed light on it brain optimized.”

#Perceptron #sound #learn #walk #predict #seismic #physics

Perceptron: AI that uses sound to see, learn to walk, and predict seismic physics

About the author

adrina

Leave a Comment X