As the artificial intelligence game intensifies, Meta Platforms (META) is working on a state-of-the-art multi-modal large language model named Chameleon.
According to the company's research paper, the proposed LLM can single-handedly perform tasks previously performed by different models and could integrate information better than previous ones.
The paper noted that Chameleon uses an 'early-fusion token-based mixed-modal' architecture, under which the model learns from a combination of images, code, text, and other inputs. Additionally, it uses a mix of images, text and code tokens to create sequences.
"Chameleon's unified token space allows it to seamlessly reason over and generate interleaved image and text sequences, without the need for modality-specific components," the research paper stated.
The latest model is trained in two stages using a dataset of 4.4 trillion tokens of text, image-text combinations, and sequences of interwoven texts and images. The researchers trained two versions of Chameleon using 7 billion parameters and one with 34 billion parameters for more than 5 million hours on Nvidia A100 80GB GPUs.
Meanwhile, Meta's competitors - OpenAI has launched GPT-4o and Microsoft (MSFT) has introduced MAI-1 model a few weeks ago.
For comments and feedback contact: editorial@rttnews.com
Business News
June 19, 2026 16:46 ET Major central banks continued to dominate the economic news flow this week too, led by the Federal Reserve, as they announced their latest policy decisions. The Federal Reserve policy session was in focus as it was the first to be led by the new chief Kevin Warsh. In Europe, central banks of the U.K. and Switzerland announced their rate decisions. In Asia, the Bank of Japan drew attention for its policy moves, while data out of China threw some light on the state of the economy.