Artificial Intelligence Timeline

2022 - Present

2022

February

Midjourney v1

March

OpenAI releases text-davinci-002 and code-davinci-002 with an API approach.

April

Midjourney v2

DALL-E 2 is announced for gradual release.

July

Midjourney v3 is launched.

August

Stable Diffusion 1.4 is released.

October

Stable Diffusion 1.5 becomes available.

November

ChatGPT, a chatbot by OpenAI using GPT-3.5, is released to the public and quickly becomes a viral sensation.

Midjourney v4 is released.

Stable Diffusion 2.0 is launched.

December

Stable Diffusion 2.1 is released.

2023

February

Meta releases the LLaMA language model as open-source for research purposes. The model is later leaked.

Microsoft gradually releases Bing AI, an AI chat based on an upgraded GPT model integrating internet search.

March

Midjourney v5 is launched.

OpenAI's GPT-4 model is partially released, featuring multimodal image analysis and improved multi-language support.

Google releases the AI chat Bard in a limited capacity, based on the LaMDA language model.

April

Adobe releases the Firefly image creation model as a beta version to a waiting list. The model allowed a variety of capabilities including text formatting.

May

Midjourney v5.1 is released.

Google announces an upgrade to Bard, moving it to the upgraded PaLM 2 language model. It will support 180 countries and many languages.

June

Midjourney v5.2 is launched.

July

Stable Diffusion XL 1.0 is released.

Anthropic announces a new version of their large language model - Claude 2.

Meta releases the LLaMA 2 open source language model to the general public in a variety of sizes.

October

DALL-E 3 is released.

Adobe releases Firefly 2.

November

Stable Diffusion XL Turbo is released - A fast model that allows the creation of an image in one step in real time.

December

Midjourney v6 is launched.

Google upgrades Bard in limited areas, moving it to be based on the upgraded Gemini Pro language model.

X Corporation launches Grok AI chatbot for paid subscribers in English language.

2024

February

Stability AI announces Stable Diffusion 3 (gradually released to waiting list).

Google upgrades the artificial intelligence chat in Bard, basing it on the new Gemini Pro model, in all available languages. Google replaces "Bard" with "Gemini".

Google announces the Gemini Pro 1.5 multimodal language model capable of parsing up to a million tokens, as well as parsing video and images. The model is gradually released to developers on a waiting list.

OpenAI announces the Sora model that produces videos up to a minute long. The model is not released to the public at this time.

March

X Corporation announces the upcoming release of the Grok 1.5 open source model.

Anthropic announces Claude 3, a new version of their large language model. The version is deployed in 3 different sizes, with the largest model performing better than GPT-4.

Suno AI, which develops a model for creating music, releases Suno v3 to the general public.

April

Stability AI releases a new update to the music creation model - Stable Audio 2.0.

X Corporation releases an upgrade to its language model, Grok-1.5V, which integrates high-level image recognition. In the test presented by the company, the model is the best in identifying and analyzing images compared to other models.

The Mistral company releases its new model Mixtral 8x22B as open source. This is the most powerful model among the open source models and it contains 141 billion parameters but uses a method that allows more economical use.

Meta releases the LLaMA 3 model as open source in sizes 8B and 70B parameters. The large model shows better performance than Claude 3 Sonnet and Gemini Pro 1.5 in several measures. Meta is expected to later release larger models with 400 billion parameters and more.

Microsoft releases the Phi-3-mini model in open source. The model comes in a reduced version of 3.8B parameters, which allows it to run on mobile devices as well, and it presents capabilities similar to GPT-3.5.

Adobe announces its new image creation model Firefly 3.

The startup Reka AI presents a series of multimodal language models in 3 sizes. The models are capable of processing video, audio and images. The large model featured similar capabilities to GPT-4.

Apple releases as full open source a series of small language models under the name OpenELM. The models are available in four weights between 270 million and 3 billion parameters.

May

OpenAI announces the GPT-4o model that presents full multimodal capabilities, including receiving and creating text, images, and audio. The model presents an impressive ability to speak with a high response speed and in natural language. The model is 2 times more efficient than the GPT-4 Turbo model, and has better capabilities for languages other than English.

Google announces a large number of AI features in its products. The main ones: increasing the token limit to 2 million for Gemini 1.5 to waiting list, releasing a smaller and faster Gemini Flash 1.5 model. Revealing the latest image creation model Imagen 3, music creation model Music AI and video creation model Veo. And the announcement of the Astra model with multimodal capabilities for realtime audio and video reception.

Microsoft announces Copilot+ for dedicated computers, which will allow a full search of the user's history through screenshots of the user's activity. The company also released as open source the SLMs that display impressive capabilities in a minimal size: Phi-3 Small, Phi-3 Medium, and Phi-3 Vision which includes image recognition capability.

Meta introduces Chameleon, a new multimodal model that seamlessly renders text and images.

Mistral AI releases a new open source version of its language model Mistral-7B-Instruct-v0.3.

Google announces AI Overviews intended to give a summary of the relevant information in Google search.

Suno AI releases an updated music creation model Suno v3.5.

Mistral AI releases a new language model designed for coding Codestral in size 22B.

June

Stability AI releases its updated image creation model Stable Diffusion 3 in a medium version in size 2B parameters.

Apple announces Apple Intelligence, an AI system that will be integrated into the company's devices and will combine AI models of different sizes for different tasks.

DeepSeekAI publishes the DeepSeekCoderV2 open source language model which presents similar coding capabilities to models such as GPT-4, Claude 3 Opus and more.

Runway introduces Gen3 Alpha, a new AI model for video generation.

Anthropic releases the Claude Sonnet 3.5 model, which presents better capabilities than other models with low resource usage.

Microsoft releases in open source a series of image recognition models called Florence 2.

Google announces Gemma 2 open source language models with 9B and 27B parameter sizes. Also, the company opens the context window capabilities to developers for up to 2 million tokens.

July

OpenAI has released a miniaturized model called GPT-4o mini that presents high capabilities at a low cost

Meta releases as open source the llama 3.1 model in sizes 8B, 70B and 405B. The large model features the same capabilities as the best closed source models

mistral ai releases three new models: Codestral Mamba, Mistral NeMo and Mathstral designed for mathematics

Google DeepMind has unveiled two new AI systems that won silver medals at this year's International Mathematical Olympiad (IMO), AlphaProof and AlphaGeometry 2.

OpenAI launched SearchGPT, an integrated web search

Startup Udio has released Udio v1.5, an updated version of its music creation model

Mistral AI has released a large language model Mistral Large 2 in size 123B, which presents capabilities close to the closed SOTA models.

Midjourney v6.1 is released

Google releases the Gemma 2 2B model as open source. The model demonstrates better capabilities than much larger models.

August

"Black Forest Labs" releases weights for an image creation model named Flux, which shows better performance than similar closedsource models.

OpenAI released a new version of its model, GPT-4o 0806, achieving 100% success in generating valid JSON output.

Google's image generation model, Imagen 3, has been released.

xAI Corporation has launched the models Grok 2 and Grok 2 mini, which demonstrate performance on par with leading SOTA models in the market.

Microsoft has introduced its small language models, Phi 3.5, in three versions, each showcasing impressive performance relative to their size.

Google has introduced three new experimental AI models: Gemini 1.5 Flash8B, Gemini 1.5 Pro Enhanced, and Gemini 1.5 Flash Updated.

Ideogram 2.0 has been released, offering image generation capabilities that surpass those of other leading models.

Luma has unveiled the Dream Machine 1.5 model for video creation.

September

The French AI company Mistral has introduced Pixtral12B, its first multimodal model capable of processing both images and text.

OPENAI has released two nextgeneration AI models to its subscribers: o1 preview and o1 mini. These models show a significant improvement in performance, particularly in tasks requiring reasoning, including coding, mathematics, GPQA, and more.

Chinese company Alibaba releases the Qwen 2.5 model in various sizes, ranging from 0.5B to 72B. The models demonstrate capabilities comparable to much larger models.

The video generation model KLING 1.5 has been released.

OpenAI launches the advanced voice mode of GPT4o for all subscribers.

Meta releases Llama 3.2 in sizes 1B, 3B, 11B and 90B, featuring image recognition capabilities for the first time.

Google has rolled out new model updates ready for deployment, Gemini Pro 1.5 002 and Gemini Flash 1.5 002, showcasing significantly improved longcontext processing.

Kyutai releases two opensource versions of its voicetovoice model, Moshi.

Google releases an update to its AI tool NotebookLM that enables users to create podcasts based on their own content.

Mistral AI launches a 22B model named Mistral Small.

October

Flux 1.1 Pro is released, showcasing advanced capabilities for image creation.

Meta unveils Movie Gen, a new AI model that generates videos, images, and audio from text input.

Pika introduces Video Model 1.5 along with "Pika Effects."

Adobe announces its video creation model, Firefly Video.

Startup Rhymes AI releases Aria, an opensource, multimodal model exhibiting capabilities similar to comparably sized proprietary models.

Meta releases an opensource speechtospeech language model named Meta Spirit LM.

Mistral AI introduces Ministral, a new model available in 3B and 8B parameter sizes.

Janus AI, a multimodal language model capable of recognizing and generating both text and images, is released as open source by DeepSeekAI.

Google DeepMind and MIT unveil Fluid, a texttoimage generation model with industryleading performance at a scale of 10.5B parameters.

Stable Diffusion 3.5 is released in three sizes as open source.

Anthropic launches Claude 3.5 Sonnet New, demonstrating significant advancements in specific areas over its previous version, and announces Claude 3.5 Haiku.

Anthropic announces an experimental feature for computer use with a public beta API.

The texttoimage model Recraft v3 has been released to the public, ranking first in benchmarks compared to similar models.

OpenAI has launched Search GPT, allowing users to perform web searches directly within the platform.

November

Alibaba released its new model, QwQ 32B Preview, which integrates reasoning capabilities before responding. The model competes with, and sometimes surpasses, OpenAI's o1-preview model.

Alibaba opensourced the model Qwen2.5 Coder 32B, which offers comparable capabilities to leading proprietary language models in the coding domain.

DeepSeek unveiled its new AI model, DeepSeek-R1-Lite-Preview, which incorporates reasoning capabilities and delivers impressive performance on the AIME and MATH benchmarks, matching the level of OpenAI's o1-preview.

Suno upgraded its AIpowered music generator to v4, introducing new features and performance improvements.

Mistral AI launched the Pixtral Large model, a multimodal language model excelling in image recognition and advanced performance metrics, and an update to Mistral Large, 2411.

Google introduced two experimental models, gemini-exp-1114 and gemini-exp-1121, currently leading the arena chatbot with enhanced performance.

Anthropic launches Claude 3.5 Haiku and Visual PDF Analysis in Claude.

December

Amazon introduced a new series of models called NOVA, designed for text, image, and video processing.

OpenAI released SORA, a video generation model, along with the full version of O1 and O1 Pro for advanced subscribers. Additionally, the company launched a live video mode for GPT4o.

Google unveiled the experimental model Gemini-Exp-1206, which ranked first in the chatbot leaderboard.

Google launched Gemini 2.0 Flash in beta. This model leads benchmarks and outperforms the previous version, Gemini Pro 1.5. Additionally, Google introduced live speech and video mode and announced built-in image generation capabilities within the model.

Google revealed Gemini-2.0-Flash-Thinking, a thinking model based on Gemini 2.0 Flash, which secured second place in the chatbot leaderboard.

Google introduced Veo 2, a beta version video generation model capable of producing 4K videos up to two minutes long. The model outperformed SORA in human evaluations. Additionally, Google updated Imagen 3, offering enhanced image quality and realism.

xAI integrated Aurora, a new model for generating high-quality and realistic images.

Microsoft open-sourced the Phi4 model, sized at 14B, showcasing impressive capabilities for its size.

Meta released Llama 3.3 70B, a model offering performance comparable to Llama 3.1 405B.

Google launched a multi-modal open-source model called PaliGemma 2, integrated with existing Gemma models.

Pika Labs released 2.0, the latest version of its AI-powered video generator.

Meta introduced Apollo, a video generation model available in three different sizes.

Deepseek open-sourced Deepseek V3, a model with 671B parameters that surpasses closed-source SOTA models across several benchmarks.

Alibaba unveiled QVQ-72B-Preview, a cutting-edge thinking model capable of analyzing images, featuring SOTA-level performance.

OpenAI announced O3, a groundbreaking AI model achieving 87.5% in the ARC-AGI benchmark, 25.2% in the Frontier Math Benchmark (compared to under 2% in previous models), and 87.7% in Ph.D.-level science questions. A cost-effective version, O3 Mini, is expected in January 2025, with performance similar to O1, alongside improved speed and efficiency.

The video generation model Kling 1.6 was released, offering significant performance enhancements.