Information is Beautiful - The Rise of AI-Based Large Language Models (LLMs)

	A	B	C	D	E	F
1	see the visualisation:	https://informationisbeautiful.net/visualizations/the-rise-of-generative-ai-large-language-models-llms-like-chatgpt/
2	last update 20th Mar 2023
3	name	owner	trained on x billion parameters	date	note / * = parameters undisclosed	link

4	BERT	Google	0.34	Oct 2018		https://en.wikipedia.org/wiki/BERT_(language_model)
5	GPT-2	OpenAI	1.5	Feb 2019	trained on Reddit only	https://en.wikipedia.org/wiki/GPT-2
6	T5	Google	11	Oct 2019		https://arxiv.org/abs/1910.10683
7	Megatron-11B	Meta / Facebook	11	Apr 2020		https://github.com/pytorch/fairseq/tree/main/examples/megatron_11b
8	BlenderBot1	Meta / Facebook	9.4	Apr 2020		https://cobusgreyling.medium.com/meta-ais-blender-bot-3-0-is-an-open-source-chatbot-with-long-term-memory-internet-search-ce024a5fe8aa
9	GPT-3	OpenAI	175	May 2020		https://en.wikipedia.org/wiki/GPT-3
10	Wu Dao 2.0	Beijing Academy of AI	1750	Jan 2021		https://en.wikipedia.org/wiki/Wu_Dao
11	GPT-J	EleutherAI	6	Jun 2021		https://huggingface.co/EleutherAI/gpt-j-6b
12	PanGu-Alpha	Huawei	200	Apr 2021		https://arxiv.org/abs/2104.12369
13	LaMDA	Google	137	Jun 2021		https://en.wikipedia.org/wiki/LaMDA
14	BlenderBot2.0	Meta / Facebook	9.4	Jul 2021		https://cobusgreyling.medium.com/meta-ais-blender-bot-3-0-is-an-open-source-chatbot-with-long-term-memory-internet-search-ce024a5fe8aa
15	Jurassic-1	AI21	178	Aug 2021		https://www.ai21.com/blog/announcing-ai21-studio-and-jurassic-1
16	Codex	OpenAI	12	Aug 2021	Generates programming code	https://arxiv.org/abs/2107.03374
17	FLAN	Google	137	Sep 2021		https://arxiv.org/abs/2109.01652
18	PLATO-XL	Baidu	11	Sep 2021	chatbot	https://arxiv.org/abs/2109.09519
19	WeLM	WeChat	10	Sep 2022	87% chinese language	https://arxiv.org/abs/2209.10372
20	xlarge	Cohere	52.4	Sep 2021	Trained on "ebooks and webpages"	https://arxiv.org/abs/2108.07790
21	Megatron-Turing NLG	Meta / Facebook	530	Oct 2021		https://developer.nvidia.com/megatron-turing-natural-language-generation
22	MT-NLG	Microsoft	530	Oct 2021		https://arxiv.org/abs/2201.11990
23	BERT-200	Google	200	Nov 2021		https://cloud.google.com/blog/topics/tpus/google-showcases-cloud-tpu-v4-pods-for-large-model-training (same as above)
24	BERT-480	Google	480	Nov 2021		https://cloud.google.com/blog/topics/tpus/google-showcases-cloud-tpu-v4-pods-for-large-model-training
25	Luminous	Aleph Alpha	200	Nov 2021	German-language	https://www.aleph-alpha.de/pricing
26	Ernie 3.0 Titan	Baidu	260	Dec 2021		https://www.marktechpost.com/2021/12/29/baidu-and-pcl-team-introduce-ernie-3-0-titan-a-pre-training-language-model-with-260-billion-parameters/
27	GLaM	Google	1200	Dec 2021		https://ai.googleblog.com/2021/12/more-efficient-in-context-learning-with.html
28	Gopher	Google Deepmind	280	Dec 2021		https://www.deepmind.com/blog/language-modelling-at-scale-gopher-ethical-considerations-and-retrieval
29	GPT-NeoX	EleutherAI	20	Feb 2022		https://huggingface.co/docs/transformers/model_doc/gpt_neox
30	GPT Neo	EleutherAI	2.7	Feb 2022		https://huggingface.co/docs/transformers/model_doc/gpt_neo
31	Chinchilla	DeepMind	70	Mar 2022		https://arxiv.org/abs/2203.15556v1
32	CodeGen	Salesforce	16	Mar 2022	Generates programming code	https://arxiv.org/abs/2203.13474
33	InCoder	Meta	6.7	Apr 2022	generates python and javascript	https://arxiv.org/abs/2204.05999
34	mGPT	Sber	13	Apr 2022	60 languages	https://arxiv.org/abs/2204.07580
35	PaLM	Google	540	Apr 2022		https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html
36	OPT-IML	Meta AI	175	May 2022		https://arxiv.org/abs/2212.12017
37	Minerva	Google	540	Jun 2022		https://ai.googleblog.com/2022/06/minerva-solving-quantitative-reasoning.html
38	YaLM 100B	Yandex	100	Jun 2022	Russian / English	https://huggingface.co/yandex/yalm-100b
39	BLOOM	BigScience	175	Jul 2022		https://huggingface.co/bigscience/bloom
40	FIM 6.9B	OpenAI	6.9	Jul 2022		https://arxiv.org/pdf/2207.14255.pdf
41	NLLB-200	Meta AI	54.5	Jul 2022	200 language translation	https://ai.facebook.com/blog/nllb-200-high-quality-machine-translation/
42	GLM-130B	Tsinghua & Zhipu	130	Aug 2022		https://huggingface.co/spaces/THUDM/GLM-130B
43	Atlas	Meta	11	Aug 2022		https://arxiv.org/abs/2208.03299
44	BlenderBot3	Meta / Facebook	175	Aug 2022		https://cobusgreyling.medium.com/meta-ais-blender-bot-3-0-is-an-open-source-chatbot-with-long-term-memory-internet-search-ce024a5fe8aa
45	AlexaTM	Amazon	20	Aug 2022	trained on Wikipedia and mC4 only	https://www.amazon.science/blog/20b-parameter-alexa-model-sets-new-marks-in-few-shot-learning
46	PaLI	Google	17	Sep 2022	Vision model	https://arxiv.org/abs/2209.06794
47	Sparrow	Google	70	Sep 2022	powered by Chincilla	https://en.wikipedia.org/wiki/Sparrow_(bot)
48	MT5	Google	13	Oct 2022	101 languages	https://huggingface.co/google/mt5-base
49	Galactica	Meta / Facebook	120	Nov 2022	scientific only	https://www.technologyreview.com/2022/11/18/1063487/meta-large-language-model-ai-only-survived-three-days-gpt-3-science/
50	ChatGPT	OpenAI	12	Nov 2022		https://en.wikipedia.org/wiki/ChatGPT
51	RL-CAI	Anthropic	52	Dec 2022		https://lifearchitect.ai/anthropic/
52	Exaone	LG	300	Dec 2022		https://sourceforge.net/software/product/EXAONE/
53	GPT 3.5	OpenAI	175	Dec 2022		https://openai.com/blog/chatgpt
54	WebGPT	Open AI / Microsoft	175	Jan 2023		https://openai.com/research/webgpt
55	Claude	Anthropic	52	Jan 2023		https://arstechnica.com/information-technology/2023/03/anthropic-introduces-claude-a-more-steerable-ai-competitor-to-chatgpt/
56	LLaMa	Meta / Facebook	65	Feb 2023		https://ai.facebook.com/blog/large-language-model-llama-meta-ai/
57	Luminous Supreme	Aleph Alpha	70	Feb 2023	German-language	https://docs.aleph-alpha.com/docs/introduction/prompting_and_completion/#zero-shot-learning-with-luminous-supreme-control
58	PanGu-Sigma	Huawei	1085	Mar 2023		https://arxiv.org/abs/2303.10845
59	Bard*	Google	0.7	Feb 2023	powered by LaMDA	https://techmonitor.ai/technology/ai-and-automation/google-i-o-bard-chatbot-llm-palm2-gemini
60	Alpaca	Stanford	7	Mar 2023		https://github.com/tatsu-lab/stanford_alpaca
61	BloombergGPT	Bloomberg	50	Mar 2023	Finance-focussed (of course)	https://arxiv.org/abs/2303.17564
62	Cerebras-GPT	Cerebras	13	Mar 2023	open-source	https://www.cerebras.net/blog/cerebras-gpt-a-family-of-open-compute-efficient-large-language-models/
63	Ernie Bot	Baidu	200	Dec 2021		https://www.prnewswire.com/news-releases/baidu-unveils-ernie-bot-the-latest-generative-ai-mastering-chinese-language-and-multi-modal-generation-301774240.html
64	GPT-4*	OpenAI	1,000	Mar 2023		https://en.wikipedia.org/wiki/GPT-4
65	GPT4All-LoRA	Nomic	7	Mar 2023	open source chatbot based on LLaMa	https://s3.amazonaws.com/static.nomic.ai/gpt4all/2023_GPT4All_Technical_Report.pdf
66	Jurassic-2*	AI21	200	Mar 2023		https://thenewstack.io/ai21-labs-releases-jurassic-2-its-new-large-language-model/
67	Koala-13B	Berkeley	13	Apr 2023	Based on LLaMA	https://bair.berkeley.edu/blog/2023/04/03/koala/
68	StableLM	Stability AI	65	Apr 2023	open-source from the makers of Stable Diffusion	https://github.com/stability-AI/stableLM/
69	Dolly 2.0	Databricks	12	Apr 2023	open-source	https://arstechnica.com/information-technology/2023/04/a-really-big-deal-dolly-is-a-free-open-source-chatgpt-style-ai-model/
70	SenseChat	SenseTime	200	Apr 2023		https://www.silicon.co.uk/e-innovation/artificial-intelligence/sensetime-ai-505764
71	Titan	Amazon	350	Apr 2023		https://aws.amazon.com/bedrock/titan/
72	Tongyi Qianwen	Alibaba	200	Apr 2023	name roughly translates to “truth from a thousand questions,”	https://www.theregister.com/2023/04/11/alibaba_tongyi_qianwen_llm/
73	Hugging Chat	LAION	30	Apr 2023		https://techcrunch.com/2023/04/25/hugging-face-releases-its-own-version-of-chatgpt/?guccounter=1&guce_referrer=aHR0cHM6Ly9uZXdzLnNsYXNoZG90Lm9yZy8&guce_referrer_sig=AQAAAAykGMvXCA4mB45v7uwolZNOHKsD8v0oCXuvA_ODzNeQYDZSu_-gosaiEklXgzcJrzmgiNapj8m3WQ7gmE8auQxFEIKokjxYpdx7TXhOimIuz0Dww2I7ceB29AYZHtxkD4wfgA8BN4aB5CR3L9aVOLjXXiiCHDmCvhBr9I8xwLAo
74	BingChat*	Microsoft / OpenAI	1,000	Apr 2023	Microsoft's version of ChatGPT	https://www.zdnet.com/article/how-to-use-the-new-bing-and-how-its-different-from-chatgpt/
75	PaLM2	Google	540	May 2023	Trained on 100 languages and 20 programming languages. Google says the new model is better at common sense reasoning, mathematics and logic	https://techcrunch.com/2023/05/10/google-launches-palm-2-its-next-gen-large-language-model/
76	Vicuna-13B	Vicuna Team	65	Mar 2023	an open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT	https://lmsys.org/blog/2023-03-30-vicuna/
77	Falcon LLM	Technology Innovation Institute	40	Jun 2023	foundational large language model (LLM) with 40 billion parameters trained on one trillion tokens	https://falconllm.tii.ae/
78	Sail-7B	Open Language Safety Research	7	Jun 2023	search engine-grounded large language model based on LLama-7B	https://openlsr.org/sail-7b
79	Web LLM	Independent	7	Jun 2023	Browser-based LLM Chatbot	https://simonwillison.net/2023/Apr/16/web-llm/
80	OpenLLM	Independent	13	Jun 2023		https://huggingface.co/openlm-research/open_llama_13b_easylm
81	Ernie Bot 3.5	Baidu	200	July 2023	Surpassing ChatGPT (3.5) in comprehensive ability scores and outperforming GPT-4 in several Chinese language capabilities - and supporting plugins.	http://research.baidu.com/Blog/index-view?id=185
82	Claude 2	Anthropic	52	July 2023	Expanded input and output length (up to 100,00 tokens) allowing the AI model to analyze long documents such as technical guides or entire books	https://arstechnica.com/information-technology/2023/07/new-chatgpt-rival-claude-2-launches-for-open-beta-testing/
83	LLaMa2	Facebook	70	July 2023	Open source LLM comes in 3 parameter sizes - 7, 30, and 70 bn	https://venturebeat.com/ai/facebook-parent-meta-unveils-llama-2-open-source-ai-model-for-commercial-use/
84	Bichuan 2	Baichuan Intelligence	13	Jul 2023	Chinese open-access equivalent to Meta's Llama model	https://techcrunch.com/2023/07/11/chinas-search-engine-pioneer-unveils-open-source-large-language-model-to-rival-openai/
85	Claude Instant	Anthropic	52	Aug 2023	100,000 token window allowing analysis of up 75,000 words	https://techcrunch.com/2023/08/09/anthropic-launches-improved-version-of-its-entry-level-llm/
86	IDEFICS	Independent	80	Aug 2023	Clone of Famingo using Llama-1 65B	https://huggingface.co/blog/idefics
87	Jais Chat	Independent	13	Aug 2023	Arabic language LLM, trained in UAE	https://arxiv.org/abs/2309.03852
88	Japanese StableLM Alpha 7B	Stability AI	7	Aug 2023	Open source Japanese lang. model	https://huggingface.co/stabilityai/japanese-stablelm-base-alpha-7b
89	InternLM	Independent	20	Sep 2023	Image-aware Decoder Enhanced à la Flamingo with Interleaved Cross-attentionS	https://github.com/InternLM/InternLM
90	Falcon 180B	TII	180	Sep 2023	Largest open-access model	https://huggingface.co/blog/falcon-180b
91	Bolt 2.5B	ThirdAI	3	Sep 2023	Notable for being trained only on CPUs rather than GPU arrays	https://medium.com/thirdai-blog/introducing-the-worlds-first-generative-llm-pre-trained-only-on-cpus-meet-thirdai-s-bolt2-5b-10c0600e1af4
92	DeciLM	Deci AI	5.7	Sep 2023	15x faster than Llama 2	https://deci.ai/blog/decilm-15-times-faster-than-llama2-nas-generated-llm-with-variable-gqa/
93	Mistral-7B	Mistral AI	7	Sep 2023	Open source, outperforms Llama2	https://mistral.ai/news/announcing-mistral-7b/
94	Persimmon-8B	Adept	8	Sep 2023	Open Apache license and publicly accessible weights.	https://github.com/persimmon-ai-labs/adept-inference
95	MoLM	IBM	8	Sep 2023	ModuleFormer is based on the Sparse Mixture of Experts (MoE).	https://github.com/ibm/moduleformer
96	Qwen	Alibaba	14	Sep 2023	'lags behind both GPT-3.5 and GPT-4'	https://huggingface.co/Qwen
97	AceGPT	KAUST/Shenzhen	13	Sep 2023	Arabic. Llama 2 + RLAIF	https://huggingface.co/FreedomIntelligence/AceGPT-13B
98	Retro48B	Nvidia	48	Sep 2023	the largest LLM pretrained with retrieval before instruction tuning.'	https://i-genie.co.uk/researchers-from-nvidia-introduce-retro-48b-the-largest-llm-pretrained-with-retrieval-before-instruction-tuning/
99	Ernie 4.0	Baidu	1,000	Oct 2023	Enhanced Representation through kNowledge IntEgration	https://slashdot.org/story/23/10/17/1156245/baidu-says-its-ai-as-good-as-chatgpt-in-big-claim-for-china?utm_source=feedly1.0mainlinkanon&utm_medium=feed
100	Fuyu	Adept	8	Oct 2023		https://huggingface.co/adept/fuyu-8b