ABCDEFG
1
see the visualisation:
https://informationisbeautiful.net/visualizations/the-rise-of-generative-ai-large-language-models-llms-like-chatgpt/
2
last update 20th Mar 2023
3
nameownertrained on x billion parametersdatenote / * = parameters undisclosedlink
4
BERTGoogle0.34Oct 2018https://en.wikipedia.org/wiki/BERT_(language_model)
5
GPT-2OpenAI1.5Feb 2019trained on Reddit onlyhttps://en.wikipedia.org/wiki/GPT-2
6
T5Google11Oct 2019https://arxiv.org/abs/1910.10683
7
Megatron-11B
Meta / Facebook
11Apr 2020https://github.com/pytorch/fairseq/tree/main/examples/megatron_11b
8
BlenderBot1
Meta / Facebook
9.4Apr 2020https://cobusgreyling.medium.com/meta-ais-blender-bot-3-0-is-an-open-source-chatbot-with-long-term-memory-internet-search-ce024a5fe8aa
9
GPT-3OpenAI175May 2020https://en.wikipedia.org/wiki/GPT-3
10
Wu Dao 2.0
Beijing Academy of AI
1750Jan 2021https://en.wikipedia.org/wiki/Wu_Dao
11
GPT-JEleutherAI6Jun 2021https://huggingface.co/EleutherAI/gpt-j-6b
12
PanGu-AlphaHuawei200Apr 2021https://arxiv.org/abs/2104.12369
13
LaMDAGoogle137Jun 2021https://en.wikipedia.org/wiki/LaMDA
14
BlenderBot2.0
Meta / Facebook
9.4Jul 2021https://cobusgreyling.medium.com/meta-ais-blender-bot-3-0-is-an-open-source-chatbot-with-long-term-memory-internet-search-ce024a5fe8aa
15
Jurassic-1AI21178Aug 2021https://www.ai21.com/blog/announcing-ai21-studio-and-jurassic-1
16
CodexOpenAI12Aug 2021Generates programming codehttps://arxiv.org/abs/2107.03374
17
FLANGoogle137Sep 2021https://arxiv.org/abs/2109.01652
18
PLATO-XLBaidu11Sep 2021chatbothttps://arxiv.org/abs/2109.09519
19
WeLMWeChat10Sep 202287% chinese languagehttps://arxiv.org/abs/2209.10372
20
xlargeCohere52.4Sep 2021Trained on "ebooks and webpages"https://arxiv.org/abs/2108.07790
21
Megatron-Turing NLG
Meta / Facebook
530Oct 2021https://developer.nvidia.com/megatron-turing-natural-language-generation
22
MT-NLGMicrosoft530Oct 2021https://arxiv.org/abs/2201.11990
23
BERT-200Google200Nov 2021https://cloud.google.com/blog/topics/tpus/google-showcases-cloud-tpu-v4-pods-for-large-model-training (same as above)
24
BERT-480Google480Nov 2021https://cloud.google.com/blog/topics/tpus/google-showcases-cloud-tpu-v4-pods-for-large-model-training
25
LuminousAleph Alpha200Nov 2021German-languagehttps://www.aleph-alpha.de/pricing
26
Ernie 3.0 TitanBaidu260Dec 2021https://www.marktechpost.com/2021/12/29/baidu-and-pcl-team-introduce-ernie-3-0-titan-a-pre-training-language-model-with-260-billion-parameters/
27
GLaMGoogle1200Dec 2021https://ai.googleblog.com/2021/12/more-efficient-in-context-learning-with.html
28
Gopher
Google Deepmind
280Dec 2021https://www.deepmind.com/blog/language-modelling-at-scale-gopher-ethical-considerations-and-retrieval
29
GPT-NeoXEleutherAI20Feb 2022https://huggingface.co/docs/transformers/model_doc/gpt_neox
30
GPT NeoEleutherAI2.7Feb 2022https://huggingface.co/docs/transformers/model_doc/gpt_neo
31
ChinchillaDeepMind70Mar 2022https://arxiv.org/abs/2203.15556v1
32
CodeGenSalesforce16Mar 2022Generates programming codehttps://arxiv.org/abs/2203.13474
33
InCoderMeta6.7Apr 2022generates python and javascripthttps://arxiv.org/abs/2204.05999
34
mGPTSber13Apr 202260 languageshttps://arxiv.org/abs/2204.07580
35
PaLMGoogle540Apr 2022https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html
36
OPT-IMLMeta AI175May 2022https://arxiv.org/abs/2212.12017
37
MinervaGoogle540Jun 2022https://ai.googleblog.com/2022/06/minerva-solving-quantitative-reasoning.html
38
YaLM 100BYandex100Jun 2022Russian / Englishhttps://huggingface.co/yandex/yalm-100b
39
BLOOMBigScience175Jul 2022https://huggingface.co/bigscience/bloom
40
FIM 6.9BOpenAI6.9Jul 2022https://arxiv.org/pdf/2207.14255.pdf
41
NLLB-200Meta AI54.5Jul 2022200 language translation https://ai.facebook.com/blog/nllb-200-high-quality-machine-translation/
42
GLM-130B
Tsinghua & Zhipu
130Aug 2022https://huggingface.co/spaces/THUDM/GLM-130B
43
AtlasMeta11Aug 2022https://arxiv.org/abs/2208.03299
44
BlenderBot3
Meta / Facebook
175Aug 2022https://cobusgreyling.medium.com/meta-ais-blender-bot-3-0-is-an-open-source-chatbot-with-long-term-memory-internet-search-ce024a5fe8aa
45
AlexaTMAmazon20Aug 2022trained on Wikipedia and mC4 onlyhttps://www.amazon.science/blog/20b-parameter-alexa-model-sets-new-marks-in-few-shot-learning
46
PaLIGoogle17Sep 2022Vision modelhttps://arxiv.org/abs/2209.06794
47
SparrowGoogle70Sep 2022powered by Chincillahttps://en.wikipedia.org/wiki/Sparrow_(bot)
48
MT5Google13Oct 2022101 languageshttps://huggingface.co/google/mt5-base
49
Galactica
Meta / Facebook
120Nov 2022scientific onlyhttps://www.technologyreview.com/2022/11/18/1063487/meta-large-language-model-ai-only-survived-three-days-gpt-3-science/
50
ChatGPTOpenAI12Nov 2022https://en.wikipedia.org/wiki/ChatGPT
51
RL-CAIAnthropic52Dec 2022https://lifearchitect.ai/anthropic/
52
ExaoneLG300Dec 2022https://sourceforge.net/software/product/EXAONE/
53
GPT 3.5OpenAI175Dec 2022https://openai.com/blog/chatgpt
54
WebGPT
Open AI / Microsoft
175Jan 2023https://openai.com/research/webgpt
55
ClaudeAnthropic52Jan 2023https://arstechnica.com/information-technology/2023/03/anthropic-introduces-claude-a-more-steerable-ai-competitor-to-chatgpt/
56
LLaMa
Meta / Facebook
65Feb 2023https://ai.facebook.com/blog/large-language-model-llama-meta-ai/
57
Luminous SupremeAleph Alpha70Feb 2023German-languagehttps://docs.aleph-alpha.com/docs/introduction/prompting_and_completion/#zero-shot-learning-with-luminous-supreme-control
58
PanGu-SigmaHuawei1085Mar 2023https://arxiv.org/abs/2303.10845
59
Bard*Google0.7Feb 2023powered by LaMDAhttps://techmonitor.ai/technology/ai-and-automation/google-i-o-bard-chatbot-llm-palm2-gemini
60
AlpacaStanford7Mar 2023https://github.com/tatsu-lab/stanford_alpaca
61
BloombergGPTBloomberg50Mar 2023Finance-focussed (of course)https://arxiv.org/abs/2303.17564
62
Cerebras-GPTCerebras13Mar 2023open-sourcehttps://www.cerebras.net/blog/cerebras-gpt-a-family-of-open-compute-efficient-large-language-models/
63
Ernie BotBaidu200Dec 2021https://www.prnewswire.com/news-releases/baidu-unveils-ernie-bot-the-latest-generative-ai-mastering-chinese-language-and-multi-modal-generation-301774240.html
64
GPT-4*OpenAI1,000Mar 2023https://en.wikipedia.org/wiki/GPT-4
65
GPT4All-LoRANomic7Mar 2023open source chatbot based on LLaMahttps://s3.amazonaws.com/static.nomic.ai/gpt4all/2023_GPT4All_Technical_Report.pdf
66
Jurassic-2*AI21200Mar 2023https://thenewstack.io/ai21-labs-releases-jurassic-2-its-new-large-language-model/
67
Koala-13BBerkeley13Apr 2023Based on LLaMAhttps://bair.berkeley.edu/blog/2023/04/03/koala/
68
StableLMStability AI65Apr 2023
open-source from the makers of Stable Diffusion
https://github.com/stability-AI/stableLM/
69
Dolly 2.0Databricks12Apr 2023open-sourcehttps://arstechnica.com/information-technology/2023/04/a-really-big-deal-dolly-is-a-free-open-source-chatgpt-style-ai-model/
70
SenseChatSenseTime200Apr 2023https://www.silicon.co.uk/e-innovation/artificial-intelligence/sensetime-ai-505764
71
TitanAmazon350Apr 2023https://aws.amazon.com/bedrock/titan/
72
Tongyi QianwenAlibaba200Apr 2023name roughly translates to “truth from a thousand questions,”https://www.theregister.com/2023/04/11/alibaba_tongyi_qianwen_llm/
73
Hugging ChatLAION30Apr 2023https://techcrunch.com/2023/04/25/hugging-face-releases-its-own-version-of-chatgpt/?guccounter=1&guce_referrer=aHR0cHM6Ly9uZXdzLnNsYXNoZG90Lm9yZy8&guce_referrer_sig=AQAAAAykGMvXCA4mB45v7uwolZNOHKsD8v0oCXuvA_ODzNeQYDZSu_-gosaiEklXgzcJrzmgiNapj8m3WQ7gmE8auQxFEIKokjxYpdx7TXhOimIuz0Dww2I7ceB29AYZHtxkD4wfgA8BN4aB5CR3L9aVOLjXXiiCHDmCvhBr9I8xwLAo
74
BingChat*
Microsoft / OpenAI
1,000Apr 2023Microsoft's version of ChatGPThttps://www.zdnet.com/article/how-to-use-the-new-bing-and-how-its-different-from-chatgpt/
75
PaLM2Google540May 2023
Trained on 100 languages and 20 programming languages. Google says the new model is better at common sense reasoning, mathematics and logic
https://techcrunch.com/2023/05/10/google-launches-palm-2-its-next-gen-large-language-model/
76
Vicuna-13BVicuna Team65Mar 2023
an open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT
https://lmsys.org/blog/2023-03-30-vicuna/
77
Falcon LLM
Technology Innovation Institute
40Jun 2023
foundational large language model (LLM) with 40 billion parameters trained on one trillion tokens
https://falconllm.tii.ae/
78
Sail-7B
Open Language Safety Research
7Jun 2023
search engine-grounded large language model based on LLama-7B
https://openlsr.org/sail-7b
79
Web LLMIndependent7Jun 2023Browser-based LLM Chatbothttps://simonwillison.net/2023/Apr/16/web-llm/
80
OpenLLMIndependent13Jun 2023https://huggingface.co/openlm-research/open_llama_13b_easylm
81
Ernie Bot 3.5Baidu200July 2023
Surpassing ChatGPT (3.5) in comprehensive ability scores and outperforming GPT-4 in several Chinese language capabilities - and supporting plugins.
http://research.baidu.com/Blog/index-view?id=185
82
Claude 2Anthropic52July 2023
Expanded input and output length (up to 100,00 tokens) allowing the AI model to analyze long documents such as technical guides or entire books
https://arstechnica.com/information-technology/2023/07/new-chatgpt-rival-claude-2-launches-for-open-beta-testing/
83
LLaMa2Facebook70July 2023
Open source LLM comes in 3 parameter sizes - 7, 30, and 70 bn
https://venturebeat.com/ai/facebook-parent-meta-unveils-llama-2-open-source-ai-model-for-commercial-use/
84
Bichuan 2
Baichuan Intelligence
13Jul 2023
Chinese open-access equivalent to Meta's Llama model
https://techcrunch.com/2023/07/11/chinas-search-engine-pioneer-unveils-open-source-large-language-model-to-rival-openai/
85
Claude InstantAnthropic52Aug 2023
100,000 token window allowing analysis of up 75,000 words
https://techcrunch.com/2023/08/09/anthropic-launches-improved-version-of-its-entry-level-llm/
86
IDEFICSIndependent80Aug 2023Clone of Famingo using Llama-1 65Bhttps://huggingface.co/blog/idefics
87
Jais ChatIndependent13Aug 2023Arabic language LLM, trained in UAEhttps://arxiv.org/abs/2309.03852
88
Japanese StableLM Alpha 7B
Stability AI7Aug 2023Open source Japanese lang. modelhttps://huggingface.co/stabilityai/japanese-stablelm-base-alpha-7b
89
InternLMIndependent20Sep 2023
Image-aware Decoder Enhanced à la Flamingo with Interleaved Cross-attentionS
https://github.com/InternLM/InternLM
90
Falcon 180BTII180Sep 2023Largest open-access modelhttps://huggingface.co/blog/falcon-180b
91
Bolt 2.5BThirdAI3Sep 2023
Notable for being trained only on CPUs rather than GPU arrays
https://medium.com/thirdai-blog/introducing-the-worlds-first-generative-llm-pre-trained-only-on-cpus-meet-thirdai-s-bolt2-5b-10c0600e1af4
92
DeciLMDeci AI5.7Sep 202315x faster than Llama 2https://deci.ai/blog/decilm-15-times-faster-than-llama2-nas-generated-llm-with-variable-gqa/
93
Mistral-7BMistral AI7Sep 2023Open source, outperforms Llama2https://mistral.ai/news/announcing-mistral-7b/
94
Persimmon-8BAdept8Sep 2023
Open Apache license and publicly accessible weights.
https://github.com/persimmon-ai-labs/adept-inference
95
MoLMIBM8Sep 2023
ModuleFormer is based on the Sparse Mixture of Experts (MoE).
https://github.com/ibm/moduleformer
96
QwenAlibaba14Sep 2023 'lags behind both GPT-3.5 and GPT-4'https://huggingface.co/Qwen
97
AceGPT
KAUST/Shenzhen
13Sep 2023Arabic. Llama 2 + RLAIFhttps://huggingface.co/FreedomIntelligence/AceGPT-13B
98
Retro48BNvidia48Sep 2023
the largest LLM pretrained with retrieval before instruction tuning.'
https://i-genie.co.uk/researchers-from-nvidia-introduce-retro-48b-the-largest-llm-pretrained-with-retrieval-before-instruction-tuning/
99
Ernie 4.0Baidu1,000Oct 2023
Enhanced Representation through kNowledge IntEgration
https://slashdot.org/story/23/10/17/1156245/baidu-says-its-ai-as-good-as-chatgpt-in-big-claim-for-china?utm_source=feedly1.0mainlinkanon&utm_medium=feed
100
FuyuAdept8Oct 2023https://huggingface.co/adept/fuyu-8b