This musing is mostly born from the challenges I faced while trying to load LLMs locally (for inference and fine tuning) that are 1/ 1000th of the parameter size of GPT-4.
The dilemma around how transparent, democratised, scalable, and sustainable the hugely popular LLM capabilities (such as ChatGPT) are to practitioners across the world, reminded me faintly of the Blockchain Trilemma and Blocksize war.
Well, this is more of an analogy rather than a theoretical similarity between the dilemmas these two very different technologies face when they attempt to scale. The Blockchain Trilemma states that in any blockchain you can only fully solve 2 of the 3 problems of security, scale and decentralisation (however, there are now workarounds (yet to stablise) to solving this, such as Sharding, Roll-ups)
Granting myself a creative license to stretch the analogy. If we view the LLM deployment issues faced today from a blockchain trilemma lens, then we face -
scalability (how can we keep adding more data and compute to improve the model)
security (if you're sending your data over the OpenAI APIs, then yes. E.g.: Samsung data leaks to ChatGPT)
decentralisation (can anyone, anywhere with some experience in this field run such a model on a regular machine? Can training be decentralised?).
another similarity is that both have a strong spirit/responsibility of an open-source community (unless they're developed by big, private companies).
The Blockhain Trilemma is not resolved yet, however there are promising methods the community is experimenting with, to expand the scale of transactions without compromising on decentralisation.
Can we take inspiration from the Blockchain trilemma to see how we can improve the decentralisation & democratisation of LLMs. Should they be smaller, smarter, distilled?