Distributed Large Language Models

LLM's have proven their utility for generating small scripts. You'll often read comments along these lines on programming forums these days:

Sure LLM's can generate little scripts but they can't create real programs...

I think this is largely true; however, this is actually useful in "real" programs — the "scripting" ability combined with auto-complete can definitely contribute to developing a full-sized program. Already, tools like Copilot are taking advantage of this ability. This is also generally useful (although it comes with some major caveats in my opinion).

Since LLM's are so great at condensing a large amount of information into a relatively small model that can be queried with natural language, like many others, I worry about the loss of great sources of information like Stack Overflow. Stack Overflow's advantage over an LLM is that it's constantly gaining new information, and it's generally fed by great feedback loops that encourage contributing new information. Of course, this has its own issues, but in general, I think that holds.

However, part of me feels like the philosophers of old, concerned that this new-fangled writing technology was going to reduce human intelligence since we no longer would need to learn through discussion. So, I think it's important to think of what will come next, now that we can in some ways have a miniature static Stack Overflow on our PC's: will we have shared models that can send information back to a host, feeding in new tips, fixes, etc.? I fear this being centralized by commercial entities, and I think even Stack Overflow is trying to build something like this. OpenAI and Microsoft are already using feedback mechanisms.

Would it be possible for the programming community to do P2P model training and inference? This sounds great, but how do we develop and use shared models that have minimal or no quality control? Does or can a community form around a P2P model that corrects its training? How does a community pay for the training? Is it distributed, like folding@home but for programming help? Building a DNN model isn't just about dropping in a bunch of data and hitting the train button, it also involves a lot of decisions on the trainer: choosing appropriate number of layers, how many iterations to train, and more. This might need to be static to some degree with a shared community model.

Note posted on Saturday, January 4, 2025 6:59 PM CST - link

David Vedvick

Notes

Distributed Large Language Models