@deranger@theunknownmuncher the US trying to stifle Chinese progress/stop chip exports has had exactly what anyone could see. China is making leaps and bounds in all sorts of tech areas, innovating around obstacles
https://arxiv.org/abs/2405.20304 they invented their own reinforcement learning framework called Group Relative Policy Optimization
EDIT: deepseek publicly released and published the model and methods to the global community, and there is now an open effort by researchers to reproduce them https://github.com/huggingface/open-r1 it is like the opposite of stealing
I thought the innovative part was using more efficient code, not what it’s trained on.
thats capitalisms dark secret. Its only innovative when it has to be.
@deranger @theunknownmuncher the US trying to stifle Chinese progress/stop chip exports has had exactly what anyone could see. China is making leaps and bounds in all sorts of tech areas, innovating around obstacles
https://arxiv.org/abs/2405.20304 they invented their own reinforcement learning framework called Group Relative Policy Optimization
EDIT: deepseek publicly released and published the model and methods to the global community, and there is now an open effort by researchers to reproduce them https://github.com/huggingface/open-r1 it is like the opposite of stealing
Yeah the original comment in this chain more describes US Telcos and shit, not this particular instance.