All theoretical, but I would cut the decoder off a very smart chat model, then fine tune the encoder to provide a score on the rationality test dataset under CoT prompting.
All theoretical, but I would cut the decoder off a very smart chat model, then fine tune the encoder to provide a score on the rationality test dataset under CoT prompting.
Well I think you actually need to train a “discriminator” model on rationality tests. Probably an encoder only model like BERT just to assign a score to thoughts. Then you do monte carlo tree search.
They are fine at programming numpy and sympy given an interface, and they are surprisingly good at explaining advanced symbolic math concepts. I wouldn’t expect them to be good at arithmetic, but a good reasoning model should be really good at mathematical reasoning.
TBH, I find the search feature on the block instances on your profile tab useful, HOWEVER, it shouldn’t require the instance to exist or show up in search. It should let me put in any string. Just in case search is broken somehow.
I just don’t know what blocked means then. It looks like I can see hexbear communities, probably comment on the, and even subscribe to them. But I guess they can’t comment in programming.dev communities or see our stuff?
Actually now that I think about it, LLM’s are decoder only these days. But decoders and encoders are architecturally very similar. You could probably cut off the “head” of the decoder, make a few fully connected layers, and fine tune them to provide a score.