Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Are there fine tuned models that perform better for structured / parsable outputs?


This isn't the answer to that question, but llama.cpp has a feature to constrain output to the provided grammar, such as https://github.com/ggerganov/llama.cpp/blob/master/grammars/...

Others should really implement that as well. You still need to guide the model to produce e.g. JSON to get good results, but they will 100% guaranteed be valid per the grammar.


Agreed that others should implement it as well but coercing llama to output results with matching grammar needs work.


What kind of work? I've only given it a short try before moving to Ollama that doesn't have it, but it seemed to have worked there. (With ollama I need to use a retry system.)

edit: I researched a bit and apparently it can reduce performance, plus the streaming mode fails to report incorrect grammars. Overall these don't seem like deal-breakers.


Fireworks.ai Firefunction is pretty good. Not GPT-level but it’s an open model.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: