Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

One problem that intrigues me is Chinese-to-English machine translation. Specifically for a subset of Chinese Martial Arts novels (especially given there's plenty of human translated versions to work with).

So Google/Bing/etc have their own pre-trained models for translations.

How would I access that in order to develop my own refinement w/ the domain specific dataset I put together?



I don't think you could get access to the actual models that are being used to run e.g. Google Translate, but if you just want a big pretrained model as a starting point, their research departments release things pretty frequently.

For example, https://github.com/google-research/bert (the multilingual model) might be a pretty good starting point for a translator. It will probably still be a lot of work to get it hooked up to a decoder and trained, though.

There's probably a better pretrained model out there specifically for translation, but I'm not sure where you'd find it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: