Researchers from top US universities warn extending pre-training can be detrimental to performance
Too much pre-training can deliver worse performance due to something akin to the butterfly effect
The more they are pre-trained, the more they become sensitive to small changes that could disrupt the end result
Researchers from Carnegie Mellon, Stanford, Harvard, and Princeton are challenging one of AI development’s accepted core beliefs – that the more pre-training data the better the performance.
As reported by HPCwire, a new paper discuses the concept of “catastrophic overtraining,” whereby extended pre-training can harm a model’s performance after fine-tuning.
+ There are no comments
Add yours