Skip to content

Add gradient accumulation option for bicleaner-ai training #27

@radinplaid

Description

@radinplaid

The wiki suggests a batch size of 128 is recommended for 'stable training'.

It would be helpful to have the option to accumulate gradients so that bicleaner-ai training with larger "effective batch size" were possible on GPUs with a relatively small amount of RAM.

Fairseq calls this option "--update-freq"
Sockeye calls this option "--update-interval"

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions