Skip to content

What is the biggest LLM model tinybox can pretrain from scratch #1

@banyan-god

Description

@banyan-god

With 144GB of Vram what is biggest model we can train from scratch using tinybox. Will it be possible to train model similar to lets say llama 3 8B from scratch using this ? Here are some model params

    llama3(
        vocab_size=128_256,
        num_layers=32,
        num_heads=32,
        num_kv_heads=8,
        embed_dim=4096,
        max_seq_len=8192,
        intermediate_dim=14336,
        attn_dropout=0.0,
        norm_eps=1e-5,
        rope_base=500000.0,
    )

Now we need to also consider block size, batch size etc

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions