Skip to content

如何使用yarn呢 #274

@dawson-chen

Description

@dawson-chen

我在训练qwen3-32b的模型,它本身的上下文窗口是32k,我想问一下如何在roll里面使用外推的方式进行64k上下文的训练。

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions