Skip to content

the distortion and undistortion before and after generation #17

@Tcorpion

Description

@Tcorpion

I download some clips from nvidia/PhysicalAI-Autonomous-Vehicle-Cosmos-Drive-Dreams, and run Cosmos-Drive-Dreams/cosmos-drive-dreams-toolkits/render_from_rds_hq.py , I set the camera to ftheta, apply hdmap and lidar control, get a synthesized clip like bellow. After undistortion by Cosmos-Drive-Dreams/cosmos-drive-dreams-toolkits/rectify_ftheta_to_pinhole.py, there is still distortion like curved traffic signs.

ClipID="0079aad7-0fc5-4722-804d-e7c8c1b84263_570745200000_570765200000"

hdmap and lidar control:

{
"hdmap": {
"control_weight": 0.5,
"input_control": "placeholder",
"ckpt_path": "checkpoints/nvidia/Cosmos-Transfer1-7B-Sample-AV/hdmap_control.pt"
},
"lidar": {
"control_weight": 0.5,
"input_control": "placeholder",
"ckpt_path": "checkpoints/nvidia/Cosmos-Transfer1-7B-Sample-AV/lidar_control.pt"
}
}

hdmap control video:

0079aad7-0fc5-4722-804d-e7c8c1b84263_570745200000_570765200000_0.mp4

lidar control video:

0079aad7-0fc5-4722-804d-e7c8c1b84263_570745200000_570765200000_0.mp4

the caption is:

{
"Sunny": "The video shows a daytime highway scene from inside a vehicle. The road is brightly illuminated by the sun, with clear visibility of multiple lanes marked by white lines. Several vehicles ahead are visible, their brake lights glowing red. A large green road sign overhead indicates directions to various destinations such as Osnabr\u00fcck, Oldenburg, Bremen-Centrum, and Hanover. The surroundings are well-lit and open, with no pedestrians or other significant objects visible outside the vehicle. The environment appears calm, with minimal activity aside from the vehicles on the road.",
}

I follow the Cosmos-Drive-Dreams SDG Pipeline (scripts/generate_video_single_view.py) and get the bellow:

raw generation:

Image

undistorted (rectify_ftheta_to_pinhole.py):

Image

There are two questions:

  1. Does the open sourced model only support producing raw data with distortion?
  2. Is the distortion in the 2nd image above coming from the intrinsic parameter inaccuracy? or from the training by raw dash cam image with distortion?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions