Skip to content

Upload of big files (> 5Gb) to UC Volumes using multipart chunking#1621

Open
alexott wants to merge 8 commits into
databricks:mainfrom
alexott:feature/multipart-chunked-upload
Open

Upload of big files (> 5Gb) to UC Volumes using multipart chunking#1621
alexott wants to merge 8 commits into
databricks:mainfrom
alexott:feature/multipart-chunked-upload

Conversation

@alexott
Copy link
Copy Markdown
Contributor

@alexott alexott commented Apr 13, 2026

Summary

Implement upload of big files (> 5Gb) to UC Volumes using multipart chunking. That functionality already exists in Python SDK, but is missing in the Go SDK.

It will require some modifications in the codegen to support new functionality

Why

databricks/terraform-provider-databricks#5521 is asking for support of big files, but Go SDK doesn't have this functionality

What changed

Interface changes

New functions are added to Files interface (requires codegen changes)

Behavioral changes

Internal changes

How is this tested?

Unit tests were added

@alexott alexott temporarily deployed to test-trigger-is April 13, 2026 11:00 — with GitHub Actions Inactive
@github-actions
Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 15 days if no further activity occurs. If this PR is still relevant, please leave a comment or push new changes to keep it open. Thank you for your contributions.

@github-actions github-actions Bot added the stale The PR has been marked as "stale" and will automatically be closed if no further activity. label May 14, 2026
alexott added 8 commits May 17, 2026 12:10
…tipart upload

Signed-off-by: Alex Ott <alexey.ott@databricks.com>
…rts, complete)

Co-authored-by: Isaac
Signed-off-by: Alex Ott <alexey.ott@databricks.com>
…FromFile

Signed-off-by: Alex Ott <alexey.ott@databricks.com>
Best-effort abort of incomplete multipart uploads when a part upload
fails, preventing orphaned upload sessions on the server.

Signed-off-by: Alex Ott <alexey.ott@databricks.com>
Signed-off-by: Alex Ott <alexey.ott@databricks.com>
- Fix URL expiration retry to fetch fresh presigned URLs instead of
  retrying the same expired URL
- Remove total http.Client timeout that would kill large part uploads;
  rely on context cancellation instead
- Add context cancellation checks in the main upload loop
- Extract partUploadError type for structured error handling

Signed-off-by: Alex Ott <alexey.ott@databricks.com>
…est coverage

Address review feedback: expose upload methods via FilesInterface, fall back
to single-shot upload when the first multipart chunk fails, validate
contentLength matches actual bytes read, and add tests for all new paths.

Co-authored-by: Isaac
Signed-off-by: Alex Ott <alexey.ott@databricks.com>
@alexott alexott changed the title Feature/multipart chunked upload Upload of big files (> 5Gb) to UC Volumes using multipart chunking May 17, 2026
@alexott alexott force-pushed the feature/multipart-chunked-upload branch from 05dc9ac to cdbdc26 Compare May 17, 2026 10:13
@github-actions
Copy link
Copy Markdown

If integration tests don't run automatically, an authorized user can run them manually by following the instructions below:

Trigger:
go/deco-tests-run/sdk-go

Inputs:

  • PR number: 1621
  • Commit SHA: cdbdc26b08287011e54d65ed276b778779529315

Checks will be approved automatically on success.

@alexott
Copy link
Copy Markdown
Contributor Author

alexott commented May 17, 2026

@chrisst - that would be useful for TF integration

@alexott alexott temporarily deployed to test-trigger-is May 17, 2026 10:15 — with GitHub Actions Inactive
@github-actions github-actions Bot removed the stale The PR has been marked as "stale" and will automatically be closed if no further activity. label May 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant