Skip to content

Conversation

@JREastonMarks
Copy link
Contributor

Fix #10702

#10702

This PR fixes documentation and a bug issue with SAML in cBioPortal

vaibhavlachhwani and others added 30 commits December 2, 2024 13:37
...and fix some broken links
…#11264)

* Adds documentation for managing keycloak instance via API
* Add keycloak management doc to retype summary page
* add donate page to docs

* Update Donate.md

---------

Co-authored-by: JJ Gao <[email protected]>
Seems like we still use this standalone frontend mode
Add Clickhouse implementation of study view endpoints (off by default) with legacy api intact
---------

Co-authored-by: haynescd <[email protected]>

---------

Co-authored-by: haynescd <[email protected]>
Co-authored-by: Charles Haynes <[email protected]>
Co-authored-by: Onur Sumer <[email protected]>
Co-authored-by: Neel Kuila <[email protected]>
Co-authored-by: Bryan Lai <[email protected]>
Co-authored-by: Zhaoyuan (Ryan) Fu <[email protected]>
Co-authored-by: Qi-Xuan Lu <[email protected]>
Co-authored-by: Bryan Lai <[email protected]>
Co-authored-by: Gaofei Zhao <[email protected]>
Co-authored-by: Robert Sheridan <[email protected]>
Co-authored-by: Zain Nasir <[email protected]>
Co-authored-by: Zain Nasir <[email protected]>
Co-authored-by: haynescd <[email protected]>
* Add donate button config instructions
* add donate_button to example props
* add donate to example props
Add documentation for cBioPortal security measures.
update github dependency graph with docker builds
…tion

fix typo in github dependency graph action
dippindots and others added 24 commits June 20, 2025 10:06
* Added numberOfAlteredCasesOffPanel property to allow frontend to disregard offpanel events
* Refactor SessionServiceController to have all low-level http calls in SessionServiceHandler

* Fix sonarqube regex vulnerability for user-controlled data in SessionServiceRequestHandler

* Remov wildcard imports
…ict Resolution (cBioPortal#11603)

* Categorize attributes, map patients to samples on conflicts

* Add tests
* Split integration test out

* Add e2e test

* Add E2E test workflow

* Fix profile for e2e tests

* update to run integration and e2e test

* Fix env vars

* fix sonar workflow
Enable HTTP streaming by eliminating unnecessary response copying.

To enable streaming for Study Data Export (RFC95).
Unlike caching http requests `ContentCachingRequestWrapper` for enabling
multiple request body reads, wrapping responses in `ContentCachingResponseWrapper` is not necessary and not used anywhere.
The application caching behaviour (with `@Cachable` annotation) does not depend on any of these.

Port clinical export

- Introduce long table
- Convert records back to pojos. Record fields are mapped by position with mybatis
- Write tests for clinical data table
- Make conditional Java Config

Enable genetic profile and MAF data export

Enable case list export

Port POC unit tests. Make them pass

Port Export integration smoke test

Port import-export functionality test

Enable dynamic study export mode on CI for running tests

Fix import export study data differences

Refactor the code

Fix order of columsn for the clinical file

Refactor exporters layer

Make test to pass

Return 404 http status when no study exists

Make sure the rows are ordered by sample/patient IDs

We do defensieve assumption check to make sure clinical file rows will be formed correctly

Check clinical attributes for duplicates

Clean code from done TODO comments

Run study export in low priority custom thread pool

Specify the type of MAF exporter

Should we maybe currupt file if download fails?

Remove unneeded comments

Make sure input stream closes

Add support of MRNA Expression data type export

Add support for generic data types

Add README to the package

Add MRNA and Generic Assay data type to the test study

Remove patient level generic assay data to fix the build

Fix special attribute values not showing for the first line

toRow() gave different results on the first and subsequent calls.
The first calls were made for getting the header only.

Skip generic properties with mistmatching id

They were blocking for the followup rows

Fix skipping rows in generic assay data

MyBatis does not like if result does not have <id ...>
If it is not specified, it picks first field as a key.
Rows that would have the same key will be skipped!

Update import/export test study to new fixes

Fix metadata tests

Fix unit tests

Improve test coverage for all exporters

Expand MRNA export support to Z-SCORE and DISCRETE

Support export of cancer types

We export the cancer type of the study with all its parents

Do not export patient level generic assay data

We are not going to support patient level data export.
Without this fix the code exported sample header for patient level
data which would fail during the load to cBioPortal

Fix number of columns for cancer type file

Export clinical timeline aka events

Support protein level data export

Support more generic assay data types

Add suport for exporting mutation uncalled data type

Add CNA contineous and log2 data export

Lower case p in phosphosite to mark it as such

Support methylation data type export

Support CNA discrete data type

Support export of CNA Segment data

Support Structural Variant Data Export

Support exporting gene panel matrix data

Remove pipe output as not reliable

Use forward only cursors where possible for memory optimisation

Corrupt zip file intentionally in case of exception

We don't want to do partial export in case of exception

Warn about incorrect format for phosphoprotein, not crash

Provide a way to increase timeout for async requests

Increase it to 10 minutes by default.

Fix sonar cube reported issues

Add README.txt file to the exported study data

warn that there might be not all data types exported

Write NA instad of blank string for absent gene panel

Validator complains about blank strings

Move code to fail zip streaming to the factory

Add posibility to export study under alternative study id

Enable filtering exported data by sample id

Move pre authorization check to the service layer

Preparation to support export of virtual studies

Move check for presense of study into the service

As a preparation to support of Virtual Study export

Enable downloading virtual studies

Use CI session service instead of default remote one

Fixing the test for study impor export on the build

Improve splitting gene name to hugo symbol and phosphosite

Phosphosite can contain underscore in it's id

Override name, description, pmid and cancer type of virtual study

that have one physical study in the definition

Test Virtual Study download that is defined with multiple Materialised Studies

Refresh dynamic Virtual Studies before export

Add dynamic_study_export_mode property

Improve performance by parsing comma-separated values once

not in the loop! It becomes very slow when amount of samples are significant

Improve start-stop time logging around performance critical code

It helped to identify some performance issues

Document export request timeout

Restore HGVSp_Short from PROTEIN_CHANGE`

Export tumor seq allele 1 and 2 from the right db field

Account for blank strings in protein change

Export a file per clinical timeline event type

Apply spotless reformatting of the code

Export data with default style color and shape

if attributes present but there is no value.
The loader does not like empty values for these columns.

Change timeline circle color to the default used by the timeline widget

Substitute with default value even when nullable or blank value specified

Update study_es_0_import_export with multiple timeline files

A timeline file per EVENT_TYPE

Update import export study to test HGVSp_Short, Tumor_Seq_Allele1 and Tumor_Seq_Allele2 parsing

Expand unit test to check defaults for STYLE_COLOR and STYLE_SHAPE

Document supported file formats by export

Add Caveats section to the export README

Add notes about time line export particularities to README

Document export particularities of Hugo_Symbol and Entrez_Gene_Id columns

Add note about mutations filtering during the upload

Blanken negative entrez ids

Mention that we do not support namespaces meta tag for mutation, CNA and SV

Fix order of custom timeline attributes

so import/export test can pass after sorting rows and applying diff command to test equality

Blanken negative entrez ids in SV data type

Fix removing last row while exporting gene panel matrix

Was happening only if there is only one panel on the last row

Rename zip output stream writer from factory to service

and move it to services package

Remove unused local variable

Rename Virtual Study aware service. Add Decorator to the name

To document the pattern in the name

Break down long VS export function to smaller ones

Add javadoc to mappers of export fucntionality

Use read-only transaction for export functionality

It make code faster and ensure nothing gets changed accidently

Do not write Virtual Study definition file if user has no access to it

Fix order of samples when exporting data for sample ids

e.g. for Virtual Studies
Before order of samples in header changed arbitrary!

Introduce repositories layer

Do not prepend number of samples to the meta description

We decided that mentioning it in meta study file is enough

Export mutation strand as was imported

Apply spotless style corrections after conflict resolution in session service classes

Rename feature flag from dynamic_study_export_mode to feature.study.export

There is a discussion to have feature.<highlevel>.<featureName> format
for all feature flags

Fix static and dynamic virtual study tests

Break down long method in timeline exporter

Remove commented code in export config

Define ant throw an export exceptions with informative message

Create and use DESCRIPTION contant instead of repeating it 3 times

Make abstract exporter constructors protected

Organise constructors and closing methods

Reduce cognitive complexity of genetic alteration tsv exporter

Refactor write tsv data method

Add nested comment explaining empty methods

Remove obsolete TODOs

Refactore meta description update and path formation

Remove use of stream peak in VS service

Fix sonarqube reported issues for tests

Update study_es_0_import_export to be consistent with study_es_0

Add single-study virtual study integration test

Add multi-study virtual study integration test

Fix multi virtual study test

Fix multi virtual study test

Fix multi virtual study test

Do not re-load test studies to test VS export

Rely on data load in previous steps

Remove unnecessary reloading on gene panels and gene sets data
* Restore sample check endpoint for CIS

* Remove darwin legacy endpoint

---------

Co-authored-by: Gaofei Zhao <[email protected]>
* upgrade spring boot 3.5

* update dependencies to fix critical and high vulnerabilities

* upgrade testcontainers and selenium

* revert dependency updates

* update beanutils

* update tomcat

* upgrade velocity-engine

* fix indentation

* remove unused dependencies

* add velocity version variable

* remove unnecessary velocity scripting dep

---------

Co-authored-by: Bryan Lai <[email protected]>
* Update News-Genie.md

* Update News-Genie.md
@dippindots dippindots self-assigned this Jul 30, 2025
Copy link
Collaborator

@dippindots dippindots left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JREastonMarks Thanks for updating the documentation!

I don't fully understand the code change though, could you elaborate a little bit more on that?

## Why is SAML Relevant to cBioPortal?

The cBioPortal code has no means of storing user name and passwords and no means of directly authenticating users. If you want to restrict access to your instance of cBioPortal, you therefore have to consider an external authentication service. SAML is one means of doing so, and your larger institution may already provide SAML support. For example, at Sloan Kettering and Dana-Farber, users of the internal cBioPortal instances login with their regular credentials via SAML. This greatly simplifies user management.
cBioPortal has no means of directly authenticating users. If you want to restrict access to your instance of cBioPortal, you therefore have to consider an external authentication service. SAML is one means of doing so, and your larger institution may already provide SAML support. For example, at Memorial Sloan Kettering and Dana-Farber, users of the internal cBioPortal instances login with their regular credentials via SAML. This greatly simplifies user management.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove additional space

Suggested change
cBioPortal has no means of directly authenticating users. If you want to restrict access to your instance of cBioPortal, you therefore have to consider an external authentication service. SAML is one means of doing so, and your larger institution may already provide SAML support. For example, at Memorial Sloan Kettering and Dana-Farber, users of the internal cBioPortal instances login with their regular credentials via SAML. This greatly simplifies user management.
cBioPortal has no means of directly authenticating users. If you want to restrict access to your instance of cBioPortal, you therefore have to consider an external authentication service. SAML is one means of doing so, and your larger institution may already provide SAML support. For example, at Memorial Sloan Kettering and Dana-Farber, users of the internal cBioPortal instances login with their regular credentials via SAML. This greatly simplifies user management.

saml.keystore.password=apollo1
saml.keystore.private-key.key=secure-key
saml.keystore.private-key.password=apollo2
saml.keystore.default-key=secure-key
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we still have some outdated keystore contents here: https://github.com/cBioPortal/cbioportal/blob/master/docs/deployment/authorization-and-authentication/Authenticating-Users-via-SAML.md?plain=1#L88-L110, maybe we can replace these with this:

#### Example to generate cert and key
```shell
openssl req -newkey rsa:2048 -nodes -keyout local.key -x509 -days 365 -out local.crt
```

Comment on lines 107 to 113
UserAuthorities authorities = securityRepository.getPortalUserAuthorities(username);

Set<GrantedAuthority> mappedAuthorities = new HashSet<>();
if (!Objects.isNull(roles)) {
mappedAuthorities.addAll(GrantedAuthorityUtil.generateGrantedAuthoritiesFromRoles(roles));
} else {
mappedAuthorities.addAll(authentication.getAuthorities());

if (!Objects.isNull(authorities)) {
mappedAuthorities.addAll(AuthorityUtils.createAuthorityList(authorities.getAuthorities()));
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand this change, are we going to read authories from database only? I think roles we get from Collection<String> roles = principal.getAttribute(this.roleAttributeName); is also important, right?

@dippindots dippindots removed their assignment Aug 26, 2025
@JREastonMarks JREastonMarks force-pushed the v6.3.1-dfci-FIX-10702-SAML branch from 41cbdf3 to 5e35b19 Compare August 29, 2025 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update Security Documenation for v6