Skip to content

DTO analysis #61

@cmatKhan

Description

@cmatKhan
  1. Select all binding samples (set up all binding samples in VirtualDB config)

    If there is a dataset that cannot be configured (no annotatedfeature dataset),
    put that in the issue discussion. @cmatKhan will address it. Do the analysis
    with available binding data.

  2. After selecting all binding samples from all datasets with DTO P<=0.01 compared
    to either Hackett-2020-ZEV or Kemmeren-2014-TFKO, investigate whether some TFs
    pass in most datasets while others pass in almost none.

    Challenge: Deciding which Hackett condition requires examining Hackett data
    and setting filters such that there is 1 hackett sample per regulator, OR explaining how multiple
    conditions per regulator affects results. In particular, we are interested in the effect over time. Which
    timepoint is best? What is the effect of time? DTO distribution over time is a good output here. Additionally,
    it should be possible to set different filters on a per regulator basis such that if there is a ZEV and GEV sample,
    then we can choose the one that performs best.

    Analysis steps:

    i. Select binding samples from all datasets with DTO vs. Hackett-2020-ZEV P<=0.01

    ii. Select binding samples from all datasets with DTO vs. Kemmeren-2014-TFKO P<=0.01

    iii. Intersect the previous two sets (this is probably a composition of the filter above)

    iv. For regulators in any active-set sample, present number of active samples:
    - As a table: one row per regulator + count
    - As a distribution: across TFs of the count above

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions