Skip to content

Commit 4d95425

Browse files
committed
feat: update installation guidances
1 parent ba2bcf4 commit 4d95425

35 files changed

+4278
-533
lines changed
Lines changed: 86 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -1,161 +1,186 @@
11
---
22
weight: 200
33
title: "Explore Matches"
4-
description: "Once the files finished loading, you will be presented with the main BDIViz interface."
4+
description: "Explore schema alignment candidates using the BDIViz interactive heatmap interface."
55
icon: "zoom_in"
66
date: "2025-04-19T13:38:17-04:00"
7-
lastmod: "2025-04-19T13:38:17-04:00"
7+
lastmod: "2025-04-21T11:00:00-04:00"
88
draft: false
99
toc: true
1010
---
1111

12-
> Once your files are uploaded, BDIViz will load the full interface for exploring match candidates between your source and target datasets.
12+
## Overview
13+
14+
After uploading your files, BDIViz launches its main interface, allowing you to visually explore match candidates between your source dataset and the target schema.
15+
16+
---
1317

1418
## Interactive Heatmap
1519

16-
This is the central visual of BDIViz. Each cell in the heatmap represents a match candidate between a source attribute (on the y-axis) and a target attribute (on the x-axis).
20+
The heatmap is the primary visualization tool in BDIViz. Each cell represents a match candidate between a source attribute (y-axis) and a target attribute (x-axis).
1721

1822
![heatmap-gif](./images/interactive-heatmap.gif)
1923

20-
21-
### Heatmap Components
24+
### Key Features
2225

2326
{{< tabs tabTotal="4">}}
24-
{{% tab tabName="Heatmap Nodes" %}}
2527

26-
Each cell in the heatmap represents a match candidate between a source attribute (on the y-axis) and a target attribute (on the x-axis).
28+
{{% tab tabName="Heatmap Cells" %}}
2729

28-
- Color intensity represents match strength: darker = higher match score.
29-
- Green indicates accepted matches; red indicates rejected ones.
30-
- Click on a node to inspect further details.
30+
Each cell corresponds to a potential match:
3131

32-
{{% /tab %}}
33-
{{% tab tabName="X-Axis" %}}
32+
- **Color intensity** indicates similarity score (darker = stronger match).
33+
- **Green** = accepted match, **Red** = rejected match.
34+
- Click on a cell to inspect match details.
3435

35-
![schema-hierarchy-gif](./images/schema-hierarchy.gif)
36+
{{% /tab %}}
3637

37-
The x-axis of the heatmap represents the target schema’s attribute space in a structured, hierarchical layout—specifically designed to support the complexity of biomedical data models like the Genomic Data Commons (GDC).
38+
{{% tab tabName="X-Axis Hierarchy" %}}
3839

39-
This hierarchy includes:
40+
![schema-hierarchy-gif](./images/schema-hierarchy.gif)
4041

41-
- **Categories** (e.g., "clinical," "biospecimen")
42-
- **Nodes** (e.g., "diagnosis," "treatment")
43-
- **Target Attributes** as leaf nodes
42+
The x-axis displays the target schema's structure as a semantic hierarchy:
4443

45-
Color coding helps differentiate between categories, while curved connectors visually represent hierarchical relationships. Users can hover over any part of the hierarchy to highlight supercategories, specific categories, or individual columns, aiding navigation and contextual understanding.
44+
- **Category Level**: e.g., `clinical`, `biospecimen`
45+
- **Node Level**: e.g., `diagnosis`, `treatment`
46+
- **Leaf Nodes**: individual target attributes
4647

48+
Color and curved connectors help clarify relationships and improve navigation.
4749

4850
{{% /tab %}}
49-
{{% tab tabName="Expandable Node" %}}
5051

51-
Once you click on a heatmap node, the node expand and shows a stacked heatmap.
52+
{{% tab tabName="Expandable Nodes" %}}
5253

53-
- **The upper histogram** shows the distribution of values from the source column.
54+
Clicking a heatmap node reveals a stacked histogram panel:
5455

55-
- **The lower histogram** shows the distribution from the target column.
56+
- **Top Chart**: Source column value distribution
57+
- **Bottom Chart**: Target column value distribution
5658

57-
These visuals help assess whether the values in the two columns are statistically or categorically aligned. For example, if both columns share similar categories with comparable frequency distributions, it suggests a strong match.
59+
Use this to evaluate whether the two columns share meaningful overlap.
5860

59-
_**Interpretation Tip:** Matching distributions with similar peaks (e.g., a high frequency of "Male" and "Female" values) may indicate semantic alignment. Dissimilar distributions can signal a mismatch or a need for closer inspection._
61+
**Tip:** Similar distributions (e.g., shared categories like "Male" and "Female") often suggest semantic alignment.
6062

6163
![embedded-node](./images/embedded-node.png)
6264

6365
{{% /tab %}}
64-
{{% tab tabName="Top Tabs (above heatmap)" %}}
6566

66-
![upper-tab](./images/upper-tab.png)
67-
68-
- **All:** Displays all potential matches.
67+
{{% tab tabName="Heatmap Top Tabs" %}}
6968

70-
- **Accepted:** Shows only those that have been manually confirmed.
69+
![upper-tab](./images/upper-tab.png)
7170

72-
- **Unmatched:** Lists only those source attributes with no confirmed match.
71+
The top-level filter tabs help narrow your focus:
7372

74-
- **Expand On Hover:** _This toggle controls if the expandable node will be expanded on mouse hover or not(on click)._
73+
- **All**: View all candidate matches
74+
- **Accepted**: Only show confirmed matches
75+
- **Unmatched**: Only show source columns with no confirmed match
76+
- **Expand on Hover**: Toggle whether expanded histograms appear on hover or click
7577

7678
{{% /tab %}}
79+
7780
{{< /tabs >}}
7881

79-
## Filters and Search Tools
82+
---
83+
84+
## Filters and Search Controls
8085

81-
BDIViz provides several ways to customize and narrow your view:
86+
Fine-tune the heatmap view using a suite of filters:
8287

8388
{{< tabs tabTotal="4">}}
84-
{{% tab tabName="Source Attribute" %}}
8589

86-
Lets you choose which column to focus on.
90+
{{% tab tabName="Source Attribute Selector" %}}
91+
92+
Select a specific source column to examine.
8793

8894
![source-attribute](./images/source-attribute.png)
8995

90-
Once click you will able to see this dropdown menu showing all the source attributes:
91-
- **All:** Shows all source attributes in a paginatable manner.
92-
- **Attributes in green:** The attributes that already have at least one accpeted candidate.
93-
- **Attributes in grey:** The attributes that is discarded by user.
96+
Dropdown key:
97+
98+
- **Green**: Already matched
99+
- **Grey**: Manually discarded
100+
- **All**: Show all source attributes
94101

95102
![source-attribute-drop](./images/source-attribute-drop.png)
96103

97104
{{% /tab %}}
105+
98106
{{% tab tabName="Similarity Threshold" %}}
99107

100-
Adjusts the minimum score a candidate must meet to appear. Values range from 0 (show all) to 1 (only highest similarity).
108+
Set a minimum similarity score for visible candidates. Range: `0.0 – 1.0`.
109+
110+
- `0.0`: Show all matches
111+
- `1.0`: Show only perfect matches
101112

102113
{{% /tab %}}
103-
{{% tab tabName="Similar Attributes" %}}
104114

105-
Determines how many similar source columns appear in the y-axis for better comparative context.
115+
{{% tab tabName="Similar Attributes Displayed" %}}
106116

107-
**Note:** This only apply when **Source Attribute** is not set to **All**.
117+
Controls how many similar source attributes appear on the heatmap when a single source column is selected.
108118

119+
> Note: Only applies when Source Attribute ≠ "All".
109120
110121
{{% /tab %}}
111-
{{% tab tabName="Search Bar" %}}
112122

113-
Lets you highlight specific target attributes by name or keyword.
123+
{{% tab tabName="Search Bar" %}}
114124

125+
Quickly locate and highlight target attributes by name or keyword.
115126

116127
{{% /tab %}}
128+
117129
{{< /tabs >}}
118130

131+
---
119132

120-
## Lower Panel Visuals
133+
## Lower Panel: Match Details
121134

122-
When a node is selected from the heatmap, the bottom panels expand to provide deeper insights:
135+
Clicking on any heatmap node reveals deeper insights below:
123136

124137
{{< tabs tabTotal="2">}}
138+
125139
{{% tab tabName="Value Comparisons" %}}
126140

127141
![value-comparisons](./images/value-comparisons.png)
128142

129-
Visual representation of fuzzy-matched value pairs between the selected source and target columns.
130-
131-
Each row shows a unique source value and its closest string-matched counterpart(s) from the target.
143+
Visualizes string-based fuzzy matching between source and target values.
132144

133-
Helps validate or question whether two attributes should be considered a match.
145+
- Each row: one source value + its closest matches
146+
- Use to validate whether mapping is semantically and syntactically justified
134147

135148
{{% /tab %}}
149+
136150
{{% tab tabName="UpSet Plot" %}}
137151

138152
![upset-plot](./images/upset-plot.png)
139153

140-
the UpSet Plot provides a detailed breakdown of how each individual matcher contributed to the overall score of the selected candidate.
154+
Visualizes how different matchers contributed to the candidate match score.
141155

142-
Each matcher is represented as a row, and each column represents a candidate match. A dot is shown where a matcher supports a given match.
156+
- Each row: a matcher
157+
- Each column: a candidate
158+
- Dots indicate support from that matcher
143159

144160
{{% /tab %}}
161+
145162
{{< /tabs >}}
146163

164+
---
165+
147166
## LLM Agent Panel
148167

149168
![agent-panel](./images/agent-panel.png)
150169

151-
- **Overview Diagnosis:** Summarizes whether the current match is likely valid or not, based on metadata, column names, and prior interactions.
170+
An embedded LLM-powered assistant provides contextual insights:
152171

153-
- **Explanation Cards:**
172+
- **Diagnosis Summary**: Determines if a match is likely valid based on metadata, column names, and user history
173+
- **Explanation Cards**: Highlight key reasons (e.g., name similarity, value overlap)
174+
- Click to expand for more detail
175+
- Provide feedback via 👍/👎 to refine model accuracy
176+
- **Target Schema Metadata**: Enriched descriptions pulled from sources such as GDC
154177

155-
- Each card explains one rationale (e.g., shared terms, similar distributions, matching patterns).
178+
---
156179

157-
- Click to expand each explanation.
180+
## What’s Next?
158181

159-
- Feedback Buttons: Let you provide a **thumbs-up** or **thumbs-down** to help improve the model’s future reasoning.
182+
After reviewing the matches:
160183

161-
- **Target Schema Descriptions:** Additional metadata from sources like GDC are displayed for further context.
184+
- Accept or reject individual match suggestions
185+
- Use filtering to prioritize high-confidence matches
186+
- Proceed to export, refine, or apply your matched schema for downstream harmonization tasks

0 commit comments

Comments
 (0)