diff --git a/docs/api/plotting.md b/docs/api/plotting.md index 4383853b1b3..301a3f62457 100644 --- a/docs/api/plotting.md +++ b/docs/api/plotting.md @@ -74,17 +74,26 @@ chart # You can now access chart.value to get the selected data Altair has a concept of [data](https://altair-viz.github.io/user_guide/data_transformers.html) transformers, which can be used to improve performance. -Such examples are: - -- pandas Dataframe has to be sanitized and serialized to JSON. -- The rows of a Dataframe might need to be sampled or limited to a maximum number. -- The Dataframe might be written to a `.csv` or `.json` file for performance reasons. - -By default, Altair uses the `default` data transformer, which is the slowest in marimo. It is limited to 5000 rows (although we increase this to `20_000` rows as marimo can handle this). This includes the data inside the HTML that is being sent over the network, which can also be limited by marimo's maximum message size. - -It is recommended to use the `marimo_csv` data transformer, which is the most performant and can handle the largest datasets: it converts the data to a CSV file which is smaller and can be sent over the network. This can handle up to +400,000 rows with no issues. - -When using `mo.ui.altair_chart`, we automatically set the data transformer to `marimo_csv` for you. If you are using Altair directly, you can set the data transformer using the following code: +Some examples are: + +- pandas Dataframe has to be sanitized and serialized to JSON; +- the rows of a Dataframe might need to be sampled or limited to a maximum number; +- the Dataframe might be written to a `.csv` or `.json` file for performance reasons. + +By default, Altair uses the `default` data transformer, which is the slowest in +marimo. It is limited to 5000 rows (although we increase this to `20_000` rows +as marimo can handle this). This includes the data inside the HTML that is +being sent over the network, which can also be limited by marimo's maximum +message size. + +It is recommended to use the `marimo_csv` data transformer, which is the most +performant and can handle the largest datasets: it converts the data to a CSV +file which is smaller and can be sent over the network. This can handle up to ++400,000 rows with no issues. + +When using `mo.ui.altair_chart`, we automatically set the data transformer to +`marimo_csv` for you. If you are using Altair directly, you can set the data +transformer using the following code: ```python import altair as alt diff --git a/docs/guides/working_with_data/plotting.md b/docs/guides/working_with_data/plotting.md index 0de3ae7f3d8..d2735e0c98b 100644 --- a/docs/guides/working_with_data/plotting.md +++ b/docs/guides/working_with_data/plotting.md @@ -41,7 +41,7 @@ points inside the selection. When nothing is selected, `fig.value` is falsy and #### Example - +/// tab | code ```python import matplotlib.pyplot as plt @@ -62,6 +62,37 @@ mask = ax.value.get_mask(x, y) selected_x, selected_y = x[mask], y[mask] ``` +/// + +/// tab | live example + +/// marimo-embed + size: large + +```python +@app.cell +def __(): + import matplotlib.pyplot as plt + import numpy as np + + x = np.random.randn(500) + y = np.random.randn(500) + plt.scatter(x, y) + ax = mo.ui.matplotlib(plt.gca()) + ax + return + +@app.cell +def __(): + mask = ax.value.get_mask(x, y) + np.column_stack([x[mask], y[mask]]) + return +``` + +/// + +/// + #### Debouncing By default, the selection streams to Python as you drag. For expensive @@ -284,7 +315,7 @@ conda install -c conda-forge "vegafusion-python-embed>=1.4.0" "vegafusion>=1.4.0 @app.cell(hide_code=True) async def __(): import micropip - await micropip.install("plotly[express]") + await micropip.install(["plotly[express]", "pandas"]) import plotly.express as px return px,