Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -47,9 +47,9 @@ By default, an Actor keeps its state in the server's memory. During a server swi

## Implementing state persistence

The [Apify SDKs](/sdk) handle state persistence automatically.
To handle state persistence, use the [`Actor.useState()`](/sdk/js/reference/class/Actor#useState) method. This method automatically saves and retrieves your state during migrations.

This is done using the `Actor.on()` method and the `migrating` event.
For more control or when using Python, you can manually handle state persistence using the `Actor.on()` method and the `migrating` event.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What control? More than what? And why we move straight to Python from the control bit what is the connection?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated to the PR: Hmm, the Python thing is kinda bummer. Seems we have it in Crawlee but not SDK. We should add it there @Pijukatel

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the issue apify/apify-sdk-python#735

But we have to think this through, given the recent (and maybe upcoming) changes to the use state
apify/crawlee#3309
apify/crawlee-python#1669

Should Actor.use_state point to a different state than Crawler.use_state unless explicitly set up to share the same state? If we by default separate use_state of two different crawlers, then it will be different from actor as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is determined by the KV record key which is one of the parameters of useState.

If users use this and Crawler.useState, I think those should be 2 separate instances with 2 different default keys. But there isn't much use-case for this and if users combine a lot of these, they are asking for trouble.


- The `migrating` event is triggered just before a migration occurs, allowing you to save your state.
- To retrieve previously saved state, you can use the [`Actor.getValue`](/sdk/js/reference/class/Actor#getValue)/[`Actor.get_value`](/sdk/python/reference/class/Actor#get_value) methods.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is kinda bad to recommend many different ways to do this. How about:

  1. Let's wait for Python team to add use_state to SDK if it is reasonably easy
  2. Recommend it as a primary method of handling state, we don't even need to mention getValue then.
  3. Show useState in the first code example
  4. Keep the "manual" usage for the 2nd example with the reboot (I want this to be automatically in the SDK too but that won't happen soon)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's wait for the Python team. I've subscribed to the issue, will finalize this once the Python issue is resolved.

Expand Down
Loading