Skip to content

Implement remote evaluation and distributed garbage collection#46

Merged
jonatanklosko merged 3 commits intomainfrom
jk-remote
Mar 2, 2026
Merged

Implement remote evaluation and distributed garbage collection#46
jonatanklosko merged 3 commits intomainfrom
jk-remote

Conversation

@jonatanklosko
Copy link
Member

@jonatanklosko jonatanklosko commented Mar 2, 2026

This adds Pythonx.remote_eval/4 to run Python code on another node. Remote eval returns "remote" %Pythonx.Object{} structs and we ensure the corresponding Python objects are kept alive on the owner node as long as necessary. There is Pythonx.copy_remote_object/1 that serializes the remote object and then deserializes it locally. Pythonx.Object also implements the FLAME.Trackable protocol, which sets up proper lifetime tracking for objects returned from FLAME.call/3 (similarly to Pythonx.remote_eval/4.

More details on the API in Pythonx docs, and implementation details in Pythonx.ObjectTracker.

We could have a separate Pythonx.Remote module, but the API surface is small, so I am not sure if we necessarily need that.

@jonatanklosko
Copy link
Member Author

Currently Pythonx.remote_eval/4 returns the globals map. The user may not necessarily care about all of those, so tracking those just to be GCed immediately is unnecessary work. That said, it's likely fine for the use cases where Pythonx.remote_eval/4 would be used, such as notebooks. When using Pythonx in an actual app, the evaluation should rather be behind a GenServer API, and remote objects should not be at play. If we ever discover it's a problem, we can also add an option to prune/pick the returned globals.

"""

defexception [:type, :value, :traceback]
defexception [:lines]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically it's a breaking change, though ideally I would consider the exception fields to be opaque in this case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FTR. I changed it so that we can easily pass Pythonx.Error across nodes without worrying about tracking and going back to the owner for formatting. I don't think storing the separate type, value, traceback objects was a good idea in the first place - if someone is interested in those, they should just have the try-catch in the Python code.

@josevalim
Copy link
Contributor

Currently Pythonx.remote_eval/4 returns the globals map

Perhaps we should make it so it will only return the last line? And if they want everything, they can call globals()?

if (enif_whereis_pid(caller_env, janitor_name, &janitor_pid)) {
auto device = type == 0 ? eval_info.stdout_device : eval_info.stderr_device;
// Copy the device term is from a differnet env, so we copy it into
// the message env, otherwise we may run into unexpected behaviour.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which sort of unexpected behaviour?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far it has been working fine. Once I was testing remote eval, where stdout_device and stderr_device are remote pid terms, I run into a weird issue where the message we send below would include a random term (e.g. :infinity, {}) instead of the actual pid. I then revisited the code and realised that those terms are from a different env and doing the copy first fixed the issue. It's interesting that it's only remote pid terms that revealed the issue.

@jonatanklosko
Copy link
Member Author

Currently Pythonx.remote_eval/4 returns the globals map

Perhaps we should make it so it will only return the last line? And if they want everything, they can call globals()?

Similarly they could return a tuple with the specific values they need. However, the issue with those approach is that there is no way to return separate values to Elixir side, it would be a single %Pythonx.Object{}. And this is relevant, because in some cases they may want to do Pythonx.Decode, but only on one of those globals.

Copy link
Contributor

@josevalim josevalim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Beautiful! 😍

@josevalim
Copy link
Contributor

Similarly they could return a tuple with the specific values they need. However, the issue with those approach is that there is no way to return separate values to Elixir side, it would be a single %Pythonx.Object{}.

Ok, so we are back to square one... with the difference we would be holding fewer things in memory?

@jonatanklosko
Copy link
Member Author

Similarly they could return a tuple with the specific values they need. However, the issue with those approach is that there is no way to return separate values to Elixir side, it would be a single %Pythonx.Object{}.

Ok, so we are back to square one... with the difference we would be holding fewer things in memory?

I'm not following. My point is that if we don't return globals, it becomes an actual limitation.

jonatanklosko and others added 2 commits March 2, 2026 22:39
Co-authored-by: José Valim <jose.valim@gmail.com>
@jonatanklosko jonatanklosko merged commit 2f725ff into main Mar 2, 2026
9 checks passed
@jonatanklosko jonatanklosko deleted the jk-remote branch March 2, 2026 21:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants