Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pyfunc code #261

Merged
merged 24 commits into from
Aug 15, 2024
Merged

Pyfunc code #261

merged 24 commits into from
Aug 15, 2024

Conversation

awicenec
Copy link
Contributor

This PR covers the very important and useful change to allow users to enter a function definition as a plain string directly into the func_code field of the PyFuncAPP. Up to now this required the string to be base64 encoded. For backwards compatibility reasons that is also still supported. The package pyext is used to generate a module from a code string at runtime. The code string can be single and multi-line, best a copy-and-paste from an editor. There is only one restriction: The function has to be defined as 'f', i.e.

def f(<arguments>):
    <more lines>
    return <whatever>

and the arguments have to be defined in the Fields Table for that function.

Note that this functionality had been regarded as a security risk before. Since the engine is usually running in user space on a resource where the user can run arbitrary Python code anyway, it does not directly add an additional risk. However, if the user would run a graph from an unknown source without checking the func_code entries that could cause an issue, but not a bigger issue than installing code from an unknown source.

@awicenec awicenec requested a review from myxie May 29, 2024 11:43
Copy link
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @awicenec - I've reviewed your changes and they look great!

Here's what I looked at during the review
  • 🟡 General issues: 4 issues found
  • 🟢 Security: all looks good
  • 🟢 Testing: all looks good
  • 🟢 Complexity: all looks good

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.

daliuge-engine/dlg/apps/pyfunc.py Outdated Show resolved Hide resolved
daliuge-engine/dlg/apps/pyfunc.py Outdated Show resolved Hide resolved
daliuge-engine/dlg/apps/pyfunc.py Show resolved Hide resolved
daliuge-engine/dlg/apps/pyfunc.py Show resolved Hide resolved
docs/conf.py Show resolved Hide resolved
Copy link
Collaborator

@myxie myxie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Andreas, just a clarification comment and maybe a docstring change.

daliuge-engine/dlg/apps/pyfunc.py Outdated Show resolved Hide resolved
daliuge-engine/dlg/apps/pyfunc.py Show resolved Hide resolved
daliuge-engine/dlg/apps/pyfunc.py Outdated Show resolved Hide resolved
@awicenec
Copy link
Contributor Author

awicenec commented May 30, 2024 via email

@awicenec awicenec marked this pull request as draft June 6, 2024 17:49
awicenec and others added 13 commits July 18, 2024 13:32
- Fixed error in app_base.py where the incorrect track_current_drop was being used.
- Added unit test to make sure this regression doesn't occur in the future
- Improved logging of exceptions in session.py
This is the result of not testing after cleaning up code whilst preparing a commit...
Hopefully this will improve the likelihood of the tests passing on GitHub.

Also removed methods used during testing that are no longer of use.
- Also added the `conftest.py` so we properly load the runtime when starting pytest as a suite (as opposed to running invidually).
- Can now specify --rpc_port and --event_port.
- Verified this works and two NodeManagers can be run using localhost.
@coveralls
Copy link

coveralls commented Aug 14, 2024

Coverage Status

coverage: 79.692% (+0.07%) from 79.625%
when pulling d8c11dc on pyfunc_code
into 5c1ba54 on master.

@@ -533,7 +533,7 @@ class ArrayGatherApp(BarrierAppDROP):
[dlg_batch_output("binary/*", [])],
[dlg_streaming_input("binary/*")],
)
value_list = dlg_list_param("value_list", [])
# value_list = dlg_list_param("value_list", [])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This value is used in the readWriteData method; I recognise that the run is called before readWriteData, but we should either initialise it earlier, or enforce that readWriteData can't be called before run().

from dlg.manager.client import MasterManagerClient
from dlg.manager.proc_daemon import DlgDaemon

_TIMEOUT = 5
_TIMEOUT = 3
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 is likely too low - I have success getting this to work locally with 10 second timeout (which I know you previously tried here).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I tried 10s, but it did not change anything, except that the time you had to wait until the tests showed up as failed got incredibly long. The original value was 5s and I will change it back to that. There are actually two timeouts, one for the startup of the manager (this one) and another for the client to try to connect to the manager (10s). Usually that's the one that fails, but that is not the actual problem, the problem is that the manager does not start up for some reason. Locally I never have issues, since the startup is much faster than even 3s.

@awicenec
Copy link
Contributor Author

This is now a merge of three branches LIU-395, LIU-396 and pyfunc_code into master.

@awicenec awicenec marked this pull request as ready for review August 15, 2024 06:01
@awicenec awicenec merged commit ce6e150 into master Aug 15, 2024
21 checks passed
@awicenec awicenec deleted the pyfunc_code branch August 15, 2024 06:02
awicenec added a commit that referenced this pull request Oct 10, 2024
Pyfunc code + LIU-395 and LIU-396
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants