Skip to content

Working with Lithops: Massively Parallel Computing Made Easy 🚀 ​

Lithops, seamlessly integrated into PyRun, empowers you to execute massively parallel Python functions in the cloud with incredible ease. This guide will show you how to get started with Lithops in your PyRun workspace and unlock the power of serverless computing for data processing, simulations, and more.

Example Lithops Code: Simple Map Function ​

When you create a Lithops workspace in PyRun, you'll find an example.py file containing a basic Lithops example to get you started. Here's the code:

python
# Welcome to PyRun!

# To help you get started, we have included a small example
# showcasing how to use lithops.

# To install more packages, please edit the environment.yml
# file found in the .pyrun directory.

import lithops
import time

def my_map_function(id, x):
    print(f"I'm activation number {id}")
    time.sleep(5)
    return x + 7

if __name__ == "__main__":
    iterdata = [10, 11, 12, 13]
    fexec = lithops.FunctionExecutor()
    fexec.map(my_map_function, range(2))
    fexec.map(my_map_function, iterdata)
    print(fexec.get_result())

What this code does:

  • my_map_function(id, x): This is a simple Python function that will be executed in parallel in the cloud. It takes two arguments: id (an activation ID) and x (an input value). It prints a message with the activation ID, pauses for 5 seconds (time.sleep(5)), and returns x + 7.
  • lithops.FunctionExecutor(): This line creates a Lithops FunctionExecutor instance, which is your entry point to running functions in the cloud with Lithops.
  • fexec.map(my_map_function, range(2)): This line submits two invocations of my_map_function to Lithops, with input data from range(2) (which is [0, 1]). These will run in parallel.
  • fexec.map(my_map_function, iterdata): This line submits four more invocations of my_map_function, using the iterdata list [10, 11, 12, 13] as input. These will also run in parallel.
  • fexec.get_result(): This line retrieves the results from all the map invocations. Lithops will automatically collect the results from the cloud and return them as a list.

In total, this code will execute my_map_function six times in parallel across four cloud workers.

Running the Example ​

To run this example code in PyRun:

  1. Ensure you have a Lithops workspace created. (See Getting Started if you haven't created one yet).
  2. Open the example.py file in your workspace.
  3. Click the "Run" button.

PyRun will handle the rest, using Lithops to distribute and execute my_map_function in parallel in your cloud account. Check the Real-Time Monitoring dashboard to see the progress of your Lithops job!

Customizing Lithops Configuration ​

Need to adjust the backend (e.g., change from AWS Lambda to EC2), modify the number of workers, memory, or other Lithops settings? PyRun makes it easy:

  1. Go to your PyRun Dashboard.

  2. Look for the "Lithops config" button. This button is typically located within your workspace view or in a dedicated configuration section of the dashboard.

    Lithops Config Button

  3. Click "Lithops config". This will open the Lithops configuration panel.

  4. Adjust Settings: Modify the Lithops configuration parameters as needed. You can typically adjust settings related to:

    • Backend: Choose your preferred compute backend (e.g., AWS Lambda, AWS EC2, AWS Fargate, etc.).
    • Resource Allocation: Control the number of workers, CPU cores, memory per worker, and other resource settings.
    • Storage: Configure storage options for intermediate data (if needed).
    • And more: Explore the full range of Lithops configuration options available in the PyRun interface.
  5. Save Changes: Click "Save" or "Apply" to save your Lithops configuration. These settings will be applied to subsequent Lithops executions in your workspace.

Installing Additional Packages for Lithops ​

If your Lithops functions require additional Python packages beyond the default environment, remember to customize your runtime by editing the environment.yml file. (See Customizing Your Runtime).

Important for Lithops users: Ensure that your environment.yml includes the $LITHOPS dependency as described in the Customizing Your Runtime guide.

Unlock Massively Parallel Power with PyRun and Lithops ​

PyRun's integration with Lithops makes massively parallel computing accessible to everyone. Start experimenting with the example code, customize your Lithops configuration, and unleash the power of serverless for your most demanding workloads!

Next Steps:

  • Explore more advanced Lithops examples and use cases.
  • Experiment with different Lithops backends and configurations.
  • Learn about integrating Lithops with other PyRun features like the Data Cockpit.

Happy parallel computing!