abacusai.pipeline_step

Classes

CodeSource

Code source for python-based custom feature groups and models

PythonFunction

Customer created python function

AbstractApiClass

PipelineStep

A step in a pipeline.

Module Contents

class abacusai.pipeline_step.CodeSource(client, sourceType=None, sourceCode=None, applicationConnectorId=None, applicationConnectorInfo=None, packageRequirements=None, status=None, error=None, publishingMsg=None, moduleDependencies=None)

Bases: abacusai.return_class.AbstractApiClass

Code source for python-based custom feature groups and models

Parameters:
  • client (ApiClient) – An authenticated API Client instance

  • sourceType (str) – The type of the source, one of TEXT, PYTHON, FILE_UPLOAD, or APPLICATION_CONNECTOR

  • sourceCode (str) – If the type of the source is TEXT, the raw text of the function

  • applicationConnectorId (str) – The Application Connector to fetch the code from

  • applicationConnectorInfo (str) – Args passed to the application connector to fetch the code

  • packageRequirements (list) – The pip package dependencies required to run the code

  • status (str) – The status of the code and validations

  • error (str) – If the status is failed, an error message describing what went wrong

  • publishingMsg (dict) – Warnings in the source code

  • moduleDependencies (list) – The list of internal modules dependencies required to run the code

__repr__()

Return repr(self).

to_dict()

Get a dict representation of the parameters in this class

Returns:

The dict value representation of the class parameters

Return type:

dict

import_as_cell()

Adds the source code as an unexecuted cell in the notebook.

class abacusai.pipeline_step.PythonFunction(client, notebookId=None, name=None, createdAt=None, functionVariableMappings=None, outputVariableMappings=None, functionName=None, pythonFunctionId=None, functionType=None, packageRequirements=None, codeSource={})

Bases: abacusai.return_class.AbstractApiClass

Customer created python function

Parameters:
  • client (ApiClient) – An authenticated API Client instance

  • notebookId (str) – The unique identifier of the notebook used to spin up the notebook upon creation.

  • name (str) – The name to identify the algorithm, only uppercase letters, numbers, and underscores allowed (i.e. it must be a valid Python identifier)

  • createdAt (str) – The ISO-8601 string representing when the Python function was created.

  • functionVariableMappings (dict) – A description of the function variables.

  • outputVariableMappings (dict) – A description of the variables returned by the function

  • functionName (str) – The name of the Python function to be used.

  • pythonFunctionId (str) – The unique identifier of the Python function.

  • functionType (str) – The type of the Python function.

  • packageRequirements (list) – The pip package dependencies required to run the code

  • codeSource (CodeSource) – Information about the source code of the Python function.

__repr__()

Return repr(self).

to_dict()

Get a dict representation of the parameters in this class

Returns:

The dict value representation of the class parameters

Return type:

dict

add_graph_to_dashboard(graph_dashboard_id, function_variable_mappings=None, name=None)

Add a python plot function to a dashboard

Parameters:
  • graph_dashboard_id (str) – Unique string identifier for the graph dashboard to update.

  • function_variable_mappings (List) – List of arguments to be supplied to the function as parameters, in the format [{‘name’: ‘function_argument’, ‘variable_type’: ‘FEATURE_GROUP’, ‘value’: ‘name_of_feature_group’}].

  • name (str) – Name of the added python plot

Returns:

An object describing the graph dashboard.

Return type:

GraphDashboard

validate_locally(kwargs=None)

Validates a Python function by running it with the given input values in an local environment. Taking Input Feature Group as either name(string) or Pandas DataFrame in kwargs.

Parameters:

kwargs (dict) – A dictionary mapping function arguments to values to pass to the function. Feature group names will automatically be converted into pandas dataframes.

Returns:

The result of executing the python function

Return type:

any

Raises:
  • TypeError – If an Input Feature Group argument has an invalid type or argument is missing.

  • Exception – If an error occurs while validating the Python function.

class abacusai.pipeline_step.AbstractApiClass(client, id)
__eq__(other)

Return self==value.

_get_attribute_as_dict(attribute)
class abacusai.pipeline_step.PipelineStep(client, pipelineStepId=None, pipelineId=None, stepName=None, pipelineName=None, createdAt=None, updatedAt=None, pythonFunctionId=None, stepDependencies=None, cpuSize=None, memory=None, timeout=None, pythonFunction={}, codeSource={})

Bases: abacusai.return_class.AbstractApiClass

A step in a pipeline.

Parameters:
  • client (ApiClient) – An authenticated API Client instance

  • pipelineStepId (str) – The reference to this step.

  • pipelineId (str) – The reference to the pipeline this step belongs to.

  • stepName (str) – The name of the step.

  • pipelineName (str) – The name of the pipeline this step is a part of.

  • createdAt (str) – The date and time which this step was created.

  • updatedAt (str) – The date and time when this step was last updated.

  • pythonFunctionId (str) – The python function_id.

  • stepDependencies (list[str]) – List of steps this step depends on.

  • cpuSize (str) – CPU size specified for the step function.

  • memory (int) – Memory in GB specified for the step function.

  • timeout (int) – Timeout for the step in minutes, default is 300 minutes.

  • pythonFunction (PythonFunction) – Information about the python function for the step.

  • codeSource (CodeSource) – Information about the source code of the step function.

__repr__()

Return repr(self).

to_dict()

Get a dict representation of the parameters in this class

Returns:

The dict value representation of the class parameters

Return type:

dict

delete()

Deletes a step from a pipeline.

Parameters:

pipeline_step_id (str) – The ID of the pipeline step.

update(function_name=None, source_code=None, step_input_mappings=None, output_variable_mappings=None, step_dependencies=None, package_requirements=None, cpu_size=None, memory=None, timeout=None)

Creates a step in a given pipeline.

Parameters:
  • function_name (str) – The name of the Python function.

  • source_code (str) – Contents of a valid Python source code file. The source code should contain the transform feature group functions. A list of allowed imports and system libraries for each language is specified in the user functions documentation section.

  • step_input_mappings (List) – List of Python function arguments.

  • output_variable_mappings (List) – List of Python function outputs.

  • step_dependencies (list) – List of step names this step depends on.

  • package_requirements (list) – List of package requirement strings. For example: [‘numpy==1.2.3’, ‘pandas>=1.4.0’].

  • cpu_size (str) – Size of the CPU for the step function.

  • memory (int) – Memory (in GB) for the step function.

  • timeout (int) – Timeout for the pipeline step, default is 300 minutes.

Returns:

Object describing the pipeline.

Return type:

PipelineStep

rename(step_name)

Renames a step in a given pipeline.

Parameters:

step_name (str) – The name of the step.

Returns:

Object describing the pipeline.

Return type:

PipelineStep

refresh()

Calls describe and refreshes the current object’s fields

Returns:

The current object

Return type:

PipelineStep

describe()

Deletes a step from a pipeline.

Parameters:

pipeline_step_id (str) – The ID of the pipeline step.

Returns:

An object describing the pipeline step.

Return type:

PipelineStep