abacusai.batch_prediction
Classes
An abstract class for Batch Prediction args specific to problem type. |
|
Batch Prediction Version |
|
Batch Input Feature Group |
|
Batch inputs |
|
A refresh schedule for an object. Defines when the next version of the object will be created |
|
Make batch predictions. |
Module Contents
- class abacusai.batch_prediction.BatchPredictionArgs
Bases:
abacusai.api_class.abstract.ApiClassAn abstract class for Batch Prediction args specific to problem type.
- problem_type: abacusai.api_class.enums.ProblemType
- classmethod _get_builder()
- class abacusai.batch_prediction.BatchPredictionVersion(client, batchPredictionVersion=None, batchPredictionId=None, status=None, driftMonitorStatus=None, deploymentId=None, modelId=None, modelVersion=None, predictionsStartedAt=None, predictionsCompletedAt=None, databaseOutputError=None, totalPredictions=None, failedPredictions=None, databaseConnectorId=None, databaseOutputConfiguration=None, fileConnectorOutputLocation=None, fileOutputFormat=None, connectorType=None, legacyInputLocation=None, error=None, driftMonitorError=None, monitorWarnings=None, csvInputPrefix=None, csvPredictionPrefix=None, csvExplanationsPrefix=None, databaseOutputTotalWrites=None, databaseOutputFailedWrites=None, outputIncludesMetadata=None, resultInputColumns=None, modelMonitorVersion=None, algoName=None, algorithm=None, outputFeatureGroupId=None, outputFeatureGroupVersion=None, outputFeatureGroupTableName=None, batchPredictionWarnings=None, bpAcrossVersionsMonitorVersion=None, batchPredictionArgsType=None, batchInputs={}, inputFeatureGroups={}, globalPredictionArgs={}, batchPredictionArgs={})
Bases:
abacusai.return_class.AbstractApiClassBatch Prediction Version
- Parameters:
client (ApiClient) – An authenticated API Client instance
batchPredictionVersion (str) – The unique identifier of the batch prediction version
batchPredictionId (str) – The unique identifier of the batch prediction
status (str) – The current status of the batch prediction
driftMonitorStatus (str) – The status of the drift monitor for this batch prediction version
deploymentId (str) – The deployment used to make the predictions
modelId (str) – The model used to make the predictions
modelVersion (str) – The model version used to make the predictions
predictionsStartedAt (str) – Predictions start date and time
predictionsCompletedAt (str) – Predictions completion date and time
databaseOutputError (bool) – If true, there were errors reported by the database connector while writing
totalPredictions (int) – Number of predictions performed in this batch prediction job
failedPredictions (int) – Number of predictions that failed
databaseConnectorId (str) – The database connector to write the results to
databaseOutputConfiguration (dict) – Contains information about where the batch predictions are written to
fileConnectorOutputLocation (str) – Contains information about where the batch predictions are written to
fileOutputFormat (str) – The format of the batch prediction output (CSV or JSON)
connectorType (str) – Null if writing to internal console, else FEATURE_GROUP | FILE_CONNECTOR | DATABASE_CONNECTOR
legacyInputLocation (str) – The location of the input data
error (str) – Relevant error if the status is FAILED
driftMonitorError (str) – Error message for the drift monitor of this batch predcition
monitorWarnings (str) – Relevant warning if there are issues found in drift or data integrity
csvInputPrefix (str) – A prefix to prepend to the input columns, only applies when output format is CSV
csvPredictionPrefix (str) – A prefix to prepend to the prediction columns, only applies when output format is CSV
csvExplanationsPrefix (str) – A prefix to prepend to the explanation columns, only applies when output format is CSV
databaseOutputTotalWrites (int) – The total number of rows attempted to write (may be less than total_predictions if write mode is UPSERT and multiple rows share the same ID)
databaseOutputFailedWrites (int) – The number of failed writes to the Database Connector
outputIncludesMetadata (bool) – If true, output will contain columns including prediction start time, batch prediction version, and model version
resultInputColumns (list[str]) – If present, will limit result files or feature groups to only include columns present in this list
modelMonitorVersion (str) – The version of the model monitor
algoName (str) – The name of the algorithm used to train the model
algorithm (str) – The algorithm that is currently deployed.
outputFeatureGroupId (str) – The Batch Prediction output feature group ID if applicable
outputFeatureGroupVersion (str) – The Batch Prediction output feature group version if applicable
outputFeatureGroupTableName (str) – The Batch Prediction output feature group name if applicable
batchPredictionWarnings (str) – Relevant warnings if any issues are found
bpAcrossVersionsMonitorVersion (str) – The version of the batch prediction across versions monitor
batchPredictionArgsType (str) – The type of the batch prediction args
batchInputs (PredictionInput) – Inputs to the batch prediction
inputFeatureGroups (PredictionFeatureGroup) – List of prediction feature groups
globalPredictionArgs (BatchPredictionArgs)
batchPredictionArgs (BatchPredictionArgs) – Argument(s) passed to every prediction call
- __repr__()
Return repr(self).
- to_dict()
Get a dict representation of the parameters in this class
- Returns:
The dict value representation of the class parameters
- Return type:
- download_batch_prediction_result_chunk(offset=0, chunk_size=10485760)
Returns a stream containing the batch prediction results.
- get_batch_prediction_connector_errors()
Returns a stream containing the batch prediction database connection write errors, if any writes failed for the specified batch prediction job.
- Parameters:
batch_prediction_version (str) – Unique string identifier of the batch prediction job to get the errors for.
- refresh()
Calls describe and refreshes the current object’s fields
- Returns:
The current object
- Return type:
- describe()
Describes a Batch Prediction Version.
- Parameters:
batch_prediction_version (str) – Unique string identifier of the Batch Prediction Version.
- Returns:
The Batch Prediction Version.
- Return type:
- get_logs()
Retrieves the batch prediction logs.
- Parameters:
batch_prediction_version (str) – The unique version ID of the batch prediction version.
- Returns:
The logs for the specified batch prediction version.
- Return type:
- download_result_to_file(file)
Downloads the batch prediction version in a local file.
- Parameters:
file (file object) – A file object opened in a binary mode e.g., file=open(‘/tmp/output’, ‘wb’).
- wait_for_predictions(timeout=86400)
A waiting call until batch prediction version is ready.
- Parameters:
timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.
- wait_for_drift_monitor(timeout=86400)
A waiting call until batch prediction drift monitor calculations are ready.
- Parameters:
timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.
- get_status(drift_monitor_status=False)
Gets the status of the batch prediction version.
- load_results_as_pandas()
Loads the output feature groups into a python pandas dataframe.
- Returns:
A pandas dataframe with annotations and text_snippet columns.
- Return type:
DataFrame
- class abacusai.batch_prediction.PredictionFeatureGroup(client, featureGroupId=None, featureGroupVersion=None, datasetType=None, default=None, required=None)
Bases:
abacusai.return_class.AbstractApiClassBatch Input Feature Group
- Parameters:
client (ApiClient) – An authenticated API Client instance
featureGroupId (str) – The unique identifier of the feature group
featureGroupVersion (str) – The unique identifier of the feature group version used for predictions
datasetType (str) – dataset type
default (bool) – If true, this feature group is the default feature group in the model
required (bool) – If true, this feature group is required for the batch prediction
- __repr__()
Return repr(self).
- class abacusai.batch_prediction.PredictionInput(client, featureGroupDatasetIds=None, datasetIdRemap=None, featureGroups={}, datasets={})
Bases:
abacusai.return_class.AbstractApiClassBatch inputs
- Parameters:
client (ApiClient) – An authenticated API Client instance
featureGroupDatasetIds (list) – The list of dataset IDs to use as input
datasetIdRemap (dict) – Replacement datasets to swap as prediction input
featureGroups (PredictionFeatureGroup) – List of prediction feature groups
datasets (PredictionDataset) – List of prediction datasets
- __repr__()
Return repr(self).
- class abacusai.batch_prediction.RefreshSchedule(client, refreshPolicyId=None, nextRunTime=None, cron=None, refreshType=None, error=None)
Bases:
abacusai.return_class.AbstractApiClassA refresh schedule for an object. Defines when the next version of the object will be created
- Parameters:
client (ApiClient) – An authenticated API Client instance
refreshPolicyId (str) – The unique identifier of the refresh policy
nextRunTime (str) – The next run time of the refresh policy. If null, the policy is paused.
cron (str) – A cron-style string that describes the when this refresh policy is to be executed in UTC
refreshType (str) – The type of refresh that will be run
error (str) – An error message for the last pipeline run of a policy
- __repr__()
Return repr(self).
- class abacusai.batch_prediction.AbstractApiClass(client, id)
- __eq__(other)
Return self==value.
- _get_attribute_as_dict(attribute)
- class abacusai.batch_prediction.BatchPrediction(client, batchPredictionId=None, createdAt=None, name=None, deploymentId=None, fileConnectorOutputLocation=None, databaseConnectorId=None, databaseOutputConfiguration=None, fileOutputFormat=None, connectorType=None, legacyInputLocation=None, outputFeatureGroupId=None, featureGroupTableName=None, outputFeatureGroupTableName=None, summaryFeatureGroupTableName=None, csvInputPrefix=None, csvPredictionPrefix=None, csvExplanationsPrefix=None, outputIncludesMetadata=None, resultInputColumns=None, modelMonitorId=None, modelVersion=None, bpAcrossVersionsMonitorId=None, algorithm=None, batchPredictionArgsType=None, batchInputs={}, latestBatchPredictionVersion={}, refreshSchedules={}, inputFeatureGroups={}, globalPredictionArgs={}, batchPredictionArgs={})
Bases:
abacusai.return_class.AbstractApiClassMake batch predictions.
- Parameters:
client (ApiClient) – An authenticated API Client instance
batchPredictionId (str) – The unique identifier of the batch prediction request.
createdAt (str) – When the batch prediction was created, in ISO-8601 format.
name (str) – Name given to the batch prediction object.
deploymentId (str) – The deployment used to make the predictions.
fileConnectorOutputLocation (str) – Contains information about where the batch predictions are written to.
databaseConnectorId (str) – The database connector to write the results to.
databaseOutputConfiguration (dict) – Contains information about where the batch predictions are written to.
fileOutputFormat (str) – The format of the batch prediction output (CSV or JSON).
connectorType (str) – Null if writing to internal console, else FEATURE_GROUP | FILE_CONNECTOR | DATABASE_CONNECTOR.
legacyInputLocation (str) – The location of the input data.
outputFeatureGroupId (str) – The Batch Prediction output feature group ID if applicable
featureGroupTableName (str) – The table name of the Batch Prediction output feature group.
outputFeatureGroupTableName (str) – The table name of the Batch Prediction output feature group.
summaryFeatureGroupTableName (str) – The table name of the metrics summary feature group output by Batch Prediction.
csvInputPrefix (str) – A prefix to prepend to the input columns, only applies when output format is CSV.
csvPredictionPrefix (str) – A prefix to prepend to the prediction columns, only applies when output format is CSV.
csvExplanationsPrefix (str) – A prefix to prepend to the explanation columns, only applies when output format is CSV.
outputIncludesMetadata (bool) – If true, output will contain columns including prediction start time, batch prediction version, and model version.
resultInputColumns (list) – If present, will limit result files or feature groups to only include columns present in this list.
modelMonitorId (str) – The model monitor for this batch prediction.
modelVersion (str) – The model instance used in the deployment for the batch prediction.
bpAcrossVersionsMonitorId (str) – The model monitor for this batch prediction across versions.
algorithm (str) – The algorithm that is currently deployed.
batchPredictionArgsType (str) – The type of batch prediction arguments used for this batch prediction.
batchInputs (PredictionInput) – Inputs to the batch prediction.
latestBatchPredictionVersion (BatchPredictionVersion) – The latest batch prediction version.
refreshSchedules (RefreshSchedule) – List of refresh schedules that dictate the next time the batch prediction will be run.
inputFeatureGroups (PredictionFeatureGroup) – List of prediction feature groups.
globalPredictionArgs (BatchPredictionArgs)
batchPredictionArgs (BatchPredictionArgs) – Argument(s) passed to every prediction call.
- __repr__()
Return repr(self).
- to_dict()
Get a dict representation of the parameters in this class
- Returns:
The dict value representation of the class parameters
- Return type:
- start()
Creates a new batch prediction version job for a given batch prediction job description.
- Parameters:
batch_prediction_id (str) – The unique identifier of the batch prediction to create a new version of.
- Returns:
The batch prediction version started by this method call.
- Return type:
- refresh()
Calls describe and refreshes the current object’s fields
- Returns:
The current object
- Return type:
- describe()
Describe the batch prediction.
- Parameters:
batch_prediction_id (str) – The unique identifier associated with the batch prediction.
- Returns:
The batch prediction description.
- Return type:
- list_versions(limit=100, start_after_version=None)
Retrieves a list of versions of a given batch prediction
- Parameters:
- Returns:
List of batch prediction versions.
- Return type:
- update(deployment_id=None, global_prediction_args=None, batch_prediction_args=None, explanations=None, output_format=None, csv_input_prefix=None, csv_prediction_prefix=None, csv_explanations_prefix=None, output_includes_metadata=None, result_input_columns=None, name=None)
Update a batch prediction job description.
- Parameters:
deployment_id (str) – Unique identifier of the deployment.
batch_prediction_args (BatchPredictionArgs) – Batch Prediction args specific to problem type.
output_format (str) – If specified, sets the format of the batch prediction output (CSV or JSON).
csv_input_prefix (str) – Prefix to prepend to the input columns, only applies when output format is CSV.
csv_prediction_prefix (str) – Prefix to prepend to the prediction columns, only applies when output format is CSV.
csv_explanations_prefix (str) – Prefix to prepend to the explanation columns, only applies when output format is CSV.
output_includes_metadata (bool) – If True, output will contain columns including prediction start time, batch prediction version, and model version.
result_input_columns (list) – If present, will limit result files or feature groups to only include columns present in this list.
name (str) – If present, will rename the batch prediction.
global_prediction_args (Union[dict, abacusai.api_class.BatchPredictionArgs])
explanations (bool)
- Returns:
The batch prediction.
- Return type:
- set_file_connector_output(output_format=None, output_location=None)
Updates the file connector output configuration of the batch prediction
- Parameters:
- Returns:
The batch prediction description.
- Return type:
- set_database_connector_output(database_connector_id=None, database_output_config=None)
Updates the database connector output configuration of the batch prediction
- Parameters:
- Returns:
Description of the batch prediction.
- Return type:
- set_feature_group_output(table_name)
Creates a feature group and sets it as the batch prediction output.
- Parameters:
table_name (str) – Name of the feature group table to create.
- Returns:
Batch prediction after the output has been applied.
- Return type:
- set_output_to_console()
Sets the batch prediction output to the console, clearing both the file connector and database connector configurations.
- Parameters:
batch_prediction_id (str) – The unique identifier of the batch prediction.
- Returns:
The batch prediction description.
- Return type:
- set_feature_group(feature_group_type, feature_group_id=None)
Sets the batch prediction input feature group.
- Parameters:
feature_group_type (str) – Enum string representing the feature group type to set. The type is based on the use case under which the feature group is being created (e.g. Catalog Attributes for personalized recommendation use case).
feature_group_id (str) – Unique identifier of the feature group to set as input to the batch prediction.
- Returns:
Description of the batch prediction.
- Return type:
- set_dataset_remap(dataset_id_remap)
For the purpose of this batch prediction, will swap out datasets in the training feature groups
- Parameters:
dataset_id_remap (dict) – Key/value pairs of dataset ids to be replaced during the batch prediction.
- Returns:
Batch prediction object.
- Return type:
- delete()
Deletes a batch prediction and associated data, such as associated monitors.
- Parameters:
batch_prediction_id (str) – Unique string identifier of the batch prediction.
- wait_for_predictions(timeout=86400)
A waiting call until batch predictions are ready.
- Parameters:
timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.
- wait_for_drift_monitor(timeout=86400)
A waiting call until batch prediction drift monitor calculations are ready.
- Parameters:
timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.
- get_status()
Gets the status of the latest batch prediction version.
- Returns:
A string describing the status of the latest batch prediction version e.g., pending, complete, etc.
- Return type:
- create_refresh_policy(cron)
To create a refresh policy for a batch prediction.
- Parameters:
cron (str) – A cron style string to set the refresh time.
- Returns:
The refresh policy object.
- Return type:
- list_refresh_policies()
Gets the refresh policies in a list.
- Returns:
A list of refresh policy objects.
- Return type:
List[RefreshPolicy]
- describe_output_feature_group()
Gets the results feature group for this batch prediction
- Returns:
A feature group object.
- Return type:
- load_results_as_pandas()
Loads the output feature groups into a python pandas dataframe.
- Returns:
A pandas dataframe with annotations and text_snippet columns.
- Return type:
DataFrame