abacusai.batch_prediction_version

Classes

BatchPredictionArgs

An abstract class for Batch Prediction args specific to problem type.

FileFormat

Generic enumeration.

PredictionFeatureGroup

Batch Input Feature Group

PredictionInput

Batch inputs

AbstractApiClass

BatchPredictionVersion

Batch Prediction Version

Module Contents

class abacusai.batch_prediction_version.BatchPredictionArgs

Bases: abacusai.api_class.abstract.ApiClass

An abstract class for Batch Prediction args specific to problem type.

_support_kwargs: bool
kwargs: dict
problem_type: abacusai.api_class.enums.ProblemType
classmethod _get_builder()
class abacusai.batch_prediction_version.FileFormat

Bases: ApiEnum

Generic enumeration.

Derive from this class to define new enumerations.

AVRO = 'AVRO'
PARQUET = 'PARQUET'
TFRECORD = 'TFRECORD'
TSV = 'TSV'
CSV = 'CSV'
ORC = 'ORC'
JSON = 'JSON'
ODS = 'ODS'
XLS = 'XLS'
GZ = 'GZ'
ZIP = 'ZIP'
TAR = 'TAR'
DOCX = 'DOCX'
PDF = 'PDF'
RAR = 'RAR'
JPEG = 'JPG'
PNG = 'PNG'
TIF = 'TIFF'
NUMBERS = 'NUMBERS'
PPTX = 'PPTX'
PPT = 'PPT'
HTML = 'HTML'
TXT = 'txt'
class abacusai.batch_prediction_version.PredictionFeatureGroup(client, featureGroupId=None, featureGroupVersion=None, datasetType=None, default=None, required=None)

Bases: abacusai.return_class.AbstractApiClass

Batch Input Feature Group

Parameters:
  • client (ApiClient) – An authenticated API Client instance

  • featureGroupId (str) – The unique identifier of the feature group

  • featureGroupVersion (str) – The unique identifier of the feature group version used for predictions

  • datasetType (str) – dataset type

  • default (bool) – If true, this feature group is the default feature group in the model

  • required (bool) – If true, this feature group is required for the batch prediction

__repr__()

Return repr(self).

to_dict()

Get a dict representation of the parameters in this class

Returns:

The dict value representation of the class parameters

Return type:

dict

class abacusai.batch_prediction_version.PredictionInput(client, featureGroupDatasetIds=None, datasetIdRemap=None, featureGroups={}, datasets={})

Bases: abacusai.return_class.AbstractApiClass

Batch inputs

Parameters:
  • client (ApiClient) – An authenticated API Client instance

  • featureGroupDatasetIds (list) – The list of dataset IDs to use as input

  • datasetIdRemap (dict) – Replacement datasets to swap as prediction input

  • featureGroups (PredictionFeatureGroup) – List of prediction feature groups

  • datasets (PredictionDataset) – List of prediction datasets

__repr__()

Return repr(self).

to_dict()

Get a dict representation of the parameters in this class

Returns:

The dict value representation of the class parameters

Return type:

dict

class abacusai.batch_prediction_version.AbstractApiClass(client, id)
__eq__(other)

Return self==value.

_get_attribute_as_dict(attribute)
class abacusai.batch_prediction_version.BatchPredictionVersion(client, batchPredictionVersion=None, batchPredictionId=None, status=None, driftMonitorStatus=None, deploymentId=None, modelId=None, modelVersion=None, predictionsStartedAt=None, predictionsCompletedAt=None, databaseOutputError=None, totalPredictions=None, failedPredictions=None, databaseConnectorId=None, databaseOutputConfiguration=None, fileConnectorOutputLocation=None, fileOutputFormat=None, connectorType=None, legacyInputLocation=None, error=None, driftMonitorError=None, monitorWarnings=None, csvInputPrefix=None, csvPredictionPrefix=None, csvExplanationsPrefix=None, databaseOutputTotalWrites=None, databaseOutputFailedWrites=None, outputIncludesMetadata=None, resultInputColumns=None, modelMonitorVersion=None, algoName=None, algorithm=None, outputFeatureGroupId=None, outputFeatureGroupVersion=None, outputFeatureGroupTableName=None, batchPredictionWarnings=None, bpAcrossVersionsMonitorVersion=None, batchPredictionArgsType=None, batchInputs={}, inputFeatureGroups={}, globalPredictionArgs={}, batchPredictionArgs={})

Bases: abacusai.return_class.AbstractApiClass

Batch Prediction Version

Parameters:
  • client (ApiClient) – An authenticated API Client instance

  • batchPredictionVersion (str) – The unique identifier of the batch prediction version

  • batchPredictionId (str) – The unique identifier of the batch prediction

  • status (str) – The current status of the batch prediction

  • driftMonitorStatus (str) – The status of the drift monitor for this batch prediction version

  • deploymentId (str) – The deployment used to make the predictions

  • modelId (str) – The model used to make the predictions

  • modelVersion (str) – The model version used to make the predictions

  • predictionsStartedAt (str) – Predictions start date and time

  • predictionsCompletedAt (str) – Predictions completion date and time

  • databaseOutputError (bool) – If true, there were errors reported by the database connector while writing

  • totalPredictions (int) – Number of predictions performed in this batch prediction job

  • failedPredictions (int) – Number of predictions that failed

  • databaseConnectorId (str) – The database connector to write the results to

  • databaseOutputConfiguration (dict) – Contains information about where the batch predictions are written to

  • fileConnectorOutputLocation (str) – Contains information about where the batch predictions are written to

  • fileOutputFormat (str) – The format of the batch prediction output (CSV or JSON)

  • connectorType (str) – Null if writing to internal console, else FEATURE_GROUP | FILE_CONNECTOR | DATABASE_CONNECTOR

  • legacyInputLocation (str) – The location of the input data

  • error (str) – Relevant error if the status is FAILED

  • driftMonitorError (str) – Error message for the drift monitor of this batch predcition

  • monitorWarnings (str) – Relevant warning if there are issues found in drift or data integrity

  • csvInputPrefix (str) – A prefix to prepend to the input columns, only applies when output format is CSV

  • csvPredictionPrefix (str) – A prefix to prepend to the prediction columns, only applies when output format is CSV

  • csvExplanationsPrefix (str) – A prefix to prepend to the explanation columns, only applies when output format is CSV

  • databaseOutputTotalWrites (int) – The total number of rows attempted to write (may be less than total_predictions if write mode is UPSERT and multiple rows share the same ID)

  • databaseOutputFailedWrites (int) – The number of failed writes to the Database Connector

  • outputIncludesMetadata (bool) – If true, output will contain columns including prediction start time, batch prediction version, and model version

  • resultInputColumns (list[str]) – If present, will limit result files or feature groups to only include columns present in this list

  • modelMonitorVersion (str) – The version of the model monitor

  • algoName (str) – The name of the algorithm used to train the model

  • algorithm (str) – The algorithm that is currently deployed.

  • outputFeatureGroupId (str) – The Batch Prediction output feature group ID if applicable

  • outputFeatureGroupVersion (str) – The Batch Prediction output feature group version if applicable

  • outputFeatureGroupTableName (str) – The Batch Prediction output feature group name if applicable

  • batchPredictionWarnings (str) – Relevant warnings if any issues are found

  • bpAcrossVersionsMonitorVersion (str) – The version of the batch prediction across versions monitor

  • batchPredictionArgsType (str) – The type of the batch prediction args

  • batchInputs (PredictionInput) – Inputs to the batch prediction

  • inputFeatureGroups (PredictionFeatureGroup) – List of prediction feature groups

  • globalPredictionArgs (BatchPredictionArgs)

  • batchPredictionArgs (BatchPredictionArgs) – Argument(s) passed to every prediction call

__repr__()

Return repr(self).

to_dict()

Get a dict representation of the parameters in this class

Returns:

The dict value representation of the class parameters

Return type:

dict

download_batch_prediction_result_chunk(offset=0, chunk_size=10485760)

Returns a stream containing the batch prediction results.

Parameters:
  • offset (int) – The offset to read from.

  • chunk_size (int) – The maximum amount of data to read.

get_batch_prediction_connector_errors()

Returns a stream containing the batch prediction database connection write errors, if any writes failed for the specified batch prediction job.

Parameters:

batch_prediction_version (str) – Unique string identifier of the batch prediction job to get the errors for.

refresh()

Calls describe and refreshes the current object’s fields

Returns:

The current object

Return type:

BatchPredictionVersion

describe()

Describes a Batch Prediction Version.

Parameters:

batch_prediction_version (str) – Unique string identifier of the Batch Prediction Version.

Returns:

The Batch Prediction Version.

Return type:

BatchPredictionVersion

get_logs()

Retrieves the batch prediction logs.

Parameters:

batch_prediction_version (str) – The unique version ID of the batch prediction version.

Returns:

The logs for the specified batch prediction version.

Return type:

BatchPredictionVersionLogs

download_result_to_file(file)

Downloads the batch prediction version in a local file.

Parameters:

file (file object) – A file object opened in a binary mode e.g., file=open(‘/tmp/output’, ‘wb’).

wait_for_predictions(timeout=86400)

A waiting call until batch prediction version is ready.

Parameters:

timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

wait_for_drift_monitor(timeout=86400)

A waiting call until batch prediction drift monitor calculations are ready.

Parameters:

timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.

get_status(drift_monitor_status=False)

Gets the status of the batch prediction version.

Returns:

A string describing the status of the batch prediction version, for e.g., pending, complete, etc.

Return type:

str

Parameters:

drift_monitor_status (bool)

load_results_as_pandas()

Loads the output feature groups into a python pandas dataframe.

Returns:

A pandas dataframe with annotations and text_snippet columns.

Return type:

DataFrame