abacusai.feature_group_version
Classes
Annotation config for a feature group |
|
Code source for python-based custom feature groups and models |
|
A feature in a feature group |
|
The indexing config for a Feature Group |
|
A point in time group containing point in time features |
|
A materialized version of a feature group |
Module Contents
- class abacusai.feature_group_version.AnnotationConfig(client, featureAnnotationConfigs=None, labels=None, statusFeature=None, commentsFeatures=None, metadataFeature=None)
Bases:
abacusai.return_class.AbstractApiClassAnnotation config for a feature group
- Parameters:
client (ApiClient) – An authenticated API Client instance
featureAnnotationConfigs (list) – List of feature annotation configs
labels (list) – List of labels
statusFeature (str) – Name of the feature that contains the status of the annotation (Optional)
commentsFeatures (list) – Features that contain comments for the annotation (Optional)
metadataFeature (str) – Name of the feature that contains the metadata for the annotation (Optional)
- __repr__()
Return repr(self).
- class abacusai.feature_group_version.CodeSource(client, sourceType=None, sourceCode=None, applicationConnectorId=None, applicationConnectorInfo=None, packageRequirements=None, status=None, error=None, publishingMsg=None, moduleDependencies=None)
Bases:
abacusai.return_class.AbstractApiClassCode source for python-based custom feature groups and models
- Parameters:
client (ApiClient) – An authenticated API Client instance
sourceType (str) – The type of the source, one of TEXT, PYTHON, FILE_UPLOAD, or APPLICATION_CONNECTOR
sourceCode (str) – If the type of the source is TEXT, the raw text of the function
applicationConnectorId (str) – The Application Connector to fetch the code from
applicationConnectorInfo (str) – Args passed to the application connector to fetch the code
packageRequirements (list) – The pip package dependencies required to run the code
status (str) – The status of the code and validations
error (str) – If the status is failed, an error message describing what went wrong
publishingMsg (dict) – Warnings in the source code
moduleDependencies (list) – The list of internal modules dependencies required to run the code
- __repr__()
Return repr(self).
- to_dict()
Get a dict representation of the parameters in this class
- Returns:
The dict value representation of the class parameters
- Return type:
- import_as_cell()
Adds the source code as an unexecuted cell in the notebook.
- class abacusai.feature_group_version.Feature(client, name=None, selectClause=None, featureMapping=None, sourceTable=None, originalName=None, usingClause=None, orderClause=None, whereClause=None, featureType=None, dataType=None, detectedFeatureType=None, detectedDataType=None, columns={}, pointInTimeInfo={})
Bases:
abacusai.return_class.AbstractApiClassA feature in a feature group
- Parameters:
client (ApiClient) – An authenticated API Client instance
name (str) – The unique name of the column
selectClause (str) – The sql logic for creating this feature’s data
featureMapping (str) – The Feature Mapping of the feature
sourceTable (str) – The source table of the column
originalName (str) – The original name of the column
usingClause (str) – Nested Column Using Clause
orderClause (str) – Nested Column Ordering Clause
whereClause (str) – Nested Column Where Clause
featureType (str) – Feature Type of the Feature
dataType (str) – Data Type of the Feature
detectedFeatureType (str) – The detected feature type of the column
detectedDataType (str) – The detected data type of the column
columns (NestedFeature) – Nested Features
pointInTimeInfo (PointInTimeFeature) – Point in time column information
- __repr__()
Return repr(self).
- class abacusai.feature_group_version.IndexingConfig(client, primaryKey=None, updateTimestampKey=None, lookupKeys=None)
Bases:
abacusai.return_class.AbstractApiClassThe indexing config for a Feature Group
- Parameters:
- __repr__()
Return repr(self).
- class abacusai.feature_group_version.PointInTimeGroup(client, groupName=None, windowKey=None, aggregationKeys=None, lookbackWindow=None, lookbackWindowLag=None, lookbackCount=None, lookbackUntilPosition=None, historyTableName=None, historyWindowKey=None, historyAggregationKeys=None, features={})
Bases:
abacusai.return_class.AbstractApiClassA point in time group containing point in time features
- Parameters:
client (ApiClient) – An authenticated API Client instance
groupName (str) – The name of the point in time group
windowKey (str) – Name of feature which contains the timestamp value for the point in time feature
aggregationKeys (list) – List of keys to use for join the historical table and performing the window aggregation.
lookbackWindow (float) – Number of seconds in the past from the current time for start of the window.
lookbackWindowLag (float) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of window. If it is negative, we are looking at the “future” rows in the history table.
lookbackCount (int) – If window is specified in terms of count, the start position of the window (0 is the current row)
lookbackUntilPosition (int) – Optional lag to offset the closest point for the window. If it is positive, we delay the start of window by that many rows. If it is negative, we are looking at those many “future” rows in the history table.
historyTableName (str) – The table to use for aggregating, if not provided, the source table will be used
historyWindowKey (str) – Name of feature to use for ordering the rows on the history table. If not provided, the windowKey from the source table will be used
historyAggregationKeys (list) – List of keys to use for join the historical table and performing the window aggregation. If not provided, the aggregationKeys from the source table will be used. Must be the same length and order as the source table’s aggregationKeys
features (PointInTimeGroupFeature) – List of features in the Point in Time group
- __repr__()
Return repr(self).
- class abacusai.feature_group_version.AbstractApiClass(client, id)
- __eq__(other)
Return self==value.
- _get_attribute_as_dict(attribute)
- class abacusai.feature_group_version.FeatureGroupVersion(client, featureGroupVersion=None, featureGroupId=None, sql=None, sourceTables=None, sourceDatasetVersions=None, createdAt=None, status=None, error=None, deployable=None, cpuSize=None, memory=None, useOriginalCsvNames=None, pythonFunctionBindings=None, indexingConfigWarningMsg=None, materializationStartedAt=None, materializationCompletedAt=None, columns=None, templateBindings=None, features={}, pointInTimeGroups={}, codeSource={}, annotationConfig={}, indexingConfig={})
Bases:
abacusai.return_class.AbstractApiClassA materialized version of a feature group
- Parameters:
client (ApiClient) – An authenticated API Client instance
featureGroupVersion (str) – The unique identifier for this materialized version of feature group.
featureGroupId (str) – The unique identifier of the feature group this version belongs to.
sql (str) – The sql definition creating this feature group.
sourceTables (list[str]) – The source tables for this feature group.
sourceDatasetVersions (list[str]) – The dataset version ids for this feature group version.
createdAt (str) – The timestamp at which the feature group version was created.
status (str) – The current status of the feature group version.
error (str) – Relevant error if the status is FAILED.
deployable (bool) – whether feature group is deployable or not.
cpuSize (str) – Cpu size specified for the python feature group.
memory (int) – Memory in GB specified for the python feature group.
useOriginalCsvNames (bool) – If true, the feature group will use the original column names in the source dataset.
pythonFunctionBindings (list) – Config specifying variable names, types, and values to use when resolving a Python feature group.
indexingConfigWarningMsg (str) – The warning message related to indexing keys.
materializationStartedAt (str) – The timestamp at which the feature group materialization started.
materializationCompletedAt (str) – The timestamp at which the feature group materialization completed.
columns (list[feature]) – List of resolved columns.
templateBindings (list) – Template variable bindings used for resolving the template.
features (Feature) – List of features.
pointInTimeGroups (PointInTimeGroup) – List of Point In Time Groups
codeSource (CodeSource) – If a python feature group, information on the source code
annotationConfig (AnnotationConfig) – The annotations config for the feature group.
indexingConfig (IndexingConfig) – The indexing config for the feature group.
- __repr__()
Return repr(self).
- to_dict()
Get a dict representation of the parameters in this class
- Returns:
The dict value representation of the class parameters
- Return type:
- create_snapshot_feature_group(table_name)
Creates a Snapshot Feature Group corresponding to a specific Feature Group version.
- Parameters:
table_name (str) – Name for the newly created Snapshot Feature Group table. Can be up to 120 characters long and can only contain alphanumeric characters and underscores.
- Returns:
Feature Group corresponding to the newly created Snapshot.
- Return type:
- export_to_file_connector(location, export_file_format, overwrite=False)
Export Feature group to File Connector.
- Parameters:
- Returns:
The FeatureGroupExport instance.
- Return type:
- export_to_database_connector(database_connector_id, object_name, write_mode, database_feature_mapping, id_column=None, additional_id_columns=None)
Export Feature group to Database Connector.
- Parameters:
database_connector_id (str) – Unique string identifier for the Database Connector to export to.
object_name (str) – Name of the database object to write to.
write_mode (str) – Enum string indicating whether to use INSERT or UPSERT.
database_feature_mapping (dict) – Key/value pair JSON object of “database connector column” -> “feature name” pairs.
id_column (str) – Required if write_mode is UPSERT. Indicates which database column should be used as the lookup key.
additional_id_columns (list) – For database connectors which support it, additional ID columns to use as a complex key for upserting.
- Returns:
The FeatureGroupExport instance.
- Return type:
- export_to_console(export_file_format)
Export Feature group to console.
- Parameters:
export_file_format (str) – File format to export to.
- Returns:
The FeatureGroupExport instance.
- Return type:
- get_materialization_logs(stdout=False, stderr=False)
Returns logs for a materialized feature group version.
- Parameters:
- Returns:
A function logs object.
- Return type:
- refresh()
Calls describe and refreshes the current object’s fields
- Returns:
The current object
- Return type:
- describe()
Describe a feature group version.
- Parameters:
feature_group_version (str) – The unique identifier associated with the feature group version.
- Returns:
The feature group version.
- Return type:
- get_metrics(selected_columns=None, include_charts=False, include_statistics=True)
Get metrics for a specific feature group version.
- Parameters:
- Returns:
The metrics for the specified feature group version.
- Return type:
- get_logs()
Retrieves the feature group materialization logs.
- Parameters:
feature_group_version (str) – The unique version ID of the feature group version.
- Returns:
The logs for the specified feature group version.
- Return type:
- wait_for_results(timeout=3600)
A waiting call until feature group version is materialized
- Parameters:
timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.
- wait_for_materialization(timeout=3600)
A waiting call until feature group version is materialized.
- Parameters:
timeout (int) – The waiting time given to the call to finish, if it doesn’t finish by the allocated time, the call is said to be timed out.
- get_status()
Gets the status of the feature group version.
- Returns:
A string describing the status of a feature group version (pending, complete, etc.).
- Return type:
- _download_avro_file(file_part, tmp_dir, part_index)
- load_as_pandas(max_workers=10)
Loads the feature group version into a pandas dataframe.
- Parameters:
max_workers (int) – The number of threads.
- Returns:
A pandas dataframe displaying the data in the feature group version.
- Return type:
DataFrame
- load_as_pandas_documents(doc_id_column, document_column, max_workers=10)
Loads a feature group with documents data into a pandas dataframe.
- Parameters:
doc_id_feature (str) – The name of the feature / column containing the document ID.
document_feature (str) – The name of the feature / column which either contains the document data itself or page infos with path to remotely stored documents. This column will be replaced with the extracted document data.
max_workers (int) – The number of threads.
doc_id_column (str)
document_column (str)
- Returns:
A pandas dataframe containing the extracted document data.
- Return type:
DataFrame