Search Results for

    Show / Hide Table of Contents

    A transformation pipeline refers to a structured arrangement that includes a Directed Acyclic Graph (DAG) of transformation functions. Users can interactively select the necessary transformation functions by employing a drag-and-drop interface along with configurable setting blocks. These blocks can be connected to establish the flow of data through the pipeline. This intuitive setup allows users to easily manipulate and visualize the sequence and interaction of data transformations, facilitating a clear and efficient process for data manipulation and analysis.

    To edit configuration of each block. User should click on icon

    The below table show the configurable items for blocks

    Configurable Items Description
    Input Data frame Get from dataset Access and generate features from an existing dataset, represented in tabular format
    Features Users can add rows to this table, with each row containing the following elements:
    • Feature name: Specifies the name of the feature.
    • Data type: Identifies the type of data, which may include text, bool, float, int.
    • Feature Type: Designates whether the column is a feature or target column
    Transformation block Features

    This table also displays the features inputted from connected preceding blocks (referred to as “Father blocks”), indicating the sources of the data. Users have the option to select specific rows for processing by ticking a checkbox at the beginning of each row.

    For example, in this scenario described, the block receives inputs for features att0, att1, att2, att3. However, the user has configured it to process only att0, att1, and att3.

    Config Some transformation functions include configurable variables, defined as environment variable that allow behavior at runtime. User can adjust these configurations to align with the specific logic required by the block function.
    Output Data frame Features It shows that the expected columns of all pipeline with current configurations. User can imagine that what is the output of pipeline.
    Config This is a JSON file contains information about pipeline during runtime, enabling the data pipeline to pass information to the training job through this JSON file.
    In This Article
    Back to top Generated by Asset Health Insights