Extreme XP

This project addresses the challenges of managing vast, complex, and uncertain data environments by developing a robust framework for experiment-driven analytics. Traditional deterministic models often fail to handle the variability and unpredictability inherent in such systems. To overcome these limitations, the project integrates advanced probabilistic methodologies, such as Bayesian Networks and Markov Decision Processes, to enhance decision-making accuracy and reliability. By aligning system configurations with user intents, the framework ensures adaptability and usability in dynamic contexts. This research also contributes novel algorithms and evaluation metrics to advance the theoretical foundations of experiment-driven analytics, ultimately providing practical tools for optimizing data-driven processes.

Table of Contents

Introduction

In the era of big data and complex decision-making, managing vast amounts of data with varying velocity, volume, and heterogeneity presents a significant challenge. Traditional deterministic models often fall short in effectively handling this complexity and uncertainty. This project seeks to address these challenges by developing a robust framework for experiment-driven analytics that leverages advanced probabilistic methodologies.

Problem Statement

Modern data environments are characterized by unpredictability and variability. In such contexts, deterministic approaches fail to provide the required adaptability and reliability. Probabilistic models, such as Bayesian Networks and Markov Decision Processes (MDPs), offer a promising solution. These models enable effective representation and reasoning under uncertainty, facilitating improved accuracy and robustness in decision-making processes.

Furthermore, aligning user intents with system configurations remains a critical requirement. By integrating user preferences with probabilistic frameworks, this project aims to create systems that are not only adaptive but also tailored to specific user needs, thereby enhancing overall usability and performance.

Objectives

The primary goals of this project include: 1. Framework Development: Constructing a comprehensive system for managing complex analytics workflows. 2. Probabilistic Model Integration: Incorporating Bayesian Networks and MDPs to enhance decision-making reliability. 3. Theoretical Contributions: Proposing novel algorithms and evaluation metrics to advance experiment-driven analytics methodologies.

By addressing these objectives, the project aspires to provide practical tools and techniques for optimizing data analytics processes in unpredictable and dynamic environments.

Approach

This project will adopt a methodical approach involving: - The creation of meta-models to capture variability and user requirements. - Integration of probabilistic models to align system configurations with user intents. - Iterative learning from experimentation to continuously refine decision-making models.

Ultimately, this project aims to push the boundaries of experiment-driven analytics, offering innovative solutions for managing the complexities of modern data environments.

Pre Install

Installation

To set up the project, follow these steps:

  1. Clone the repository: bash git clone git@github.com:ExtremeXP/T2.2.-OptionsExplorer.git cd T2.2.-OptionsExplorer
  2. Create the .env file in the main directory and put these variables in it:

  3. Create .env file

    nano .env

  4. Add these vriables to .env

    DB_NAME=""
    DB_USER=""
    DB_PASSWORD=""
    DB_HOST=""
    DB_PORT=""
    DATASET_FOLDER=""
    PROFILE_FOLDER=""
    JWT_SECRET_KEY=""
    HASH_SAULT=""
    
  5. Optional: For creating JWT_SECRET_KEY you can use this bash node -e "console.log(require('crypto').randomBytes(32).toString('hex'))"

  6. Add .env to source bash source .env

  7. Run Docker Compose to set up the necessary services: bash docker-compose up -d

  8. Execute all queries in /queries

    cd queries
    cat *.sql | PGPASSWORD=$DB_PASSWORD psql -U $DB_USER -h localhost -p $DB_PORT -d $DB_NAME
    

Usage

The Extreme XP Project provides a set of APIs to manage and analyze experiment-driven data effectively. Below is a list of available APIs along with their descriptions and usage examples.

To use the APIs effectively, the user must first register in the system by calling the /user/register endpoint. If the user has already registered, they can log in using the /user/login endpoint, which returns an access_token and refresh_token. Since the access_token expires daily, the user must refresh it periodically by calling /user/refresh_token, while the refresh_token remains valid for one week.

Before adding any experiments to the Option-Explorer database, all experiment description types must be defined. This is done using the /experiment/add_experience_description_type endpoint. To retrieve the list of available experiment description type IDs, the user can call /experiment/get_experiment_description_types. Once the experiment description types are set up, experiments can be added to the database either one by one using /experiment/add_experiment or in bulk via a CSV file with the /experiment/add-uc5-dataset endpoint. If needed, experiment details can be retrieved using /experiment/get_experiment.

User interaction and feedback are crucial for the MDP calculation. Every time a user clicks on an experiment, the /experiment/select_experiment endpoint is called to record the selection. Additionally, user feedback can be logged with /experiment/add_user_feedback, helping to refine the ranking process. To check how many times experiments have been selected, the system provides /experiment/get_selected_experiments.

Finally, once enough data is collected, the /experiment/call_mdp endpoint is used to calculate the MDP, ranking experiments based on soft and hard constraints provided as input. This ensures that experiment recommendations align with user preferences and system constraints.