This project addresses the challenges of managing vast, complex, and uncertain data environments by developing a robust framework for experiment-driven analytics. Traditional deterministic models often fail to handle the variability and unpredictability inherent in such systems. To overcome these limitations, the project integrates advanced probabilistic methodologies, such as Bayesian Networks and Markov Decision Processes, to enhance decision-making accuracy and reliability. By aligning system configurations with user intents, the framework ensures adaptability and usability in dynamic contexts. This research also contributes novel algorithms and evaluation metrics to advance the theoretical foundations of experiment-driven analytics, ultimately providing practical tools for optimizing data-driven processes.
In the era of big data and complex decision-making, managing vast amounts of data with varying velocity, volume, and heterogeneity presents a significant challenge. Traditional deterministic models often fall short in effectively handling this complexity and uncertainty. This project seeks to address these challenges by developing a robust framework for experiment-driven analytics that leverages advanced probabilistic methodologies.
Modern data environments are characterized by unpredictability and variability. In such contexts, deterministic approaches fail to provide the required adaptability and reliability. Probabilistic models, such as Bayesian Networks and Markov Decision Processes (MDPs), offer a promising solution. These models enable effective representation and reasoning under uncertainty, facilitating improved accuracy and robustness in decision-making processes.
Furthermore, aligning user intents with system configurations remains a critical requirement. By integrating user preferences with probabilistic frameworks, this project aims to create systems that are not only adaptive but also tailored to specific user needs, thereby enhancing overall usability and performance.
The primary goals of this project include: 1. Framework Development: Constructing a comprehensive system for managing complex analytics workflows. 2. Probabilistic Model Integration: Incorporating Bayesian Networks and MDPs to enhance decision-making reliability. 3. Theoretical Contributions: Proposing novel algorithms and evaluation metrics to advance experiment-driven analytics methodologies.
By addressing these objectives, the project aspires to provide practical tools and techniques for optimizing data analytics processes in unpredictable and dynamic environments.
This project will adopt a methodical approach involving: - The creation of meta-models to capture variability and user requirements. - Integration of probabilistic models to align system configurations with user intents. - Iterative learning from experimentation to continuously refine decision-making models.
Ultimately, this project aims to push the boundaries of experiment-driven analytics, offering innovative solutions for managing the complexities of modern data environments.
To set up the project, follow these steps:
bash
git clone git@github.com:ExtremeXP/T2.2.-OptionsExplorer.git
cd T2.2.-OptionsExplorerCreate the .env file in the main directory and put these variables in it:
Create .env file
nano .env
Add these vriables to .env
DB_NAME=""
DB_USER=""
DB_PASSWORD=""
DB_HOST=""
DB_PORT=""
DATASET_FOLDER=""
PROFILE_FOLDER=""
JWT_SECRET_KEY=""
HASH_SAULT=""
Optional: For creating JWT_SECRET_KEY you can use this
bash
node -e "console.log(require('crypto').randomBytes(32).toString('hex'))"
Add .env to source
bash
source .env
Run Docker Compose to set up the necessary services:
bash
docker-compose up -d
Execute all queries in /queries
cd queries
cat *.sql | PGPASSWORD=$DB_PASSWORD psql -U $DB_USER -h localhost -p $DB_PORT -d $DB_NAME
The Extreme XP Project provides a set of APIs to manage and analyze experiment-driven data effectively. Below is a list of available APIs along with their descriptions and usage examples.
To use the APIs effectively, the user must first register in the system by calling the /user/register endpoint. If the user has already registered, they can log in using the /user/login endpoint, which returns an access_token and refresh_token. Since the access_token expires daily, the user must refresh it periodically by calling /user/refresh_token, while the refresh_token remains valid for one week.
Before adding any experiments to the Option-Explorer database, all experiment description types must be defined. This is done using the /experiment/add_experience_description_type endpoint. To retrieve the list of available experiment description type IDs, the user can call /experiment/get_experiment_description_types. Once the experiment description types are set up, experiments can be added to the database either one by one using /experiment/add_experiment or in bulk via a CSV file with the /experiment/add-uc5-dataset endpoint. If needed, experiment details can be retrieved using /experiment/get_experiment.
User interaction and feedback are crucial for the MDP calculation. Every time a user clicks on an experiment, the /experiment/select_experiment endpoint is called to record the selection. Additionally, user feedback can be logged with /experiment/add_user_feedback, helping to refine the ranking process. To check how many times experiments have been selected, the system provides /experiment/get_selected_experiments.
Finally, once enough data is collected, the /experiment/call_mdp endpoint is used to calculate the MDP, ranking experiments based on soft and hard constraints provided as input. This ensures that experiment recommendations align with user preferences and system constraints.
This endpoint allows users to create a new account by providing their personal details, email, password, and profile picture. The password is hashed using MD5 with a salt before storing it. If the registration is successful, the user’s information is stored, and a success response is returned. If the provided profile picture is missing or not in an allowed format (PNG, JPG, JPEG), the request will be rejected with a 400 Bad Request error. For security reasons, it is recommended to use a stronger hashing algorithm such as bcrypt. Ensure that sensitive information is securely handled.
Method: Post
Endpoint:
/user/register
Request body:
{
name: <string>,
lastname: <string>,
email: <string>,
profile_pic: <string>,
address: <string>,
birth_date: <string>,
educational_level: <string>,
educational_field: <string>,
password: <string>,
}
This endpoint allows users to authenticate by providing their email and password. The password is hashed using MD5 with a salt before validation. If the credentials are correct, the system generates and returns an access token and a refresh token, which should be used for authenticated requests. If authentication fails, a 401 Unauthorized response is returned. Security Note: MD5 is not a secure hashing algorithm; using a stronger hashing method like bcrypt is recommended. Ensure tokens are securely stored and refreshed appropriately.
Method: Post
Endpoint:
/user/login
Request body:
{
email: <string>,
password: <string>
}
This endpoint allows users to obtain a new access token using a valid refresh token. The request must include a valid refresh token in the Authorization header. If the refresh token is valid, a new access token is returned. If the refresh token is missing, expired, or invalid, the request will be rejected with a 401 Unauthorized error.
Method: Post
Authentication: Required (JWT)
Endpoint:
/user/refresh_token
This endpoint allows authenticated users to add a new experiment. The user must provide experiment details, including the title, domain, intent, algorithm, method, model, and a list of descriptions. The request must be sent in JSON format and include a valid JWT access token in the Authorization header. If successful, the experiment is stored and a success response is returned. If authentication fails, a 401 Unauthorized response is returned.
Method: Post
Authentication: Required (JWT)
Endpoint:
/experiment/add_experiment
Request body:
{
"title" : <string>,
"domain" : <string>,
"intent" : "<string>,
"algorithm" : <string>,
"method" : <string>,
"model" : <string>,
"descriptions" : [
{
"description_type_id": <int>,
"value": <string|int>
},
...
]
}
This endpoint allows authenticated users to add a new experiment description type.
Users must provide the name, type, and reward for the description type.
The request should be sent using form-data. The name represents the type of description,
type represents the description's format (e.g., numerical, text), and reward represents
the associated reward or importance level for the description. A successful addition returns
a success response, while an unauthorized request results in a 401 error.
Method: Post
Authentication: Required (JWT)
Endpoint:
/experiment/add_experience_description_type
Request body:
{
"name" : <string>,
"type" : <string>,
"reward" : <int>,
}
This endpoint allows users to upload a UC5 dataset in CSV format. The data is then processed, converted to JSON, and added as experiments.
Method: Post
Authentication: Required (JWT)
Endpoint:
/experiment/add-uc5-dataset
Request body:
{
"file" : <file.csv>
}
Select an experiment by its ID and log the user’s selection in the search history.
Method: Get
Authentication: Required (JWT)
Endpoint:
/experiment/select_experiment?experiment_id=<int>
Allows users to provide feedback for an experiment by rating it.
Method: Post
Authentication: Required (JWT)
Endpoint:
/experiment/add_user_feedback
Request body:
{
"experiment_id" : <int>,
"rating": <int>
}
This endpoint allows authenticated users to retrieve an experiment by its unique ID.
The experiment_id is passed as a query parameter. If the experiment is found,
its details are returned in the response. If the experiment cannot be found,
a 404 Not Found error is returned. The request must include a valid JWT access token in the Authorization header.
Method: Get
Authentication: Required (JWT)
Endpoint:
/experiment/get_experiment?experiment_id=<int>
This endpoint allows authenticated users to retrieve all available experiment description types. The response includes a list of all description types, including their names, types, rewards, and description type IDs. If no description types are found or the user is not authenticated, an error response will be returned.
Method: Post
Authentication: Required (JWT)
Endpoint:
/experiment/get_experiment_description_types
Fetch selected experiments by their IDs.
Method: Post
Authentication: Required (JWT)
Endpoint:
/experiment/get_selected_experiments
Request body:
{
"experiment_ids" :[
{
"id" : <int>
},
...
]
}
This endpoint accepts constraints related to experiments, filters experiences based on the provided criteria, and then processes them using a Markov Decision Process (MDP) model.
Method: Post
Authentication: Required (JWT)
Endpoint:
/experiment/call_mdp
Request body:
{
"domain": <string>,
"intent": <string>,
"algorithm": <string>,
"method": <string>,
"hard_constraints": [
{
"description_type_id": <int>,
"value": <string|int>,
"comparison_type": <categorical|numerical>,
"operator": <<=|>=|=>
},
...
],
"soft_constraints":[
{
"name" : <string>,
"type" : <categorical|numerical>,
"value" :<string|int>
},
...
]
}