Configuration

Setup & Configuration

Environment Setup

Install dependencies. Follow more detail instruction on Installation section.
```
pip install -r requirements.txt
```
Place your dataset (e.g., CoAt-Set.csv) in the ./dataset directory.
Update the dataset path in the code if you place the dataset in other directory or you use other dataset.
```
file_path = os.path.join("your_dataset_path", "your_dataset.csv")  # Adjust path as needed
```

Dataset Configuration

By default, the simulator run Binary Classification. The target label is using Attack column in the dataset.

df = df.drop(columns=['Label'])  # Drops multi-class labels
y_df = df['Attack']              # Target: 0 (benign) or 1 (attack)

If you want to run Multi-Class Classification, you need change the target label is using Label column in the dataset.

df = df.drop(columns=['Attack'])
y_df = df['Label']  # Retain 'Label' column for multi-class targets

Do not forget to edit the neural network architecture in create_model():
Extra code for doing Multi-Class Classification also need to provide. This version of simulator is not cover yet, maybe in the future.

Federated Learning Parameters

Configure in cids_federated_training():

# Example settings
num_nodes = 10     # Number of clients
num_rounds = 20    # Training rounds
epochs = 10        # Local epochs per round
batch_size = 1000  # Batch size per client

Data Distribution Settings

Modify in load_data():

# Non-IID partitioning example
def load_data():
    # Each client receives 2% of the dataset
    fraction = 0.02
    # Customize shuffling/sampling logic here

Model Customization

Neural Network Architecture

Edit create_model():

def create_model(input_shape):
 model = keras.Sequential([
     layers.Dense(20, activation='relu', input_shape=(input_shape,)),
     layers.Dense(10, activation='relu'), #Edit activation using other method. You can see here https://keras.io/api/layers/activations/#available-activations
     layers.Dense(5, activation='relu'), #Edit number of neuron in each layer (e.g. change 5 with 1000)
     layers.Dense(3, activation='relu'),
     layers.Dense(1, activation='sigmoid')
 ])
 #Edit loss with other method. You can see here https://keras.io/api/losses/#available-losses
 #Edit optimizer with other method. You can see here https://keras.io/api/optimizers/#available-optimizers
 model.compile(loss='mean_squared_error', optimizer='sgd', metrics=['accuracy', Recall(), Precision()])
 return model

Preprocessing Adjustments

Replace scalers in the preprocessing pipeline:

from sklearn.preprocessing import StandardScaler

# Replace QuantileTransformer
preprocessor = StandardScaler()

Execution & Outputs

Run the Simulation

Run jupyter notebook first.

jupyter notebook

After that, you can open CIDS-Sim_Non-IID.ipynb and CIDS-Sim_Heterogeneous.ipynb in jupyter notebook.

Outputs Generated:

Logs: Real-time metrics (accuracy, F1-score, and etc.) in the console.
Visualizations: Graphic plots of metric in each rounds.
CSV Files: Detailed metrics in each round and save in files (e.g., global_metrics.csv).

Troubleshooting

Common Issues:

Dataset Not Found:
- Verify file_path points to the correct dataset file.
- Check filesystem permissions.
Poor Model Performance:
- Increase num_rounds or epochs.
- Add more layers to create_model().
High Memory Usage:
- Reduce batch_size or num_nodes.
- Disable resource tracking in the code.

Support

For further assistance, open an issue on the GitHub repository.