Python implementation of ECLAT algorithm for association rule mining.
This implementation mines rules , such that
is an element in a transaction and
is an element in hierarchy that a belongs to.
This kind of rule is mined on the condition that there are transactions
, where
is an itemset belonging to an element in hierarchy
.
$ conda env create -f environment.yml
$ conda activate eclatExecute with default parameters:
$ python main.pyTo execute for a predefined dataset:
$ python main.py --dataset=<dataset_id>Possible dataset_id values:
To execute for a custom dataset:
$ python main.py --data=<path/to/transactions.txt> --taxonomy=<path/to/taxonomy.txt>File with taxonomy is optional. Rules based on hierarchy of items are not mined if taxonomy is not provided.
Example of transactions.txt file format:
1 2 3
1 2
1 3
Example of taxonomy.txt file format:
1,11
2,11
3,22
11,111
22,111
An example of execution with ECLAT parametrization:
$ python main.py --min_sup=5 --min_conf=0.8 --min_len=3 --max_len=10The options are:
- min_sup - minimum support of the base of mined rules (type=int, default=1),
- min_conf - minimum confidence of mined rules (type=float, default=0.5),
- min_len - minimum length of mined rules (type=int, default=1),
- max_len - maximum length of mined rules (type=int, default=None - not limited by default).
To execute unit tests run the following command in the main directory:
$ python -m unittest test.test_eclatTo run efficiency experiments:
$ python -m test.test_efficiency