A miningbased approach for efficient enumeration of algebraic structures
 880 Downloads
Abstract
Algebraic structures are wellstudied mathematical structures in abstract algebra with applications in many fields of computer security such as cryptography and authentication. Generating such structures is computationally very expensive because of the huge number of permutations. Also, many of these permutations are redundant as they are symmetrically equivalent. The symmetry breaking (finding symmetrically equivalent structures) is also a computationally challenging task. In this paper, we present a miningbased approach for symmetry breaking in algebraic structures. The approach reduces the number of redundant structures by identifying rules based on recurring patterns in the previously known structures. These rules are then used as constraints in a constraint solver. The proposed approach is applied to IP loop, a special class of algebraic structures, and the deduced rules have eliminated a large number of redundant solutions resulting in significant time improvement.
Keywords
Algebraic structures Mining rules Symmetry breaking Inverse property loop1 Introduction
Constraints for generating algebraic structures
Name  Constraint 

Latin square  \(\forall row:\forall i,j \in row, x_i=x_j\Rightarrow i=j\) 
\(\forall col:\forall i,j \in col, y_i=y_j\Rightarrow i=j\)  
Loop  \(x*e=x=e*x\) 
IP loop  \( \forall x,y \in L: x^{1}*(x*y)=(y*x)*x^{1}=y\) 
Basic symmetry breaking in IP loop  \(xx^{1}\le 1\) 
Isomorphism  \(*_1\) Isom.\(*_2\Leftrightarrow \forall u, v \in *_1\), \(f(u*_1v)=f(u)*_2f(v)\) 
A simple way to count and enumerate algebraic structures of any order is to model them as a finite domain constraint satisfaction problem (CSP), where the range of the binary operation \(*\) is a CSP variable whose domain consists of elements of the algebra. Then depending on the required algebraic structure, the corresponding constraint is applied on CSP variables. CSP constraints for Latin Square, loop, and IP loop properties are shown in Table 1. Constraint solver explores the state space in order to find all possible solutions that satisfy the specified constraints.
It is well known that constraint satisfaction problems have symmetries, that is, for every solution there are many equivalent solutions [5, 6]. For example, there are 161280 Latin squares of order 5, of which only 1411 isomorphism classes^{1} exist. To show the enormity of redundant copies, the number of Latin squares and unique isomorphism classes up to order 10 are shown in Fig. 1 [2]. When enumerating algebraic structures, it is sufficient to find only one solution from each class of equivalent solutions. To reduce the search time of constraint solvers, it is better to break symmetries during the search so that redundant search efforts can be avoided. Therefore, additional constraints for symmetry breaking such as those proposed by [7] are added. Generating these symmetric breaking constraints is a timeconsuming process and requires a good insight into the problem domain.
Even after applying symmetric breaking constraints, the solutions generated by the constraint solver have enormous number of isomorphic copies. These redundant isomorphic copies need to be eliminated in order to get the count of isomorphism classes. These isomorphic copies are eliminated in a separate postprocessing step (henceforth referred to as isomorphism detection) using tools such as Nauty [8]. In this paper, we use a miningbased approach to discover more symmetric breaking constraints. To the best of our knowledge, miningbased approaches have never been used to generate symmetric breaking rules. We demonstrate that new rules can be generated without expertise in specific algebraic structures. To prove the effectiveness of our approach, we apply the proposed approach on IP loops of order 13 and show performance improvements in enumerating IP loops. Inclusion of constraints discovered by our mining process provides two benefits: (1) it cuts down the search space for a constraint solver, which reduces the search time; (2) it minimizes the number of redundant copies, which reduces isomorphism detection time.
The rest of the paper is organized as follows. Section 2 describes the related background information for constraint programming and the history of counting algebraic structures. Section 3 explains the methodology used to extract the symmetric breaking constraints. Section 4 discusses the results obtained by applying our approach to one of the well known algebraic structures. Section 5 provides conclusion and future directions.
2 Background
2.1 Constraint solvers
Constraint programming (CP) is a paradigm where a problem can be modeled in terms of constraints in order to identify feasible solutions from a huge set of possible solutions. CP focuses on finding feasible solutions instead of optimal solutions. CP has been applied in several domains including computer graphics, natural language processing, scheduling, and planning. There are several free and commercial constraint programming solvers available which allow users to model problems in terms of constraints.
In this paper, we use JaCoP and Google’s ortools to enumerate algebraic structures. Constraint solver models the problem as finite domain constraint satisfaction problem (CSP), where the range of the binary operation \(*\) is a CSP variable whose domain consists of elements of the algebra. Then the relevant constraints for the algebraic structures (Latin square, loop, and IP loop), as shown in Table 1, are applied on CSP variables.
Latin square constraint in Table 1 implies that each symbol (element) occurs only once in any row and in any column. An example of Latin square of order 5 is shown in Fig. 2a. The loop constraint enforces existence of identity element (e) such that a binary operation (\(*\)) between e and any other element (x) results in the same element (x). For example, in Fig. 2b \(e=0\) and \((2*0) = (0*2) = 2\), whereas \((2*0) \ne (0*2) \ne 2\) in Latin square shown in Fig. 2a. The IP loop constraint implies existence of left and right inverses (\(x^{1}\)) such that \(x^{1}*x=e\) and \(x*x^{1}=e\) and holds left inverse property (\(x^{1}*(x*y)=y\)) and right inverse property (\((y*x)*x^{1}=y\)) for each element of the loop. For example in Fig. 2c, the inverse of element 1 is 2, and \(2*(1*3) = (3*1)*2=3\). On the contrary in Fig. 2b, the inverse of element 1 is 1, but \(1*(1*3) \ne 3\) and \((3*1)*1 \ne 3\).
History of counting Latin squares and loops
Key milestones in Latin square (LS) and loops counting  Historical study details 

Reduced LS up to N = 5  Euler (1782) [14] 
Reduced LS up to N = 6  Frolov (1890) [16] 
Isotopy classes up to N = 6  Fisher and Yates (1934) [15] 
Loops up to N = 6  Schonhardt (1930) [25], Albert (1944) [9] and Sade (1970) [23] 
Main & isotopy classes for N = 7  
Loops up to N = 7  Brant and Mullen (1985) [11] 
Reduced LS for N = 8  Wells (1967) [26] 
Reduced LS for N = 9  Bammel and Rathstein (1975) [10] 
Reduced LS for N = 10  McKay and Rogoyski (1995) [19] 
Reduced LS for N = 11  McKay and Wanless (2005) [20] 
Loops and LS up to N = 10  McKay, Meynert and Myrvold (2007) [2] 
Inverse property loops up to N = 13  Slaney and Ali (2008) [7] 
2.2 Isomorphism Classes
Given two algebraic structures (e.g., Latin squares, loops or IP loops) (\(L_1,*_1\)) and (\(L_2,*_2\)), these structures are considered isomorphic to each other if there exists a bijective function \(f:A\rightarrow B\), where \(A=\{0,\ldots ,n1\}\) and B is any permutation of A, such that for all indices u and v in \(L_1\):\(f(u*_1 v)=f(u)*_2 f(v)\). In our case, \(L_1\) (\(n \times n\)) is isomorphic to \(L_2\) (\(n \times n\)) if \(\forall i, j < n\), \(f(L_1[i][j])=L_2[f(i)][f(j)]\). All those structures that are isomorphic to each other belong to one isomorphism class.
For example, Fig. 3 shows two IP loops \(L_1\) and \(L_2\), which look quite different from each other (as highlighted). But they are isomorphic to each other because there exists a bijective function, \(f:\{0,1,2,3,4,5,6\} \rightarrow \{0,1,2,4,3,5,6\}\) that satisfies the isomorphism property for each element of \(L_1\) and \(L_2\). Please note that \(f(0)=0\), \(f(1)=1\), \(f(2)=2\), \(f(3)=4\), \(f(4)=3\), \(f(5)=5\), and \(f(6)=6\). For example, it can be seen that at indices \((i,j)=(1,3)\), isomorphism property is satisfied because \(f(L_1[1][3])=L_2[f(1)][f(3)]=5\).
We can also describe the above bijective function f as \(f:(3 \ \ 4)\), which means that symbols 3 and 4 are swapped. Another way to check isomorphism between two algebraic structures \(L_1\) and \(L_2\) is to generate \(L_2\) from \(L_1\) by swapping particular rows, columns, and the values according to the function f. For example, in Fig. 3, \(L_2\) can be generated from \(L_1\) by swapping rows 3 and 4, column 3 and 4, and values 3 and 4.
2.3 History of algebraic structures enumeration
It is known that researchers had interest in counting and enumerating algebraic structures for over three centuries. Latin squares, loops, and IP loops are some of the wellstudied algebraic structures. Earliest history of counting Latin squares (LS) goes back to at least 1782 as the number of reduced LS of order 5 was known to Euler [14] and Cayley [13]. Since that time, the researchers have been trying to get the next order algebraic structures. However, there has been considerable delay in achieving the consecutive milestones. This was because of computational complexity of the problem. The history of counting reduced Latin squares and loops is summarized in Table 2. This table shows the main achievements and the related studies. Additionally, there are numerous other studies [12, 17, 18, 21] on counting algebraic structures, which produced incorrect counts.
In this paper, we demonstrate the application of mining techniques in order to reduce the time in enumerating algebraic structures.
3 Proposed methodology
We propose an approach to find symmetry breaking constraints by mining rules from the previously known solutions of lowerorder algebraic structures. These constraints can then be used for efficient enumeration of algebraic structures of higher order. For example, we can find rules by applying a mining technique on known set of matrices for algebraic structures of order 1 to n. The discovered rules can then be used for enumerating the solutions for algebraic structures of order \((n+1)\).
Thus, the first step in our proposed methodology is to enumerate the required algebraic structures and their respective isomorphism classes. These algebraic structures and the isomorphism classes are then used in the second step to identify rules in the form of symmetry breaking constraints. These rules can then be used to enumerate algebraic structure of higher order. The following subsections describe the details of these steps.
3.1 Enumerating algebraic structures
Some of the association rules extracted from known isomorphism classes of IP loops
Association rule (\(\forall mat_i \in R_n,\ldots ,R_p\))  Support (%)  Confidence (%)  Lift 

\(mat_i[3][1] = 4 \implies mat_i[4][4] = 1\)  69.24  100  1.44 
\(mat_i[4][4] = 1 \implies mat_i[3][1] = 4\)  69.24  100  1.44 
\(mat_i[3][1] = 4 \implies mat_i[1][3] = 4\)  69.24  100  1.44 
\(mat_i[1][3] = 4 \implies mat_i[3][1] = 4\)  69.24  100  1.44 
\(mat_i[3][1] = 4 \implies mat_i[4][2] = 3\)  69.24  100  1.44 
\(mat_i[4][2] = 3 \implies mat_i[3][1] = 4\)  69.24  100  1.44 
\(mat_i[3][1] = 4 \wedge mat_i[4][4] = 1 \implies mat_i[1][3] = 4\)  69.24  100  1.44 
3.2 Mining symmetry breaking rules
In order to identify rules to restrict the number of symmetries, we first need to generate all the algebraic structures of orders p to n (i.e., \(S_p, \ldots , S_n\)). The value of p is chosen such that \(S_p\) has considerably large number of structures. We also determine the corresponding set of isomorphism classes (i.e., \(R_p,\ldots , R_n\)). Our first attempt for mining rules was to generate association rules from the known algebraic structure in \(R_p,\ldots , R_n\). Each matrix was considered an itemset. Each position in the matrix along with its value was considered a unique item, \(I_{i,j,v}\). So, an item \(I_{3,4,4}\) means that the matrix had value 4 at index (3,4). Thus, an \(n \times n\) matrix resulted in \(n^2\) items in each itemset.
We applied this approach to the known isomorphism classes of inverse property loop (IP Loop) algebraic structures of order 11 and 13 which consisted of 10391 matrices. The apriori algorithm [27] was then applied on these itemsets using R programming language [29]. Please note that we did not feel the need to use any advanced association rule mining algorithm like Eclat [28] as the standard apriori algorithm took only few seconds (about 10 seconds) to get all the association rules from 10391 matrices.
We considered rules with support larger than 60 % and the confidence equal to 100 %. This provided us with 186 different rules which had 100 % confidence. Some of these association rules are shown in Table 3. For example, the first rule specified that whenever any matrix in \(R_p,\ldots , R_n\) had value 4 at index (3, 1), the matrix also had value 1 at index (4, 4). This was always true (confidence = 100 %) and was observed in about 70 % of the matrices. The rule lift indicated a positive correlation between antecedent and consequent of the rule.
All of the rules were also symmetric with exactly same support and confidence (e.g., \(mat_i[3][1] = 4 \Leftrightarrow mat_i[4][4] = 1\)). The presence of such rules in these structures was significant. This could potentially be used to add symmetry breaking constraints in the constraint solver. These constraints would discard any matrix as being redundant which did not satisfy these rules. For example, a constraint based on the first rule would discard all matrices which had value 4 at index (3, 1), but did not have value 1 at index (4, 4).

(1, 2, 1, 150) has significance 3.35

(1, 1, 2, 100) has significance 3.1

(10, 10, 1, 200) has significance 1.9
4 Results
Time taken and number of solutions for IP loops
Order (n)  Total solutions (with known constraints)  Isomorphism classes  Time (s) 

5  1  1  0 
7  4  2  0.023 
9  64  7  0.024 
11  6464  49  5.86 
13  7853368  10342  103636 
Rules extracted from IP loops of order 7, 9, and 11
Rules \((\forall mat_i \in S_n,\ldots , S_p)\)  Support count  Significance 

\(mat_i[3][3] \ne 1\)  420  2.58 
\(mat_i[1][5] \ne 3\)  109  2.18 
\(mat_i[5][5] \ne 1\)  259  2.09 
\(mat_i[3][5] \ne 6\)  640  2.08 
Performance gains for IP loops of order 13
Without using mined constraints  After using mined constraints  Perf. Gain  

Total solutions  7853368  6392816  18.6 % 
Time (s)  103636  81124  21 % 
Isomorphism classes  10342  10342   
These discovered rules were then used as additional constraints for enumerating IP loops of order 13. Table 6 shows the performance improvements. The total number of solutions decreased to 6392816, which is 18.6 % improvement. The time taken to determine isomorphism classes was reduced by 21 % to 81124 s. This shows a considerably large performance gain in terms of time as well as the number of solutions using the mining approach.
4.1 Rules evaluation
It should be noted that inclusion of these additional constraints (based on rules discovered using our proposed rule mining approach) did not cause any loss of information as all the representative isomorphism classes (i.e., 10342 isomorphism classes) were identified successfully.
We conducted further evaluation of the rules to get a break down of the number of solutions discarded by each rule and their corresponding isomorphism classes.
Rules evaluation: the number of solutions discarded by each rule, the corresponding number of isomorphism classes, and the number of different mappings
Rules  Number of isomorphic solutions discarded  Number of isomorphism classes  Number of mappings 

\(mat_i[3][3] \ne 1\)  609408  6322  8853 
\(mat_i[1][5] \ne 3\)  304160  6322  6146 
\(mat_i[5][5] \ne 1\)  150448  3263  4116 
\(mat_i[3][5] \ne 6\)  396536  8583  14156 
Total solutions (unique)  1460552  8853  21529 
It was observed that the set of isomorphism classes which represented the discarded solutions based on different rules is not mutually exclusive. For example, in Fig. 7, the matrix (b1) which was discarded due to \(mat_i[5][5] \ne 1\) rule and the matrix (b2) which was discarded due to \(mat_i[3][5] \ne 6\) rule are represented by the same isomorphism class (i.e., matrix (b)). It was also observed that the same set of 6322 isomorphism classes represented all the solutions (matrices) discarded by \(mat_i[3][3] \ne 1\) and \(mat_i[1][5] \ne 3\) rules. In general, all of the 1460552 discarded solutions were represented by 8853 isomorphism classes using 21529 different mappings.
5 Conclusion
Studying algebraic structures is an important area of research in mathematics with applications in many areas of computer science. However, generating these structures is computationally expensive because of overwhelmingly large number of symmetries present in these structures. In this paper, we presented a miningbased approach to discover symmetry breaking constraints in algebraic structures. We demonstrated the effectiveness of our approach by applying it to enumerate IP loops. We found new symmetry breaking constraints that resulted in significant reduction in the number of redundant solutions, thereby reducing computational time to generate these structures.
To the best of our knowledge, this is the first time a miningbased approach has been applied to discover symmetry breaking constraints. This work can be enhanced in multiple directions. A similar approach can be applied to other algebraic structures like C loops and flexible loops. Applying other mining approaches such as clustering and classification need further investigation.
Footnotes
Notes
Acknowledgments
This work was supported by Prince Mohammad Bin Fahd University (PMU) internal research grant. The views and conclusions herein are those of the authors and do not represent the official policies of the university.
References
 1.Khan, M.A., Mohammad, N., Muhammad, S., Ali, A.: A mining based approach for efficient enumeration of algebraic structures. In: IEEE International Conference on Data Science and Advanced Analytics (DSAA) (2015)Google Scholar
 2.McKay, B.D., Meynert, A., Myrvold, W.: Small latin squares, quasigroups, and loops. J. Comb. Des. 15, 98–119 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
 3.Battey, M., Parakh, A.: An efficient quasigroup block cipher. Wirel. Pers. Commun. 73(1), 63–76 (2013)CrossRefGoogle Scholar
 4.Krapez, A.: An application of quasigroups in cryptology. Math. Maced 8, 47–52 (2010)MathSciNetzbMATHGoogle Scholar
 5.Gent, I.P., Barbara, S.: Symmetry Breaking During Search in Constraint Programming. University of Leeds, School of Computer Studies, Leeds (1999)Google Scholar
 6.Gent, I.P., Harvey, W., Kelsey, T.: Groups and Constraints: Symmetry Breaking During Search. Principles and Practice of Constraint ProgrammingCP 2002. Springer, Berlin (2002)Google Scholar
 7.Ali, A., Slayney, J.: Counting loops with the inverse property. Quasigroups Relat. Syst. 16, 13 (2008)MathSciNetGoogle Scholar
 8.McKay, B.D.: Practical graph isomorphism. Congr. Numer. 30, 3587 (1981)MathSciNetzbMATHGoogle Scholar
 9.Albert, A.A.: Quasigroups. II. Trans. Am. Math. Soc. 55, 401–409 (1944)MathSciNetCrossRefzbMATHGoogle Scholar
 10.Bammel, S.E., Rothstein, J.: The number of 9 \(\times \) 9 latin squares. Discret. Math. 11, 83–95 (1975)MathSciNetCrossRefzbMATHGoogle Scholar
 11.Brant, L.J., Mullen, G.L.: A note on isomorphism classes of reduced latin squares of order 7. Util. Math. 27, 261–263 (1985)MathSciNetzbMATHGoogle Scholar
 12.Brown, J.W.: Enumeration of latin squares with application to order 8. J. Comb. Theory 5, 177–184 (1972)MathSciNetCrossRefzbMATHGoogle Scholar
 13.Cayley, A.: On latin squares. Oxf. Camb. Dublin Messenger Math. 19, 85–239 (1890)Google Scholar
 14.Euler, L.: Recherches sur une nouvelle espéce de quarrés magiques combinatorial aspects of relations. Verhandelingen/uitgegeven door het zeeuwsch Genootschap der Wetenschappen te Vlissingen, 9, 85–239, (1782)Google Scholar
 15.Fisher, R.A., Yates, F.: The 6 \(\times \) 6 latin squares. Proc. Camb. Philos. Soc. 30, 492–507 (1934)CrossRefzbMATHGoogle Scholar
 16.Frolov, M.: Sur les permutations carrées. J. Math. Spéc IV, 8–11 (1890)Google Scholar
 17.Jacob, S.M.: The enumeration of the latin rectangle of depth three by means of a formula of reduction, with other theorems relating to nonclashing substitutions and latin squares. Proc. Lond. Math. Soc. 31, 329–354 (1930)MathSciNetCrossRefzbMATHGoogle Scholar
 18.MacMahon, P.A.: Combinatory Analysis, vol. 1. Cambridge University Press, Cambridge (1915)zbMATHGoogle Scholar
 19.McKay, B.D., Rogoyski, E.: Latin squares of order 10. Electron. J. Combin. 2, N3 (1995)MathSciNetzbMATHGoogle Scholar
 20.McKay, B.D., Wanless, I.M.: On the number of latin squares. Ann. Combin. 9, 335–344 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
 21.Norton, H.W.: The 7 \(\times \) 7 squares. Ann. Eugenics 9, 269–307 (1939)MathSciNetCrossRefzbMATHGoogle Scholar
 22.Sade, A.: An omission in norton’s list of 7 \(\times \) 7 squares. Ann. Math. Stat. 22, 306–307 (1951)MathSciNetCrossRefzbMATHGoogle Scholar
 23.Sade, A.: Morphismes de quasigroupes: Tables. Revista da Faculdade de Ciências de Lisboa, 2: A – Ciências Matemáticas, 13 149–172, (1970/71)Google Scholar
 24.Saxena, P.N.: A simplified method of enumerating latin squares by macmahon’s differential operators; II. The 7 \(\times \) 7 latin squares. J. Indian Soc. Agric. Stat. 3, 24–79 (1951)Google Scholar
 25.Schönhardt, E.: Über lateinische quadrate und unionen. J. Reine Angew. Math. 163, 183–230 (1930)Google Scholar
 26.Wells, M.B.: The number of latin squares of order eight. J. Comb. Theory 3, 98–99 (1967)MathSciNetCrossRefzbMATHGoogle Scholar
 27.Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proceedings of 20th International Conference on very Large Data Bases, VLDB 1215, 487–499 (1994)Google Scholar
 28.Zaki, M.J.: Scalable algorithms for association mining. IEEE Trans. Knowl. Data Eng. 12(3), 372–390 (2000)MathSciNetCrossRefGoogle Scholar
 29.R: A Language and Environment for Statistical Computing. http://www.Rproject.org