Non-dominated sorting based multi-page photo collage

The development of social networking services (SNSs) revealed a surge in image sharing. The sharing mode of multi-page photo collage (MPC), which posts several image collages at a time, can often be observed on many social network platforms, which enables uploading images and arrangement in a logical order. This study focuses on the construction of MPC for an image collection and its formulation as an issue of joint optimization, which involves not only the arrangement in a single collage but also the arrangement among different collages. Novel balance-aware measurements, which merge graphic features and psychological achievements, are introduced. Non-dominated sorting genetic algorithm is adopted to optimize the MPC guided by the measurements. Experiments demonstrate that the proposed method can lead to diverse, visually pleasant, and logically clear MPC results, which are comparable to manually designed MPC results.


Introduction
The popularity of digital cameras and mobile devices has provided people with the opportunity to take photos easily and record their lives. The limit on the number of uploaded images contributes to the difficulty in sharing a large image collection on social networks at once (for example, 10 for Facebook and 9 for WeChat). Moreover, viewing a set of images one by one may be time consuming and may contribute to the difficulty in helping the viewer obtain the main information of the images. Many users choose to post the images in a multi-page photo collage (MPC) mode to address the aforementioned problems by arranging the images into multiple collages with simple templates and posting the collages to the social media as shown in Fig. 1. Therefore, photos can be uploaded on social media within the limited numbers, and viewers can easily obtain information regarding the image collection.
Collages generally comprise images along the horizontal and vertical directions to enlarge the collage volume effectively and maintain the collage in a simple and straightforward organizational format as shown in Fig. 1. However, ensuring the visually pleasing appearance and effective information transmission of collages in a disorganized arrangement is difficult. The design of MPC is time and energy consuming for non-professional users. Image collage methods are carefully designed to manage and explore the visualization of large image collections on a canvas [1][2][3][4]. These methods focus on information aggregation and presentation to help users analyze image collections. The layout of the collage is usually automatically generated. Each sub-image corresponds to a part of the canvas by assigning a position coordinate, a rotation angle, and a scale factor. However, these methods are used to allocate all images onto one canvas, and the layout among collages varies. Therefore, the final collage is complicated and suffers huge sub-module amounts and serious sub-module size variations. Moreover, collaging all images on one canvas may lead to an information explosion for social media sharing, and the frequent changing of collage layouts may add difficulty to the acquisition of information among different collages.
Album design is a visualization form that merges the image assignment and collage. These methods assign a certain number of images to each page and provide a configuration that can help decide the image appearance. However, existing research trends on the design of album pages have blank spaces and image rotations, which will decrease the delivery effectiveness and legibility of information. These techniques can hardly work well on the MPC. This paper focuses on the issue of this application scenario. The largest challenge of the MPC is to determine the deep relationship between the entire collage choreography and user experience. Psychologists explored the inner connection between the element characters of the image and the visual experience of users. They revealed that people perceive elements gathered together as a whole rather than individuals [5]. Meaningful conclusions (e.g., the visual appreciation of a picture largely depends on the perceptual balance of its elements [6], and a positive relationship exists between balance and liking for a multi-element picture [7]) have been drawn. Most psychological literature refers to the assessment of preference for balance (APB) [8] and deviation of the center of mass (DCM) [6] to measure the character of balance. However, these measures cannot provide sufficient guidance for MPC due to the lack of consideration for high-level features for every single page and coordination of images among multi-pages, which are important for evaluating the MPC to achieve balance perception.
The visual data of social medias are characterized by their large scale, user data are of considerable difference, and the updating rhythm is of high speed. Therefore, the page numbers of MPC cannot be excessive, and the layout of each page cannot be substantially complicated. The task is split into the following three issues: assignment of images to separated pages, arrangement of the images in an exact page, and resolution of conflicts between the internal optimization of a page and optimization between pages.
Novel objective functions based on information gathering, which measure the compactness degree of image information within a page and the dispersion degree within pages, are proposed to solve the first issue. The engaged measure is formulated by refining the high-level semantic information of images and combining the image features of color, content, and object size. A novel uniqueness function is proposed to solve the second issue, and the collage per page is established in a balanced form at the feature level. As a process of multi-objective optimization, a non-dominated sorting genetic algorithm-II (NSGA-II) [9] is employed to solve the third issue, and the measurement of single and multiple collage pages is treated as equally important levels. A genetic algorithm is utilized to find a solution set that effectively balances all optimization objectives. Overall, the major contributions of this work are as follows: • The formulation of MPC, which is a fresh new type of the multi-image joint presence task and a new form of photo sharing on social media. • Novel image scatter measurements based on information aggregation, which formulate not only the evaluation within a collage page but also among multiple collage pages, are proposed. • A new form of image balance at the image feature level and a uniqueness function is proposed, which can lead to visually pleasing image arrangements on a page. • An optimal strategy based on NSGA-II , which jointly applies the two aforementioned measures to construct image distribution with clear outlines of information, is conducted.

Image collage
Image collage is a traditional technique for multiimage presence, which lays multiple images on one canvas. Highlighting salient information and speeding up browsing of an image set are the common core tasks. Tan et al. [10] arranged images globally by graph formulation and refined the final result by Voronoi tessellation. Cao et al. [11] focused on stylized comic layout generation through userspecified semantics artwork dataset. They further utilized a probabilistic graphical model to synthesize comic elements, which aims to guide the attention of the reader [12]. Chen et al. [13] leveraged the image collage techniques to visualize the video summarization in a static way. Yu et al. [1] took the photo collage as a circle packing issue to show the salient areas of each image. Jing et al. [14] extracted video key frames and arranged them with manga-style layout based on salience. Han et al. [15] collaged images into different interesting shapes via a tree-based approach. Liu et al. [2] generated compact collages by splitting the canvas into irregular partitions. Liang et al. [3] proposed sample-based image arrangement and Voronoi tree map-based layout generation methods. Wu and Aizawa [16] generated image correlation preserved collages by binary tree-based page analysis. Zheng et al. [17] used a deep generative model for sketching with user semantics information input. Pan et al. [4] visualized image collection summarization and generated the layout via tree-based image content analysis. Gan et al. [18] managed albums in a comic-like layout based on image classification. These collage methods can have an outstanding effect on image collection information arrangement. However, placing images on one canvas led to a final result with a complex layout, thus lacking application in daily life. Some works focus on application scenarios with the popularity of mobile devices. Kong et al. [19] proposed a strategy to collage phone images into a centroidal Voronoi diagram. Song et al. [20] introduced a balance-based photo-posting strategy, which is tailored for the scenario in social media.
Overall, the collage research trends to design methods based on real application scenarios such as mobile devices and social platforms. This paper focuses on the task of MPC design, which is a comprehensive exploration of the presence of collages as well as a real form of daily image sharing with realistic application scenarios.

Multi-objective optimization
Multi-objective optimization (MOO) is a common problem for most research. Multi-angle analysis of an issue often introduces multiple optimization objectives under the constraint of conflicts. Optimizing the objectives simultaneously is difficult. Instead, some solutions, namely Pareto-optimal solutions, can optimize one objective without degrading the others. Many multi-objective evolutionary algorithms are used to search approximations of the Pareto-optimal solution and applied to several areas actively [9,21,22]. Li and Zhong [23] generated a realistic crowd model by considering both static and dynamic features. Lee et al. [24] modeled free-form surfaces by optimizing multiple objectives of design goal and cost efficiency to reduce the usage of curved panels. Rejeesh and Thejaswini [25] proposed a denoising algorithm based on optimal trilateral filtering. Su and Yin [26] solved the super-resolution problem of a single image by merging the MOO strategy into GAN training. This paper employs the non-dominated sorting genetic algorithm-II (NSGA-II) to search for solutions (one multi-objective optimization method). The operator of the genetic algorithm is utilized to generate the solutions, and non-dominated sorting is utilized to select the excellent solution.

Measurement formulation
This paper proposed a new formulation tailored for MPC considering the demand for a clear information outline and visually pleasing experience. Numerous studies have worked on image quality [27][28][29][30] or aesthetics assessment [31][32][33]. Meanwhile, all the methods are designed for single images, and the measurement of MPC is still lacking. The MPC is decomposed as a two-step task, namely image scatter to pages and image arrangement per page, as shown in Fig. 2. Specific measures, which can jointly contribute to the final generation of collages, have been designed for each step.

Feature characterization
The feature item in this paper includes three kinds: color, content, and size. Color sets the tone of images and is characterized as dominant colors in RGB space. Each image is described by three dominant color descriptors and represented by the symbol fea col . Content, which is the core information carrier of an image, is characterized as the last layer output of VGG-16 [34] that is pre-trained on ImageNet. The content item is represented by the symbol fea con . Size is an important descriptor for image objects and characterized as "Saliency Map" which is computed and represented by the symbol fea size . Overall, three kinds of characterizations are utilized to describe image from a different angle (fea col , fea con , fea size ) and processed as image information in the following calculation.

Image scatter to pages
Image scattering is the chief task for a photo collage, and the information difference is the key factor. Gathering similar images into a page can illustrate a clear information outline among pages, which can ease viewer browsing, especially for the image set with serious information variation. The scattering quality is measured by determining the similarity and dissimilarity degrees of images on a page and those between pages, respectively.

Measurement in a page
The inner measurement of pages aims to determine if the images within a page have similar information. The value is calculated by feature subtraction among images, which is implemented as follows: where P is the set of images in a page; Mea col in , Mea con in , and Mea size in are the page inner measurements of color, content, and size, respectively; fea col (I m ), fea con (I m ), and fea size (I m ) are the feature values of image I m 's color, content, and size, respectively. Overall, the entire page inner measurement can be written as follows: Mea in = Mea col in + Mea con in + Mea size in (4)

Measurement between pages
The outer measurement of pages aims to determine the image difference among pages. The feature difference of all pictures that are not on the same page is summed as the page outer measurement, and the implement details are as follows: where S is the set of all images, and P is the set of images in a page; Mea col out , Mea con out , and Mea size out are the respective measurements of color, content, and size between pages; fea col (I m ), fea con (I m ), and fea size (I m ) are the feature values of image I m 's color, content, and size, respectively. Overall, the entire measurement between pages can be written as follows: Mea out = Mea col out +Mea con out +Mea size out (8) The final objective functions can then be written as which is an issue of multi-objective optimization, and M is the collage page number.

Image arrangement per page
The image arrangement for each page is the final decision area when page scattering of images is decided. Researchers have found that users prefer the image with "balanced" characteristics [6,35]. Thus, a balance formulation at the image level and a uniqueness mechanism are proposed to guide the image arrangement per page. The most unique image is generally laid on the most evident location. The uniqueness mechanism describes the difference degree between one image and other images on a page, which is calculated as follows: Thus, each image for a determined page owns a uniqueness value, and the unique image owns the largest value. The unique image is presented in this paper to the page center location (if the image number is even, then this step is skipped due to the absence of center location). Then, the similarity between the remaining images is calculated. The two images are closer than any other images on the page and will be set to a pair. Images on a page under the result of the unique image and image pairs are arranged as follows. First, the unique image is put in the center. Second, the paired image is placed at the symmetric position on the page. Therefore, the image arrangement results in a page can naturally be clear information, and the image-level balance can lead to a visually pleasing result.

Approach
The approach design faces two challenges. The first challenge is the solution space design. A solution must correspond to an MPC result, of which the page number, the image scatter result, and the image arrangement must be accurately expressed. The second challenge is the guidance design, which can help search for the final result in the solution space. For the first challenge, the MPC is represented by introducing predefined templates. By contrast, the MPC for the second challenge is half solved by Section 3, which can be used as the guidance. However, two objective functions are available, and dealing with the two associated and exclusive objective functions is the core link. The NSGA-II, which can directly leverage the solution representation form and improve the objective value without degrading the other objective values, is adopted for the aforementioned challenge. For clarity, this section will discuss template definition and NSGA-II.

Image set characterization and template definition
This paper characterizes the image set first and equips it with an MPC template to achieve the detailed distribution of the image set. Each image set is characterized as an image sequence. The template is set in accordance with the total number of images in the image set, that is, the sum of the number of images on a single page in a template should meet the capacity of the image set. Given an exact image set, every possible configuration (how many pages and images per page) is seen as a template and all templates form the template space. Specifically, the template mentioned herein is sequential. That is, as shown in Fig. 3, the results generated by the same image sequence but different templates, namely 3 × 3×2 and 3×2×3 will be different. Notably, each page is in full-line collage (learned from the existing multipage form Fig. 1), and all templates are designed under the assumption of the presence of upper limits for the total page number and the image number per page. Taking the image set containing eight images as an example, 47 templates in total are under the page number upper limit 4 and the image number in a page upper limit 4. An MPC is specific when the image set is sequentially characterized and a template is given. The final characterization for the MPC is presented as a sequence that starts with the template index and follows the image set characterization as shown in Fig. 3.

Non-dominated sorting genetic algorithm-II
As previously mentioned, the MPC is an issue of multi-objective optimization, which is difficult to solve by common methods directly. This paper employs the NSGA-II to search the solution space. In this algorithm, the operator of the genetic algorithm generates candidate solutions, and the non-dominated sorting mechanism selects excellent solutions. Notably, the search space includes two parts: template and image space. The MPC can be represented according to Section 4.1, in which the sequential characterization naturally matches the gene appearance. The measurement can be calculated in accordance with Section 3, in which the value naturally matches the function of the adaption value.

Genetic operators
Gene mutations and the gene segment exchange can effectively search the existing solution space rounds. This paper presents two gene types, namely template and image genes, and the gene operator comprises three types: template gene mutation, image gene exchange, and image segment exchange as shown in Fig. 4. Template-gene-mutation. The template gene mutation is an operation that can test additional templates on the identical image arrangement. The template gene can mutate into another template index according to the probability.
Image-gene-exchange. Instead of image gene mutation, the image gene exchange is adopted due to the feature of the image set, which indicates that images should be different from each other. The two image genes can exchange the location according to the probability.
Image-segment-exchange. Exchange from one image exchange to image segment exchange is enlarged considering the efficiency of operators. The two (or three) image gene segments can exchange the location according to the probability.

Non-dominated sorting
The evolution of genetic algorithms is generally based on the determined fitness function values. Directly selecting generation by a determined value is impractical for the different optimization directions of different objective functions. The nondominated sorting strategy is adopted in the select generated solutions [22]. Non-dominated sorting hierarchical solution space is conducted by judging the relationship between different solutions. If all the objective function values of a solution a is no more than the corresponding function values of the solution b, then solution a dominates solution b. Therefore, the non-dominated solution means that the solution is comprehensive, which demonstrates a good performance among all objectives.

Experiments
First, the previously proposed measurement is analyzed to discuss the effects of the coefficient on the result. Then, three experiments are designed to examine the performance of the proposed method in different settings. A user study is also taken to examine the differences in results between the proposed method and human beings. Each experiment has two results: image scatter to pages and image arrangements per page. The collage collection with eight images is used for the convenience of illustration. The image collection is randomly shuffled 10 times, and each shuffle result is equipped with a candidate template. Therefore, the initialization of the image sequence is obtained for the experiments.

Coefficient analysis
Three items are involved in the definition of MPC measurement, namely content, size, and color, which are given equal importance. Experiments with item bias, which is unfolded into ablation study and enhancement experiments, are conducted for further insights.

Ablation study
The ablation study comprises two steps: single and two items for the measurement. Particularly, number sequences are used as the representation of items; that is, 1, 0, and 0 respectively indicate content, size, and color. Only the content items are left, and the coefficient is 1. The collages in Fig. 5 are the example collages. These collages increasingly become visually pleasing as the number of iterations rises.
Single item. Only one item is left for the final measurement in this experiment. The measurement values are recorded in Table 1, and the outliers are marked in red. First, the measurement slightly varies and numerous same values appear when only one item exists. Second, the final iterate collages have many contradictory values. Take template 5 × 2 × 1 for example. Mea out value is almost unchanged in (0 0 1) mode, and the final value at r = 5 is smaller than that at r = 2. Therefore, only one item can hardly describe the collages, and the exceptions in values and insensitivity to values are common.
Two items. Only two items are left for the final measurement in this experiment. The measurement values are recorded in Table 2, and the outliers are marked in red. The results show that the remaining and contradictory values still existed but to a lesser extent. Considering two items, template 5 × 2 × 1 can almost provide a reasonable description, which may because of the difference in images. The item omission for template 2 × 2 × 2 × 2 still has serious effects.

Enhancement experiment
All items are reserved in this experiment but with emphasized consideration for the final measurement.
The measurement values are recorded in Table 3. We choose one item as an enhancement one and give it a higher coefficient. Outliers are also marked in red. One item is chosen as an enhancement and Fig. 5 Pipeline of template-fixed collage generation. The fixed template is sketched out with red boxes, and the uniqueness image is labeled with red star. As the number of iterations (r) increases, the collages gradually becomes clear and tidy information, which leads to a visually easy and pleasant experience.  given a high coefficient. Outliers are also marked in red. The results reveal that template 5 × 2 × 1 is seriously affected by the enhancement, and template 2 × 2 × 2 × 2 can maintain a normal level of collage description.

Template fixed MPC generation
The template is the direct factor that affects collage generation. Therefore, the generation result of the proposed method is first examined under fixed templates. Experiments are performed on the image collection by using uniform (2 × 2 × 2 × 2) and a non-uniform template (5 × 2 × 1) templates. Figure 5 shows the template sketch and the process of collages generation. Starting with a disorganized images scatter (r = 0) initialization, the proposed method can reassign these images into pages in an information tidy form. For example, images with similar color information for templatefor template (2 × 2 × 2 × 2) are close at r = 2 and the entire image scatter result converges at r = 5. The final collage result must arrange images in a page based on the scatter result. All image number for the result of the template (2 × 2 × 2 × 2) in a page is 2, of which the scatter result can be the final collage result. The page with five images needs the arrangement for the result of the template (5 × 2 × 1). Figure 6 illustrates uniqueness value distribution, in which unique images and image pairs are respectively labeled by a red star and color boxes.
The final result in Fig. 5 is easy to read due to its clear information gathering effect, thus leading to a visually easy and pleasant experience. However, the generation of collages by a definite template is inflexible. Thus, the proposed method is further evaluated by presetting the collage page numbers.

Page number fixed MPC generation
Page number affects collage generation by indirectly limiting the template types. An exact page number setting generally corresponds to a template set in a limited number. For example, the collection of eight images reveal four template types, namely ((1 × 7), (2 × 6), (3 × 5), and (4 × 4)) under a setting of the deterministic page number 2. The experiments on two image collections (collection a and collection b) are conducted under three different page number settings (2, 3, and 4).

Two-page collage generation
The template of two-page collage can be unfolded into seven individuals: (1 × 7), (7 × 1), (2 × 6), (6 × 2), (3 × 5), (5 × 3), (4 × 4). Instinctively, reading a collage page with a large image volume, which may be caused not only by the numerous information itself but also by the image volume variance among pages, is visually unfriendly. Figure 7 shows the pipeline of two-page collage generation. With rough scatter (r = 0) initialization, the proposed method searches solutions in image and template spaces. The results in the optimization process vary, and the final scatter result can be convergent at a limited iteration number, that is, (r = 10) for collection a and (r = 2) for collection b. The page volume of the final template is 3 and 4, which is an ideal configuration without any pre-designs. Images on a page are arranged similar to that in Section 5.2 based on the image scatter result. The final collage result is at the rightmost in Fig. 7, which is clear and visually pleasing.

Three-page collage generation
The total template number of a three-page collage is 21. Figure 8 shows the pipeline of three-page collage generation. Compared with the configuration of two pages, the setting of three provides additional space to information gathering among pages. The final result of three pages demonstrates information aggregation and is easy to read whether for collection a or collection b.

Four-page collage generation
The four-page configuration has a total of 20 templates. The total template number of a three-page collage is 21. Figure 9 shows the pipeline of four-page collage generation, and the result is optimized step by step with iteration. However, Sections 5.3.2 and 5.3.1 reveal that the information becomes specific per page as the page number increases. This situation has the opposite effect when the page number is 4. Figure 9 shows that collection a must split two similar images into different pages to satisfy the Fig. 7 Pipeline of two-page collage generation. The entire generation follows the timeline, yellow blocks represent the iteration number, and collages nearby are the corresponding image scatter result. The image scatter result is surrounded by blue boxes. Then, the images in each page will be rearranged in accordance with the mechanism of uniqueness value computation and image similarity matching. The final collage result is labeled by red box.  page setting, and collection b must let two similar images become separate pages. Increasing the page number blindly is an ineffective means to improve the final quality. Thus, the proposed method is further evaluated by removing the page preset, and the solution is searched among all templates.

MPC generation without fixed condition
The template space is built under the assumption that the image page is no more than four to avoid an excessive number of templates for iteration. The size of the entire space is 47, which includes the two-and three-page templates. Figure 10 compares the results generated by the proposed method (after 20 iterations) with random initialization and human results. The findings reveal that collages generated by the proposed method outperform random collages and are comparable to manual results.

User study
A total of 30 participants with different backgrounds (15 males and 15 females) were invited. Twenty new image collections, which were released on the website according to the designed arrangement, were obtained. Participants were shown three different results for each image set: 1) generated randomly, 2) generated by the proposed method, and 3) arranged by photographers. The participants were then asked to choose one from these results.
Participants were shown three different results for each image set (one is generated randomly, one is generated by our method, and the last one is arranged by photographers) and were asked to choose one from them. Table 4 shows that the collages generated by the proposed method are superior to the random result and have votes close to the photographer result.

Limitation and discussion
There are two main limitations of our work. One is that the image number of each template is fixed (8 in our experiments); thus it is hard to deal with the image set which has more or less images, which makes the method less of versatility. The other limitation is that our template form is fixed, so it can not collage the images according to its content flexibly. In the future, we will take advantage of template generated methods like Ref. [4] to enhance the generality of the algorithm. Figure 11 illustrates a failure case of our method. We can observe that our method prefers to scatter images that contain different visual information to individual pages (6 pages in Fig. 11(a)). However, images with different visual cues may contain similar scene semantics and should be scattered to one page (see Fig. 11(b)), so that the whole MPC can be more brief and clear. Moreover, Fig. 12 shows a special case. The result generated by our method is different from human result, while both results are visually pleasing and reasonable.

Conclusions
This study focuses on the issue of MPC generation, which is common in social networks. Compared with traditional collage studies, MPC must organize images into pages (scatter images into pages and arrange images per page) to generate visually pleasing results. Measurements based on information aggregation, which can guide image scattering, are proposed; color, content, and size of images are leveraged as basic items of image information. Furthermore, the measurements are optimized by searching for solutions with a non-dominated sorting algorithm-II (NSGA-II). A balanced strategy at the image-level and a uniqueness value computation mechanism are proposed to arrange images within a page. Experiments show that the proposed measurements, balance strategy, and uniqueness mechanism can lead to clear information and visually pleasing results. The results generated by the proposed method are compared with those of photographers. Merging different kinds of templates into MPC in a subtle manner will be considered in the future.