Measuring our Universe from Galaxy Redshift Surveys

Galaxy redshift surveys have achieved significant progress over the last couple of decades. Those surveys tell us in the most straightforward way what our local Universe looks like. While the galaxy distribution traces the bright side of the Universe, detailed quantitative analyses of the data have even revealed the dark side of the Universe dominated by non-baryonic dark matter as well as more mysterious dark energy (or Einstein’s cosmological constant). We describe several methodologies of using galaxy redshift surveys as cosmological probes, and then summarize the recent results from the existing surveys. Finally we present our views on the future of redshift surveys in the era of precision cosmology.


Introduction
Nowadays the exploration of the Universe can be performed by a variety of observational probes and methods over a wide range of the wavelengths: the temperature anisotropy map of the cosmic microwave background (CMB), the Hubble diagrams of nearby galaxies and distant Type Ia supernovae, wide-field photometric and spectroscopic surveys of galaxies, the power spectrum and abundances of galaxy clusters in optical and X-ray bands combined with the radio observation through the Sunyaev-Zel'dovich effect, deep surveys of galaxies in sub-mm, infrared, and optical bands, quasar surveys in radio and optical, strong and weak lensing of distant galaxies and quasars, high-energy cosmic rays, and so on. Undoubtedly gamma-rays, neutrinos, and gravitational radiation will join the above already crowded list.
Among those, optical galaxy redshift surveys are the most classical. Indeed one may phrase that the modern observational cosmology started with a sort of galaxy redshift survey by Edwin Hubble. Still galaxy redshift surveys are of vital importance in cosmology in the 21st century for various reasons: Redshift surveys have unprecedented quantity and quality: The numbers of galaxies and quasars in the spectroscopic sample of Two Degree Field (2dF) are ∼ 250, 000 and ∼ 30, 000, and will reach ∼ 800, 000 and 100, 000 upon completion of the on-going Sloan Digital Sky Survey (SDSS). These unprecedented numbers of the objects as well as the homogeneous selection criteria enable the precise statistical analysis of their distribution.
The Universe at z ≈ 1000 is well specified: The first-year WMAP (Wilkinson Microwave Anisotropy Probe) data [6] among others have established a set of cosmological parameters. This may be taken as the initial condition of the Universe from the point-of-view of the structure evolution toward z = 0. In a sense, the origin of the Universe at z ≈ 1000 and the evolution of the Universe after the epoch are now equally important, but they constitute well separable questions that particle and observational cosmologists focus on, respectively.

Gravitational growth of dark matter component is well understood:
In addition, extensive numerical simulations of structure formation in the Universe has significantly advanced our understanding of the gravitational evolution of the dark matter component in the standard gravitational instability picture. In fact, we even have very accurate and useful analytic formulae to describe the evolution deep in its nonlinear regime. Thus we can now directly address the evolution of visible objects from the analysis of their redshift surveys separately from the nonlinear growth of the underlying dark matter gravitational potentials.
Formation and evolution of galaxies: In the era of precision cosmology among others, the scientific goals of research using galaxy redshift surveys are gradually shifting from inferring a set of values of cosmological parameters using galaxy as their probes to understanding the origin and evolution of galaxy distribution given a set of parameters accurately determined by the other probes like CMB and supernovae.
With the above in mind, we will attempt to summarize what we have learned so far from galaxy redshift surveys, and then describe what will be done with future data. The review is organized as follows. We first present a brief overview of the Friedmann model and gravitational instability theory in Section 2. Then we describe the non-Gaussian nature of density fluctuations generated by the nonlinear gravitational evolution of the primordial Gaussian field in Section 3. Next we discuss the spatial biasing of galaxies relative to the underlying dark matter distribution in Section 4. Our understanding of biasing is still far from complete, and its description is necessarily empirical and very approximate. Nevertheless this is one of the most important ingredients for proper interpretation of galaxy redshift surveys. Section 5 introduces general relativistic effects which become important especially for galaxies at high redshifts. We present the latest results from the two currently largest galaxy redshift surveys, 2dF (Two Degree Field) and SDSS (Sloan Digital Sky Survey), in Section 6. Finally, Section 7 is devoted to a summary of the present knowledge of our Universe and our personal view of the future direction of cosmological researche in the new millennium.
hence to test for consistency with the underlying hypothesis. If the assumption of homogeneity turns out to be wrong, then there are numerous possibilities for inhomogeneous models, and each of them must be tested against the observations. For that purpose, one needs observational data with good quality and quantity extending up to high redshifts. Let us mention some of those:

CMB fluctuations
Ehlers, Garen, and Sachs [18] showed that by combining the CMB isotropy with the Copernican principle one can deduce homogeneity. More formally their theorem (based on the Liouville theorem) states that "If the fundamental observers in a dust space-time see an isotropic radiation field, then the space-time is locally given by the Friedman-Robertson-Walker (FRW) metric". The COBE (COsmic Background Explorer) measurements of temperature fluctuations (∆T /T = 10 −5 on scales of 10 • ) give via the Sachs-Wolfe effect (∆T /T = 1 3 ∆φ/c 2 ) and the Poisson equation r.m.s. density fluctuations of δρ/ρ ∼ 10 −4 on 1000 h −1 Mpc (see, e.g., [99]), which implies that the deviations from a smooth Universe are tiny.

Galaxy redshift surveys
The distribution of galaxies in local redshift surveys is highly clumpy, with the Supergalactic Plane seen in full glory. However, deeper surveys like 2dF and SDSS (see Section 6) show that the fluctuations decline as the length-scales increase. Peebles [69] has shown that the angular correlation functions for the Lick and APM (Automatic Plate Measuring) surveys scale with magnitude as expected in a Universe which approaches homogeneity on large scales. While redshift surveys can provide interesting estimates of the fluctuations on intermediate scales (see, e.g., [72]), the problems of biasing, evolution, and K-correction would limit the ability of those redshift surveys to 'prove' the cosmological principle. Despite these worries the measurement of the power spectrum of galaxies derived on the assumption of an underlying FRW metric shows good agreement with the Λ-CDM (cold dark matter) model.

Radio sources
Radio sources in surveys have a typical median redshift ofz ∼ 1, and hence are useful probes of clustering at high redshift. Unfortunately, it is difficult to obtain distance information from these surveys: The radio luminosity function is very broad, and it is difficult to measure optical redshifts of distant radio sources. Earlier studies claimed that the distribution of radio sources supports the cosmological principle. However, the wide range in intrinsic luminosities of radio sources would dilute any clustering when projected on the sky. Recent analyses of new deep radio surveys suggest that radio sources are actually clustered at least as strongly as local optical galaxies. Nevertheless, on very large scales the distribution of radio sources seems nearly isotropic.

X-ray background
The X-ray background (XRB) is likely to be due to sources at high redshift. The XRB sources are probably located at redshift z < 5, making them convenient tracers of the mass distribution on scales intermediate between those in the CMB as probed by COBE, and those probed by optical and IRAS redshift surveys. The interpretation of the results depends somewhat on the nature of the X-ray sources and their evolution. By comparing the predicted multipoles to those observed by HEAO1, Scharf et al. [75] estimate the amplitude of fluctuations for an assumed shape of the density fluctuations. The observed fluctuations in the XRB are roughly as expected from interpolating between the local galaxy surveys and the COBE and other CMB experiments. The r.m.s. fluctuations δρ/ρ on a scale of ∼ 600 h −1 Mpc are less than 0.2%.
Since the (generalized) cosmological principle is now well supported by the above observations, we shall assume below that it holds over scales l > 100h −1 Mpc.
The rest of the current section is devoted to a brief review of the homogeneous and isotropic cosmological model. Further details may be easily found in standard cosmology textbooks [96,62,69,64,10,63].
The cosmological principle is mathematically paraphrased as that the metric of the Universe (in its zero-th order approximation) is given by where x is the comoving coordinate, and where we use units in which the light velocity c = 1. The above Robertson-Walker metric is specified by a constant K, the spatial curvature, and a function of time a(t), the scale factor. The homogeneous and isotropic assumption also implies that T µν , the energy-momentum tensor of the matter field, should take the form of the ideal fluid: where u µ is the 4-velocity of the matter, ρ is the mean energy density, and p is the mean pressure.

From the Einstein equation to the Friedmann equation
The next task is to write down the Einstein equation, using Equations (1) and (2). In this case one is left with the following two independent equations: 1 a d 2 a dt 2 = − 4πG 3 (ρ + 3p) + Λ 3 (5) for the three independent functions a(t), ρ(t), and p(t).
In either case, however, one needs another independent equation to solve for a(t). This is usually given by an equation of state of the form p = p(ρ). In cosmology, the following simple relation is assumed: While the value of w may in principle change with redshift, it is often assumed that w is independent of time just for simplicity. Then substituting this equation of state into Equation (7) immediately yields ρ ∝ a −3(1+w) .
The non-relativistic matter (or dust), ultra-relativistic matter (or radiation), and the cosmological constant correspond to w = 0, 1/3, and −1, respectively. If the Universe consists of different fluid species with w i (i = 1, . . . , N ), Equation (9) still holds independently as long as the species do not interact with each other. If one denotes the present energy density of the i-th component by ρ i,0 , then the total energy density of the Universe at the epoch corresponding to the scale factor of a(t) is given by where the present value of the scale factor, a 0 , is set to be unity without loss of generality. Thus, Note that those components with w i = −1 may be equivalent to the conventional cosmological constant Λ at this level, although they may exhibit spatial variation unlike Λ. Evaluating Equation (11) at the present epoch, one finds where H 0 is the Hubble constant at the present epoch. The above equation is usually rewritten as follows: where the density parameter for the i-th component is defined as and similarly the dimensionless cosmological constant is Incidentally Equation (13) clearly illustrates the Mach principle in the sense that the space curvature is simply determined by the amount of matter components in the Universe. In particular, Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 the flat Universe (K = 0) implies that the sum of the density parameters is unity: Finally the cosmic expansion is described by As will be shown below, the present Universe is supposed to be dominated by non-relativistic matter (baryons and collisionless dark matter) and the cosmological constant. So in the present review, we approximate Equation (17) as unless otherwise stated.

Expansion law and age of the Universe
Equation (18) has the following analytic solutions in several simple but practically important cases.

Einstein's static model and Lemaître's model
So far we have shown that solutions of the Einstein equation are dynamical in general, i.e., the scale factor a is time-dependent. As a digression, let us examine why Einstein once introduced the Λterm to obtain a static cosmological solution. This is mainly important for historical reasons, but is also interesting to observe how the operationally identical parameter (the Λ-term, the cosmological constant, the vacuum energy, the dark energy) shows up in completely different contexts in the course of the development of cosmological physics. Consider first the case of Λ = 0 in Equations (4) and (5). Clearly the necessary and sufficient condition that the equations admit the solution of a = const. is given by Namely, any static model requires that the Universe is dominated by matter with either negative pressure or negative density. This is physically unacceptable as long as one considers normal matter in the standard model of particle physics. If Λ = 0 on the other hand, the condition for the static solution is Thus both ρ and p can be positive if In particular, if p = 0, This represents the closed Universe (with positive spatial curvature), and corresponds to Einstein's static model. The above static model is a special case of Lemaître's Universe model with Λ > 0 and K > 0. For simplicity, let us assume that the Universe is dominated by non-relativistic matter with negligible pressure, and consider the behavior of Lemaître's model. First we define the values of the density and the scale factor corresponding to Einstein's static model: In order to study the stability of the model around the static model, consider a model in which the density at a = a E is a factor of α(> 1) larger than ρ E . Then and Equations (4) and (5) reduce tȯ Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 For the period of a a E , Equation (33) indicates that a ∝ t 2/3 and the Universe is decelerating (ä < 0). When a reaches α 1/3 a E ,ȧ 2 takes the minimum value Λa 2 E (α 2/3 − 1) and the Universe becomes accelerating (ä > 0). Finally the Universe approaches the exponential expansion or de Sitter model: a ∝ exp(t Λ/3). If α becomes closer to unity, the minimum value reaches zero and the expansion of the Universe is effectively frozen. This phase is called the coasting period, and the case with α = 1 corresponds to Einstein's static model in which the coasting period continues forever. A similar consideration for α < 1 indicates that the Universe starts collapsing (ȧ 2 = 0) beforeä = 0. Thus the behavior of Lemaître's model is crucially different if α is larger or smaller than unity. This suggests that Einstein's static model (α = 1) is unstable.

Vacuum energy as an effective cosmological constant
So far we discussed the cosmological constant introduced in the l.h.s. of the Einstein equation. Formally one can move the Λ-term to the r.h.s. by assigning This effective matter field, however, should satisfy an equation of state of p = −ρ. Actually the following example presents a specific example for an effective cosmological constant. Consider a real scalar field whose Lagrangian density is given by Its energy-momentum tensor is and if the field is spatially homogeneous, its energy density and pressure are Clearly if the evolution of the field is negligible, i.e.,φ 2 V (φ), p φ ≈ −ρ φ and the field acts as a cosmological constant. Of course this model is one of the simplest examples, and one may play with much more complicated models if needed.
If the Λ-term is introduced in the l.h.s., it should be constant to satisfy the energy-momentum conservation T µν ;ν = 0. Once it is regarded as a sort of matter field in the r.h.s., however, it does not have to be constant. In fact, the above example shows that the equation of state for the field has w = −1 only in special cases. This is why recent literature refers to the field as dark energy instead of the cosmological constant.

Gravitational instability
We have presented the zero-th order description of the Universe neglecting the inhomogeneity or spatial variation of matter inside. Now we are in a position to consider the evolution of matter in the Universe. For simplicity we focus on the non-relativistic regime where the Newtonian approximation is valid. Then the basic equations for the self-gravitating fluid are given by the continuity equation, Euler's equation, and the Poisson equation: Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 We would like to rewrite those equations in the comoving frame. For this purpose, we introduce the position x in the comoving coordinate, the peculiar velocity v, density fluctuations δ(t, x), and the gravitational potential φ(t, x) which are defined as respectively. Then Equations (39) to (41) where the dot and ∇ in the above equations are the time derivative for a given x and the spatial derivative with respect to x, i.e., defined in the comoving coordinate (while those in Equations (39,40,41) are defined in the proper coordinate). A standard picture of the cosmic structure formation assumes that the initially tiny amplitude of density fluctuation grow according to Equations (46,47,48). Also the Universe smoothed over large scales approaches a homogeneous model. Thus at early epochs and/or on large scales, the nonlinear effect is small and one can linearize those equations with respect to δ and v: where c 2 s ≡ (∂p/∂ρ) is the sound velocity squared. As usual, we transform the above equations in k space using Then the equation for δ k reduces tö If the signature of the third term is positive, δ k has an unstable, or, monotonically increasing solution. This condition is equivalent to the Jeans criterion: namely, the wavelength of the fluctuation is larger than the Jeans length λ J which characterizes the scale that the sound wave can propagate within the dynamical time of the fluctuation π/Gρ. Below the scale, the pressure wave can suppress the gravitational instability, and the fluctuation amplitude oscillates.

Linear growth rate of the density fluctuation
Most likely our Universe is dominated by collisionless dark matter, and thus λ J is negligibly small. Thus, at most scales of cosmological interest, Equation (54) is well approximated as For a given set of cosmological parameters, one can solve the above equation by substituting the expansion law for a(t) as described in Section 2.3. Since Equation (55) is the second-order differential equation with respect to t, there are two independent solutions; a decaying mode and a growing mode which monotonically decreases and increases as t, respectively. The former mode becomes negligibly small as the Universe expands, and thus one is usually interested in the growing mode alone. More specifically those solutions are explicitly obtained as follows. First note that the l.h.s. of Equation (18) is the Hubble parameter at t, H(t) =ȧ/a: The first and second differentiation of Equation (56) with respect to t yields respectively. Thus the differential equation for H reduces tö This coincides with the linear perturbation equation for δ k , Equation (55). Since H(t) is a decreasing function of t, this implies that H(t) is the decaying solution for Equation (55). Then the corresponding growing solution D(t) can be obtained according to the standard procedure: Subtracting Equation (55) from Equation (60) yields and therefore the formal expression for the growing solution in linear theory is It is often more useful to rewrite D(t) in terms of the redshift z as follows: where the proportional factor is chosen so as to reproduce D(z) → 1/(1 + z) for z → ∞. Linear growth rates for the models described in Section 2.3 are summarized below: Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 • Einstein-de Sitter model (Ω m = 1, Ω Λ = 0): • Open model with vanishing cosmological constant (Ω m < 1, Ω Λ = 0): • Spatially-flat model with cosmological constant (Ω m < 1, Ω Λ = 1 − Ω m ): For most purposes, the following fitting formulae [67] provide sufficiently accurate approximations: where Note that Ω m and Ω Λ refer to the present values of the density parameter and the dimensionless cosmological constant, respectively, which will be frequently used in the rest of the review. Figure 2 shows the comparison of the numerically computed growth rate (thick lines) against the above fitting formulae (thin lines), which are practically indistinguishable.

Gaussian random field
Consider the density contrast δ i ≡ δ(x i ) = ρ(x)/ρ − 1 defined at the comoving position x i . The density field is regarded as a stochastic variable, and thus forms a random field. The conventional assumption is that the primordial density field (in its linear regime) is Gaussian, i.e., its m-point joint probability distribution obeys the multi-variate Gaussian, for an arbitrary positive integer m. Here M ij ≡ δ i δ j is the covariance matrix, and M −1 is its inverse. Since M ij = ξ(x i , x j ), Equation (71) implies that the statistical nature of the Gaussian density field is completely specified by the two-point correlation function ξ and its linear combination (including its derivative and integral). For an extensive discussion of the cosmological Gaussian density field, see [4].
The Gaussian nature of the primordial density field is preserved in its linear evolution stage, but this is not the case in the nonlinear stage. This is clear even from the definition of the Gaussian distribution: Equation (71) formally assumes that the density contrast distributes symmetrically in the range of −∞ < δ i < ∞, but in the real density field δ i cannot be less than −1. This assumption does not make any practical difference as long as the fluctuations are (infinitesimally) small, but it is invalid in the nonlinear regime where the typical amplitude of the fluctuations exceeds unity.
In describing linear theory of cosmological density fluctuations, the Fourier transform of the spatial density contrast δ(x) ≡ ρ(x)/ ρ − 1 is the most basic variable: Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 Since δ k is a complex variable, it is decomposed by a set of two real variables, the amplitude D k and the phase φ k : Then linear perturbation equation reads , and φ(t) rapidly converges to a constant value. Thus D k evolves following the growing solution in linear theory.
The most popular statistic of clustering in the Universe is the power spectrum of the density fluctuations, which measures the amplitude of the mode of the wavenumber k. This is the Fourier transform of the two-point correlation function, If the density field is globally homogeneous and isotropic (i.e., no preferred position or direction), Equation (77) reduces to Since the above expression is obtained after the ensemble average, x does not denote an amplitude of the position vector, but a comoving wavelength 2π/k corresponding to the wavenumber k = |k|. It should be noted that neither the power spectrum nor the two-point correlation function contains information for the phase φ k . Thus in principle two clustering patterns may be completely different even if they have the identical two-point correlation functions. This implies the practical importance to describe the statistics of phases φ k in addition to the amplitude D k of clustering.
In the Gaussian field, however, one can directly show that Equation (71) reduces to the probability distribution function of φ k and D k that are explicitly written as mutually independently of k. The phase distribution is uniform, and thus does not carry information. The above probability distribution function is also derived when the real and imaginary parts of the Fourier components δ k are uncorrelated and Gaussian distributed (with the dispersion P (k)/2) independently of k. As is expected, the distribution function (79) is completely fixed if P (k) is specified. This rephrases the previous statement that the Gaussian field is completely specified by the two-point correlation function in real space. Incidentally the one-point phase distribution turns out to be essentially uniform even in a strongly non-Gaussian field [81,21]. Thus it is unlikely to extract useful information directly out of it mainly due to the cyclic property of the phase. Very recently, however, Matsubara [51] and Hikage et al. [31] succeeded in detecting a signature of phase correlations in Fourier modes of mass density fields induced by nonlinear gravitational clustering using the distribution function of the phase sum of the Fourier modes for triangle wavevectors. Several different statistics which carry the phase information have been also proposed in cosmology, including the void probability function [97], the genus statistics [26], and the Minkowski functionals [57,76].

Log-normal distribution
A probability distribution function (PDF) of the cosmological density fluctuations is the most fundamental statistic characterizing the large-scale structure of the Universe. As long as the density fluctuations are in the linear regime, their PDF remains Gaussian. Once they reach the nonlinear stage, however, their PDF significantly deviates from the initial Gaussian shape due to the strong non-linear mode-coupling and the non-locality of the gravitational dynamics. The functional form for the resulting PDFs in nonlinear regimes are not known exactly, and a variety of phenomenological models have been proposed [34,74,9,25].
Kayo et al. [40] showed that the one-point log-normal PDF describes very accurately the cosmological density distribution even in the nonlinear regime (the r.m.s. variance σ nl 4 and the over-density δ 100). The above function is characterized by a single parameter σ 1 which is related to the variance of δ. Since we use δ to represent the density fluctuation field smoothed over R, its variance is computed from its power spectrum P nl explicitly as Here we use subscripts "lin" and "nl" to distinguish the variables corresponding to the primordial (linear) and the evolved (nonlinear) density fields, respectively. Then σ 1 depends on the smoothing scale R alone and is given by Given a set of cosmological parameters, one can compute σ nl (R) and thus σ 1 (R) very accurately using a fitting formula for P nl (k) (see, e.g., [67]). In this sense, the above log-normal PDF is completely specified without any free parameter. From an empirical point of view, Hubble [34] first noted that the galaxy distribution in angular cells on the celestial sphere may be approximated by a log-normal distribution, rather than a Gaussian. Theoretically the above log-normal function may be obtained from the one-to-one mapping between the linear random-Gaussian and the nonlinear density fields [9]. We define a linear density field g smoothed over R obeying the Gaussian PDF, where the variance is computed from its linear power spectrum: If one introduces a new field δ from g as Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 the PDF for δ is simply given by (dg/dδ)P (1) G (g), which reduces to Equation (80). At this point, the transformation (85) is nothing but a mathematical procedure to relate the Gaussian and the log-normal functions. Thus there is no physical reason to believe that the new field δ should be regarded as a nonlinear density field evolved from g even in an approximate sense. In fact it is physically unacceptable since the relation, if taken at face value, implies that the nonlinear density field is completely determined by its linear counterpart locally. We know, on the other hand, that the nonlinear gravitational evolution of cosmological density fluctuations proceeds in a quite nonlocal manner, and is sensitive to the surrounding mass distribution. Nevertheless the fact that the log-normal PDF provides a good fit to the simulation data, empirically implies that the transformation (85) somehow captures an important aspect of the nonlinear evolution in the real Universe.

Higher-order correlation functions
One of the most direct methods to evaluate the deviation from Gaussianity is to compute the higher-order correlation functions. Suppose that x i now labels the position of the i-th object (galaxy). Then the two-point correlation function ξ 12 ≡ ξ(x 1 , x 2 ) is defined also in terms of the joint probability of the pair of objects located in the volume elements of δV 1 and δV 2 , wheren is the mean number density of the objects. This definition is generalized to three-and fourpoint correlation functions, , in a straightforward manner: Apparently ξ 12 , ζ 123 , and η 1234 are symmetric with respect to the change of the indices. Define the following quantities with the same symmetry properties: Then it is not unreasonable to suspect that the following relations hold: where Q, R a , R b , and R c are constants. In fact, the analysis of the two-dimensional galaxy catalogues [68] revealed Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 The generlization of those relations for N -point correlation functions is suspected to hold generally, and is called the hierarchical clustering ansatz. Cosmological N -body simulations approximately support the validity of the above ansatz, but also detect the finite deviation from it [82].

Genus statistics
A complementary approach to characterize the clustering of the Universe beyond the two-point correlation functions is the genus statistics [26]. This is a mathematical measure of the topology of the isodensity surface. For definiteness, consider the density contrast field δ(x) at the position x in the survey volume V all . This may be evaluated, for instance, by taking the ratio of the number where σ(V f ) is its r.m.s. value. Consider the isodensity surface parameterized by a value of ν ≡ δ(x, V f )/σ(V f ). Genus is one of the topological numbers characterizing the surface defined as where κ is the Gaussian curvature of the isolated surface. The Gauss-Bonnet theorem implies that the value of g is indeed an integer and equal to the number of holes minus 1. This is qualitatively understood as follows: Expand an arbitrary two-dimensional surface around a point as Then the Gaussian curvature of the surface is defined by κ = κ 1 κ 2 . A surface topologically equivalent to a sphere (a torus) has κ = 1 (κ = 0), and thus Equation (99) yields g = −1 (g = 0) which coincides with the number of holes minus 1.
In reality, there are many disconnected isodensity surfaces for a given ν, and thus it is more convenient to define the genus density in the survey volume V all using the additivity of the genus: where the A i (i = 1 ∼ I) denote the disconnected isodensity surfaces with the same value of Interestingly the Gaussian density field has an analytic expression for Equation (101): where is the moment of k 2 weighted over the power spectrum of fluctuations P (k) and the smoothing functionW 2 (kR) (see, e.g., [4]). It should be noted that in the Gaussian density field the information of the power spectrum shows up only in the proportional constant of Equation (102), and its Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 functional form is deterimined uniquely by the threshold value ν. This ν-dependence reflects the phase information which is ignored in the two-point correlation function and power spectrum. In this sense, genus statistics is a complementary measure of the clustering pattern of Universe. Even if the primordial density field obeys the Gaussian statistics, the subsequent nonlinear gravitational evolution generates the significant non-Gaussianity. To distinguish the initial non-Gaussianity from that acquired by the nonlinear gravity is of fundamental importance in inferring the initial condition of the Universe in a standard gravitational instability picture of structure formation. In a weakly nonlinear regime, Matsubara derived an analytic expression for the non-Gaussianity emerging from the primordial Gaussian field [49]: Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 where are the Hermite polynomials: denote the third-order moments of δ. This expression plays a key role in understanding if the non-Gaussianity in galaxy distribution is ascribed to the primordial departure from the Gaussian statistics.

Minkowski functionals
In fact, genus is one of the complete sets of N + 1 quantities, known as the Minkowski functionals (MFs), which determine the morphological properties of a pattern in N -dimensional space. In the analysis of galaxy redshift survey data, one considers isodensity contours from the threedimensional density contrast field δ by taking its excursion set F ν , i.e., the set of all points where the density contrast δ exceeds the threshold level ν as was the case in the case of genus described in the above subsection. All MFs can be expressed as integrals over the excursion set. While the first MF is simply given by the volume integration of a Heaviside step function Θ normalized to the total volume V tot , the other MFs V k (k = 1, 2, 3) are calculated by the surface integration of the local MFs v loc k . The general expression is with the local Minkowski functionals for k = 1, 2, 3 given by where R 1 and R 2 are the principal radii of curvature of the isodensity surface. For a 3-D Gaussian random field, the average MFs per unit volume can be expressed analytically as follows: Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 where λ = σ 2 1 /6πσ 2 , σ ≡ δ 2 1/2 , σ 1 ≡ |∇δ| 2 1/2 , and δ is the density contrast. The above MFs can be indeed interpreted as well-known geometric quantities: the volume fraction V 0 (ν), the total surface area V 1 (ν), the integral mean curvature V 2 (ν), and the integral Gaussian curvature, i.e., the Euler characteristic V 3 (ν). In our current definitions (see Equations (101,108), or Equations (102, 115)), one can easily show that V 3 (ν) reduces simply to −G(ν). The MFs were first introduced to cosmological studies by Mecke et al. [57], and further details may be found in [57,32]. Analytic expressions of MFs in weakly non-Gaussian fields are derived in [52].

Concepts and definitions of biasing
As discussed above, luminous objects, such as galaxies and quasars, are not direct tracers of the mass in the Universe. In fact, a difference of the spatial distribution between luminous objects and dark matter, or a bias, has been indicated from a variety of observations. Galaxy biasing clearly exists. The fact that galaxies of different types cluster differently (see, e.g., [16]) implies that not all of them are exact tracers of the underlying mass distribution (see also Section 6).
In order to confront theoretical model predictions for the mass distribution against observational data, one needs a relation of density fields of mass and luminous objects. The biasing of density peaks in a Gaussian random field is well formulated [37,4], and it provides the first theoretical framework for the origin of galaxy density biasing. In this scheme, the galaxy-galaxy and mass-mass correlation functions are related in the linear regime via where the biasing parameter b is a constant independent of scale r. However, a much more specific linear biasing model is often assumed in common applications, in which the local density fluctuation fields of galaxies and mass are assumed to be deterministically related via the relation Note that Equation (116) follows from Equation (117), but the reverse is not true.
The above deterministic linear biasing is not based on a reasonable physical motivation. If b > 1, it must break down in deep voids because values of δ g below −1 are forbidden by definition. Even in the simple case of no evolution in comoving galaxy number density, the linear biasing relation is not preserved during the course of fluctuation growth. Non-linear biasing, where b varies with δ m , is inevitable.
Indeed, an analytical model for biasing of halos on the basis of the extended Press-Schechter approximation [59] predicts that the biasing is nonlinear and provides a useful approximation for its behavior as a function of scale, time, and mass threshold. N -body simulations provide a more accurate description of the nonlinearity of the halo biasing confirming the validity of the Mo and White model [35,103].

Modeling biasing
Biasing is likely to be stochastic, not deterministic [15]. An obvious part of this stochasticity can be attributed to the discrete sampling of the density field by galaxies, i.e., the shot noise. In addition, a statistical, physical scatter in the efficiency of galaxy formation as a function of δ m is inevitable in any realistic scenario. For example, the random variations in the density on smaller scales is likely to be reflected in the efficiency of galaxy formation. As another example, the local geometry of the background structure, via the deformation tensor, must play a role too. Such 'hidden variables' would show up as physical scatter in the density-density relation [87].
Consider the density contrasts of visible objects and mass, δ obj (x, z|R) and δ m (x, z|R), at a position x and a redshift z smoothed over a scale R [86]. In general, the former should depend on various other auxiliary variables A defined at different locations x and redshifts z smoothed over different scales R in addition to the mass density contrast at the same position, δ m (x, z|R). While this relation can be schematically expressed as Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 it is impossible even to specify the list of the astrophysical variables A, and thus hopeless to predict the functional form in a rigorous manner. Therefore if one simply focuses on the relation between δ obj (x, z|R) and δ m (x, z|R), the relation becomes inevitably stochastic and nonlinear due to the dependence on unspecified auxiliary variables A.
For illustrative purposes, we define the biasing factor as the ratio of the density contrasts of luminous objects and mass: Only in very idealized situations, the above nonlocal stochastic nonlinear factor in terms of δ m may be approximated by • a local stochastic nonlinear bias, • a local deterministic nonlinear bias, and • a local deterministic linear bias, From the above point of view, the local deterministic linear bias is obviously unrealistic, but is still a widely used conventional model for biasing. In fact, the time-and scale-dependence of the linear bias factor b obj (z, R) was neglected in many previous studies of biased galaxy formation until very recently. Currently, however, various models beyond the deterministic linear biasing have been seriously considered with particular emphasis on the nonlinear and stochastic aspects of the biasing [71,15,87,86].

Density peaks and dark matter halos as toy models for galaxy biasing
Let us illustrate the biasing from numerical simulations by considering two specific and popular models: primordial density peaks and dark matter halos [86]. We use the N -body simulation data of L = 100 h −1 Mpc again for this purpose [36]. We select density peaks with the threshold of the peak height ν th = 1.0, 2.0, and 3.0. As for the dark matter halos, these are identified using the standard friend-of-friend algorithm with a linking length of 0.2 in units of the mean particle separation. We select halos of mass larger than the threshold M th = 2.0 × We locate a fiducial observer in the center of the circle. Then the comoving position vector r for a particle with a comoving peculiar velocity v at a redshift z is observed at the position s in redshift space: Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 where H(z) is the Hubble parameter at z. The right panels in Figures 5 and 6 plot the observed distribution in redshift space, where the redshift-space distortion is quite visible: The coherent velocity field enhances the structure perpendicular to the line-of-sight of the observer (squashing) while the virialized clump becomes elongated along the line-of-sight (finger-of-God ). We use two-point correlation functions to quantify stochasticity and nonlinearity in biasing of peaks and halos, and explore the signature of the redshift-space distortion. Since we are interested in the relation of the biased objects and the dark matter, we introduce three different correlation functions: the auto-correlation functions of dark matter and the objects, ξ mm and ξ oo , and their cross-correlation function ξ om . In the present case, the subscript o refers to either h (halos) or ν (peaks). We also use the superscripts R and S to distinguish quantities defined in real and redshift spaces, respectively. We estimate those correlation functions using the standard pair-count method. The correlation function ξ (S) is evaluated under the distant-observer approximation.
Those correlation functions are plotted in Figures 7 and 8 for peaks and halos, respectively. The correlation functions of biased objects generally have larger amplitudes than those of mass. In nonlinear regimes (ξ > 1) the finger-of-God effect suppresses the amplitude of ξ (S) relative to ξ (R) , while ξ (S) is larger than ξ (R) in linear regimes (ξ < 1) due to the coherent velocity field.

Biasing of galaxies in cosmological hydrodynamic simulations
Popular models of the biasing based on the peak or the dark halos are successful in capturing some essential features of biasing. None of the existing models of bias, however, seems to be sophisticated Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 enough for the coming precision cosmology era. The development of a more detailed theoretical model of bias is needed. A straightforward next step is to resort to numerical simulations which take account of galaxy formation even if phenomenological at this point. We show an example of such approaches from Yoshikawa et al. [103] who apply cosmological smoothed particle hydrodynamic (SPH) simulations in the LCDM model with particular attention to the comparison of the biasing of dark halos and simulated galaxies (see also [78]).
Galaxies in their simulations are identified as clumps of cold and dense gas particles which satisfy the Jeans condition and have the SPH density more than 100 times the mean baryon density at each redshift. Dark halos are identified with a standard friend-of-friend algorithm; the linking length is 0.164 times the mean separation of dark matter particles, for instance, at z = 0. In addition, they identify the surviving high-density substructures in dark halos, DM cores (see [103] for further details). Figure 9 illustrates the distribution of dark matter particles, gas particles, dark halos, and galaxies at z = 0 where galaxies are more strongly clustered than dark halos. Figure 10 depicts a close-up snapshot of the most massive cluster at z = 0 with a mass M 8 × 10 14 M . The circles in the lower panels indicate the positions of galaxies identified in our simulation. Figure 11 shows the joint distribution of δ h and δ g with the mass density field δ m at redshift z = 0, 1, and 2 smoothed over R s = 12 h −1 Mpc. The conditional mean relationδ i (δ m ) computed directly from the simulation is plotted in solid lines, while dashed lines indicate theoretical predictions of halo biasing by Taruya and Suto [87]. For a given smoothing scale, the simulated halos exhibit positive biasing for relatively small δ m in agreement with the predictions. On the other hand, they tend to be underpopulated for large δ m , or anti-biased. This is mainly due to the exclusion effect of dark halos due to their finite volume size which is not taken into account in the theoretical model. Since our simulated galaxies have smaller spatial extent than the halos, the Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8  exclusion effect is not so serious. This is clearly illustrated in the lower panels in Figure 11, and indeed they show much better agreement with the theoretical model. We turn next to a more conventional biasing parameter defined through the two-point statistics: where ξ ii (r) and ξ mm (r) are two-point correlation functions of objects i and of dark matter, respectively. While the above biasing parameter is ill-defined where either ξ ii (r) or ξ mm (r) becomes negative, it is not the case at clustering scales of interest (< 10 h −1 Mpc). Figure 12 shows two-point correlation functions of dark matter, galaxies, dark halos, and DM cores (upper and middle panels), and the profiles of biasing parameters b ξ (r) for those objects (lower panels) at z = 0, 1, and 2. In the lower panels, we also plot the parameter b var,i ≡ σ i /σ m , which are defined in terms of the one-point statistics (variance), for comparison on smoothing scales R s = 4 h −1 Mpc, R s = 8 h −1 Mpc, and R s = 12 h −1 Mpc at r = R s for each kind of objects by different symbols. In the upper panels, we show the correlation functions of DM cores identified with two different maximum linking lengths, l max = 0.05 and l max = b h /2. Correlation functions of DM cores identified with l max = 0.05 are similar to those of galaxies. On the other hand, those identified with l max = b h /2 exhibit much weaker correlation, and are rather similar to those of dark halos. This is due to the fact that the present algorithm of group identification with larger l max tends to pick up lower mass halos which are poorly resolved in our numerical resolution.
Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 The correlation functions of galaxies are almost unchanged with redshift, and the correlation functions of dark halos only slightly evolve between z = 0 and 2. By contrast, the amplitude of the dark matter correlation functions evolve rapidly by a factor of ∼ 10 from z = 2 to z = 0. The biasing parameter b ξ,g is larger at a higher redshift, for example, b ξ,g 2 -2.5 at z = 2. The biasing parameter b ξ,h for dark halos is systematically lower than that of galaxies and DM cores again due to the volume exclusion effect. At z = 0, galaxies and DM cores are slightly anti-biased relative to dark matter at r 1 h −1 Mpc. In lower panels, we also plot the one-point biasing parameter b var,i ≡ σ i /σ m at r = R s for comparison. In general we find that b ξ,i is very close to b var,i at z ∼ 0, but systematically lower than b var,i at higher redshifts.
For each galaxy identified at z = 0, we define its formation redshift z f by the epoch when half of its cooled gas particles satisfy our criteria of galaxy formation. Roughly speaking, z f corresponds to the median formation redshift of stars in the present-day galaxies. We divide all simulated galaxies at z = 0 into two populations (the young population with z f < 1.7 and the old population with z f > 1.7) so as to approximate the observed number ratio of 3/1 for late-type and early-type galaxies. Figure 13: Two-point correlation functions for the old and young populations of galaxies at z = 0 as well as that of the dark matter distribution. The profiles of bias parameters b ξ (r) for both of the two populations are also shown in the lower panel. (Figure taken from [103].) The difference of the clustering amplitude can be also quantified by their two-point correlation functions at z = 0 as plotted in Figure 13. The old population indeed clusters more strongly than the mass, and the young population is anti-biased. The relative bias between the two populations b rel ξ,g ≡ ξ old /ξ young ranges 1.5 and 2 for 1 h −1 Mpc < r < 20 h −1 Mpc, where ξ young and ξ old are the two-point correlation functions of the young and old populations.

Halo occupation function approach for galaxy biasing
Since the clustering of dark matter halos is well understood now, one can describe the galaxy biasing if the halo model is combined with the relation between the halos and luminous objects. This is another approach to galaxy biasing, halo occupation function (HOF), which has become very popular recently. Indeed the basic idea behind HOF has a long history, but the model predictions have been significantly improved with the recent accurate models for the mass function, the biasing and the density profile of dark matter halos. We refer the readers to an extensive review on the HOF by Cooray and Sheth [13]. Here we briefly outline this approach.
We adopt a simple parametric form for the average number of a given galaxy population as a function of the hosting halo mass: The above statistical and empirical relation is the essential ingredient in the current modeling characterized by the minimum mass M min of halos which host the population of galaxies, a normalization parameter which can be interpreted as the critical mass M 1 above which halos typically host more than one galaxy (note that M 1 may exceed M min since the above relation represents the statistical expected value of number of galaxies), and the power-law index α of the mass dependence of the efficiency of galaxy formation. We will put constraints on the three parameters from the observed number density and clustering amplitude for each galaxy population. In short, the number density of galaxies is most sensitive to M 1 which changes the average number of galaxies per halo. The clustering amplitude on large scales is determined by the hosting halos and thus very sensitive to the mass of those halos, M min . The clustering on smaller scales, on the other hand, depends on those three parameters in a fairly complicated fashion; roughly speaking, M min changes the amplitude, while α, and to a lesser extent M 1 as well, change the slope. With the above relation, the number density of the corresponding galaxy population at redshift z is given by where n halo (M ) denotes the halo mass function. The galaxy two-point correlation function on small scales is dominated by contributions of galaxy pairs located in the same halo. For instance, Bullock et al. [8] adopted the mean number of galaxy pairs N g (N g − 1) (M ) within a halo of mass M of the form: In the framework of the halo model, the galaxy power spectrum consists of two contributions, one from galaxy pairs located in the same halo (1-halo term) and the other from galaxy pairs located in two different halos (2-halo term): The 1-halo term is written as Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 Seljak [77] chose p = 2 for N g (N g − 1) > 1 and p = 1 for N g (N g − 1) < 1. The 2-halo term on the assumption of the linear halo bias model [59] reduces to where P lin (k) is the linear dark matter power spectrum, b(M ) is the halo bias factor, and y(k, M ) is the Fourier transform of the halo dark matter profile normalized by its mass, y(k, M ) = ρ(k, M )/M [77]. The halo occupation formalism, although simple, provides a useful framework in deriving constraints on galaxy formation models from large data sets of the upcoming galaxy redshift surveys. For example, Zehavi et al. [105] used the halo occupation formalism to model departures from a power law in the SDSS galaxy correlation function. They demonstrated that this is due to the transition from a large-scale regime dominated by galaxy pairs in different halos to a small-scale regime dominated by those in the same halo. Magliocchetti and Porciani [47] applied the halo occupation formalism to the 2dFGRS clustering results per spectral type of Madgwick et al. [45]. This provides constraints on the distribution of late-type and early-type galaxies within the dark matter halos of different mass.

Relativistic Effects Observable in Clustering at High Redshifts
Redshift surveys of galaxies definitely serve as the central database for observational cosmology.
In addition to the existing shallower surveys (z < 0.2), clustering in the Universe in the range z = 1 -3 has been partially revealed by, for instance, the Lyman-break galaxies and X-ray selected AGNs. In particular, the 2dF and SDSS QSO redshift surveys promise to extend the observable scale of the Universe by an order of magnitude, up to a few Gpc. A proper interpretation of such redshift surveys in terms of the clustering evolution, however, requires an understanding of many cosmological effects which can be neglected for z 1 and thus have not been considered seriously so far. These cosmological contaminations include linear redshift-space (velocity) distortion, nonlinear redshift-space (velocity) distortion, cosmological redshift-space (geometrical) distortion, and the cosmological light-cone effect.
We describe a theoretical formalism to incorporate those effects, in particular the cosmological redshift-distortion and light-cone effects, and present several specific predictions in CDM models. The details of the material presented in this section may be found in [83,101,100,46,28,29].

Cosmological light-cone effect on the two-point correlation functions
Observing a distant patch of the Universe is equivalent to observing the past. Due to the finite light velocity, a line-of-sight direction of a redshift survey is along the time, as well as spatial, coordinate axis. Therefore the entire sample does not consist of objects on a constant-time hypersurface, but rather on a light-cone, i.e., a null hypersurface defined by observers at z = 0. This implies that many properties of the objects change across the depth of the survey volume, including the mean density, the amplitude of spatial clustering of dark matter, the bias of luminous objects with respect to mass, and the intrinsic evolution of the absolute magnitude and spectral energy distribution. These aspects should be properly taken into account in order to extract cosmological information from observed samples of redshift surveys.
In order to predict quantitatively the two-point statistics of objects on the light-cone, one must take account of 1. nonlinear gravitational evolution, 2. linear redshift-space distortion, 3. nonlinear redshift-space distortion, 4. weighted averaging over the light-cone, 5. cosmological redshift-space distortion due to the geometry of the Universe, and 6. object-dependent clustering bias.
The Effect 5 comes from our ignorance of the correct cosmological parameters, and Effect 6 is rather sensitive to the objects which one has in mind. Thus the latter two effects will be discussed in the next Section 5.2.
Nonlinear gravitational evolution of mass density fluctuations is now well understood, at least for two-point statistics. In practice, we adopt an accurate fitting formula [67] for the nonlinear power spectrum P R nl (k, z) in terms of its linear counterpart. If one assumes a scale-independent deterministic linear bias, furthermore, the power spectrum distorted by the peculiar velocity field is known to be well approximated by the following expression: Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 where k ⊥ and k are the comoving wavenumber perpendicular and parallel to the line-of-sight of an observer, and P (R) mass (k; z) is the mass power spectrum in real space. The second factor on the r.h.s. comes from the linear redshift-space distortion [38], and the last factor is a phenomenological correction for the non-linear velocity effect [67]. In the above, we introduce We assume that the pair-wise velocity distribution in real space is approximated by with σ P being the 1-dimensional pair-wise peculiar velocity dispersion. Then the finger-of-God effect is modeled by the damping function D vel k σ P (z) : where µ is the direction cosine in k-space, and the dimensionless wavenumber κ is related to the peculiar velocity dispersion σ P in the physical velocity units: Since we are mainly interested in the scales around 1 h −1 Mpc, we adopt the following fitting formula throughout the analysis below which better approximates the small-scale dispersions in physical units: Integrating Equation (131) over µ, one obtains the direction-averaged power spectrum in redshift space: where Adopting those approximations, the direction-averaged correlation functions on the light-cone are finally computed as Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 where z min and z max denote the redshift range of the survey, and Throughout the present analysis, we assume a standard Robertson-Walker metric of the form where S K (χ) is determined by the sign of the curvature K as where the present scale factor a 0 is normalized as unity, and the spatial curvature K is given as (see Equation (13)). The radial comoving distance χ(z) is computed by The comoving angular diameter distance D c (z) at redshift z is equivalent to S −1 (χ(z)), and, in the case of Ω Λ = 0, is explicitly given by Mattig's formula: Then dV c /dz, the comoving volume element per unit solid angle, is explicitly given as

Evaluating two-point correlation functions from N-body simulation data
The theoretical modeling described above was tested against simulation results by Hamana, Colombi, and Suto [28]. Using cosmological N -body simulations in SCDM and Λ-CDM models, they generated light-cone samples as follows: First, they adopt a distance observer approximation and assume that the line-of-sight direction is parallel to the Z-axis regardless of its (X, Y ) position. Second, they periodically duplicate the simulation box along the Z-direction so that at a redshift z, the position and velocity of those particles locating within an interval χ(z) ± ∆χ(z) are dumped, where ∆χ(z) is determined by the output time-interval of the original N -body simulation. Finally they extract five independent (non-overlapping) cone-shape samples with the angular radius of 1 degree (the field-of-view of π degree 2 ). In this manner, they have generated mock data samples on the light-cone continuously extending up to z = 0.4 (relevant for galaxy samples) and z = 2.0 (relevant for QSO samples) from the small and large boxes, respectively.
Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 The two-point correlation function is estimated by the conventional pair-count adopting the following estimator [43]: The comoving separation x 12 of two objects located at z 1 and z 2 with an angular separation θ 12 is given by where In redshift space, the observed redshift z obs for each object differs from the "real" one z real due to the velocity distortion effect: where v pec is the line of sight relative peculiar velocity between the object and the observer in physical units. Then the comoving separation s 12 of two objects in redshift space is computed as where s 1 ≡ D c (z obs,1 ) and s 2 ≡ D c (z obs,2 ). In properly predicting the power spectra on the light-cone, the selection function should be specified. For galaxies, we adopt a B-band luminosity function of the APM galaxies fitted to the Schechter function [44]. For quasars, we adopt the B-band luminosity function from the 2dF QSO survey data [7]. To compute the B-band apparent magnitude from a quasar of absolute magnitude M B at z (with the luminosity distance d L (z)), we applied the K-correction, for the quasar energy spectrum L ν ∝ ν −p (we use p = 0.5). In practice, we adopt the galaxy selection function φ gal (< B lim , z) with B lim = 19 and z min = 0.01 for the small box realizations, and the QSO selection function φ QSO (< B lim , z) with B lim = 21 and z min = 0.2 for the large box realizations. We do not introduce the spatial biasing between selected particles and the underlying dark matter. Figures 14 and 15 show the two-point correlation functions in SCDM and Λ-CDM, respectively, taking account of the selection functions. It is clear that the simulation results and the predictions are in good agreement.

Cosmological redshift-space distortion
Consider a spherical object at high redshift. If the wrong cosmology is assumed in interpreting the distance-redshift relation along the line of sight and in the transverse direction, the sphere will appear distorted. Alcock and Paczynski [2] pointed out that this curvature effect could be used to estimate the cosmological constant. Matsubara and Suto [54] and Ballinger, Peacock, and Heavens [3] developed a theoretical framework to describe the geometrical distortion effect (cosmological redshift distortion) in the two-point correlation function and the power spectrum of distant objects, respectively. Certain studies were less optimistic than others about the possibility of measuring this Alcock-Paczynski effect. For example, Ballinger, Peacock, and Heavens [3] argued that the geometrical distortion could be confused with the dynamical redshift distortions caused by peculiar velocities and characterized by the linear theory parameter β ≡ Ω 0.6 m /b. Matsubara and Szalay [55,56] showed that the typical SDSS and 2dF samples of normal galaxies at low Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 Figure 14: Mass two-point correlation functions on the light-cone for particles with redshiftdependent selection functions in the SCDM model, for z < 0.4 (upper panels) and 0.2 < z < 2.0 (lower panels). Left panels: with selection function whose shape is the same as that of the B-band magnitude limit of 19 for galaxies (upper) and 21 for QSOs (lower); right panels: randomly selected N ∼ 10 4 particles from the particles in the results from the left panels. (Figure taken  redshift (z ∼ 0.1) have sufficiently low signal-to-noise, but they are too shallow to detect the Alcock-Paczynski effect. On the other hand, the quasar SDSS and 2dFGRS surveys are at a useful redshift, but they are too sparse. A more promising sample is the SDSS Luminous Red Galaxies survey (out to redshift z ∼ 0.5) which turns out to be optimal in terms of both depth and density.
While this analysis is promising, it remains to be tested if non-linear clustering and complicated biasing (which is quite plausible for red galaxies) would not 'contaminate' the measurement of the equation of state. Even if the Alcock-Paczynski test turns out to be less accurate than other cosmological tests (e.g., CMB and SN Ia), the effect itself is an interesting and important ingredient in analyzing the clustering pattern of galaxies at high redshifts. We shall now present the formalism for this effect.
Due to a general-relativistic effect through the geometry of the Universe, the observable separations perpendicular and parallel to the line-of-sight direction, x s⊥ = (c/H 0 )zδθ and x s = (c/H 0 )δz, are mapped differently to the corresponding comoving separations in real space x ⊥ and x : with d A (z) being the angular diameter distance. The difference between c ⊥ (z) and c (z) generates an apparent anisotropy in the clustering statistics, which should be isotropic in the comoving space. Then the power spectrum in cosmological redshift space P (CRD) is related to P (S) defined in the comoving redshift space as where the first factor comes from the Jacobian of the volume element dk 2 s⊥ dk s , and k s⊥ = c ⊥ (z)k ⊥ and k s = c (z)k are the wavenumber perpendicular and parallel to the line-of-sight direction.
Next we decompose the power spectrum into harmonics, where L l (µ k ) are the l-th order Legendre polynomials. Similarly, the two-point correlation function is decomposed as Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 using the direction cosine µ x between the separation vector and the line-of-sight. The above multipole moments satisfy the following relations: with j l (kx) being spherical Bessel functions. Substituting P (CRD) (k s , µ k ; z) in Equation (159) yields P  We randomly selected N = 5×10 3 (upper panels), N = 5×10 4 (middle panels), and N = 5×10 5 (lower panels) particles from N -body simulation. The value of σ 8 is adopted from the cluster abundance. (Figure taken from [46].) A comparison of the monopoles and quadrupoles from simulations and model predictions exhibits how the results are sensitive to the cosmological parameters, which in turn may put potentially useful constraints on (Ω m , Ω Λ ). Figure 17 indicates the feasibility, which interestingly results in a constraint fairly orthogonal to that from the supernovae Ia Hubble diagram.

Two-point clustering statistics on a light-cone in cosmological redshift space
In order to explore the relation between the two-point statistics on a constant-time hypersurface in real space and that on a light-cone hypersurface in cosmological redshift space, we simply consider the case of the deterministic, linear, and scale-independent bias: In what follows, we explicitly use the subscript 'mass' to indicate the quantities related to the mass density field, while those without the subscript correspond to objects satisfying Equation (163). Using Equation (157), the two-point correlation function in the cosmological redshift space, ξ (CRD) (x s⊥ , x s ; z), is computed as where ξ (S) (x ⊥ , x ; z) is the redshift-space correlation function defined through Equation (131). Since P where n CRD 0 (z) and n com 0 (z) denote the number densities of the objects in cosmological redshift space and comoving space, respectively, and φ(z) is the selection function determined by the observational target selection and the luminosity function of the objects. Then, the final expressions [84] reduce to where z min and z max denote the redshift range of the survey, dV c /dz = d 2 C (z)/H(z) is the comoving volume element per unit solid angle.
Note that k s and x s , defined in P Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 Figure 18 compares the predictions for the angle-averaged (monopole) power spectra under various approximations. The upper and lower panels adopt the selection functions appropriate for galaxies in 0 < z < z max = 0.2 and QSOs in 0 < z < z max = 5, respectively. The left and right panels present the results in SCDM and LCDM models. For simplicity we adopt a scale-independent linear bias model [23]: with b(k, z = 0) = 1 and 1.5 for galaxies and quasars, respectively. Figure 18: Light-cone and cosmological redshift-space distortion effects on angle-averaged power spectra. (Figure taken from [84].) The upper and lower panels correspond to magnitude-limited samples of galaxies (B < 19 in 0 < z < z max = 0.2; no bias model) and QSOs (B < 20 in 0 < z < z max = 5; Fry's linear bias model), respectively. We present the results normalized by the real-space power spectrum in linear theory P (R,lin) (k; z) [4], and P (k s ) are computed using the nonlinear power spectrum [67]. Consider first the results for the galaxy sample (upper panels). On linear scales (k < 0.1 h Mpc−1), P (S) 0 (k; z = 0) plotted in dashed lines is enhanced relative to that in real space, mainly due to a linear redshift-space distortion (the Kaiser factor in Equation (131)). For nonlinear scales, the nonlinear gravitational evolution increases the power spectrum in real space, while the finger-of-God effect suppresses that in redshift space. Thus, the net result is sensitive to the shape and the amplitude of the fluctuation spectrum, itself; in the LCDM model that we adopted, the nonlinear gravitational growth in real space is stronger than the suppression due to the finger-of-God effect. Thus, P (S) 0 (k; z = 0) becomes larger than its real-space counterpart in linear theory. In the SCDM model, however, this is opposite and P (S) 0 (k; z = 0) becomes smaller. The power spectra at z = 0.2 (dash-dotted lines) are smaller than those at z = 0 by the corresponding growth factor of the fluctuations, and one might expect that the amplitude of the power spectra on the light-cone (solid lines) would be in-between the two. While this is correct, if we use the comoving wavenumber, the actual observation on the light-cone in the cosmological redshift space should be expressed in terms of k s (see Equation (158)). If we plot the power spectra at z = 0.2 taking into account the geometrical distortion, P (k s ; z = 0.2). This explains the qualitative features shown in the upper panels of Figure 18. As a result, both the cosmological redshift-space distortion and the light-cone effect substantially change the predicted shape and amplitude of the power spectra, even for the galaxy sample [60]. The results for the QSO sample can be basically understood in a similar manner, except that the evolution of the bias makes a significant difference, since the sample extends to much higher redshifts. Figure 19 shows the results for the angle-averaged (monopole) two-point correlation functions, exactly corresponding to those in Figure 18. The results in this figure can also be understood by an analogy of those presented in Figure 18 at k ∼ 2π/x. Unlike the power spectra, however, two-point correlation functions are not positive definite. The funny features in Figure 19 on scales larger than 30 h −1 Mpc (100 h −1 Mpc) in SCDM (LCDM) originate from the fact that ξ (R,lin) (x, z = 0) becomes negative there.
In fact, since the resulting predictions are sensitive to the bias, which is unlikely to quantitatively be specified by theory, the present methodology will find two completely different applications. For relatively shallower catalogues, like galaxy samples, the evolution of bias is not supposed to be so strong. Thus, one may estimate the cosmological parameters from the observed degree of the redshift distortion, as has been conducted conventionally. Most importantly, we can correct for the systematics due to the light-cone and geometrical distortion effects, which affect the estimate of the parameters by ∼ 10%. Alternatively, for deeper catalogues like high-redshift quasar samples, one can extract information on the object-dependent bias only by correcting the observed data on the basis of our formulae.
In a sense, the former approach uses the light-cone and geometrical distortion effects as real cosmological signals, while the latter regards them as inevitable, but physically removable, noise. In both cases, the present methodology is important in properly interpreting the observations of the Universe at high redshifts.

The latest galaxy redshift surveys
Redshifts surveys in the 1980s and the 1990s (e.g., the CfA, IRAS, and Las campanas surveys) measured thousands to tens of thousands galaxy redshifts. Multifibre technology now allows us to measure redshifts of millions of galaxies. Below we summarize briefly the properties of the main new surveys 2dFGRS, SDSS, 6dF, VIRMOS, DEEP2, and we discuss key results from 2dFGRS and SDSS. Further analysis of these surveys is currently underway.

The 2dF galaxy redshift survey
The Anglo-Australian 2-degree Field Galaxy Redshift (2dFGRS) [89] has recently been completed with redshifts for 230,000 galaxies selected from the APM catalogue December 2002) down to an extinction corrected magnitude limit of b J < 19.45. The main survey regions are two declination strips, in the northern and southern Galactic hemispheres, and also 100 random fields, covering in total about 1800 deg 2 (see Figures 20 and 21). The median redshift of the 2dFGRS isz ∼ 0.1 (see [11,65] for reviews).

The SDSS galaxy redshift survey
The SDSS (Sloan Digital Sky Survey) is a U.S.-Japan-Germany joint project to image a quarter of the Celestial Sphere at high Galactic latitude as well as to obtain spectra of galaxies and quasars from the imaging data[93]. The dedicated 2.5 meter telescope at Apache Point Observatory is Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 equipped with a multi-CCD camera with five broad bands centered at 3561, 4676, 6176, 7494, and 8873Å. For further details of SDSS, see [102,80] The latest map of the SDSS galaxy distribution, together with a typical slice, are shown in Figures 22 and 23 (see also [32]). The three-dimensional map centered on us in the equatorial coordinate system is shown Figure 22. Redshift slices of galaxies centered around the equatorial plane with various redshift limits and thicknesses of planes are shown in Figure 23

The 6dF galaxy redshift survey
The 6dF (6-degree Field) [91] is a survey of redshifts and peculiar velocities of galaxies selected primarily in the Near Infrared from the new 2MASS (Two Micron All Sky Survey) catalogue [90]. One goal is to measure redshifts of more than 170,000 galaxies over nearly the entire Southern sky. Another exciting aim of the survey is to measure peculiar velocities (using 2MASS photometry and 6dF velocity dispersions) of about 15,000 galaxies out to 150 h −1 Mpc. The high quality data of this survey could revive peculiar velocities as a cosmological probe (which was very popular about 10-15 years ago). Observations have so far obtained nearly 40,000 redshifts and completion is expected in 2005.

The DEEP galaxy redshift survey
The DEEP survey is a two-phased project using the Keck telescopes to study the properties and distribution of high redshift galaxies [92]. Phase 1 used the LRIS spectrograph to study a sample Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8  of ∼ 1000 galaxies to a limit of I = 24.5. Phase 2 of the DEEP project will use the new DEIMOS spectrograph to obtain spectra of ∼ 65, 000 faint galaxies with redshifts z ∼ 1. The scientific goals are to study the evolution of properties of galaxies and the evolution of the clustering of galaxies compared to samples at low redshift. The survey is designed to have the fidelity of local redshift surveys such as the LCRS survey, and to be complementary to ongoing large redshift surveys such as the SDSS project and the 2dF survey. The DEIMOS/DEEP or DEEP2 survey will be executed with resolution R 4000, and we therefore expect to measure linewidths and rotation curves for a substantial fraction of the target galaxies. DEEP2 will thus also be complementary to the VLT/VIRMOS project, which will survey more galaxies in a larger region of the sky, but with much lower spectral resolution and with fewer objects at high redshift.

The VIRMOS galaxy redshift survey
The on-going Franco-Italian VIRMOS project [94] has delivered the VIMOS spectrograph for the European Southern Observatory Very Large Telescope (ESO-VLT). VIMOS is a VIsible imaging Multi-Object Spectrograph with outstanding multiplex capabilities: With 10 arcsec slits, spectra can be taken of 600 objects simultaneously. In integral field mode, a 6400-fibre Integral Field Unit (IFU) provides spectroscopy for all objects covering a 54 × 54 arcsec 2 area. VIMOS therefore provides unsurpassed efficiency for large surveys. The VIRMOS project consists of: construction of VIMOS, and a Mask Manufacturing Unit for the ESO-VLT. The VIRMOS-VLT Deep Survey (VVDS), a comprehensive imaging and redshift survey of the deep Universe based on more than 150,000 redshifts in four 4 square-degree fields.

The power spectrum of 2dF Galaxies on large scales
An initial estimate of the convolved, redshift-space power spectrum of the 2dFGRS was determined by Percival et al. [72] for a sample of 160,000 redshifts. On scales 0.02 h Mpc −1 < k < 0.15 h Mpc −1 , the data are fairly robust and the shape of the power spectrum is not significantly affected by redshift-space distortion or non-linear effects, while its overall amplitude is increased due to the linear redshift-space distortion effect (see Section 5).
If one fits the Λ-CDM model predictions to the 2dFGRS power spectrum (see Figure 24) over the above range in k, one can constrain the cosmological parameters. For instance, assuming a Gaussian prior on the Hubble constant h = 0.7 ± 0.07 (from [22]), Percival et al. [72] obtained the 68 percent confidence limits on the shape parameter Ω m h = 0.20 ± 0.03, and a baryon fraction Ω b /Ω m = 0.15 ± 0.07. For a fixed set of cosmological parameters, i.e., n = 1, Ω m = 1 − Ω Λ = 0.3, Ω b h 2 = 0.02, and h = 0.70, the r.m.s. mass fluctuation amplitude of 2dFGRS galaxies smoothed over a top-hat radius of 8 h −1 Mpc in redshift space turned out to be σ S 8g (L s , z s ) ≈ 0.94.

An upper limit on neutrino masses
The recent results of atmospheric and solar neutrino oscillations [24,1] imply non-zero masssquared differences of the three neutrino flavours. While these oscillation experiments do not directly determine the absolute neutrino masses, a simple assumption of the neutrino mass hierarchy suggests a lower limit on the neutrino mass density parameter, Ω ν = m ν,tot h −2 /(94 eV) ≈ 0.001. Large scale structure data can put an upper limit on the ratio Ω ν /Ω m due to the neutrino 'free streaming' effect [33]. By comparing the 2dF galaxy power-spectrum of fluctuations with a fourcomponent model (baryons, cold dark matter, a cosmological constant, and massive neutrinos) it was estimated that Ω ν /Ω m < 0.13 (95% CL), or with concordance prior of Ω m = 0.3, Ω ν < 0.04, or an upper limit of ∼ 2 eV on the total neutrino mass, assuming a prior of h ≈ 0.7 [20,19] Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 (see Figure 24). In order to minimize systematic effects due to biasing and non-linear growth, the analysis was restricted to the range 0.02 < k < 0.15 h Mpc −1 . Additional cosmological data sets bring down this upper limit by a factor of two [79].

Combining 2dFGRS and CMB
While the CMB probes the fluctuations in matter, the galaxy redshift surveys measure the perturbations in the light distribution of particular tracer (e.g., galaxies of certain type). Therefore, for a fixed set of cosmological parameters, a combination of the two can better constrain cosmological parameters, and it can also provide important information on the way galaxies are 'biased' relative to the mass fluctuations, The CMB fluctuations are commonly represented by the spherical harmonics C . The connection between the harmonic and k is roughly for a spatially-flat Universe. For Ω m = 0.3, the 2dFGRS range 0.02 < k < 0.15 h Mpc −1 corresponds approximately to 200 < < 1500, which is well covered by the recent CMB experiments.
Recent CMB measurements have been used in combination with the 2dF power spectrum. Efstathiou et al. [17] showed that 2dFGRS+CMB provide evidence for a positive cosmological constant Ω Λ ∼ 0.7 (assuming w = −1), independently of the studies of supernovae Ia. As explained in [72], the shapes of the CMB and the 2dFGRS power spectra are insensitive to Dark Energy. The main important effect of the dark energy is to alter the angular diameter distance to the last scattering, and thus the position of the first acoustic peak. Indeed, the latest result from a combination of WMAP with 2dFGRS and other probes gives h = 0.71 +0.04 −0.03 , Ω b h 2 = 0.0224±0.0009, Ω m h 2 = 0.135 +0.008 −0.009 , σ 8 = 0.84 ± 0.04, Ω tot = 1.02 ± 0.02, and w < −0.78 (95% CL, assuming w ≥ −1) [79].

The bi-spectrum and higher moments
It is well established that important information on the non-linear growth of structure is encoded at the high order moments, e.g., the skewness or its Fourier version, the bi-spectrum. Verde et al. [95] computed the bi-spectrum of 2dFGRS and used it to measure the bias parameter of the galaxies. They assumed a specific quadratic biasing model: Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 By analysing 80 million triangle configurations in the wavenumber range 0.1 < k < 0.5 h Mpc −1 they found b 1 = 1.04 ± 0.11 and b 2 = −0.054 ± 0.08, in support of no biasing on large scale. This is a non-trivial result, as the analysis covers non-linear scales. Baugh et al. [5] and Croton et al. [14] measured the moments of the galaxy count probability distribution function in 2dFGRG up to order p = 6 (order p = 2 is the variance, p = 3 is the skewness, etc.). They demonstrated the hierarchical scaling of the averaged p-point galaxy correlation functions. However, they found that the higher moments are strongly affected by the presence of two massive superclusters in the 2dFGRS volume. This poses the question of whether 2dFGRS is a 'fair sample' for high order moments.

Luminosity and spectral-type dependence of galaxy clustering
Although biasing was commonly neglected until the early 1980s, it has become evident observationally that on scales 10 h −1 Mpc different galaxy populations exhibit different clustering amplitudes, the so-called morphology-density relation [16]. As discussed in Section 4, galaxy biasing is naturally predicted from a variety of theoretical considerations as well as direct numerical simulations [37,59,15,87,86,103]. Thus, in this Section we summarize the extent to which the galaxy clustering is dependent on the luminosity, spectral-type, and color of the galaxy sample from the 2dFGRS and SDSS.

2dFGRS: Clustering per luminosity and spectral type
Madgwick et al. [45] applied the Principal Component Analysis to compress each galaxy spectrum into one quantity, η ≈ 0.5 pc 1 + pc 2 . Qualitatively, η is an indicator of the ratio of the present to the past star formation activity of each galaxy. This allows one to divide the 2dFGRS into η-types, and to study, e.g., luminosity functions and clustering per type. Norberg et al. [61] showed that, at all luminosities, early-type galaxies have a higher bias than late-type galaxies, and that the biasing parameter, defined here as the ratio of the galaxy to matter correlation function b ≡ ξ g /ξ m varies as b/b * = 0.85 + 0.15L/L * . Figure 25 indicates that for L * galaxies, the real space correlation function amplitude of η early-type galaxies is ∼ 50% higher than that of late-type galaxies. Figure 26 shows the redshift-space correlation function in terms of the line-of-sight and perpendicular to the line-of-sight separation ξ(σ, π). The correlation function calculated from the most passively ('red', for which the present rate of star formation is less than 10 % of its past averaged value) and actively ('blue') star-forming galaxies. The clustering properties of the two samples are clearly distinct on scales 10 h −1 Mpc. The 'red' galaxies display a prominent finger-of-God effect and also have a higher overall normalization than the 'blue' galaxies. This is a manifestation of the well-known morphology-density relation. By fitting ξ(π, σ) over the separation range 8 -20 h −1 Mpc for each class, it was found that β active = 0.49 ± 0.13, β passive = 0.48 ± 0.14 and corresponding pairwise velocity dispersions σ P of 416 ± 76 km s −1 and 612 ± 92 km s −1 [45]. At small separations, the real space clustering of passive galaxies is stronger than that of active galaxies: The slopes γ are respectively 1.93 and 1.50 (see Figure 27) and the relative bias between the two classes is a declining function of separation. On scales larger than 10 h −1 Mpc the biasing ratio is approaching unity.
Another statistic was applied recently by Wild et al. [98] and Conway et al. [12], of a joint counts-in-cells on 2dFGRS galaxies, classified by both color and spectral type. Exact linear bias is ruled out on all scales. The counts are better fitted to a bivariate log-normal distribution. On small scales there is evidence for stochasticity. Further investigation of galaxy formation models is required to understand the origin of the stochasticity.
Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8  Zehavi et al. [104] analyzed the Early Data Release (EDR) sample of the SDSS 30,000 galaxies to explore the clustering of per luminosity and color. The inferred real-space correlation function is well described by a single power-law: ξ(r) = (r/6.1 ± 0.2 h −1 Mpc) −1.75±0.03 for 0.1 h −1 Mpc ≤ r ≤ 16 h −1 Mpc. The galaxy pairwise velocity dispersion is σ 12 ≈ 600 ± 100 km s −1 for projected separations 0.15 h −1 Mpc ≤ r p ≤ 5 h −1 Mpc. When divided by color, the red galaxies exhibit a stronger and steeper real-space correlation function and a higher pairwise velocity dispersion than do the blue galaxies. In agreement with 2dFGRS there is clear evidence for a scale-independent luminosity bias at r ∼ 10 h −1 Mpc. Subsamples with absolute magnitude ranges centered on M * − 1.5, M * , and M * + 1.5 have real-space correlation functions that are parallel power laws of slope ≈ −1.8 with correlation lengths of approximately 7.4 h −1 Mpc, 6.3 h −1 Mpc, and 4.7 h −1 Mpc, respectively. Figures 27 and 28 pose an interesting challenge to the theory of galaxy formation, to explain why the correlation functions per luminosity bins have similar slope, while the slope for early type galaxies is steeper than for late type.

SDSS: Three-point correlation functions and the nonlinear biasing of galaxies per luminosity and color
Let us move next to the three-point correlation functions (3PCF) of galaxies, which are the lowestorder unambiguous statistic to characterize non-Gaussianities due to nonlinear gravitational evolution of dark matter density fields, formation of luminous galaxies, and their subsequent evolution. The determination of the 3PCF of galaxies was pioneered by Peebles and Groth [70] and Groth and Peebles [27] using the Lick and Zwicky angular catalogs of galaxies. They found that the 3PCF ζ(r 12 , r 23 , r 31 ) obeys the hierarchical relation: ζ(r 12 , r 23 , r 31 ) = Q r [ξ(r 12 )ξ(r 23 ) + ξ(r 23 )ξ(r 31 ) + ξ(r 31 )ξ(r 12 )], with Q r being a constant. The value of Q r in real space deprojected from these angular catalogues is 1.29 ± 0.21 for r < 3 h −1 Mpc. Subsequent analyses of redshift catalogs confirmed the hierarchical relation, at least approximately, but the value of Q z (in redshift space) appears to be smaller, Q z ∼ 0.5 -1.
As we have seen in Section 6.3.2, galaxy clustering is sensitive to the intrinsic properties of the galaxy samples under consideration, including their morphological types, colors, and luminosities. Nevertheless the previous analyses were not able to examine those dependences of 3PCFs because of the limited number of galaxies. Indeed Kayo et al. [39] were the first to perform the detailed analysis of 3PCFs explicitly taking account of the morphology, color, and luminosity dependence. They constructed volume-limited samples from a subset of the SDSS galaxy redshift data, 'Large-scale Structure Sample 12'. Specifically they divided each volume limited sample into color subsamples of red (blue) galaxies, which consist of 7949 (8329), 8930 (8155), and 3706 (3829) galaxies for −22 < M r − 5 log h < −21, −21 < M r − 5 log h < −20, and −20 < M r − 5 log h < −19, respectively. Figure 29 indicates the dimensionless amplitude of the 3PCFs of SDSS galaxies in redshift space, Q z (s 12 , s 23 , s 31 ) ≡ ζ(s 12 , s 23 , s 31 ) ξ(s 12 )ξ(s 23 ) + ξ(s 23 )ξ(s 31 ) + ξ(s 31 )ξ(s 12 ) , for the equilateral triplets of galaxies. The overall conclusion is that Q z is almost scale-independent and ranges between 0.5 and 1.0, and that no systematic dependence is noticeable on luminosity and color. This implies that the 3PCF itself does depend on the galaxy properties since twopoint correlation functions (2PCFs) exhibit clear dependence on luminosity and color. Previous Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 simulations and theoretical models [82,53,50,85] indicate that Q decreases with scale in both real and redshift spaces. This trend is not seen in the observational results. Figure 29: Dimensionless amplitude of the three-point correlation functions of SDSS galaxies in redshift space. The galaxies are classified according to their colors; all galaxies in open circles, red galaxies in solid triangles, and blue galaxies in crosses. (Figure taken from [39].) In order to demonstrate the expected dependence in the current samples, they compute the biasing parameters estimated from the 2PCFs, where the index i runs over each sample of galaxies with different colors and luminosities. The predictions of the mass 2PCFs in redshift space, ξ z,ΛCDM (s), in the Λ cold dark matter model are computed following [28].
As an illustrative example, consider a simple bias model in which the galaxy density field δ g,i for the i-th population of galaxies is given by If both b g,i(1) and b g,i(1) are constant and the mass density field δ mass 1, Equation (174) implies that Thus the linear bias model (b g,i(2) = 0) simply implies that Q g,i is inversely proportional to b g,i(1) , which is plotted in Figure 30. A comparison of Figures 29 and 30 indicates that the biasing in the 3PCFs seems to compensate the difference of Q g purely due to that in the 2PCFs. Such behavior is unlikely to be explained by any simple model inspired by the perturbative expansion like Equation (176). Rather it indeed points to a kind of regularity or universality of the clustering hierarchy behind galaxy formation and evolution processes. Thus the galaxy Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 biasing seems much more complex than the simple deterministic and linear model. More precise measurements of 3PCFs and even higher-order statistics with future SDSS datasets would be indeed valuable to gain more specific insights into the empirical biasing model.

Topology of the Universe: Analysis of SDSS galaxies in terms of Minkowski functionals
All the observational results presented in the preceding Sections 6.1, 6.2, and 6.3 were restricted to the two-point statistics. As emphasized in Section 3, the clustering pattern of galaxies has much richer content than the two-point statistics can probe. Historically the primary goal of the topological analysis of galaxy catalogues was to test Gaussianity of the primordial density fluctuations. Although the major role for that goal has been superseded by the CMB map analysis [41], the proper characterization of the morphology of large-scale structure beyond the two-point statistics is of fundamental importance in cosmology. In order to illustrate a possibility to explore the topology of the Universe by utilizing the new large surveys, we summarize the results of the Minkowski Functionals (MF) analysis of SDSS galaxy data [32].
In an apparent-magnitude limited catalogue of galaxies, the average number density of galaxies decreases with distance because only increasingly bright galaxies are included in the sample at larger distance. With the large redshift surveys it is possible to avoid this systematic change in both density and galaxy luminosity by constructing volume-limited samples of galaxies, with cuts on both absolute-magnitude and redshift. This is in particular useful for analyses such as MF and was carried out in the analysis shown here. Figure 31 shows the MFs as a function of ν f defined from the volume fraction [26]: This is intended to map the threshold so that the volume fraction on the high-density side of the Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 isodensity surface is identical to the volume in regions with density contrast δ = ν f σ, for a Gaussian random field with r.m.s. density fluctuations σ. If the evolved density field may approximately have a good one-to-one correspondence with the initial random-Gaussian field, then this transformation removes the effect of evolution of the PDF of the density field. Under this assumption, the MFs as a function of volume fraction would be sensitive only to the topology of the isodensity contours rather than evolution with time of the density threshold assigned to a contour. While the limitations of the approximation of monotonicity in the relation between initial and evolved density fields are well recognized [40], we plot the result in this way for simplicity. The good match between the observed MFs and the mock predictions based on the LCDM model with the initial random-Gaussianity, as illustrated in Figure 31, might be interpreted to imply that the primordial Gaussianity is confirmed. A more conservative interpretation is that, given the size of the estimated uncertainties, these data do not provide evidence for initial non-Gaussianity, i.e., the data are consistent with primordial Gaussianity. Unfortunately, due to the Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 statistical limitation of the current SDSS data, it is not easy to put a more quantitative statement concerning the initial Gaussianity. Moreover, in order to go further and place more quantitative constraints on primordial Gaussianity with upcoming data, one needs a more precise and reliable theoretical model for the MFs, which properly describes the nonlinear gravitational effect possibly as well as galaxy biasing beyond the simple mapping on the basis of the volume fraction. In fact, galaxy biasing is a major source of uncertainty for relating the observed MFs to those obtained from the mock samples for dark matter distributions. If LCDM is the correct cosmological model, the good match of the MFs for mock samples from the LCDM simulations to the observed SDSS MFs may indicate that nonlinearity in the galaxy biasing is relatively small, at least small enough that it does not significantly affect the MFs (the MFs as a function of ν σ remain unchanged for the linear biasing).

Other statistical measures
In this section, we have presented the results on the basis of particular statistical measures including the two-point correlation functions, power spectrum, redshift distortion, and Minkowski functionals. Of course there are other useful approaches in analysing redshift surveys: the void probability function, counts-in-cells, Voronoi cells, percolation, and minimal spanning trees. Another area not covered here is optimal reconstruction of density field (e.g, using the Wiener filtering). The reader is referred to a good summary of those and other methodology in the book by Martinez and Saar [48].
Admittedly the results that we presented here are rather observational and phenomenological, and far from being well-understood theoretically. It is quite likely that when other on-going and future surveys are being analysed in great detail, the nature of galaxy clustering will be revealed in a much more quantitative manner. They are supposed to act as a bridge between cosmological framework and galaxy formation operating in the Universe. While the proper understanding of physics of galaxy formation is still far away, the future redshift survey data will present interesting challenges for constructing models of galaxy formation.

Discussion
As a classical probe, galaxy redshift surveys still remain an important tool for studying cosmology and galaxy formation. On large scales (> 10 h −1 Mpc or so) they nicely complement the cosmic microwave background, supernovae Ia, and gravitational lensing in quantifying in detail the cosmological model. On small scales (< 10 h −1 Mpc) the clustering patterns of different galaxy types (defined by structural or spectral properties) provide important constraints on models of biased galaxy formation.
The redshift surveys mainly constrain Ω m via both redshift distortion (which also depends on biasing) and the shape of the Λ-CDM power spectrum, which depends on the primordial spectrum, the product Ω m h, and also the baryon density Ω b . Redshift surveys at a given epoch are not sensitive to the Dark Energy (or the cosmological constant, in a specific case), but combined with the CMB they can constrain the cosmic equation of state.
A good example of the importance of the redshift surveys in cosmology is given by the recent WMAP analysis of cosmological parameters, where the estimation of certain parameters was much improved by adding the 2dF power spectrum of fluctuations [79] or the SDSS power spectrum [88]. This is illustrated in Table 2 by contrasting the WMAP-alone derived parameters from those derived from WMAP+SDSS [88]. Such results are sensitive to the assumed parameter space and priors, but for simplicity we quote here the results for the simple six-parameter model. In this analysis it was assumed that the Universe is flat, the fluctuations are adiabatic, there are no gravity waves, there is no running tilt of the spectral index, the neutrino masses are negligible, and the dark energy is in the form of Einstein's cosmological constant (w = −1). Within the Λ-CDM model this scenario can be characterised by the six parameters given in Table 2 based on [88]. The WMAP data used are both the temperature and polarization fluctuations. It can be seen that by adding the SDSS information more than halves the WMAP-only error bars on some of the parameters. These results are in good agreement with the joint analysis WMAP+2dF [79]. Re-ionization optical depth Table 2: Derived 6 cosmological parameters from WMAP alone (temperature and polarization) versus WMAP+SDSS (from Tables 2 and 3 of Tegmark et al. [88]). The quoted error bars correspond to 1-σ.
We emphasize that these parameters were fitted assuming the Λ-CDM model. While the degree of such phenomenological successes of the Λ-CDM model is truly amazing, there are many fundamental open questions: • Both components of the model, Λ and CDM, have not been directly measured. Are they 'real' entities or just 'epicycles' ? Historically epicycles were actually quite useful in forcing Living Reviews in Relativity http://www.livingreviews.org/lrr-2004-8 observers to improve their measurements and theoreticians to think about better models! • 'The Old Cosmological Constant problem': Why is Ω Λ at present so small relative to what is expected from Early Universe physics?
• 'The New Cosmological Constant problem': Why is Ω m ∼ Ω Λ at the present-epoch? Why is w ∼ −1? Do we need to introduce new physics or to invoke the anthropic principle to explain it?
• There are still open problems in Λ-CDM on the small scales, e.g., galaxy profiles and satellites.
• Could other (yet unknown) models fit the data equally well?
• Where does the field go from here? Should the activity focus on refinement of the cosmological parameters within Λ-CDM, or on introducing entirely new paradigms?
Even if Λ-CDM turns out to be the correct model, it is not yet the "end of cosmology". Beyond the 'zero-th order' task of finding the cosmological parameters of the FRW model, we would like to understand the non-linear growth of mass density fluctuations and then the formation and evolution of luminous objects. The wealth of data of galaxy images and spectra in the new surveys calls for the development of more detailed models of galaxy formation. This is important so the comparison of the measurements (e.g., correlation function per spectral type or colour) and the models could be done on equal footing, with the goal of constraining scenarios of galaxy formation. There is also room for new statistical methods to quantify the 'cosmic web' of filaments, clusters of voids, for effective comparison with the simulations. It may well be that in the future the cosmological parameters will be fixed by the CMB, SN Ia, and other probes. Then, for fixed cosmological parameters, one may use redshift surveys primarily to study galaxy biasing and evolution with cosmic epoch.

Acknowledgements
We thank Joachim Wambsganss and Bernard Schutz for inviting us to write the present review article. O.L. thanks members of the 2dFGRS team and the Leverhulme Quantitative Cosmology group for helpful discussions. Y.S. thanks all his students and collaborators for over many years, in particular Thomas Buchert, Takashi Hamana, Chiaki Hikage, Yipeng Jing, Issha Kayo, Hiromitsu Magira, Takahiko Matsubara, Hiroaki Nishioka, Jens Schmalzing, Atsushi Taruya, Kazuhiro Yamamoto, and Kohji Yoshikawa among others, for enjoyable and fruitful collaborations whose results form indeed the important elements in this review. Y.S. is also grateful for the hospitality at the Institute of Astronomy, University of Cambridge, where the most of present review was put together and written up for completion. O.L. acknowledges a PPARC Senior Research Fellowship. We also thank Idit Zehavi for permitting us to use Figure 28.
Numerical simulations were carried out at the ADAC (Astronomical Data Analysis Center) of the National Astronomical Observatory, Japan (project ID: mys02, yys08a). This research was also supported in part by the Grants-in-Aid from Monbu-Kagakusho and the Japan Society of Promotion of Science.