Verde et al., WMAP: Parameter Estimation Methodology

The Astrophysical Journal Supplement Series, 148:195-211, 2003
© 2003. The American Astronomical Society. All rights reserved. Printed in U.S.A.

First-YearWilkinsonMicrowaveAnisotropyProbe(WMAP)¹Observations:ParameterEstimationMethodology

L. Verde,^2,³H. V. Peiris,²D. N. Spergel,²M. R. Nolta,⁴C. L. Bennett,⁵M. Halpern,⁶G. Hinshaw,⁵N. Jarosik,⁴A. Kogut,⁵M. Limon,^5,⁷S. S. Meyer,⁸L. Page,⁴G. S. Tucker,^5,^7,⁹E. Wollack,⁵andE. L. Wright¹⁰

Received 2003 February 11; accepted 2003 May 30

ABSTRACT

We describe our methodologyfor comparing the Wilkinson Microwave Anisotropy Probe(WMAP) measurements of thecosmic microwave background (CMB)and other complementary datasets to theoretical models.The unprecedented quality ofthe WMAP data andthe tight constraints oncosmological parameters that arederived require a rigorousanalysis so that theapproximations made in themodeling do not leadto significant biases. Wedescribe our use ofthe likelihood function tocharacterize the statistical propertiesof the microwave backgroundsky. We outline theuse of the MonteCarlo Markov Chains toexplore the likelihood ofthe data given amodel to determine thebest-fit cosmological parameters andtheir uncertainties. We addto the WMAP datathe 700Cosmic Background Imager (CBI)and Arcminute Cosmology BolometerArray Receiver (ACBAR) measurementsof the CMB, thegalaxy power spectrum atz 0 obtainedfrom the Two-Degree FieldGalaxy Redshift Survey (2dFGRS),and the matter powerspectrum at z 3 as measured withthe Ly forest. Theselast two data setscomplement the CMB measurementsby probing the matterpower spectrum of thenearby universe. Combining CMBand 2dFGRS requires thatwe include in ouranalysis a model forgalaxy bias, redshift distortions,and the nonlinear growthof structure. We showhow the statistical andsystematic uncertainties in themodel and the dataare propagated through thefull analysis.

Subject headings: cosmic microwave background; cosmological parameters; cosmology: observations; methods: data analysis; methods: statistical
     ¹ WMAP is theresult of a partnershipbetween Princeton University andthe NASA Goddard SpaceFlight Center. Scientific guidanceis provided by theWMAP Science Team.
     ² Department ofAstrophysical Sciences, Princeton University,Peyton Hall, Princeton, NJ08544; lverde@astro.princeton.edu.
     ³ Chandra Fellow.
     ⁴ Department ofPhysics, Princeton University, JadwinHall, P.O. Box 708,Princeton, NJ 08544.
     ⁵ NASA GoddardSpace Flight Center, Code685, Greenbelt, MD 20771.
     ⁶ Departmentof Physics and Astronomy,University of British Columbia,6224 Agricultural Road, Vancouver,BC V6T 1Z1, Canada.
     ⁷ NationalResearch Council (NRC) Fellow.
     ⁸ Departmentsof Astrophysics and Physics,EFI, and CfCP, Universityof Chicago, 5640 SouthEllis Avenue, Chicago, IL60637.
     ⁹ Department of Physics, BrownUniversity, Providence, RI 02912.
     ¹⁰ Departmentof Astronomy, UCLA, P.O.Box 951562, Los Angeles,CA 90095-1562.

1. INTRODUCTION

Cosmic microwave background(CMB) experiments are powerfulcosmological probes because theearly universe is particularlysimple and because thefluctuations over angular scales &thetas; > 0 &fdg; 2 aredescribed by linear theory(Peebles & Yu 1970;Bond & Efstathiou 1984;Zaldarriaga & Seljak 2000).Exploiting this simplicity toobtain precise constraints oncosmological parameters requires thatwe accurately characterize theperformance of the instrument(Jarosik et al. 2003b;Page et al. 2003b;Barnes et al. 2003;Hinshaw et al. 2003a),the properties of theforegrounds (Bennett et al.2003a), and the statisticalproperties of the microwavesky.

The primary goal ofthis paper is topresent our approach toextracting the cosmological parametersfrom the temperature-temperature angularpower spectrum (TT) andthe temperature-polarization angular cross-powerspectrum (TE). In companionpapers, we present theTT (Hinshaw et al.2003a) and TE (Kogutet al. 2003) angularpower spectra and showthat the CMB fluctuationsmay be treated asGaussian (Komatsu et al.2003).

Our basic approach isto constrain cosmological parameterswith a likelihood analysisfirst of the Wilkinson Microwave Anisotropy Probe(WMAP) TT and TEspectra alone, then jointlywith other CMB angularpower spectrum determinations athigher angular resolution, andfinally of all CMBpower spectra data jointlywith the power spectrumof the large-scale structure(LSS). In § 2 wedescribe the use ofthe likelihood function forthe analysis of microwavebackground data. This buildson the Hinshaw etal. (2003b) methodology fordetermining the TT spectrumand its curvature matrixand Kogut et al.(2003), who describe ourmethodology for determining theTE spectrum. In § 3we describe our useof Markov Chain MonteCarlo (MCMC) techniques toevaluate the likelihood functionof model parameters. WhileWMAP's measurements are apowerful probe of cosmology,we can significantly enhancetheir scientific value bycombining the WMAP datawith other astronomical datasets. This paper alsopresents our approach forincluding external CMB datasets (§ 4), LSS data(§ 5), and Ly forestdata (§ 6). When includingexternal data sets, thereader should keep inmind that the physicsand the instrumental effectsinvolved in the interpretationof these external datasets (especially 2dFGRS andLy) are much morecomplicated and less wellunderstood than for WMAPdata. Nevertheless, we aimto match the rigoroustreatment of uncertainties inthe WMAP angular powerspectrum with the inclusionof known statistical andsystematic effects (of thedata and of thetheory), in the complementarydata sets.

2. LIKELIHOOD ANALYSIS OF WMAP ANGULAR POWER SPECTRA

The first goalof our analysis programis to determine thevalues and confidence levelsof the cosmological parametersthat best describe theWMAP data for agiven cosmological model. Wealso wish to discriminatebetween different classes ofcosmological models, in otherwords, to assess whethera cosmological model isan acceptable fit toWMAP data.

The ultimate goalof the likelihood analysisis to find aset of parameters thatgive an estimate of &angl0; &angr0; , the ensemble averageof which the realizationon our sky¹¹ is. The likelihood function,[|( &b.alpha; )], yields the probabilityof the data givena model and itsparameters (). In ournotation denotes ourbest estimator of (Hinshaw et al. 2003a)and is thetheoretical prediction for angularpower spectrum. From Bayes'stheorem, we can splitthe expression for theprobability of a modelgiven the data as

where( &b.alpha; ) describes our priorson cosmological parameters andwe have neglected anormalization factor that doesnot depend on theparameters . Once thechoice of the priorsis specified, our estimatorof &angl0; &angr0; is givenby evaluated atthe maximum of (|).

¹¹ Throughoutthis paper we usethe convention that

(

+ 1)C/(2

2.1. Likelihood Function

Oneof the generic predictionsof inflationary models isthat fluctuations in thegravitational potential have Gaussianrandom phases. Since thephysics that governs theevolution of the temperatureand metric fluctuations islinear, the temperature fluctuationsare also Gaussian. Ifwe ignore the effectsof nonlinear physics atz < 10 andthe effect of foregrounds,then all of thecosmological information in themicrowave sky is encodedin the temperature andpolarization power spectra. Theleading-order low-redshift astrophysical effectis expected to begravitational lensing of theCMB by foreground structures.We ignore this effecthere as it generatesa less than 1%covariance in the TTangular power spectrum onWMAP angular scales (Hu2001; see also Spergelet al. 2003, §3).

There are several expectedsources of noncosmological signaland of non-Gaussianity inthe microwave sky. Themost significant sources onthe full sky areGalactic foreground emission, radiosources, and galaxy clusters.Bennett et al. (2003b)show that these contributionsare greatly reduced ifwe restrict our analysisto a cut skythat masks bright sourcesand regions of brightGalactic emission. The residualcontribution of these foregroundsis further reduced bythe use of externaltemplates to subtract foregroundemission from the Q-,V-, and W-band maps.Komatsu et al. (2003)find no evidence fordeviations from Gaussianity onthis template-cleaned cut sky.While the sky cutgreatly reduces foreground emission,it has the unfortunateeffect of coupling multipolemodes on the skyso that the powerspectrum covariance matrix isno longer diagonal. Thegoal of this sectionis to include thiscovariance in the likelihoodfunction.

The likelihood function forthe temperature fluctuations observedby a noiseless experimentwith full-sky coverage hasthe form

where denotes ourtemperature map and S_ij= (2 + 1)CP(_i_j)/(4),where the P arethe Legendre polynomials and_i is the pixelposition on the map.If we expand thetemperature map in sphericalharmonics, T() = a_mY_m, then the likelihoodfunction for each a_mhas a simple form:

Sincewe assume that theuniverse is isotropic, thelikelihood function is independentof m. Thus, wecan sum over mand rewrite the likelihoodfunction as

up to anirrelevant additive constant. Here,for a full-sky, noiselessexperiment, we have identified² /(2 + 1)with . Note thatthe likelihood function dependsonly on the angularpower spectrum. In thislimit, the angular power spectrum encodesall of the cosmologicalinformation in the CMB.

Characteristicsof the instrument arealso included in thelikelihood analysis. Jarosik etal. (2003a) show thatthe detector noise isGaussian (see their Fig.6 and § 3.4);consequently, the pixel noisein the sky mapis also Gaussian (Hinshawet al. 2003b). Theresolution of WMAP isquantified with a windowfunction, w (Page etal. 2003a). Thus, thelikelihood function for ourCMB map has thesame form as equation (2),but with replacedby = +,where is thenearly diagonal noise correlationmatrix¹² and _ij =(2 + 1)CwP(_i &b.dot; _j)/(4).

If foreground removal didnot require a skycut and if thenoise were uniform andpurely diagonal, then thelikelihood function for theWMAP experiment would havethe form (Bond, Jaffe,& Knox 2000)

where theeffective bias isrelated to the noisebias N as

Note that and appeartogether in equation (5) becausethe noise and cosmologicalfluctuations have the samestatistical properties: they bothare Gaussian random fields.

Becauseof the foreground skycut, different multipoles arecorrelated and only afraction of the sky,f_sky, is used inthe analysis. In thiscase, it becomes computationallyprohibitive to compute theexact form of thelikelihood function. There areseveral different approximations usedin the CMB literaturefor the likelihood function.At large , equation (5)is often approximated asGaussian:

where Q, the curvaturematrix, is the inverseof the power spectrumcovariance matrix.

The power spectrumcovariance encodes the uncertaintiesin the power spectrumdue to cosmic variance,detector noise, point sources,the sky cut, andsystematic errors. Hinshaw etal. (2003a) and § 2.2describe the various termsthat enter into thepower spectrum covariance matrix.

Sincethe likelihood function forthe power spectrum isslightly non-Gaussian, equation (6) isa systematically biased estimator.Bond et al. (2000)suggest using a lognormaldistribution, _LN (Bond etal. 2000; Sievers etal. 2002):

where z =ln( + ), = ln( + ),and is thelocal transformation of thecurvature matrix Q tothe lognormal variables z,

Wefind that, for theWMAP data, both equations(6) and (7) arebiased estimators. We usean alternative approximation ofthe likelihood function forthe values (eq. [11])motivated by the followingargument.

We can expand theexact expression for thelikelihood (eq. [4]) around itsmaximum by writing = (1 + &epsis; ).Then, for a singlemultipole ,

We note thatthe Gaussian likelihood approximationis equivalent to theabove expression truncated at²:

The Bond, Jaffe, &Knox (1998) expression forthe lognormal likelihood forthe equal variance approximationis

Thus, our approximation ofthe likelihood function isgiven by the form

where has the formof equation (7) apart from that is notgiven by equation (8) butby

We tested this formof the likelihood bymaking 100,000 full-sky realizationsof the TT angularpower spectrum . Foreach realization, the maximumlikelihood amplitude of fluctuationsin the underlying modelwas found and themean value was computed.Since we kept allother model parameters fixed,this one-dimensional maximization wascomputationally trivial. The Gaussianapproximation (eq. [6]) was foundto systematically overestimate theamplitude of the fluctuationsby 0.8%, while thelognormal approximation underestimates itby 0.2%. Equation (11) wasfound to be accurateto better than 0.1%.

¹² 1/fnoise makes a non

random-phasecontribution to the detectornoise and leads tooff-diagonal terms in thenoise matrix. By makingthe noise N₀ afunction of

(denotedby N) we includethis effect to leadingorder (Hinshaw et al.2003b).

2.2. Curvature Matrix

We obtain the curvaturematrix in a formthat can be usedin the likelihood analysisfrom the power spectrumcovariance matrix for computed in Hinshaw etal. (2003a). The matrixis composed of severalterms of the followingform:

where &epsis; is thecoupling introduced by thebeam uncertainties and point-sourcesubtraction ( &epsis; = 0if = ),^K denotes the Kroneckerdelta function, and Ddenotes the diagonal terms,

Thequantity r encodes themode coupling due tothe sky cut andis the dominant off-diagonalterm (it is setto be 0 if = ). Themode-coupling coefficient, r, ismost easily defined interms of the curvaturematrix, Q = D+ r/ (see Hinshawet al. 2003b).¹³

The skycut has two significanteffects on the powerspectrum covariance matrix. Becauseless data are used,the covariance matrix isincreased by a factorof f_sky. An additionalfactor of f_sky arisesfrom the coupling tonearby -modes. The additionalterm does not leadto a loss ofinformation as nearby -modesare slightly anticorrelated.

Hinshaw etal. (2003b) describe thebeam uncertainty and point-sourceterms included in and &epsis; . The beamand calibration uncertainties dependon the realization ofthe angular power spectrumon the sky ,not on the theoreticalangular power spectrum ;thus, they should notchange as, in exploringthe likelihood surface, wechange in theexpression for D. Thisdiffers from other approaches(e.g., Bridle et al.2002). Rescaling all thecontributions to the off-diagonalterms in the covariancematrix with isnot correct and leadsto a 2% biasin our estimator of &angl0; &angr0; , which propagates, forexample, into a 2%error on the matterdensity parameter _m or2% error on thespectral slope n_s.

We findthe curvature matrix byinverting equation (13):

where we haveassumed that the off-diagonalterms are small. Forcosmological models that have very different fromthe best-fit , equation (15)does not yield theinverse of equation (13): inthese cases the inversionof needs tobe computed explicitly.

We donot propagate the WMAP0.5% calibration uncertainty inthe covariance matrix asthis uncertainty does notaffect cosmological parameter determinations.This systematic only affectsthe power spectrum amplitudeconstraint at the 0.5%level, while the statisticalerror on this quantityis 10%.

¹³ In this equationwe have set tozero the beam andpoint-source uncertainties. This isbecause the coupling coefficientis computed for anideal cut sky.

2.2.1. Calibration with Monte Carlo Simulations

The angularpower spectrum is computedusing three different weightings:uniform weighting in thesignal-dominated regime ( <200), an intermediate weightingscheme for 200 < < 450, andN_obs weighting (for thenoise-dominated regime 450 < 900; Hinshawet al. 2003b). Uniformweighting is a minimumvariance weighting in thesignal-dominated regime, and N_obsweighting is a minimumvariance in the noise-dominatedregime. However, in theintermediate regime the weightingschemes are not necessarilyoptimal and the analyticexpression for the covariancematrix might thus underestimatethe errors. To ensurethat we have theappropriate errors, we calibratethe covariance matrix from100,000 Monte Carlo realizationsof the sky withthe WMAP noise level,symmetrized beams, and theKp2 sky cut. Agood approximation of thecurvature matrix can beobtained by using equations(13)(15), but substituting and f_sky with and f calibrated fromthe Monte Carlo simulations,as shown in Figures1 and 2.

Fig. 1. Ratio ofthe effective sky coverageto the actual skycoverage. This correction factorcalibrates the expression forthe Fisher matrix tothe value obtained fromthe Monte Carlo approach.Here we show theratio obtained from 100,000simulations (jagged line); the smoothcurve shows the fitwe use, eq. (16). Notethat, since we areswitching between weighting schemes,the correction factors arenot expected to smoothlyinterpolate between regimes.

Fig. 2. Correction factorfor the noise. Thelines are as inFig. 1. Note that, sincewe are switching betweenweighting schemes, the correctionfactors are not expectedto smoothly interpolate betweenregimes.

We find that for < 200 theweighting scheme is nearlyoptimal. The power spectrumcovariance matrix (eq. [13]) givesa correct estimate ofthe error bars; thus,we do not needto calibrate orf_sky. We have computedan effective reduced ²,¹⁴/ -2 ln/, where isthe number of degreesof freedom. The effectivereduced ² from theMonte Carlo simulations inthis range isconsistent with unity.

In theintermediate regime our Ansatzpower spectrum covariance matrix(eq. [13]) slightly underestimates theerrors. This can becorrected by computing thecovariance matrix for aneffective fraction of thesky f as shownin Figure 1. The jaggedline is the ratioobtained from the MonteCarlo simulations, while thesmooth curve shows thefit to f weadopt,

for 200 < < 450.

For >450, in the noise-dominatedregime, the weighting isasymptotically optimal for . However, sincewe are using asmaller fraction of thesky, we need againto correct the f_skyfactor. This numerical factordescribes the reduction ineffective sky coverage dueto weighting the well-observedecliptic poles more heavilythan the ecliptic plane(see Fig. 3 ofBennett et al. 2003a).We fit this factorto the numerical simulationsof the TT spectrumcovariance matrix. Kogut etal. (2003) note thatthis same factor isalso a good fitto the Monte Carlosimulations of the TEspectrum covariance matrix. Forthe noise-dominated regime, wedefine an effective skyfraction f = f_sky/1.14and an effective noisegiven by

which can beobtained from the noisebias of the maps by a noisecorrection factor /. Thisis shown in Figure 2,where the smooth curveis the fit weadopt to this correctionfactor,

for > 450.

Thiscalibration of the covariancematrix from the MonteCarlo simulations allows usto use the effectivereduced ² as atool to assess goodnessof fit. It canalso be used todetermine the relative likelihoodof different models (e.g.,Peiris et al. 2003).

¹⁴ Thisis not exactly thereduced

² because thelikelihood is non-Gaussian especiallyat low

2.3. Likelihood for the TE Angular Power Spectrum

Since theTE signal is noisedominated, we adopt aGaussian likelihood, where thecurvature matrix is givenby

The expression for Dis given by equation(10) of Kogut etal. (2003), and thecoupling coefficient due tothe sky cut, r,is obtained from 100,000Monte Carlo realizations ofthe sky with WMAPmask and noise level.The TE spectrum iscomputed with noise inverseweighting; in this regimer depends only onthe difference = - andis set to be0 at separations > 15. We useall multipoles 2 450, ascomparison with the MonteCarlo realizations shows thatin this regime equation (18)correctly estimates the TEuncertainties. We have alsoverified on the simulationsthat the Gaussian likelihoodis an unbiased estimatorand that the effectivereduced ² is centeredaround 1.

The amplitude ofthe covariance between TTand TE power spectrais /(1 + C/n),where is thecorrelation term (C)²(CC)^-1 0.2. Since C/n 0.25 for 1 yrdata, we neglect thisterm, but we willinclude it in the2+ yr analysis asit becomes increasingly important.

Weprovide a subroutine¹⁵ thatreads in a setof (TT, orTE, or both) andreturns the likelihood forthe WMAP data set,including all the effectsdescribed in this section.

¹⁵ Theroutine is available athttp://lambda.gsfc.nasa.gov.

3. MARKOV CHAIN MONTE CARLO LIKELIHOOD ANALYSIS

The analysis described inSpergel et al. (2003)and Peiris et al.(2003) is numerically demanding.At each point inthe six-dimensional (or more)parameter space a newmodel from CMBFAST¹⁶ (Seljak& Zaldarriaga 1996) iscomputed. Our version ofthe code incorporates anumber of corrections anduses the RECFAST (Seager,Sasselov, & Scott 1999)recombination routine. Most ofthe likelihood calculations weredone with four sharedmemory 32 CPU SGIOrigin 300 with 600MHz processors. With eightprocessors per calculation, eachevaluation of CMBFAST for < 1500 fora flat reionized -dominateduniverse requires 3.6 s.(The scaling is notlinear; with 32 processorseach evaluation requires 1.62s.)

A grid-based likelihood analysiswould have required prohibitiveamounts of CPU time.For example, a coarsegrid (20 grid pointsper dimension) with sixparameters requires 6.4 ×10⁷ evaluations of thepower spectra. At 1.6s per evaluation, thecalculation would take 1200days. Christensen & Meyer(2000) proposed using MCMCto investigate the likelihoodspace. This approach hasbecome the standard toolfor CMB analyses (e.g.,Christensen et al. 2001;Knox, Christensen, & Skordis2001; Lewis & Bridle2002; Kosowsky, Milosavljevic, &Jimenez 2002) and isthe backbone of ouranalysis effort. For aflat reionized -dominated universe,we can evaluate thelikelihood 120,000 times inless than 2 daysusing four sets ofeight processors. As weexplain below, this isadequate for finding thebest-fit model and forreconstructing the 1 and2 confidence levelsfor the cosmological parameters.

Werefer the reader toGilks, Richardson, & Spiegelhalter(1996) for more informationabout MCMC. Here weonly provide a briefintroduction to the subjectand concentrate on theissue of convergence.

¹⁶ We usedthe parallelized ver. 4.1of CMBFAST developed incollaboration with Uros Seljakand Matias Zaldarriaga.

3.1. Markov Chain Monte Carlo

MCMC isa method to simulateposterior distributions. In particular,we simulate observations fromthe posterior distribution (|x),of a set ofparameters given eventx, obtained via Bayes'stheorem,

where (x|) is thelikelihood of event xgiven the model parameters and () isthe prior probability density.For our application theWMAP denotes aset of cosmological parameters(e.g., for the standard,flat CDM model thesecould be the colddark matter density parameter_c, the baryon densityparameter _b, the spectralslope n_s, the Hubbleconstant [in units of100 km s^-1 Mpc^-1]h, the optical depth, and the powerspectrum amplitude A), andevent x will bethe set of observed.

The MCMC generates randomdraws (i.e., simulations) fromthe posterior distribution thatare a "fair" sampleof the likelihood surface.From this sample, wecan estimate all ofthe quantities of interestabout the posterior distribution(mean, variance, confidence levels).The MCMC method scalesapproximately linearly with thenumber of parameters, thusallowing us to performlikelihood analysis in areasonable amount of time.

Aproperly derived and implementedMCMC draws from thejoint posterior density (|x)once it has convergedto the stationary distribution.The primary consideration inimplementing MCMC is determiningwhen the chain hasconverged. After an initial"burn-in"period, all further samplescan be thought ofas coming from thestationary distribution. In otherwords, the chain hasno dependence on thestarting location.

Another fundamental problemof inference from Markovchains is that thereare always areas ofthe target distribution thathave not been coveredby a finite chain.If the MCMC isrun for a verylong time, the ergodicityof the Markov chainguarantees that eventually thechain will cover allthe target distribution, butin the short termthe simulations cannot tellus about areas wherethey have not been.It is thus crucialthat the chain achievesgood "mixing." If theMarkov chain does notmove rapidly throughout thesupport of the targetdistribution because of poormixing, it might takea prohibitive amount oftime for the chainto fully explore thelikelihood surface. Thus, itis important to havea convergence criterion anda mixing diagnostic. Plotsof the sampled MCMCparameters or likelihood valuesversus iteration number arecommonly used to providesuch criteria (Fig. 3, left-hand panel).However, samples from achain are typically seriallycorrelated; very high autocorrelationleads to little movementof the chain andthus makes the chain"appear" to have converged.For a more detaileddiscussion see Gilks etal. (1996). Using anMCMC that has notfully explored the likelihoodsurface for determining cosmologicalparameters will yield wrongresults. We describe belowthe method we useto ensure convergence andgood mixing.

Fig. 3. Unconverged Markov chains.The left-hand panel showsa trace plot of thelikelihood values vs. iterationnumber for one MCMC(these are the first3000 steps from oneof our CDM modelruns). Note the burn-infor the first 100steps. In the right-handpanel, red dots arepoints of the chainin the (n, A)-planeafter discarding the burn-in.Green dots are fromanother MCMC for thesame data set andthe same model. Itis clear that, althoughthe trace plot mayappear to indicate thatthe chain has converged,it has not fullyexplored the likelihood surface.Using either of thesetwo chains at thisstage will give incorrectresults for the best-fitcosmological parameters and theirerrors.

3.2. Convergence and Mixing

We use the methodproposed by Gelman &Rubin (1992) to testfor convergence and mixing.They advocate comparing severalsequences drawn from differentstarting points and checkingto see that theyare indistinguishable. This methodnot only tests convergencebut can also diagnosepoor mixing. For any analysis of the WMAP data, we strongly encourage the use of a convergence criterion.

Let usconsider M chains (theanalyses in Spergel etal. 2003 and Peiriset al. 2003 usefour chains unless otherwisestated) starting at well-separatedpoints in parameter space;each has 2N elements,of which we consideronly the last N:{y}, where i =1, &ldots; ,N andj = 1, &ldots; ,M; i.e., y denotesa chain element (apoint in parameter space),the index i runsover the elements ina chain, and theindex j runs overthe different chains. Wedefine the mean ofthe chain

and the meanof the distribution

We thendefine the variance betweenchains as

and the variancewithin a chain as

Thequantity

is the ratio oftwo estimates of thevariance in the targetdistribution: the numerator isan estimate of thevariance that is unbiasedif the distribution isstationary, but it isotherwise an overestimate. Thedenominator is an underestimateof the variance ofthe target distribution ifthe individual sequences didnot have time toconverge.

The convergence of theMarkov chain is thenmonitored by recording thequantity for all theparameters and running thesimulations until the valuesfor are always lessthan 1.1. A. Gelman(Kaas et al. 1997)suggests to use valuesfor < 1.2. Herewe conservatively adopt thecriterion < 1.1 asour definition of convergence.We have found thatthe four chains willsometimes go in andout of convergence asthey explore the likelihoodsurface, especially if thenumber of points alreadyin the chain issmall. To avoid this,one could run manychains simultaneously or runone chain for avery long time (e.g.,Panter, Heavens, & Jimenez2002). Because of CPU-timeconstraints, we run fourchains until they fulfillboth of the followingcriteria: (1) they havereached convergence and (2)each chain contains atleast 30,000 points. Inaddition to minimizing chancedeviations from convergence, wefind that this manypoints are needed tobe able to robustlyreconstruct the 1 and2 levels ofthe marginalized likelihood forall the parameters. Formost chains, the burn-intime is relatively rapid,so that we onlydiscard the first 200points in each chain;however, the results arenot sensitive to thisprocedure.

3.3. Markov Chains in Practice

In this section weexplain the necessary stepsto run an MCMCfor the CMB temperaturepower spectrum. It isstraightforward to generalize theseinstructions to include thetemperature-polarization power spectrum andother data sets. TheMCMC is essentially arandom walk in parameterspace, where the probabilityof being at anyposition in the spaceis proportional to theposterior probability.

Here is ourbasic approach:

Start with aset of cosmological parameters{₁} and compute the and the likelihood₁ = (|).
Take arandom step in parameterspace to obtain anew set of cosmologicalparameters {₂}. The probabilitydistribution of the stepis taken to beGaussian in each directioni with rms givenby _i. We willrefer below to _ias the "step size."The choice of thestep size is importantto optimize the chainefficiency (see § 3.4.2).
Compute the for the newset of cosmological parametersand their likelihood ₂.
4a. If₂/₁ 1, "takethe step," i.e., savethe new set ofcosmological parameters {₂} aspart of the chain,then go to step2 after the substitution{₁} {₂}.
4b. If ₂/₁< 1, draw arandom number x froma uniform distribution from0 to 1. Ifx ₂/₁, "donot take the step,"i.e., save the parameterset {₁} as partof the chain andreturn to step 2.If x < ₂/₁," take the step,"i.e., do as instep 4a.
For each cosmologicalmodel run four chainsstarting at randomly chosen,well-separated points in parameterspace. When the convergencecriterion is satisfied andthe chains have enoughpoints to provide reasonablesamples from the aposteriori distributions (i.e., enoughpoints to be ableto reconstruct the 1and 2 levelsof the marginalized likelihoodfor all the parameters),stop the chains.

It isclear that the MCMCapproach is easily generalizedto compute the jointlikelihood of WMAP datawith other data sets.

3.4. Improving MCMC Efficiency

TheMarkov chain efficiency canbe improved in differentways. We have tunedour algorithm by reparameterizationand optimization of thestep size.

3.4.1. Reparameterization

Degeneracies and poorparameter choices slow therate of convergence andmixing of the Markovchain. There is onenear-exact degeneracy (the geometricdegeneracy) and several approximatedegeneracies in the parametersdescribing the CMB powerspectrum (Bond et al.1994; Efstathiou & Bond1999). The numerical effectsof these degeneracies arereduced by finding acombination of cosmological parameters(e.g., _c, _b, h)that have essentially orthogonaleffects on the angularpower spectrum. The useof such parameter combinationsremoves or reduces degeneraciesin the MCMC andhence speeds up convergenceand improves mixing becausethe chain does nothave to spend timeexploring degeneracy directions. Kosowskyet al. (2002) introduceda set of reparameterizationsto do just this.In addition, these newparameters reflect the underlyingphysical effects determining theform of the CMBpower spectrum (we referto these as physicalparameters). This leads toparticularly intuitive and transparentparameter dependencies of theCMB power spectrum.

Following Kosowskyet al. (2002), weuse a core setof six physical parameters.There are two parametersfor the physical energydensities of cold darkmatter, _c _ch²,and baryons, _b _bh². There is aparameter for the characteristicangular scale of theacoustic peaks,

where a_dec isthe scale factor atdecoupling,

is the sound horizonat decoupling, and

is theangular diameter distance atdecoupling, where H₀ denotesthe Hubble constant andc is the speedof light. Here _m= _c + _b, denotes the darkenergy density parameter, wis the equation ofstate of the darkenergy component, =_m + , andthe radiation density parameter_rad = +, where and are the photonand neutrino density parameters,respectively. For reionization weuse the physical parameter exp(-2), where denotes the opticaldepth to the lastscattering surface (not thedecoupling surface). The remainingtwo core parameters arethe spectral slope ofthe scalar primordial densityperturbation power spectrum, n_s,and the overall amplitudeof the primordial powerspectrum, A. Both arenormalized at k =0.05 Mpc^-1 ( 700).

For more complex modelswe add other parametersas described in Spergelet al. (2003) andPeiris et al. (2003)and in § 5. Toinvestigate nonflat models, weuse the vacuum energy, h². Otherexamples include the tensorindex, n_t, the tensor-to-scalarratio, r, and therunning of the scalarspectral index, dn_s/d lnk.

Here we relate theinput parameter for theoverall normalization, A, asin the CMBFAST code(ver. 4.1 with UNNORMoption), to the amplitudeof primordial comoving curvatureperturbations , (k₀) (k³/2²) &angl0; ² &angr0; . We also relateour convention for thetensor perturbations to theone in the code.CMBFAST calculates

where isthe Newtonian potential, g_T(k)is the radiation transferfunction, and T₀ =2.725 × 10⁶ isthe CMB temperature inunits of K. Thetilde denotes that (k)is used in CMBFASTbut differs from ourconvention, (k), where =/16. The comoving curvatureperturbation, , is relatedto by = - ;thus, (k) = (25/9)(k).Note that this relationholds from radiation dominationto matter domination withaccuracy better than 0.5%.

CMBFASTuses A to parameterize(k₀). The tensor perturbationsare calculated accordingly. Therelations are

Therefore, one obtains

Theamplitude A is normalizedat k₁ = 0.05Mpc^-1 and the tensor-to-scalarratio r is evaluatedat k₀ = 0.002Mpc^-1, unless otherwise specified.To convert A(k₀) toA(k₁), we use

3.4.2. Step Size Optimization

The choiceof the step sizein the Markov chainis crucial to improvethe chain efficiency andspeed up convergence. Ifthe step size istoo big, the acceptancerate will be verysmall; if the stepsize is too small,the acceptance rate willbe high, but thechain will exhibit poormixing. Both situations willlead to slow convergence.For our initial stepsizes for each parameterwe use the standarddeviation for each parameterwhen all the otherparameters are held fixedat the maximum likelihoodvalue. These are easyto find once apreliminary chain has beenrun and the likelihoodsurface has been fitted,as explained in § 3.4.3.If a given parameteris roughly orthogonal toall the other parameters,it is not necessaryto adjust the stepsize further; in thepresence of severe degeneraciesthe step size estimateneeds to be increasedby a "banana correction"factor, which is approximatelythe ratio of theprojection of the 1 error along thedegeneracy to the projectionperpendicular to the degeneracy.

Withthese optimizations the convergencecriterion is satisfied forthe four chains afterroughly 30,000 steps each(2N = 30,000) fora model with sixparameters. On an Origin300 machine this takesroughly 32 hr runningeach chain on eightprocessors. These numbers serveonly as a roughindication: convergence speed dependson the model andon the data set.For a fixed numberof parameters, convergence canbe significantly slower ifthere are severe degeneraciesamong the parameters. Addingmore data sets mightslow down the evaluationof a single stepin the chain butcan also speed upconvergence by breaking degeneracies.

3.4.3. Likelihood Surface Fitting

Thelikelihood surface explored bythe MCMC was foundto be functionally wellapproximated by a quarticexpansion of the cosmologicalparameters (for example, {_i}= {_b,_c,n, &thetas; _A,,A}):

Here q arefit coefficients and _iare related to thecosmological parameters via _i= (_i - )/_i,where is themaximum likelihood value ofthe parameter. Lower orderexpansions were unable toreproduce the likelihood surface.With six parameters thereare M_f = 210fit coefficients. Writing equation (34)as y = &b.dot; ,the minimum least-squares estimatorfor is

where isthe N × M_fmatrix X_ij = xand N is thenumber of unique pointsin the chain.

We runpreliminary MCMC chains with"guesstimated" step sizes untilthere are 1000 uniquepoints in total. Thenwe use equation (34) tocut through the likelihoodsurface at the maximumlikelihood value to findthe 1 levelin each parameter direction(see § 3.4.2). This definesour "step size" forsubsequent chains.

3.5. The Choice of Priors

From Bayes's theorem(eq. [19]) we can infer(_i|x), the probability ofthe model parameters _igiven the event x(i.e., our observation ofthe power spectra), fromthe likelihood function oncethe prior is specified.It is reasonable totake prior probabilities tobe equal when nothingis known to thecontrary (Bayes's postulate). Unlessotherwise stated, we assumeuniform priors on theparameters given in thefollowing list:

0 _c 1.
0 _b 1.
0.005 _A 0.1.
0 0.3.
0.5 A 2.5.
0 n_s| 2.
0 n_iso 2.
0 f_iso 5000.
-0.5 dn/dln k 0.5.
0 r 2.5.
0 1.
-3.2 (-1.2) w 0(we will present twosets of results, onewith the prior w -1.2 and theother with w -3.2).
0 1.

Note that we assumeuniform priors on

_c,

_b, and

_A ratherthan uniform priors on

_m,

_b, and H₀.

Exceptfor the priors on and w (theequation of state ofthe dark energy component),the MCMCs never hitthe imposed boundaries; thus,most of our choicesfor priors have noeffect on the outcome.A detailed discussion aboutthe prior on is presented in Spergelet al. (2003).

We seta lower bound onw at -3.2 (-1.2),but we discard theregion of parameter spacewhere w < -3(w < -1). Thisis necessary because ourbest-fit value for thisparameter is close tothe boundary. If wehad instead set theprior to be w -3 (w -1), then the chainswould fail to bea fair representation ofthe posterior distribution inthe region of parameterspace where the distancefrom the boundary iscomparable to the stepsize.

3.6. MCMC Output Analysis

We merge the fourconverged MCMCs (120,000 points)into one. From thiswe give the cosmologicalparameters that yield ourbest estimate of and the marginalized distributionof the parameters. Wecompute the marginalized distributionfor one parameter andthe joint distribution fortwo parameters, obtained marginalizingover all the otherparameters. Since the MCMCpasses objective tests forconvergence and mixing, thedensity of points inparameter space is proportionalto the posterior probabilityof the parameters.

The marginalizeddistribution is obtained byprojecting the MCMC points.For the marginalized parametervalues , Spergel etal. (2003) quote theexpectation value of themarginalized likelihood, _i d_i= 1/N _t,i.Here N is thenumber of points inthe merged chain and_t,i denotes the valueof parameter _i atthe tth step ofthe chain. The lastequality becomes clear ifwe consider that theMCMC gives to eachpoint in parameter spacea "weight" proportional tothe number of stepsthe chain has spentat that particular location.The 100(1 - 2p)%confidence interval [c_p,c_1-p] fora parameter is estimatedby setting c_p tothe pth quantile of_t,i,t = 1, &ldots; ,N and c_p-1 tothe (1 - p)thquantile. The procedure issimilar for multidimensional constraints:the density of pointsin the n-dimensional spaceis proportional to thelikelihood, and multidimensional confidencelevels can be foundas illustrated in §15.6 of Press etal. (1992).

We note thatthe global maximum likelihoodvalue for the parametersdoes not necessarily coincidewith the expectation valueof their marginalized distributionif the likelihood surfaceis not a multivariateGaussian. We find that,for most of theparameters, the maximum likelihoodvalues of the globaljoint fit are consistentwith the expectation valuesof the marginalized distribution.

Avirtue of the MCMCmethod is that theaddition of extra datasets in the jointanalysis can efficiently bedone with minimal computationaleffort from the MCMCoutput if the inclusionof extra data setsdoes not require theintroduction of extra parametersor does not drivethe param eters significantlyaway from the currentbest fit. For example,we add Ly powerspectrum constraints to MCMC'soutputs, but we cannotdo this for the2dFGRS, since this requiresthe introduction of twoextra parameters ( and_p; see § 5.1 formore details).

If the likelihoodsurface for a subsetof parameters from anexternal (independent) data setis known, or ifa prior needs tobe added a posteriori,the joint likelihood surfacecan be obtained bymultiplying the likelihood withthe posterior distribution ofthe MCMC output. InSpergel et al. (2003)we follow this methodto obtain the jointconstraint of CMB withType Ia supernova (Riesset al. 1998, 2001)data and CMB withHubble Key Project Hubbleconstant (Freedman et al.2001) determination.

There is yetanother advantage of theMCMC technique. The currentversion of CMBFAST withthe nominal interpolation settingsis accurate to 1%,but random numerical errorscan sometimes exceed this.As the precision ofthe CMB measurements improves,these effects can becomeproblematic for any approachthat calculates derivatives asa function of parameters.Because MCMC calculations averageover 100,000 CMB calculations,the MCMC technique ismuch less sensitive thaneither grid-based likelihood calculationsor methods that numericallycalculate the Fisher matrix.

4. EXTERNAL CMB DATA SETS

TheCosmic Background Imager (CBI;Mason et al. 2002;Sievers et al. 2002;Pearson et al. 2002)and the Arcminute CosmologyBolometer Array Receiver (ACBAR;Kuo et al. 2002)experiments complement WMAP byprobing the amplitude ofCMB temperature power spectrumat > 900.These observations probe theSilk damping tail andimprove our analysis intwo ways: (1) theyimprove our ability toconstrain the baryon density,the amplitude of fluctuations,and the slope ofthe matter power spectrum,and (2) they improveconvergence by preventing thechains from spending longperiods of time inlarge, moderately low likelihoodregions of parameter space.

TheCBI data set isdescribed in Mason etal. (2002), in Pearsonet al. (2002), andon their Web site.¹⁷We use data fromthe CBI mosaic dataset (Pearson et al.2002) and do notinclude the deep dataset as the twodata sets are notindependent. We use thethree band powers fromthe even binning atcentral -values of 876,1126, and 1301, thusensuring that the chosenband power can beconsidered independent from theWMAP data. At 1500, the CBIexperiment detected excess power.If the rms amplitudeof mass fluctuations onscales of 8 h^-1Mpc is ₈ 1, then this excesspower can be interpretedas due to Sunayev-Zeldovichdistortion from undetected galaxyclusters (Mason et al.2002; Bond et al.2002; Komatsu & Seljak2002). We simplify ouranalyses by not usingthe CBI data onscales where this effectcan be important. Thecorrelations between different bandpowers are taken intoaccount with the fullcovariance matrix; we usethe lognormal form ofthe likelihood (as inPearson et al 2002).In addition, we marginalizeover a 10% calibrationuncertainty (CBI beam uncertaintiesare negligible).

The ACBAR dataset is described inKuo et al. (2002).We use the sevenband powers at multipoles842, 986, 1128, 1279,1426, 1580, and 1716.As shown in Figure 4,these points do notoverlap with the WMAPpower spectrum except at 800, whereWMAP is noise dominated.As shown in Figure 4,the ACBAR experiment isless sensitive to Sunyaev-Zeldovichcontamination than CBI. Wecompute the likelihood analysisfor cosmological parameters forthe ACBAR data setfollowing Goldstein et al.(2002) and using theerror bars given onthe ACBAR Web site.¹⁸In addition, we marginalizeover conservative beam andcalibration uncertainties (B. Holzapfel2002, private communication). Inparticular, we assume acalibration uncertainty of 20%(the double of thenominal value) and 5%beam uncertainty (60% largerthan the nominal value).

Fig. 4. CMBangular power spectrum (inK²) for our best-fitCDM model for > 800 and theSunayev-Zeldovich contribution for ₈= 0.98 for CBIwavelengths (dotted line) and forACBAR (dashed line). The verticalline shows the adoptedcutoff for CBI andACBAR.

The ACBAR and CBIdata are completely independentfrom each other (theymap different regions ofthe sky) and fromthe WMAP data (theband powers we considerspan different ranges).To perform the jointlikelihood analysis, we simplymultiply the individual likelihoods.

¹⁷ Seehttp://www.astro.caltech.edu/~tjp/CBI/data/index.html (last update 2002August).
¹⁸ See http://cosmology.berkeley.edu/group/swlh/acbar/data.

5. ANALYSIS OF LARGE-SCALE STRUCTURE DATA

We can enhancethe scientific value ofthe CMB data fromz 1089 bycombining it with measurementsof the low-redshift universe.Galaxy redshift surveys allowus to measure thegalaxy power spectrum atz 0, andobservations of Ly absorptionof about 50 quasarspectra (Ly forest) allowus to probe thedark matter power spectrumat redshift z 3.

We use the Anglo-AustralianTelescope Two-Degree Field GalaxyRedshift Survey (2dFGRS; Collesset al. 2001) ascompiled in 2001 February.This survey probes theuniverse at redshift z_eff 0.1 and probesthe power spectrum onscales corresponding to 0.022< k < 0.2(where k is inunits of h Mpc^-1).The anticipated Sloan DigitalSky Survey (Gunn &Knapp 1993) power spectrumwill be an importantcomplement to 2dFGRS. Wealso use the linearmatter power spectrum asrecovered by Croft etal. (2002) from Lyforest observations. This powerspectrum is reconstructed atan effective redshift z 2.72 and probesscales k > 0.2h Mpc^-1. Together thesedata sets allow usto probe not onlya wide range ofphysical scalesfrom k 1 × 10^-4 (30,000Mpc h^-1) to k 1 (3 Mpch^-1) (see Fig. 5)but alsothe evolution of agiven scale with redshift.

Fig. 5. CombinedCMB and LSS dataset. Top: CMB angularpower spectrum in K²as a function ofk, where k isrelated to by = ₀k (where₀ 14,400 Mpcis the distance tothe last scattering surface).Black points are theWMAP data, red pointsCBI, blue points ACBAR.Bottom: LSS data. Blackpoints are the 2dFGRSmeasurements, and green pointsare the Ly measurements.Both LSS power spectraare in units of(Mpc h^-1)³ and havebeen rescaled to z= 0. This plotonly illustrates the scalecoverage of all thedata sets we consider.The various LSS powerspectra as plotted herecannot be directly comparedwith the theory becauseof the effects outlinedin § 5 (e.g., redshift-spacedistortions, nonlinearities, bias andwindow function effect).

When includingLSS data sets, oneshould keep in mindthat the underlying physicsfor these data setsis much more complicatedand less well understoodthan for WMAP data,and systematic and instrumentaleffects are much moreimportant. We attempt hereto include all theknown (up-to-date) uncertainties andsystematics in our analysis.In what follows, weillustrate our modeling ofthe "real-world" effects ofLSS surveys and howwe propagate systematic andstatistical uncertainties into theparameter estimation. The goalof our modeling isto relate not justthe shape but alsothe amplitude of theobserved power spectrum tothat of the linearmatter power spectrum asconstrained by CMB data.The reason for thiswill be clear in§ 5.1.3; by using theinformation in the powerspectrum amplitude, we canbreak some of thedegeneracies among cosmological parameters.

5.1. The 2dFGRS Power Spectrum

The2dFGRS power spectrum, asreleased in 2002 June,has been calculated fromthe 2001 February catalogthat includes 140,000 galaxies(Percival et al. 2001).The full survey iscomposed of 220,000 galaxiesbut is not yetavailable. The sample ismagnitude limited at b_J= 19.45 and thusprobes the universe atz_eff 0.1 andthe power spectrum onscales corresponding to k> 0.015 h Mpc^-1.The input catalog isan extended version ofthe Automatic Plate Machine(APM) galaxy catalog (Maddoxet al. 1990b; Maddox,Efstathiou, & Sutherland 1990a,1996), which includes about5 million galaxies tob_J = 20.5. TheAPM catalog was usedpreviously to recover thethree-dimensional power spectrum ofgalaxies by inverting theclustering properties of thetwo-dimensional galaxy distribution (Baugh& Efstathiou 1993; Efstathiou& Moody 2001). Thesetechniques, however, are affectedby sample variance anduncertainties in the photometry;a full three-dimensional analysisis thus more reliable.

Thepower spectrum of thegalaxy distribution as measuredby LSS surveys, suchas the 2dFGRS, cannotbe directly compared tothat of the initialdensity fluctuations, as predictedby theory, or recoveredfrom WMAP or thecombination of WMAP+CBI+ACBAR datasets. This is dueto a number ofintervening effects that canbe broadly divided intotwo classes: effects dueto the survey geometry(i.e., window function, selectionfunction effects) and effectsintrinsic to the galaxydistribution (e.g., redshift-space distortions,bias, nonlinearities).

5.1.1. Survey Geometry

Galaxy surveys suchas the 2dFGRS aremagnitude limited rather thanvolume limited; thus, mostnearby galaxies are includedin the catalog whileonly the brighter ofthe more distant galaxiesare selected. The selectionfunction accounts for thefact that fewer galaxiesare included in thesurvey as the distance(or the redshift) increases.An additional effect arisesfrom the fact thatthe clustering properties ofbright galaxies might bedifferent from the averageclustering properties of thegalaxy population as awhole. The selection functiondoes not take thisinto account (we returnto this point in§ 5.1.2).

Moreover, the completeness acrossthe sky is notconstant, and the surveycan only cover afraction of the wholesky, sometimes with avery complicated geometry describedby the window function.In particular, for thedata we use, unobservedfields make the surveycompleteness a strongly varyingfunction of position. Themeasured Fourier coefficients aretherefore the true coefficientsof the galaxy distributionconvolved by the Fouriertransform of the selectionfunction (in the directionof the line ofsight) and of thewindow function (on theplane of the sky).In this section wefollow the standard notationused in LSS analysesand refer to allof these effects aswindow effects.

The window notonly modifies the measuredpower spectrum but alsointroduces spurious correlations betweenFourier modes (see Percivalet al. 2001 formore details). For the2dFGRS these effects havebeen quantified by MonteCarlo simulations of mockcatalogs of the survey.¹⁹We include them inour analysis by convolvingthe theory power spectrumwith the window "kernel"and by including off-diagonalterms in the covariancematrix.

¹⁹ For WMAP data, wedeconvolve the raw measured

by the effectof the window (themask), thus leaving theeffect of the windowfunction and the maskonly in the Fishermatrix. For LSS wewill convolve the theorywith the window, projectthe power spectrum intoredshift space, and comparethis to the observedpower spectrum.

5.1.2. Effects Intrinsic to the Galaxy Distribution

Linear gravitational evolutionmodifies the amplitude butnot the shape ofthe underlying power spectrum.However, in the nonlinearregime (where the amplitudeof fluctuations is / 1) this isno longer the case.Nonlinear gravitational evolution changesthe shape of thepower spectrum and introducescorrelations between Fourier modes.This effect becomes importanton scales k 0.1 h Mpc^-1, butthe exact scale atwhich it appears andits detailed characteristic dependon cosmological parameters. Mostof the clustering signalfrom galaxy surveys suchas 2dFGRS comes fromthe regime where nonlinearitiesare nonnegligible because shotnoise is the dominantsource of error atk 0.5 hMpc^-1 and the numberdensity of modes scalesas k³. These nonlinearitiesencode additional information aboutcosmology and motivate theirinclusion in the presentanalysis. This approach iscomplicated by the factthat an accurate descriptionof the fully nonlinearevolution of the galaxypower spectrum is complicated.In the literature, thereare several different approachesto modeling the nonlinearevolution of the underlyingdark matter power spectrum inreal space: (1) linear(and extended) perturbation theory,(2) semianalytical modeling, and(3) numerical simulation. Allof these approaches yieldconsistent results on thescales used in ouranalysis. We will usethe semianalytical approach developedby Hamilton et al.(1991) and Peacock &Dodds (1996). In particular,we use the Maet al. (1999) formulationof the nonlinear powerspectrum. Figure 6 shows theeffect of nonlinearities onthe matter power spectrumon the scales ofinterest (compare solid anddashed lines).

Fig. 6. Matter power spectrumin (Mpc h^-1)³, linearin real space (solid line),nonlinear in real space(dashed line), and nonlinear inredshift space (dotted line). Theerror bars on thedotted line show thesize of the statisticalerror bars per kbin of the 2dFGRSgalaxy power spectrum. Thepower spectrum is inunits of (Mpc h)³.

Theorypredicts the statistical propertiesof the continuous matterdistribution, while observations areconcerned with the galaxydistribution, which is discrete.Moreover, galaxies might notbe faithful tracers ofthe mass distribution (i.e.,the galaxy distribution mightbe biased). In theanalysis of galaxy surveysit is assumed thatgalaxies form a Poisson samplingof an underlying continuousfield that is relatedto the matter fluctuationfield via the bias.It is possible toformally relate the discretegalaxy field and itscontinuous counterpart. For thepower spectrum, this consistsof the subtraction fromthe measured galaxy powerspectrum of the shot-noisecontribution. The published powerspectra from galaxy surveysalready have this contributionsubtracted but are stillbiased with respect tothe underlying mass powerspectra.

The idea that galaxiesare biased tracers ofthe mass distribution evenon large scales wasintroduced by Kaiser (1984)to explain the propertiesof Abell clusters. Nevertheless,the fact that galaxiesof different morphologies havedifferent clustering properties (hencedifferent power spectra) wasknown much before (e.g.,Hubble 1936; Dressler 1980;Postman & Geller 1984).Since the clustering propertiesof different types ofgalaxies are different, theycannot all be goodtracers of the underlyingmass distribution.²⁰

In the simplestbiasing model, the linearbias model, the massand galaxy fractional overdensityfields and _gare related by _g()= b(). This impliesthat on all scales

Thissimple model (although justifiedby the Kaiser 1984assumption that galaxies formon the highest peaksof the mass distribution)cannot be true indetail for two reasons.The first is that,on a fundamental level,the galaxy fluctuation fieldon small smoothing scalescould become _g <-1, which corresponds toa negative galaxy density.The second is that,from an observational pointof view, this schemeleaves the shape ofthe power spectrum unchangedwhile not all galaxypopulations have the sameobserved power spectrum shape,although the differences arenot large (e.g., Peacock& Dodds 1994; Norberget al. 2001). Manydifferent and more complicatedbiasing schemes have beenintroduced in the literature.For our purposes itis important to notethat the bias ofa sample of galaxiesdepends on the sampleselection criteria and onthe weighting scheme usedin the analysis. Thus,different surveys will havedifferent biases, and caremust be taken whencomparing the different galaxypower spectra.

There are severalindications that large-scale galaxybias is scale independenton large scales (e.g.,Hoekstra et al. 2002;Verde et al. 2002).This justifies adopting equation (36).For the 2dFGRS, thebias of galaxies hasbeen measured by Verdeet al. (2002), byusing higher order correlationsof the galaxy fluctuationfield. They assume ageneralization of the simplelinear biasing scheme, _g= b₁ + b₂/2².They find no evidencefor scale-dependent bias atleast on linear andmildly nonlinear scales (i.e.,k < 0.4 hMpc^-1) and b₂ consistentwith 0. This findingfurther supports the useof equation (36). In particular,they find b₁ =1.06 ± 0.11. Inour analysis we assumelinear biasing.

The Verde etal. (2002) bias measurementhas to be interpretedwith care. It appliesto 2dFGRS galaxies weightedwith a modification ofthe Feldman, Kaiser, &Peacock (1994) weighting schemeas described in Percivalet al. (2001). Itis important to notethat, close to theobserver, dim galaxies areincluded in the survey;the galaxy density ishigh, but a smallvolume of the skyis covered. On theother hand, far awayfrom the observer, onlyvery bright galaxies areincluded in the survey;a large volume isprobed, but the galaxydensity is low. Asa consequence, clustering ofdim galaxies in asmall volume close tothe observer contains mostof the signal forthe power spectrum atsmall scales. While rare,bright galaxies in alarge volume enclose mostof the information aboutthe power spectrum onlarge scales. An "optimal"weighting scheme would thusweight dim galaxies onsmall scales and brightgalaxies on large scales.This weighting scheme is,unfortunately, biased. Bright galaxiesare more strongly clustered(i.e., more biased) thandim ones. This effectis known as "luminositybias." The power spectrumrecovered from such aweighting scheme will haveoptimal error bars butwill exhibit scale-dependent bias.The weighting scheme usedin Percival et al.(2001) is not optimalbut is virtually unaffectedby luminosity bias (Percival2003). The power spectrumso obtained is thatof two L_* galaxieson virtually all scales,and the effective redshiftfor the power spectrumis z_eff = 0.17,slightly larger than theeffective redshift of thesurvey as defined bythe selection function (Percivalet al. 2001; Peacocket al. 2001).

The finalcomplication is that galaxycatalogs use the redshiftas the third spatialcoordinate. In a perfectlyhomogeneous Friedman universe, redshiftwould be an accuratedistance indicator. Inhomogeneities, however,perturb the Hubble flowand introduce peculiar velocities.As Kaiser (1987) emphasized,the peculiar velocities distortthe clustering pattern notonly on small scales,where virialized objects produce"Fingers of God," butalso on large scales,where coherent flows producelarge-scale distortion components.

On large(linear) scales the redshift-spaceeffect on an individualFourier component of thedensity fluctuation field can be modeled by

wherethe superscript s refersto the quantity inredshift space and is the cosine ofthe angle between the-vector and the lineof sight. The Kaiserfactor, , is thelinear redshift-space distortion parameter.One defines =f/b, where f =d ln /d lna, with =/ and a =(1 + z)^-1; bis the linear biasparameter. The expression forf(z) is a knownfunction of _m, ,and z (Lahav etal. 1991),

where X =1 + _mz +(a² - 1) andcan be approximated by²¹ /b. Theanalysis of the 2dFGRS(Peacock et al. 2001;Percival et al. 2001)constrains f at theeffective redshift of thesurvey. The effective redshiftof the survey dependson the galaxy weightingscheme adopted to computethe power spectrum forthe above work (z_eff 0.17). This peculiarvelocity infall causes theoverdensity to appear squashedalong the line ofsight. The net effecton the angle-averaged powerspectrum in the small-angleapproximation is

Thus, on largescales the redshift-space distortionsboost the power spectrumif > 0.

Onsmaller scales, virialized motionsproduce a radial smearingand the associated "Fingersof God" effect contaminatesthe wavelengths we areinterested in. This isdifficult to treat exactly,but as it isa smearing effect, itproduces a mild dampingof the power, actingin the opposite directionto the large-scale boostingby the Kaiser effect(see, e.g., Matsubara 1994).On these scales, theredshift-space correlation function iswell modeled as aconvolution of the real-spaceisotropic correlation function withsome distribution function forthe line-of-sight velocities (e.g.,Davis & Peebles 1983;Cole, Fisher, & Weinberg1994; Fisher 1995). Sincethe convolution in realspace is equivalent tomultiplication in Fourier space,the redshift-space power spectrumon small scales ismultiplied by the squareof the Fourier transformof the velocity distributionfunction (e.g., Peacock &Dodds 1994),

where _p denotesthe line-of-sight pairwise velocitydispersion. If the pairwisevelocity distribution is takento be an exponential(e.g., Ballinger, Heavens, &Taylor 1995; Ballinger, Peacock,& Heavens 1996; Hatton& Cole 1998), whichseems to be supportedby simulations (e.g., Zureket al. 1994) andobservations (e.g., Marzke etal. 1995), then thedamping factor is aLorentzian (see also Kanget al. 2002),

We adoptthis functional form asit is used byPeacock et al. (2001)in determining the redshift-spacedistortion parameters and_p from the 2dFGRS.The overall effect forthe power spectrum ina thin shell ink-space is given by

obtainedby averaging over in equation (40) with thedamping factor given byequation (41). Figure 6 shows theeffect of redshift-space distortions(eq. [42]) on the scalesof interest.

This model issimplistic for several reasons.The most important isthat, because of thecomplicated geometry of thesurvey, the simple angleaverage operation performed toobtain equation (42) might notbe strictly correct. Inaddition, equation (42) is obtainedin the plane-parallel (alsoknown as small-angle) approximation(i.e., as if thelines of sight todifferent galaxies on thesky were parallel).

We haveperformed extensive testing ofequation (42) using mock 2dFGRScatalogs obtained from theHubble volume simulation. Wefind that the simulations'redshift-space power spectrum isconsistent, given the errors,with equation (42), where P(k)is the simulations' real-spacepower spectrum up tok < 0.4 hMpc^-1, even for thecomplicated geometry of the2dFGRS. This means thatup to k 0.4 the systematics introducedby equation (42) is smallerthan the statistical errors;in the analysis weuse only k 0.15. However, the valuefor in equation (42)needs to be calibratedon Monte Carlo realizationsof the survey. Wefind that ^eff =0.85. We have verifiedthat our results forthe cosmological parameters areinsensitive to the exactchoice of the correctionfactor. Peacock et al.(2001) measured the parameters and _p andtheir joint probability distributionfrom the survey, obtaining = 0.43 and_p = 385 kms^-1. This measurement hasbeen obtained by usingthe full angular dependenceof the power spectrumand therefore recovers directly and not ^eff.Hawkins et al. (2002)measured these parameters froma larger sample thanthe one from Peacocket al. (2001), obtaininga slightly different result.This is mostly dueto a shift inthe recovered value for_p. Since most ofthe galaxies in theHawkins et al. (2002)sample are in thePeacock et al. (2001)sample, we conservatively extendour error bars on and by10% and 30%, respectively,to include the newvalue within the 1 marginalized confidence contourand to include apossible error in thedetermination of ^eff. Figure 6illustrates the importance ofincluding all the aboveeffects in our analysis.

Inour analysis we considerdata in the krange 0.02 h Mpc^-1 < k< 0.2 h Mpc^-1.On large scales thelimit is set bythe accuracy of thewindow function model; onsmall scales the limitis set by wherethe covariance matrix hasbeen extensively tested. Inthis regime we alsohave a weak dependenceon the velocity dispersionparameter _p, the parameterwith the largest systematicuncertainty.

²⁰ Galaxies are likely tobe formed in thevery high density regionsof the matter fluctuationfield; thus, they areformed very biased atz

0 (e.g.,Lyman break galaxies), butthen gravitational evolution shouldmake the galaxy distributionless and less biasedas time goes on(e.g., Fry 1996).
²¹ In ouranalysis we use theexact expression for

as in eq. (38).

5.1.3. Motivation for this Modeling

The motivationbehind the complicated modelingof §§ 5.1.1 and5.1.2 is to beable to infer theamplitude of the matterpower spectrum from theobserved galaxy clustering properties.

Figures7 and 8 illustratehow the modeling of§§ 5.1.1 and 5.1.2helps in breaking degeneraciesamong cosmological parameters. Forillustration, we consider twocases below: the degeneracyof the dark energyequation of state, w(Huey et al. 1999),with _b and n_sand the -h degeneracy, where = h².

Fig. 7. Two cosmological models:_b = 0.0235, _m= 0.143, n_s =0.978, = 0.11,w = -0.426, h= 0.53 (solid line) and_b = 0.0254, _m= 0.137, n_s =1.024, = 0.08,w = -1, h= 0.77 (dotted line). Thetwo models are indistinguishablewithin current error barsfrom the CMB angularpower spectrum (left-hand panel; unitsfor the power spectrumare K²). However, theycan easily be distinguishedif we can relatethe observed power spectrumto the underlying matterpower spectrum [right-hand panel; unitsfor the power spectrumare (Mpc h^-1)³]. Theerror bars on thesolid line are examplesof the size ofthe 2dFGRS and Lypower spectra statistical errorbars for one datapoint at different scales.There are four errorbars for 2dFGRS andfour for Ly .

Fig. 8. Twocosmological models: _m =0.26, _b = 0.02319, = 0.12, n_s= 0.953, =0, h = 0.714(solid line) and _m =0.26, _b = 0.02319, = 0.12, n_s= 0.953, =0.02, h = 0.6(dashed line). As before thetwo models are virtuallyindistinguishable from the CMBangular power spectrum (left-hand panel;units for the powerspectrum are K²), butthey can easily bedistinguished if the matterpower spectrum amplitude isknown [right-hand panel; units forthe power spectrum are(Mpc h^-1)³]. The errorbars on the solidline are examples ofthe size of the2dFGRS and Ly powerspectra statistical error barsfor one data point.There are four errorbars for 2dFGRS andfour for Ly .

Figure 7shows two models thatare virtually indistinguishable withCMB data, but whichpredict different amplitudes forthe matter power spectraat z 0.This is because thelinear growth factor andthe shape parameter are different for thetwo cases. The twomodels differ in thevalues of _b, n_s,and w. The solidline is a modelwith w = -0.4,while the dotted lineis a model withw = -1.

In Figure 8we show two setsof cosmological parameters thatdiffer only in thevalues of the neutrinomass and the Hubbleconstant. These two modelsare virtually indistinguishable withCMB observations. However, thematter power spectrum inthe two cases isdifferent in shape andamplitude. Since redshift-space distortionsand window function affectthe power spectrum shape,extra information about cosmologicalparameters is encoded inits amplitude. By usingthis information, Spergel etal. (2003) obtain acosmological upper bound onthe neutrino mass thatis 4 times betterthan current cosmological constraints(Elgarøy et al. 2002).

Forcompleteness, we have shownthe power spectrum alsofor scales probed bythe Ly forest (see§ 6). The error barsin Figures 7 and8 are examples ofthe size of the2dFGRS and Ly powerspectra statistical uncertainties inone data point, showingthat the two modelscan be distinguished ifthe observed power spectrumcan be related tothe linear matter powerspectrum without introducing largeadditional uncertainties.

5.1.4. Practical Approach

The procedure weadopt in order tocompare the observed galaxypower spectrum with thetheory predictions is outlinedbelow (the published 2dFGRSgalaxy power spectrum hasbeen already corrected forshot noise). For agiven set of cosmologicalparameters and a pairwisevelocity dispersion parameter weproceed as follows:

The MCMCselects a set ofcosmological parameters and valuesfor and _p.CMBFAST computes the theoreticallinear matter power spectrumat z = 0.
Weevolve the theoretical linearmatter power spectrum toobtain the nonlinear matterpower spectrum at theeffective redshift of thesurvey, following the prescriptionof Ma et al.(1999).
We then obtain theredshift-space power spectrum forthe mass by usingequation (42) with ^eff calibratedfrom Monte Carlo realizationsof the catalog.
The biasis computed from and _m using equation (38).The galaxy power spectrumis obtained by correctingfor bias, equation (36).
The resultingpower spectrum is convolvedwith the galaxy windowfunction. We use theroutine provided on the2dFGRS Web site toperform this numerically. Thisis the power spectrumthat can be comparedwith the quantity measuredfrom a galaxy survey.
Wecan now evaluate thelikelihood using the fullcovariance matrix as providedby the 2dFGRS team.We approximate the likelihoodto be Gaussian asit was done bythe team. In principle,this is not strictlycorrect since in thelinear regime the powerspectrum is an exponentialdistribution and in thenonlinear regime the distributionhas contributions from higherorders correlations. However, becauseof the size ofthe survey we areconsidering, the central limittheorem ensures that thelikelihood is well describedby the Gaussian approximation(e.g., Scoccimarro, Zaldarriaga, &Hui 1999). Moreover, thecovariance matrix for the2dFGRS power spectrum hasbeen computed by the2dFGRS team under theassumption that the likelihoodis Gaussian.

We assume thatthe likelihood for thebias parameter is Gaussian,centered on b =1.04 with dispersion _b= 0.11. This isa conservative overestimate ofthe error on thebias parameter, as notedin Verde et al.(2002). The determination ofb is correlated with and _p, andthe error quoted hasalready been marginalized overthe uncertainties in thesetwo parameters. In practice,for each step inthe Markov chain wecompute the likelihood accordingto items 16 above.The bias parameter isdetermined once , _p,and the other cosmologicalparameters are chosen. Wethen multiply the likelihoodby the joint likelihoodfor and _p,as in Figure 4of Peacock et al.(2001), and by thelikelihood for the biasparameter. In effect, weuse the determination of, _p, and bas priors. By multiplyingthe likelihood, we assumethat the measurements ofthe redshift-space distortion parameters,bias, and the 2dFGRSpower spectrum are independent.We justify this assumptionbelow.

The parameters needed tomap the real-space nonlinearmatter power spectrum ontothe redshift-space galaxy spectrumare , _p, andb. These three parametersare not independent: notonly is 1/b, but, more importantly,the three parameters aremeasured from the samecatalog that we areusing to constrain othercosmological parameters. However, theinformation we use toconstrain cosmological parameters isall encoded in theshape and amplitude ofthe angle-averaged power spectrum.The information used tomeasure and _pis all encoded inthe dependence of theFourier coefficients (i.e., ofthe power spectrum) onthe angle from theline of sight. Thus,we can treat thedeterminations of and_p as independent fromthe likelihood for cosmologicalparameters. The analysis ofVerde et al. (2002)to measure the biasparameter from the 2dFGRSuses both information aboutthe amplitude of theFourier coefficients and theirangular dependence. This dependence,however, is not thatintroduced by redshift-space distortionsbut is the configurationdependence of the bispectrum.Thus, in principle, weshould not treat thismeasurement as completely independent.However, most of thesignal for the biasmeasurement comes from thek range of 0.2 h Mpc^-1< k < 0.4h Mpc^-1, while thesignal for the presentanalysis comes from k< 0.2 h Mpc^-1.Note that the configurationdependence of the bispectrumis largely independent ofcosmology. This allows usto easily include aprior for the biasparameter in the analysis.

6. Ly FOREST DATA

TheLy forest traces thefluctuations in the neutralgas density along theline of sight todistant quasars. Since mostof this absorption isproduced by low-density unshockedgas in the voidsor in mildly overdenseregions that are thoughtto be in ionizationequilibrium, this gas isassumed to be anaccurate tracer of thelarge-scale distribution of darkmatter. In this epochand on these scalesthe clustering of darkmatter is still inthe linear regime.

Since theLy forest observations areprobing the distribution ofmatter at z 3, they are animportant complement to theCMB data and thegalaxy survey data. Becauseof their importance, therehas been extensive numericaland observational work testingthe notion that theytrace the LSS. Inour analyses, we findthat the addition ofLy forest data appearsto confirm trends seenin other data setsand tightens cosmological constraints.However, more observational andtheoretical work is stillneeded to confirm thevalidity of the emergingconsensus that the Lyforest data trace theLSS.

Recent papers use twodifferent approaches for analysisof the Ly forestpower spectrum data. McDonaldet al. (2000) andZaldarriaga, Hui, & Tegmark(2001) directly compare theobserved transmission spectra tothe predictions from cosmologicalmodels. We follow theapproach of Croft etal. (2002) and Gnedin& Hamilton (2002, hereafterGH), who use ananalytical fitting function torecover the matter powerspectrum from the transmissionspectrum.²²

GH factorize the linearpower spectrum into fourterms,

where P^fact(k) is aquantity that is independentof modeling and isalmost directly measured. Theother parameters convert thisquantity into the linearmatter power spectrum andencode the dependence oncosmology and the modelingof the intergalactic medium(IGM). In our treatment,we use the valuesof P^obs(k) (the estimatorfrom Ly forest observationsof P^fact) from GHand their parameterization interms of equation (43) becauseit allows us toexplicitly include the dependenceof the recovered linearmatter power spectrum onthe cosmological parameters. Qencodes the dependence ofthe recovered linear powerspectrum on the matterdensity parameter at z= 2.72. For Qwe use the GHAnsatz of

Q_T = 20,000 K/T₀(T₀ 20,000 K)parameterizes the dependence onthe mean temperature ofthe IGM, and Q 1.11 parameterizes thedependence on the assumedmean optical depth. Inaddition to the statisticalerrors, GH quote asystematic uncertainty that weadd to the statisticalone. Finally, the uncertaintiesin Q, Q_T, andQ contribute to theoverall normalization uncertainty. Weuse the Croft etal. (2002) prescription toparameterize this uncertainty asln (A) = -1/2(A- 1)²/, where ifA 1, then_Ly = 0.25, whileif A > 1,_Ly = 0.29.

N-body simulationsare used to convertthe flux power spectrumto the dark matterpower spectrum and calibratethe form of equation (43).The two different groups,GH and Croft etal. (2002), have donethis independently. The resultingpower spectra agree wellwithin the 1 errors for all datapoints except the lastthree. We thus increasethe 1 uncertaintieson the last threedata points to makethe two determinations ofP_L(k) consistent and usethis as a roughmeasure of the intrinsicsystematic uncertainties in theLy data.

GH point outthat the correlation influx measured from theLy forest samples powerover a finite bandof wavenumbers. The effectiveband-power windows are ratherbroad as a resultof the peculiar velocitiesthat smear power onscales comparable to theone-dimensional velocity dispersion. Thus,the recovered linear powerspectrum is effectively smoothedwith a window thatbecomes broader at smallerscales. In principle, theresulting covariance between estimatesof power at differentk needs to betaken into account todo a full likelihoodanalysis to extract cosmologicalparameters. However, the fullcovariance matrix is notavailable. Since the Lydata are such apowerful tool, we justperform a simple ²fit and caution thereader that interpreting thereduced ² as ameasure of goodness offit for this dataset is not meaningfulsince the data arestrongly correlated.

To marginalize overthe overall normalization uncertainty,we take advantage ofthe MCMC approach. Inprinciple, we could marginalizeover it analytically, aswe do for thecalibration uncertainty. Instead, ateach step of thechain we compute thebest-fit amplitude as donefor point sources (Hinshawet al. 2003b),

The likelihoodfor the Ly datafor the model isgiven by ln _Ly= ln (P^obs|,P_L) +ln (). The marginalizationis then automatically obtainedfrom the MCMC output.In other words, theanalytic marginalization computes (data|model)(A)dA,while we compute anestimator of this givenby (data|model)()d.

²² After the presentpaper was submitted, apreprint appeared (Seljak etal. 2003) claiming thatthe treatment of GHand Croft et al.(2002) significantly underestimates theerrors. Given the importanceof this data setto tighten cosmological constraints,the Ly

forest communityshould reach a consensuson the interpretation ofthese observations and onthe level of systematiccontamination.

7. CONCLUSIONS

In this paper wehave presented the basicformalism that we usefor our likelihood analysis.This paper shows thefinal step on thepath from time-ordered datato cosmological parameters. Itprovides the framework forthe analysis of cosmologicalparameters and their implicationsfor cosmology.

The unprecedented qualityof the WMAP dataand the tight constraintson cosmological parameters thatare derived require arigorous analysis so thatthe approximations made inthe modeling do notpropagate into significant biasesand systematic errors. Wehave derived an approximationto the exact likelihoodfunction for the that is accurate tobetter than 0.1%, andwe have carefully calibratedthe temperature power spectrumcovariance matrix with MonteCarlo simulations. This enablesus to use theeffective ² per degreeof freedom as atool to test whetheror not a modelis an acceptable fitto the data.

We implementour likelihood analysis withthe MCMC. We haveconcentrated on the issueof convergence and mixing,emphasizing how important theseissues are in recoveringcosmological parameter values andtheir confidence levels fromthe MCMC output.

To theWMAP data sets (TTand TE angular powerspectra) we have addedthe CBI and ACBARmeasurement of the CMBon smaller angular scales,the 2dFGRS galaxy powerspectrum at z 0, and the Lyforest matter power spectrumat z 3.These external data setssignificantly enhance the scientificvalue of the WMAPmeasurement, by allowing usto break parameter degeneracies.While the underlying physicsfor these data setsis much more complicatedand less well understoodthan for WMAP data,and systematic and instrumentaleffects are much moreimportant, we feel wehave made a significantstep toward improving therigor of the analysisof these data sets.We have included adetailed modeling of galaxybias, redshift distortions, andthe nonlinear growth ofstructure. We also includeknown (as to thepresent day) systematic andstatistical uncertainties intrinsic tothese other data sets.

Wethank Bill Holzapfel forinvaluable discussions about theACBAR data. We thankthe 2dFGRS team forgiving us access tothe Monte Carlo realizationsof the 2dFGRS. Themock catalogs of the2dFGRS were constructed atthe Institute for ComputationalCosmology at Durham. Wethank Will Percival fordiscussions and for providingus with the covariancematrix of the 2dFGRSpower spectrum. L. V.is supported by NASAthrough Chandra Fellowship PF2-30022issued by the ChandraX-Ray Observatory Center, whichis operated by theSmithsonian Astrophysical Observatory andon behalf of NASAunder contract NAS 8-39073.L. V. also acknowledgesRutgers University for supportduring the initial stagesof this work. H.V. P. is supportedby a Dodds Fellowshipgranted by Princeton University.The WMAP mission ismade possible by thesupport of the Officeof Space Sciences atNASA Headquarters and bythe hard and capablework of scores ofscientists, engineers, technicians, machinists,data analysts, budget analysts,managers, administrative staff, andreviewers.

REFERENCES

Ballinger, W. E., Heavens, A. F., & Taylor, A. N. 1995, MNRAS, 276, L59 First citation in article | Crossref | ADS
Ballinger, W. E., Peacock, J. A., & Heavens, A. F. 1996, MNRAS, 282, 877 First citation in article | Crossref | ADS
Barnes, C., et al. 2003, ApJS, 148, 51 First citation in article | IOPscience | ADS
Baugh, C. M., & Efstathiou, G. 1993, MNRAS, 265, 145 First citation in article | Crossref | ADS
Bennett, C. L., et al. 2003a, ApJ, 583, 1 First citation in article | IOPscience | ADS
. 2003b, ApJS, 148, 97 First citation in article | IOPscience | ADS
Bond, J. R., Crittenden, R., Davis, R. L., Efstathiou, G., & Steinhardt, P. J. 1994, Phys. Rev. Lett., 72, 13 First citation in article | Crossref | ADS | PubMed
Bond, J. R., & Efstathiou, G. 1984, ApJ, 285, L45 First citation in article | Crossref | ADS
Bond, J. R., Jaffe, A. H., & Knox, L. 1998, Phys. Rev. D, 57, 2117 First citation in article | Crossref | ADS
. 2000, ApJ, 533, 19 First citation in article | IOPscience | ADS
Bond, J. R., et al. 2002, preprint (astro-ph/0205386) First citation in article | ADS | Preprint
Bridle, S. L., Crittenden, R., Melchiorri, A., Hobson, M. P., Kneissl, R., & Lasenby, A. N. 2002, MNRAS, 335, 1193 First citation in article | Crossref | ADS
Christensen, N., & Meyer, R. 2000, preprint (astr-ph/0006401) First citation in article | Preprint
Christensen, N., Meyer, R., Knox, L., & Luey, B. 2001, Classical Quantum Gravity, 18, 2677 First citation in article | IOPscience | ADS
Cole, S., Fisher, K. B., & Weinberg, D. H. 1994, MNRAS, 267, 785 First citation in article | Crossref | ADS
Colless, M., et al. 2001, MNRAS, 328, 1039 First citation in article | Crossref | ADS
Croft, R. A. C., Weinberg, D. H., Bolte, M., Burles, S., Hernquist, L., Katz, N., Kirkman, D., & Tytler, D. 2002, ApJ, 581, 20 First citation in article | IOPscience | ADS
Davis, M., & Peebles, P. J. E. 1983, ApJ, 267, 465 First citation in article | Crossref | ADS
Dressler, A. 1980, ApJ, 236, 351 First citation in article | Crossref | ADS
Efstathiou, G., & Bond, J. R. 1999, MNRAS, 304, 75 First citation in article | Crossref | ADS
Efstathiou, G., & Moody, S. J. 2001, MNRAS, 325, 1603 First citation in article | Crossref | ADS
Elgarøy, Ø., et al. 2002, Phys. Rev. Lett., 89, 61301 First citation in article | Crossref | PubMed
Feldman, H. A., Kaiser, N., & Peacock, J. A. 1994, ApJ, 426, 23 First citation in article | Crossref | ADS
Fisher, K. B. 1995, ApJ, 448, 494 First citation in article | Crossref | ADS
Freedman, W. L., et al. 2001, ApJ, 553, 47 First citation in article | IOPscience | ADS
Fry, J. N. 1996, ApJ, 461, L65 First citation in article | IOPscience | ADS
Gelman, A., & Rubin, D. 1992, Stat. Sci., 7, 457 First citation in article | Crossref
Gilks, W. R., Richardson, S., & Spiegelhalter, D. J. 1996, Markov Chain Monte Carlo in Practice (London: Chapman and Hall) First citation in article | Crossref
Gnedin, N. Y., & Hamilton, A. J. S. 2002, MNRAS, 334, 107 (GH) First citation in article | Crossref | ADS
Goldstein, J. H., et al. 2002, ApJ, submitted First citation in article
Gunn, J. E., & Knapp, G. R. 1993, in ASP Conf. Ser. 43, Sky Surveys: Protostars to Protogalaxies, ed. B. T. Soifer (San Francisco: ASP), 267 First citation in article | ADS
Hamilton, A. J. S., Matthews, A., Kumar, P., & Lu, E. 1991, ApJ, 374, L1 First citation in article | Crossref | ADS
Hatton, S., & Cole, S. 1998, MNRAS, 296, 10 First citation in article | Crossref | ADS
Hawkins, E., et al. 2002, MNRAS, submitted First citation in article
Hinshaw, G. F., et al. 2003a, ApJS, 148, 63 First citation in article | IOPscience | ADS
. 2003b, ApJS, 148, 135 First citation in article | IOPscience | ADS
Hoekstra, H., van Waerbeke, L., Gladders, M. D., Mellier, Y., & Yee, H. K. C. 2002, ApJ, 577, 604 First citation in article | IOPscience | ADS
Hu, W. 2001, Phys. Rev. D, 64, 83005 First citation in article | Crossref | ADS
Hubble, E. 1936, ApJ, 84, 517 First citation in article | Crossref | ADS
Huey, G., Wang, L., Dave, R., Caldwell, R. R., & Steinhardt, P. J. 1999, Phys. Rev. D, 59, 63005 First citation in article | Crossref | ADS
Jarosik, N., et al. 2003a, ApJS, 145, 413 First citation in article | IOPscience | ADS
. 2003b, ApJS, 148, 29 First citation in article | IOPscience | ADS
Kaiser, N. 1984, ApJ, 284, L9 First citation in article | Crossref | ADS
. 1987, MNRAS, 227, 1 First citation in article | Crossref | ADS
Kang, X., Jing, Y. P., Mo, H. J., & Börner, G. 2002, MNRAS, 336, 892 First citation in article | Crossref | ADS
Kass, R. E., Carlin, B. P., Gelman, A., & Neal, R. M. 1997, J. Am. Stat. Assoc., 52, 93 First citation in article
Knox, L., Christensen, N., & Skordis, C. 2001, ApJ, 563, L95 First citation in article | IOPscience | ADS
Kogut, A., et al. 2003, ApJS, 148, 161 First citation in article | IOPscience | ADS
Komatsu, E., & Seljak, U. 2002, MNRAS, 336, 1256 First citation in article | Crossref | ADS
Komatsu, E., et al. 2003, ApJS, 148, 119 First citation in article | IOPscience | ADS
Kosowsky, A., Milosavljevic, M., & Jimenez, R. 2002, Phys. Rev. D, 66, 63007 First citation in article | Crossref | ADS
Kuo, C. L., et al. 2002, ApJ, submitted First citation in article
Lahav, O., Rees, M. J., Lilje, P. B., & Primack, J. R. 1991, MNRAS, 251, 128 First citation in article | Crossref | ADS
Lewis, A., & Bridle, S. 2002, Phys. Rev. D, 66, 103511 First citation in article | Crossref | ADS
Ma, C., Caldwell, R. R., Bode, P., & Wang, L. 1999, ApJ, 521, L1 First citation in article | IOPscience | ADS
Maddox, S. J., Efstathiou, G., & Sutherland, W. J. 1990a, MNRAS, 246, 433 First citation in article | ADS
. 1996, MNRAS, 283, 1227 First citation in article | Crossref | ADS
Maddox, S. J., Efstathiou, G., Sutherland, W. J., & Loveday, J. 1990b, MNRAS, 243, 692 First citation in article | ADS
Marzke, R. O., Geller, M. J., da Costa, L. N., & Huchra, J. P. 1995, AJ, 110, 477 First citation in article | Crossref | ADS
Mason, B. S., et al. 2002, ApJ, submitted First citation in article
Matsubara, T. 1994, ApJ, 424, 30 First citation in article | Crossref | ADS
McDonald, P., Miralda-Escudé, J., Rauch, M., Sargent, W. L. W., Barlow, T. A., Cen, R., & Ostriker, J. P. 2000, ApJ, 543, 1 First citation in article | IOPscience | ADS
Norberg, P., et al. 2001, MNRAS, 328, 64 First citation in article | Crossref | ADS
Page, L., et al. 2003a, ApJ, 585, 566 First citation in article | IOPscience | ADS
. 2003b, ApJS, 148, 39 First citation in article | IOPscience | ADS
Panter, B., Heavens, A. F., & Jimenez, R. 2002, MNRAS, submitted First citation in article
Peacock, J. A., & Dodds, S. J. 1994, MNRAS, 267, 1020 First citation in article | Crossref | ADS
. 1996, MNRAS, 280, L19 First citation in article | Crossref | ADS
Peacock, J. A., et al. 2001, Nature, 410, 169 First citation in article | Crossref | ADS | PubMed
Pearson, T. J., et al. 2002, ApJ, submitted First citation in article
Peebles, P. J. E., & Yu, J. T. 1970, ApJ, 162, 815 First citation in article | Crossref | ADS
Peiris, H., et al. 2003, ApJS, 148, 213 First citation in article | IOPscience | ADS
Percival, W. J. 2003, MNRAS, submitted First citation in article
Percival, W. J., et al. 2001, MNRAS, 327, 1297 First citation in article | Crossref | ADS
Postman, M., & Geller, M. J. 1984, ApJ, 281, 95 First citation in article | Crossref | ADS
Press, W. H., Teukolsky, S. A., Vetterling, W. T., & Flannery, B. P. 1992, Numerical Recipes in C (2d ed.; Cambridge: Cambridge Univ. Press) First citation in article | ADS
Riess, A. G., et al. 1998, AJ, 116, 1009 First citation in article | IOPscience | ADS
. 2001, ApJ, 560, 49 First citation in article | IOPscience | ADS
Scoccimarro, R., Zaldarriaga, M., & Hui, L. 1999, ApJ, 527, 1 First citation in article | IOPscience | ADS
Seager, S., Sasselov, D. D., & Scott, D. 1999, ApJ, 523, L1 First citation in article | IOPscience | ADS
Seljak, U., & Zaldarriaga, M. 1996, ApJ, 469, 437 First citation in article | Crossref | ADS
Seljak, U., et al. 2003, preprint (astro-ph/0302571) First citation in article | ADS | Preprint
Sievers, J. L., et al. 2002, preprint (astro-ph/0205387) First citation in article | ADS | Preprint
Spergel, D. N., et al. 2003, ApJS, 148, 175 First citation in article | IOPscience | ADS
Verde, L., et al. 2002, MNRAS, 335, 432 First citation in article | Crossref | ADS
Zaldarriaga, M., Hui, L., & Tegmark, M. 2001, ApJ, 557, 519 First citation in article | IOPscience | ADS
Zaldarriaga, M., & Seljak, U. 2000, ApJS, 129, 431 First citation in article | IOPscience | ADS
Zurek, W. H., Quinn, P. J., Salmon, J. K., & Warren, M. S. 1994, ApJ, 431, 559 First citation in article | Crossref | ADS