<center><h2>
Class 4
</center></h2>

<center><img SRC="../../../colline.gif" ></center>

<ul>

<li>Howework was very easy. All we need to do to check that MINUIT
parameter error estimates are about right is:

<pre>
x=[1:10]'
p=[0.5,1]
y=polyval(p,x)
e(1:5,1)=0.1;
e(6:10,1)=1;

for i=1:100
  yd=y+e.*randn(size(y));
  [pf(i,:),pfe(i,:)]=matmin('chisq',p,[],'polyval',yd,e,x);
end

subplot(2,1,1); hfill(pf(:,1));
subplot(2,1,2); hfill(pf(:,2));

[std(pf(:,1)) pfe(1,1)]
[std(pf(:,2)) pfe(1,2)]

</pre>

<p>Note that pfe(:,1) is a column of equal numbers - the parameter errors
depend only on the data errors and hence are equal for each sim realization.
Also note that the fir values scatter around the true values and that
the spread (std) of the scatter matches the parameter error estimate.

<li>What if the problem is still more generalized - for any given
set of trial parameters we can calculate the likelihood of the
a given data point.
So we want to seek the set of model parameters which
maximizes the joint likelihood of the observations.
MINUIT is an all purpose minimizer - if we make the figure of merit
function the negative log likelihood it will find these most likely
parameters for us.
It has to be the log likelihood as the joint likelihood is always
very small and numerical precision problems would arrise if
we use the straight likelihood.

<p>The example we will use is counts in a histogram.
If we have n counts in a bin people often take the sqrt(n) as the error on
the bin and do the fit chisq style as x,n,sqrt(n) as x,y,e.
But what is the chance to get a count of 1? 1+/-1?
No if we see even a single event clearly the probability to get zero is
zero!

<p>What is the probability to get 1 event if the model says the
expectation value in that bin is 0.6 events?

<pre>
poisspdf(1,0.5)
</pre>

<p>Matlab has an excellent set of probability functions.
Look at the 
<a href="http://www.mathworks.com/access/helpdesk/help/toolbox/stats/">
Statistics toolbox help</a> under "By Category".

<li>First let's generate a Gaussian distribution and fit it chisq style:

<pre>
[x,n]=hfill(randn(1,1000),100,-3,3);
hplot(x,n);
[p,pe]=matmin('chisq',[30,0,1],[],'gauss',n,sqrt(n),x);
hold on
plot(x,gauss(p,x),'r');
hold off
</pre>

<li>To fit the histogram maximum likelihood style we need a function which
returns -2 times the log of the joint probability of the bin contents.
i.e. we need L=logl(pars,func,n,x...).
I'll give you the skeleton of this function:

<pre>
function L = logl(pars,func,n,varargin)
% L = logl(pars,func,n,x...)
%
% Calculate -2 times the log of the joint probability of a 
% set of observations n at values x if they are Poisson deviates from
% mean values described by function func with parameters
% pars.
%
% eg: [x,n]=hfill(randn(1,1000),100,-3,3);
%     L=logl([30,0,1],'gauss',n,x)
%     [p,pe]=matmin('logl',[30,0,1],[],'gauss',n,x)

% get model value for each n
mu=feval(func,pars,varargin{:});

% use poisspdf to get probability of each n given model

% combine these to get L

return
</pre>

<p>Now we fit using logl:

<pre>
[p,pe]=matmin('logl',[30,0,1],[],'gauss',n,x)
hold on
plot(x,gauss(p,x),'g');
hold off
</pre>

<p>In this case the results are similar but it can maka a big difference
when there are many bins with zero contents - chisq has to to be
made to ignore them or divide by zero occurs.

<li>For a concrete example let's consider the power law flux spectrum
of radio point sources.
Go to the <a href="http://lambda.gsfc.nasa.gov/product/map/current/">
WMAP site</a> and get their point source catalog - file is called
wmap_ptsrc_catalog_p2_3yr_v2.txt.
We can read in this data table using Matlab textread function.
To read the first two columns as cell arrays of strings:

<pre>
[ra dec]=textread('wmap_ptsrc_catalog_p2_3yr_v2.txt','%s %s %*[^\n]','commentstyle','shell')
</pre>

<p>textread reads the entire file in one shot.
It is very powerful but rapidly becomes convoluted.
Check the Matlab help and Write a function to return a structure
s with elements of glon, glat and a nx5 array of fluxes
s=wmap_read('wmap_ptsrc_catalog_p2_3yr_v2.txt').

<pre>
s=wmap_read('wmap_ptsrc_catalog_p2_3yr_v2.txt')
plot(s.l,s.b,'.'); xlim([0,360]); ylim([-90,90]);
</pre>

<li>That's not very pretty - we want to plot a spherical projection.
Matlab knows every projection under the Sun - check the
<a href="http://www.mathworks.com/access/helpdesk/help/toolbox/map/">
Mapping toolbox help</a>.

<pre>
clf
axesm('MapProjection','Hammer','Grid','on');
plotm(s.b,s.l,'.')
</pre>

<p>Now let's plot the histogram of source fluxes.

<pre>
hfill(s.s(:,1),100);
</pre>

<p>The form of this becomes clearer if we plot in log bins
with a log y axis;

<pre>
s.s(s.s==0)=NaN;
[x,nl]=hfill(log10(s.s(:,1)),30);
hplot(x,nl);
set(gca,'YScale','log'); ylim([0.3,100]);
</pre>

<p>Now let's overplot the highest frequency:

<pre>
[x,nh]=hfill(log10(s.s(:,end)),30,x(1),x(end));
hold on; hplot(x,nh,'r'); hold off;
ylim([0.3,100]);
</pre>

<p>At low flux we get no hits because the experiment
is not sensitive enough.
If we try and fit down there without telling the fitter
about the selection efficiency we will get a silly
answer.
We need to limit to the region above flux where efficiency
appears to become 100%.

<pre>
[x,nl]=hfill(log10(s.s(:,1)),30,0);
hplot(x,nl);
set(gca,'YScale','log'); ylim([0.3,100]);
</pre>

<li>Let's fit this distribution to a power law
dN/dlog(S) = p(1)*S^p(2).
Make a function y = powlaw(par,x) which implements
this, plug it into matmin using logl and see what you get...

<pre>
</pre>

<li>Homework is to refine the fit making the following improvments:

<ul>
<li>Limit the galactic latitude to b>20 where coverage
appears to be uniform.
<li>Repeat the fit to log bin histogram.
<li>Scale the y axis to dN/dS in units
of sources per steradian per Jansky.
(Remember: we must do the fit in unscaled units and then
scale the results also for overplotting.)
<li>Calculate the upper and lower limit 68% confidence
range for the n in each bin using the formula in
Gehrels N., 1986, ApJ vol 303 page 336 and use these
to plot proper error bars on your final plot.
You only need the equations in section 2 b and the
Matlab function chi2inv.
You can find the paper easily on
<a href="http://adswww.harvard.edu/">ADS</a>.
Again you will need to calculate the upper/lower limits
in the base histogram units and scale to final plot.
</ul>

</ul>

<hr>