I have been reading a paper: Wang X, Golbandi N, Bendersky M, Metzler D, Najork M. Position bias estimation for unbiased learning to rank in personal search.
I am wondering how they derive the M-step in an EM algorithm that has been applied to the problem in their paper.
This is the data log-likelihood function:
where observed data are (c:click, q:query, d:document, k:result position) rows, and is the probability a search result at rank k is examined by the user (assumed to be independent of query-document pair), and is the probability a search result, i.e., a query-document pair is actually relevant (assumed to be independent of result position).
Here are some hidden variable probability expressions (E means examination and R means relevant, they assume that a click event implies a search result is both examined and relevant):
The paper has derived the M-step update formulas:
however, how to get these two formulas?
Note: M-step is usually derived from this formula:
where θ are parameters (i.e., and in this case), X are data (c,q,d,k in this case), and Z are hidden variables (E and R?).