For the second part of your question (estimation of entropy difference between distributions) you may be able to use the identity
F=⟨E⟩−TS,
where
⟨E⟩ is the average energy,
T is the temperature (it is proportional to
θ in
p∝eθE), and
S is the entropy. For details, see: Jaynes, E. (1957). Information Theory and Statistical Mechanics. Physical Review, 106(4), 620–630.
http://doi.org/10.1103/PhysRev.106.620.
The idea then would be to use one of the methods available in the Computational Statistical Physics literature (see the links in the sidebar of that page) to find free energy differences ΔF and then find ΔS as a function of ΔF and Δ⟨E⟩ using the above formula (keep in mind that you can think of the restriction to a subset A1 of A as being equivalent to modifying the energy function E so that it becomes infinite in the complement of A1).
Here are two additional references on algorithms for computing free energy:
Lelièvre, T., Rousset, M., & Stoltz, G. (2010). Free Energy Computations. Imperial College Press. http://doi.org/10.1142/9781848162488
Chipot, C., & Pohorille, A. (2007). Free Energy Calculations. (C. Chipot & A. Pohorille, Eds.) (Vol. 86). Berlin, Heidelberg: Springer Berlin Heidelberg. http://doi.org/10.1007/978-3-540-38448-9