The question invites us to characterize the concept of "mean" in a sufficiently broad sense to encompass all the usual means--power means, Lp means, medians, trimmed means--but not so broadly that it becomes almost useless for data analysis. This reply discusses some of the axiomatic properties that any reasonably useful definition of "mean" should have.
Basic Axioms
A usefully broad definition of "mean" for the purpose of data analysis would be any sequence of well-defined, deterministic functions fn:An→A for A⊂R and n=1,2,… such that
(1) min(x)≤fn(x)≤max(x) for all x=(x1,x2,…,xn)∈An (a mean lies between the extremes),
(2) fn is invariant under permutations of its arguments (means do not care about the order of the data), and
(3) each fn is nondecreasing in each of its arguments (as the numbers increase, their mean cannot decrease).
We must allow for A to be a proper subset of real numbers (such as all positive numbers) because plenty of means, such as geometric means, are defined only on such subsets.
We might also want to add that
(1') there exists at least some x∈A for which min(x)≠fn(x)≠max(x) (means are not extremes). (We cannot require that this always hold. For instance, the median of (0,0,…,0,1) equals 0, which is the minimum.)
These properties seem to capture the idea behind a "mean" being some kind of "middle value" of a set of (unordered) data.
Consistency axioms
I am further tempted to stipulate the rather less obvious consistency criterion
(4.a) The range of fn+1(t,x1,x2,…,xn) as t varies throughout the interval [min(x),max(x)] includes fn(x). In other words, it is always possible to leave the mean unchanged by adjoining an appropriate value t to a dataset. In conjunction with (3), it implies that adjoining extreme values to a dataset will pull the mean towards those extremes.
If we wish to apply the concept of mean to a distribution or "infinite population", then one way would be to obtain it in the limit of arbitrarily large random samples. Of course the limit might not always exist (it does not exist for the arithmetic mean when the distribution has no expectation, for instance). Therefore I do not want to impose any additional axioms to guarantee the existence of such limits, but the following seems natural and useful:
(4.b) Whenever A is bounded and xn is a sequence of samples from a distribution F supported on A, then the limit of fn(xn) almost surely exists. This prevents the mean from forever "bouncing around" within A even as sample sizes get larger and larger.
Along the same lines, we could further narrow the idea of a mean to insist that it become a better estimator of "location" as sample sizes increase:
(4.c) Whenever A is bounded, then the variance of the sampling distribution of fn(X(n)) for a random sample X(n)=(X1,X2,…,Xn) of F is nondecreasing in n.
Continuity axiom
We might consider asking means to vary "nicely" with the data:
(5) fn is separately continuous in each argument (a small change in the data values should not induce a sudden jump in their mean).
This requirement might eliminate some strange generalizations, but it does not rule out any well-known mean. It will rule out some aggregation functions.
An invariance axiom
We can conceive of means as applying to either interval or ratio data (in Stevens' well-known sense). We cannot demand they be invariant under shifts of location (the geometric mean is not), but we can require
(6) fn(λx)=λfn(x) for all x∈An and all λ>0 for which λx∈An. This says only that we are free to compute fn using any units of measurement we like.
All the means mentioned in the question satisfy this axiom except for some aggregation functions.
Discussion
General aggregation functions f2, as described in the question, do not necessarily satisfy axioms (1'), (2), (3), (5), or (6). Whether they satisfy any consistency axioms may depend on how they are extended to n>2.
The usual sample median enjoys all these axiomatic properties.
We could augment the consistency axioms to include
(4.d) f2n(x;x)=fn(x) for all x∈An.
This implies that when all elements of a dataset are repeated equally often, the mean does not change. This may be too strong, though: the Winsorized mean does not have this property (except asymptotically). The purpose of Winsorizing at the 100α% level is to provide resistance against changes in at least 100α% of the data at either extreme. For instance, the 10% Winsorized mean of (1,2,3,6) is the arithmetic mean of (2,2,3,3), equal to 2.5, but the 10% Winsorized mean of (1,1,2,2,3,3,6,6) is 3.5.
I do not know which of the consistency axioms (4.a), (4.b), or (4.c) would be most desirable or useful. They appear to be independent: I don't think any two of them imply the third.