DCOPs can be adressed by a series of algorithms descibed in \cite{fioretto18jair} and presented in Figure~\ref{fioretto_taxo}.
DCOPs can be addressed by algorithms descibed in \cite{fioretto18jair} and presented in Figure~\ref{fioretto_taxo}.
The first distinction made is in terms of solution optimality with complete algorithms guaranteeing the optimal solution to be found while incomplete ones offer no guarantee but have shorter execution times.
Then, they are classified according to their (lack or presence of) centralisation and synchronicity.
Finally, they are divided based on their exploration process, revolving around three main frames, namely search, inference and sampling.
...
...
@@ -18,7 +18,7 @@ In our case, we focus on MGM and its coordinated variant MGM-2 which are incompl
\subsection{MGM \& MGM-2}
Both MGM and MGM-2 are extensively described in\cite{maheswaran04pdcs}'s paper.
Both MGM and MGM-2 are extensively described by\cite{maheswaran04pdcs}.
In this paper the 2-coordinated algorithm is detailed and hints are given at k-coordinated versions.
Historically, MGM evolved from DBA with the difference that there is no change on constraint costs to exit local minima issues and no need for DBA's global knowledge of solution quality.
...
...
@@ -26,14 +26,14 @@ Another algorithm MGM is often compared with is DSA. The difference lies is the
In terms of solution quality, both MGM and MGM-2 are proved to be monotonous, intuitively because since the utility function is the sum of all neighbours' utilities, if one increases, all neighbours' utility consequently increases too.
\subsubsection{Maheswaran's paper extended abstract}
\subsubsection{MGM-2 overview}
Their main focus of interest is the application of DCOPs to large-scale problems where the limitation of fully-connected networks of agents (complete graphs) is high.
Computational costs caused by such topologies are prohibitive and a possible solution to this is the local knowledge approach of distributed algorithms.
The principle is one of coordinated negotiation and distributed control of variables, where each agent is (in our case) one variable.
The principle is one of coordinated negotiation and distributed control of variables.
"\textit{The optimal solution of a DCOP is a Nash Equilibrium in an appropriate game}".
Coordination and cooperation are key since selfish behaviours can result in unstable dynamics, which means a structure needs to be imposed on how values get updated.
The 2-coordinated algorithm keeps improving until neither a unilateral nor a bilateral move can improve the utility function.
Agents can perform either unilateral or bilatrela (2-coordinated) moves whereby they update their values according to the computed improvement in global utility.
As opposed with coalition scenarios where a manager handles agent's decisions, therefore merging the behaviour into a centralised one, MGM-2 aims at allowing coordination while maintaining a distributed decision-making process.
A notion of solidarity is necessary here, hinting at a cooperative environment but could be replaced by compensations between agents in a competitive environment.
In MGM-2, coordinated pairs consider the overall gain they and their partner can achieve, irrespective of whether their own gain is better or not.
...
...
@@ -41,19 +41,19 @@ This is possible because we base the interaction on a cooperative framework. Wer
a joint action is considered useful if the sum of the 2 partenered agents utility functions increases, even if one of them diminishes.
\subsubsection{General overview of the algorithms}
\subsubsection{MGM-2 prnciple}
Globally speaking, the process of MGM could be summed-up as follows:
at the beginning of each round, agents inform their neighbours of their current value.
Thanks to this information received from each of their neighbours, each agent is then capable of computing the changes it can make to its own value to change its utility taking into account each neighbours' value.
Once this is done, each agent selects the best move he can do and the corresponding gain and informs each of its neighbours of this.
Once this is done, each agent selects the best move it can do and the corresponding gain and informs each of its neighbours of this.
Among each neighbourhood, a single agent will be allowed to act, the one having made the best offer of move.
The agent thus selected updates its value and another round can begin.
In MGM-2, the process is slightly more complex since coordinated moves come into play.
The difference starts from the begining of the round where agents are randomly split into 2 sets, \emph{Offerers} and \emph{Receivers}.
The difference starts from the begining of the round where agents are randomly split into 2 sets, \emph{offerers} and \emph{receivers}.
Each set will have a very different behaviour.
Offerers select a neighbour at random, make an offer to said neighbour and wait for their neighbour's response.
If the neighbour declines the offer, they switch back to an MGM-like behaviour where they compute their best solo move (like they would in MGM) and so on.
If the neighbour declines the offer, they switch back to an MGM-like behaviour where they compute their best solo move (as in MGM) and so on.
If their neighbour accepts, they are from then on \textit{committed}, just like their neighbour.
Receivers merely wait for potential offers, they might receive none and therefore go for an MGM-like behaviour of solo move, or receive offers and choose among them.
If they do get offers, they will chose the best one among the acceptable ones.
...
...
@@ -69,7 +69,7 @@ A No-Go means neither of them will make a move this round and so they move on to
All variables are in fact agents in a DCOP game, the optimal solution corresponds to a Nash Equilibrium in the specified game.
Since variables are in fact agents in a DCOP game, the optimal solution corresponds to a Nash Equilibrium in the specified game.
The notion of vicinity is crucial and should be considered fixed once and for all, the agent's neighbours do not vary during the game.
A round starts by all agents broadcasting the current value to their vicinity.
This means each agent sends one message and receives $|\text{\emph{vicinity}}|$ messages, each containing one particular neighbours's current value.