DCOPs are a multi-agent approach to optimisation problems. Among existing DCOP algorithms, Maximum-Gain-Message (MGM) and its 2-coordinated counterpart MGM-2 are popular options.
Among existing DCOP algorithms, Maximum-Gain-Message (MGM) and its 2-coordinated version MGM-2 are popular.
Other similar algorithms such as DSA which is considered as a stochastic variant of MGM \cite{fioretto18jair} and BE-Rebid \cite{taylor2010} which extends MGM by calculating and communicating expected gain \cite{fioretto18jair} also exist.
Other similar algorithms such as DSA which is considered as a stochastic variant of MGM \cite{fioretto18jair} and BE-Rebid \cite{taylor2010} which extends MGM by calculating and communicating expected gain \cite{fioretto18jair} also exist.
Several open-source implementations of MGM exist in Python with the \href{https://github.com/Orange-OpenSource/pyDcop}{pyDCOP framework} developped by \cite{rust19} and Java with the \href{https://api.frodo-ai.tech/d3/d4c/a01724.html}{Frodo project} offered by \cite{Leaute2009}.
Several open-source implementations of MGM exist in Python with the \href{https://github.com/Orange-OpenSource/pyDcop}{pyDCOP framework} developped by \cite{rust19} and Java with the \href{https://api.frodo-ai.tech/d3/d4c/a01724.html}{Frodo project} offered by \cite{Leaute2009}.
Though not necessarily always the best performers, MGM and its variants are deemed robust and efficient algorithms on average, making them suitable benchmark options \cite{fioretto18jair}.
Though not necessarily always the best performers, MGM and its variants are deemed robust and efficient algorithms on average, making them suitable benchmark options \cite{fioretto18jair}.
...
@@ -237,7 +237,7 @@ We proceed to the description of the MGM-2 automaton and finally move on to two
...
@@ -237,7 +237,7 @@ We proceed to the description of the MGM-2 automaton and finally move on to two
\section{Conclusion}
\section{Conclusion}
\label{sec:conc}
\label{sec:conc}
We detailed MGM and more particularly MGM-2 algorithms and presented examples of execution on toy DCOPs.
We detailed MGM and more particularly MGM-2 algorithms and presented examples of execution on toy DCOPs.
We likewise detailed the automaton representing MGM-2 and offered two examples of execution on toy problems.
We likewise detailed the automaton representing MGM-2 and proposed two examples of execution on toy problems.
A Distributed Constraint Optimization Problem (DCOP) framework is
A Distributed Constraint Optimization Problem (DCOP) framework is
the distributed version of constraint optimization problems. In this
the distributed version of constraint optimization problems. In this
multi-agent paradigm, agents representing variables in the problem communicate so that each of them
multi-agent paradigm, agents representing variables in the problem communicate so that each of them
can gradually update the value of the variable it controls to improve it and better match a maximisation or minimisation objective.
can gradually update the value of the variable it controls to improve the global utility function.
There are at least as many variables as agents and possibly more variables than agents, implying that a single agent might be in control of several variables but will always be in control of at least one.
There are at least as many variables as agents and possibly more variables than agents, implying that a single agent might control several variables.
However in most DCOPs, control of a single variable by a single agent is assumed.
However in most DCOPs, control of a single variable by a single agent is assumed.
The optimal solution is the minimum/maximum of the global objective function.
The optimal solution is the minimum/maximum of the global objective function.
Each variable has a domain of values it can take, this domain is only known to the agent in control of the said variable.
Each variable has a domain of values it can take, this domain is only known to the agent in control of the said variable.
...
@@ -15,8 +15,7 @@ The notion of neighbourhood is key because DCOPs rely on locality.
...
@@ -15,8 +15,7 @@ The notion of neighbourhood is key because DCOPs rely on locality.
Therefore each agent can only communicate with its neighbours and only knows about cost functions which involve at least one of the variables it controls.
Therefore each agent can only communicate with its neighbours and only knows about cost functions which involve at least one of the variables it controls.
\subsection{Origins}
\subsection{Origins}
DCOPs are decentralised versions of COPs which are in turn complexified CSPs.
In order to better understand DCOPs, we give here a brief overview of the types of problems they come from.
In order to better understand DCOPs, we give here a brief overview of the types of problems they originate from.
A CSP is a problem where the goal is to determine whether a set of variables spanning over determined domains can satisfy constraints.
A CSP is a problem where the goal is to determine whether a set of variables spanning over determined domains can satisfy constraints.
These are typically combinatorial problems \cite{fioretto18jair}
These are typically combinatorial problems \cite{fioretto18jair}
...
@@ -26,16 +25,16 @@ These are typically combinatorial problems \cite{fioretto18jair}
...
@@ -26,16 +25,16 @@ These are typically combinatorial problems \cite{fioretto18jair}
A Constraint Satisfaction Problem is a tuple
A Constraint Satisfaction Problem is a tuple
$\langle X,D,C \rangle$ where
$\langle X,D,C \rangle$ where
\begin{itemize}
\begin{itemize}
\item$X =\{x_1, \ldots, x_n \}$ is the the set of variables;
\item$X =\{x_1, \ldots, x_n \}$ is the the set of variables
\item$D =\{d_1, \ldots, d_n \}$ is the set of respective non-empty domains;
\item$D =\{d_1, \ldots, d_n \}$ is the set of corresponding non-empty domains
\item$C =\{c_1, \ldots, c_m\}$ is the set of constraints over the variables;
\item$C =\{c_1, \ldots, c_m\}$ is the set of constraints over the variables.
\end{itemize}
\end{itemize}
\end{definition}
\end{definition}
COPs -sometimes called Weighted Constraint Satisfaction Problems- take a step further when compared with CSPs.
COPs -sometimes called Weighted Constraint Satisfaction Problems- take a step further when compared with CSPs.
Here the solution is not binary simply stating whether constraint can be satisfied or not but results are quantifiable.
Here the solution is not binary simply stating whether constraint can be satisfied or not but values are quantifiable.
COPs are akin to CSPs with an additional particular objective function, either a maximisation or a minimisation. In this setting, constraints can either be hard or soft depending on whether respecting them is vital or preferable.
COPs are akin to CSPs with an additional particular objective function, either a maximisation or a minimisation. In this setting, constraints can either be hard or soft depending on whether respecting them is vital or preferable.
The objective function can be either a maximisation in the case of rewards or a minimisation in the case of costs.
There are two types of objective functions, maximisations with their respective gain and minimisations with their respective cost.
Typically, a constraint which \textit{needs} to be satisfied will be coined \textit{hard} while a constraint which \textit{should} be satisfied will be coined \textit{soft}.
Typically, a constraint which \textit{needs} to be satisfied will be coined \textit{hard} while a constraint which \textit{should} be satisfied will be coined \textit{soft}.
\begin{definition}[COP]
\begin{definition}[COP]
...
@@ -43,9 +42,9 @@ Typically, a constraint which \textit{needs} to be satisfied will be coined \tex
...
@@ -43,9 +42,9 @@ Typically, a constraint which \textit{needs} to be satisfied will be coined \tex
A Constraint Optimization Problem is a tuple
A Constraint Optimization Problem is a tuple
$\langle X,D,F,\alpha\rangle$ where:
$\langle X,D,F,\alpha\rangle$ where:
\begin{itemize}
\begin{itemize}
\item$X =\{ x_1, \ldots, x_n \}$ is a set of $n$ variables ;
\item$X =\{ x_1, \ldots, x_n \}$ is a set of $n$ variables
\item$D =\{ D_1, \ldots, D_n \}$ is the set of finite domains for the variables in $X$,
\item$D =\{ D_1, \ldots, D_n \}$ is the set of finite domains for the variables in $X$,
with $D_i$ being the set of possible values for the variable $x_i$ ;
with $D_i$ being the set of possible values for the variable $x_i$
\item$F$ is a set of constraints. A constraint $f_i \in F$ is a
\item$F$ is a set of constraints. A constraint $f_i \in F$ is a
@@ -81,7 +81,7 @@ A multi-agent version of CSPs also exists in the form of DisCSPs.
...
@@ -81,7 +81,7 @@ A multi-agent version of CSPs also exists in the form of DisCSPs.
and $\bot$ is a special element used to denote that a
and $\bot$ is a special element used to denote that a
given combination of values for the variables in $\mathbf{x^i}$ is not
given combination of values for the variables in $\mathbf{x^i}$ is not
allowed, and it has the property that
allowed, and it has the property that
$a +\bot=\bot+ a=\bot, \forall~a \in\mathds{R}$ ;
$a +\bot=\bot+ a=\bot, \forall~a \in\mathds{R}$
\item$\alpha : X \rightarrow A$ is a surjective function, from
\item$\alpha : X \rightarrow A$ is a surjective function, from
variables to agents, which assigns the control of each variable
variables to agents, which assigns the control of each variable
$x \in X$ to an agent $\alpha(x)$.
$x \in X$ to an agent $\alpha(x)$.
...
@@ -94,7 +94,7 @@ variables of $X$. An assignment is complete if it assigns a value to
...
@@ -94,7 +94,7 @@ variables of $X$. An assignment is complete if it assigns a value to
each variable in $X$. For a given complete assignment $\sigma$, we say
each variable in $X$. For a given complete assignment $\sigma$, we say
that a constraint (also called cost function) $f_i$ is satisfied by
that a constraint (also called cost function) $f_i$ is satisfied by
$\sigma$ if $f_i(\sigma(x_i))\neq\bot$. A complete assignment is a
$\sigma$ if $f_i(\sigma(x_i))\neq\bot$. A complete assignment is a
solution of a DCOP if it satisfies all its cost functions.
solution of a DCOP if it satisfies all its constraints.
In the case of anytime algorithms, each assignment is complete and gradually improves over time, which makes it possible to stop the algorithm at any point.
In the case of anytime algorithms, each assignment is complete and gradually improves over time, which makes it possible to stop the algorithm at any point.
...
@@ -102,9 +102,9 @@ The optimization objective is represented by the function $\Obj$, which
...
@@ -102,9 +102,9 @@ The optimization objective is represented by the function $\Obj$, which
can be of different nature, either a minimisation or a maximisation. A solution is a complete assignment with
can be of different nature, either a minimisation or a maximisation. A solution is a complete assignment with
cost different from $\bot$, and an optimal solution is a solution with
cost different from $\bot$, and an optimal solution is a solution with
minimal cost (resp. maximal utility). In general, this function is a sum of cost (resp. utility) constraints:
minimal cost (resp. maximal utility). In general, this function is a sum of cost (resp. utility) constraints:
$F =\Sigma_i f_i$ ; but some approaches can use other kind of aggregation.
$F =\Sigma_i f_i$ ; but some approaches can use other kind of agregation.
Additionally, in the following we consider that
Additionally, in the following report, we consider that:
\begin{itemize}
\begin{itemize}
\item$n = m$,
\item$n = m$,
i.e. each agent controls only one variable. This restriction is often considered in the literature;
i.e. each agent controls only one variable. This restriction is often considered in the literature;
...
@@ -116,15 +116,9 @@ With these assumptions, a DCOP can be easily represented as a graph where vertic
...
@@ -116,15 +116,9 @@ With these assumptions, a DCOP can be easily represented as a graph where vertic
edges are binary constraints. Note that since constraints are binary and can only be over a given pair once, the resulting graph is never a multigraph.
edges are binary constraints. Note that since constraints are binary and can only be over a given pair once, the resulting graph is never a multigraph.
\begin{definition}
\begin{definition}
Let $\langle A,X,D,F,\alpha\rangle$ be a DCOP, such that
Let $\langle A,X,D,F,\alpha\rangle$ be a DCOP, such that
\begin{itemize}
\item$A =\{a_1 , \ldots, a_m \}$ ;
We define from $F$ the global cost function and the local cost function of an agent depending on its neighborhood~:
\item$X =\{ x_1, \ldots, x_m \}$ ;
\item$F$ contains at most one function $f_{ij}(x_i,x_j)$ per pair ${i,j}$, $1\leq i < j \leq m$ ; to simplify, $f_{ij}(x_i,x_j)$ and $f_{ji}(x_j,x_i)$ denote the same function.
\item$\alpha(x_i)= a_i$, each agent $a_i$ controls the variable $x_i$
\end{itemize}
In this context, we define from $F$ the global cost function and the local cost function of an agent depending on its neighborhood~:
\begin{itemize}
\begin{itemize}
\item the global cost function $\Obj=\Sigma_{f_{ij}\in F} f_{ij}(x_i,x_j)$ ;
\item the global cost function $\Obj=\Sigma_{f_{ij}\in F} f_{ij}(x_i,x_j)$ ;
...
@@ -135,22 +129,29 @@ edges are binary constraints. Note that since constraints are binary and can onl
...
@@ -135,22 +129,29 @@ edges are binary constraints. Note that since constraints are binary and can onl
\end{definition}
\end{definition}
This presentation of DCOP is based on cost functions, so the objective is to minimize the global cost function.
We consider here cost functions, so the objective is to minimize the global cost function.
Symmetrically, one can use utility functions and the objective becomes to maximize the global utility function.
Symmetrically, one can use utility functions and the objective becomes to maximize the global utility function.
\subsubsection{Classification of DCOPs}
\subsection{Classification}
In the multi-agent domain, DCOPs are classified according to their set of characteristics, namely:
In the multi-agent domain, DCOPs are classified according to their set of characteristics, namely:
\begin{itemize}
\begin{itemize}
\item Action effects : stochastic or deterministic
\item Action effects : stochastic or deterministic.
Refers to the results obtained by actions, in a deterministic seting one action is considered to have one specific result in a particular context, and if this action had to be repeated in a similar context, the result ought to be identical.
Whereas in a stochastic one, the same action performed in the same environment can lead to a different result.
\item Knowledge : total or incomplete
\item Knowledge : total or incomplete
Refers to the extent of an agent's awareness of the other agent's variables.
With incomplete knowledge, locality is often considered as the landmark, while with total knowledge any agent posesses information about any other agent.
\item Group behaviour : cooperative or competitive
\item Group behaviour : cooperative or competitive
Depends on whether agents consider their own welfare first or the general well-being of the system.
\item Environment type : deterministic or stochastic
\item Environment type : deterministic or stochastic
\item Environment's dynamics : static or dynamic
Refers to how an environment evolves, whether events follow a certain pattern or not.
\item Environment's dynamics : static or dynamic
Depends on whether an environment remains identical from beginning to end of a process or evolves.
\end{itemize}
\end{itemize}
Generally speaking, most DCOPs rely on deterministic action effects, a cooperative group-behaviour and total knowledge.
Generally speaking, most DCOP algorithms rely on deterministic action effects, a cooperative group-behaviour and total knowledge.
MGM and MGM-2 are deterministic both in terms of actions and environment, with cooperative group behaviour sharing incomplete (local) knowledge.
MGM and MGM-2 are deterministic both in terms of actions and environment, with cooperative group behaviour sharing incomplete (local) knowledge.
DCOP is used for different applications, by example:
DCOP algorithms are used for several applications, for instance: