Hybrid Two-Phase Task Allocation for Mobile Crowd Sensing

Citation

LIU Jiahao, JIN Hanxin, QIANG Lei, et al. Hybrid Two-Phase Task Allocation for Mobile Crowd Sensing[J]. Computer Engineering, 2022, 48(3), 139-145. DOI: 10.19678/j.issn.1000-3428.0061752.

Foundation item

National Natural Science Foundation of China(62102275, U20A20182, 61873177, 62072322);Natural Science Foundation of Jiangsu Province in China (BK20210704);Natural Science Foundation of the Jiangsu Higher Education Institutions of China (21KJB520025)

Author

LIU Jiahao (2000―), male, undergraduate, main research directions are crowd sensing, knowledge graph; JIN Hanxin, undergraduate; QIANG Lei, undergraduate; GAO Guoju, assistant professor, Ph.D; DU Yang, Ph.D., postdoctoral fellow; HUANG He, professor, Ph.D.

History

Receipt Date: 2021-05-25
Revised Data: 2021-07-05

Contents Abstract Full text Figures/Tables PDF

Hybrid Two-Phase Task Allocation for Mobile Crowd Sensing

LIU Jiahao , JIN Hanxin , QIANG Lei , GAO Guoju , DU Yang , HUANG He

School of Computer Science and Technology, Soochow University, Suzhou, Jiangsu 215006, China

Receipt Date: 2021-05-25; Revised Data: 2021-07-05

Foundation item: National Natural Science Foundation of China(62102275, U20A20182, 61873177, 62072322);Natural Science Foundation of Jiangsu Province in China (BK20210704);Natural Science Foundation of the Jiangsu Higher Education Institutions of China (21KJB520025)

Biography: LIU Jiahao (2000―), male, undergraduate, main research directions are crowd sensing, knowledge graph;
JIN Hanxin, undergraduate;
QIANG Lei, undergraduate;
GAO Guoju, assistant professor, Ph.D;
DU Yang, Ph.D., postdoctoral fellow;
HUANG He, professor, Ph.D.

Abstract: As a result of the popularity of mobile devices, Mobile Crowd Sensing(MCS) has attracted a lot of attention.Task allocation is a significant problem in MCS.Most previous studies mainly focused on stationary spatial tasks while neglecting the changes of tasks and workers.In this paper, the proposed hybrid two-phase task allocation algorithm considers heterogeneous tasks and diverse workers.For heterogeneous tasks, there are different start times and deadlines.In each round, the tasks are divided into urgent and non-urgent tasks.The diverse workers are classified into opportunistic and participatory workers.The former complete tasks on their way, so they only receive a fixed payment as employment compensation, while the latter commute a certain distance that a distance fee is paid to complete the tasks in each round as needed apart from basic employment compensation.The task allocation stage is divided into multiple rounds consisting of the opportunistic worker phase and the participatory worker phase.At the start of each round, the hiring of opportunistic workers is considered because they cost less to complete each task.The Poisson distribution is used to predict the location that the workers are going to visit, and greedily choose the ones with high utility.For participatory workers, the urgent tasks are clustered by employing hierarchical clustering after selecting the tasks from the uncompleted task set.After completing the above steps, the tasks are assigned to participatory workers by extending the Kuhn-Munkres(KM) algorithm.The rest of the uncompleted tasks are non-urgent tasks which are added to the task set for the next round.Experiments are conducted based on a real dataset, Brightkite, and three typical baseline methods are selected for comparison.Experimental results show that the proposed algorithm has better performance in terms of total cost as well as efficiency under the constraint that all tasks are completed.

开放科学(资源服务)标志码(OSID):

0 Introduction

Nowadays, with the proliferation of smart phones, bracelets, and other mobile devices, Mobile Crowd Sensing(MCS)^[1] has been widely used in traffic prediction^[2], air pollution monitoring^[3], and so on.As a critical part of MCS, the allocation of sensing tasks is extensively studied^[4].In the sensing task allocation stage, it needs to choose an appropriate strategy to allocate tasks to workers in order to reduce costs or improve data quality.In this paper, authors classify workers into opportunistic^[5] and participatory workers^[6], and assign dynamic tasks(tasks with different start time and deadlines) in each round.To minimize the total cost under the constraint that all tasks are completed on time, three challenges must be addressed.First, how to accurately predict the path of opportunistic workers based on the past trajectory; Second, how to combine the task allocation of opportunistic workers and participatory workers organically; Third, how to find the appropriate workers to reduce the overall system cost.

To address the above challenges, this paper puts forward a reliable hybrid two-phase task allocation algorithm.Specifically, it focuses on the mixed task allocation problem in which all tasks should be completed before the deadline with a minimum cost.The proposed algorithm considers the opportunistic worker phase and the participatory worker phase in each round.In the former phase, this paper first predicts the probability of workers visiting each location, then introduces the greedy algorithm to choose a group of candidate workers based on geographical entropy.In the later phase, it divides uncompleted tasks into urgent tasks and non-urgent tasks.This paper employs hierarchical clustering^[7] to cluster the urgent tasks and assigns them to participatory workers by extending the Kuhn-Munkres (KM) algorithm in the current round.Non-urgent tasks are added to the task set of next round, waiting for possible opportunistic workers to complete the tasks to save cost.

1 Related work

There has been large numbers of studies on the task allocation problem in MCS, which mainly focus on maximizing task quality or task coverage under the budget constraints^[8].YANG et al.proposed an efficient budgeted informativeness maximization algorithm to recruit workers and realized informativeness maximization under the budget constraint^[9].GONG et al.focused on the scenarios where users and tasks arrived dynamically and designed four online task assignment algorithms to maximize total task quality^[10].GUO et al. proposed a worker selection framework for time-sensitive tasks that seek to minimize the total distance and for delay-tolerant tasks that seek to minimize the total number of workers^[11].

Some studies also focused on selecting appropriate workers to minimize the overall system cost.XIAO et al.introduced a highly reliable multi-task allocation framework that considered the offline and online stage^[12].They combined the reverse auction mechanism with the network ﬂow model to effectively minimize cost and maximize coverage.WANG et al.proposed a sparse mobile crowdsensing paradigm to reduce the number of tasks according to the spatial and temporal correlation among the data, thus lowering total sensing cost^[13].KANG et al. proposed a hitchhiking working style that encourages crowd workers to commit tasks on their way so as to expand space coverage while reducing costs^[14].

Most of the above studies have given incomplete views.Some of them consider spatial tasks but ignore time constraints, while some take deadline into account but neglect the dynamic changes of tasks and workers.In this paper, authors consider the real-time requirements and heterogeneity of sensing tasks with different time of arrival and deadlines.In addition, the active time of opportunistic workers and participatory workers may also be different.

2 Problem model

In this section, the task allocation problem is definesd and formulatesd, in order to minimize the total cost under the constraint of completing all tasks before each task expires.

2.1 Problem definition

In the mobile crowd sensing system, tasks are generally heterogeneous in reality.There may be short-time tasks such as puddle detection, or non-urgent tasks such as data acquisition of old buildings in the city.Moreover, workers in the MCS are diverse.They can be busy employees, students with a fixed route, or part-time workers recruited by the platform.This paper studies how to assign heterogeneous tasks to mixed workers, in order to minimize the cost while ensuring that all tasks are completed.

In this paper, time is divided into numerous rounds, expressed as $ R=\{{R}_{1}, {R}_{2}, \cdots , {R}_{t}\} $, in which $ {R}_{k} $ denotes the $ {k}^{\mathrm{t}\mathrm{h}} $ round.In each round, there exists a variety of sensing tasks $ \mathit{\Gamma} =\{{T}_{1}, {T}_{2}, \cdots , {T}_{w}\} $.Each task $ {T}_{s} $ can be represented as $ {T}_{s}= < {l}_{s}, {\tau }_{s}^{k} > $, in which $ {l}_{s} $ denotes the location of $ {T}_{s} $ consisting of latitude and longitude, and $ {\tau }_{s}^{k} $ represents the remaining survival time of the task from the start time of this round to deadline^[15].Only tasks that are completed before the deadline are actually effective.Sensing tasks are posted only at the beginning of each round, but their deadline can be any time during the round.

For workers in the MCS model, this paper divides them into two categories based on their historical activity patterns.One is opportunistic workers, denoted as $ O=\{{o}_{1}, {o}_{2}, \cdots , {o}_{m}\} $.These workers have a relatively fixed path of movement each day and prefer to complete tasks along the way.For an opportunistic worker $ {o}_{i}(1\le i\le m) $, there may be a record of his/her trajectories in each past round.Besides, there is no upper limit on the number of tasks they can complete.The other category is participatory workers, denoted as $ P=\{{p}_{1}, {p}_{2}, \cdots , {p}_{n}\} $.These workers usually have variable routes, but they are willing to take a detour at any time to complete a certain number of tasks.Here authors assume that each participatory worker completes an upper limit of $ \eta $ tasks in each round.For a participatory worker $ {p}_{j}(1\le j\le n) $, his/her position at the beginning of the current round, and the speed $ {v}_{j} $ at which he/she can move toward the tasks can be gotten.

With the above definition of opportunistic and participatory workers, the round $ {R}_{k} $ can be further divided.Since the active time of opportunistic workers is limited, this paper divides the round $ {R}_{k} $ into two parts: the first part of the round is for opportunistic workers, denoted as $ {R}_{k}^{\mathrm{o}} $, and the remaining time is for participatory workers, denoted as $ {R}_{k}^{\mathrm{p}} $.Furthermore, the location of $ {p}_{j} $ at the beginning of $ {R}_{k}^{\mathrm{p}} $ will be gotten.The whole process of task allocation is shown in Fig.1.

	Download: JPG larger image
Fig. 1 Hybrid two-phase task allocation framework

The objective of this paper is to minimize the total cost^[16].Therefore, the costs of both types of workers need to be considered.For opportunistic workers, this paper denotes their individual cost as $ {I}_{\mathrm{b}\mathrm{a}\mathrm{s}\mathrm{i}\mathrm{c}} $.For participatory workers, a certain amount for their detour must be paid which is $ {I}_{\mathrm{d}\mathrm{i}\mathrm{s}} $ per kilameter.Besides, they also receive a fixed cost for each task completed, which is denoted as $ {I}_{\mathrm{t}\mathrm{a}\mathrm{s}\mathrm{k}} $.On this basis, in round $ {R}_{k} $, if a participatory worker $ {p}_{j} $ moves a total of $ {D}_{j}^{k} $ km to complete $ {q}_{j}^{k} $ tasks, the cost is:

$ {c}_{\mathrm{p}}={D}_{j}^{k}\times {I}_{\mathrm{d}\mathrm{i}\mathrm{s}}+{q}_{j}^{k}\times {I}_{\mathrm{t}\mathrm{a}\mathrm{s}\mathrm{k}} $

(1)

In summary, the total cost of employing workers in round $ {R}_{k} $ is:

$ {C}^{k}=\left|{S}^{k}\right|\times {I}_{\mathrm{b}\mathrm{a}\mathrm{s}\mathrm{i}\mathrm{c}}+\sum\limits_{{p_j} \in {P^k}} {\left( {D_j^k \times {I_{{\rm{dis}}}} + q_j^k \times {I_{{\rm{task}}}}} \right)} $

(2)

where $ {S}^{k} $ represents the set of opportunistic workers and $ \left|{S}^{k}\right| $ represents the number of its elements.$ {P}^{k} $ represents the set of participatory workers in round $ {R}_{k} $.

At the same time, the constraints need to be met that all tasks are completed by the deadline and participatory workers have an upper limit of tasks:

$ \mathrm{m}\mathrm{i}\mathrm{n}\sum\limits_{k=1}^{t}{C}^{k} $

(3)

$ \mathrm{s}.\mathrm{t}.\mathrm{ }\mathrm{C}\mathrm{o}\mathrm{m}\mathrm{p}\mathrm{l}\mathrm{e}\mathrm{t}\mathrm{e}\left({T}_{s}\right), \forall {T}_{s}\in \Gamma $

(4)

$ {n}_{j}^{k}\le \eta , \forall {p}_{j}\in {P}^{k}, \mathrm{ }\forall {R}_{k}\in R $

(5)

2.2 Task allocation model for opportunistic workers

For opportunistic workers, this paper predicts the future location of $ {o}_{i} $ according to the historical data, and gets the probability of reaching each location in the next round.

The probability of o_i arriving at location $ {L}_{s} $ for h times follows an inhomogeneous Poisson process^[17].Hence, the probability can be predicted that o_i will arrive at the $ {L}_{s} $ in the round $ {R}_{k} $:

$ {P}_{s}^{k}\left({o}_{i}\right)=\sum\limits_{h=1}^{\mathrm{\infty }}\frac{{\lambda }_{i, s}^{h}\times {\mathrm{e}}^{-{\lambda }_{i, s}}}{h!}=1-{\mathrm{e}}^{-{\lambda }_{i, s}} $

(6)

where $ {\lambda }_{i, s} $ denotes the average times for $ {o}_{i} $ to reach $ {L}_{s} $.

Based on the geographical entropy^[18], the priority of each task is:

$ W\left({T}_{s}\right)=\frac{1/\left(\mathrm{E}\mathrm{n}\mathrm{t}\mathrm{r}\mathrm{o}\mathrm{p}\mathrm{y}\right({T}_{s})+1)}{\sum\limits_{i=1}^{w}1/\left(\mathrm{E}\mathrm{n}\mathrm{t}\mathrm{r}\mathrm{o}\mathrm{p}\right({T}_{i})+1)} $

(7)

The greater the weight of a location is, the lower the probability that it will be visited by participatory workers, and the higher the expected reward is required by participatory workers to complete tasks nearby.Since the cost of an opportunistic worker is certain, more cost can be saved by allocating the tasks to opportunistic workers.

The utility of each opportunity worker to overall system is defined as:

$ {U}^{k}\left({S}^{k}\right)=\sum\limits_{{T}_{s}\in \Gamma , {o}_{i}\in {S}^{k}}{P}_{s}^{k}\left({o}_{i}\right)\times w\left({T}_{s}\right) $

(8)

In the search of opportunistic workers, authors seek to find an o_i that maximizes the overall increase in utility for which this paper sets a threshold $ \xi $.A gain-based OW selection greedy algorithm is introduced, which is presented in Algorithm1.This paper greedily selects the opportunistic workers with the greatest utility increment until there is no opportunistic worker who can bring a utility increment greater than $ \xi $.The set of selected opportunistic workers is denoted as $ {S}^{k} $.

Algorithm1 Gain-based OW selection greedy algorithm

Input set of opportunistic workers $ {O}^{k} $, set of participatory workers $ {P}^{k} $, set of tasks $ \Gamma $

Output set of candidate opportunistic workers $ {S}^{k} $

Initialize $ {S}^{k}=\mathrm{\varnothing } $, $ \mathrm{g}\mathrm{a}\mathrm{i}\mathrm{n}=\mathrm{\infty } $, $ \mathrm{p}\mathrm{r}\mathrm{e}\mathrm{U}=0 $,

$ \mathrm{C}\mathrm{a}\mathrm{n}\mathrm{d}\mathrm{i}\mathrm{d}\mathrm{a}\mathrm{t}\mathrm{e}\mathrm{W}\mathrm{o}\mathrm{r}\mathrm{k}\mathrm{e}\mathrm{r}=\mathrm{n}\mathrm{u}\mathrm{l}\mathrm{l} $

while $ \mathrm{g}\mathrm{a}\mathrm{i}\mathrm{n} > \mathrm{\xi } $ do

MaxU = 0;

for $ \mathrm{e}\mathrm{a}\mathrm{c}\mathrm{h}{\mathrm{o}}_{\mathrm{i}}\in {\mathrm{O}}^{\mathrm{k}}/{\mathrm{S}}^{\mathrm{k}} $ do

if $ \mathrm{U}\mathrm{t}\mathrm{i}\mathrm{l}\mathrm{i}\mathrm{t}{\mathrm{y}}^{\mathrm{k}}\left({\mathrm{S}}^{\mathrm{k}}\cup \{{\mathrm{o}}_{\mathrm{i}}\right\}) > \mathrm{M}\mathrm{a}\mathrm{x}\mathrm{U} $ then

$ \mathrm{M}\mathrm{a}\mathrm{x}\mathrm{U}=\mathrm{U}\mathrm{t}\mathrm{i}\mathrm{l}\mathrm{i}\mathrm{t}{\mathrm{y}}^{\mathrm{k}}\left({\mathrm{S}}^{\mathrm{k}}\cup \{{\mathrm{o}}_{\mathrm{i}}\right\}) $;

$ \mathrm{C}\mathrm{a}\mathrm{n}\mathrm{d}\mathrm{i}\mathrm{d}\mathrm{a}\mathrm{t}\mathrm{e}\mathrm{W}\mathrm{o}\mathrm{r}\mathrm{k}\mathrm{e}\mathrm{r}={\mathrm{o}}_{\mathrm{i}} $;

$ \mathrm{g}\mathrm{a}\mathrm{i}\mathrm{n}=\mathrm{M}\mathrm{a}\mathrm{x}\mathrm{U}-\mathrm{p}\mathrm{r}\mathrm{e}\mathrm{U} $;

$ \mathrm{p}\mathrm{r}\mathrm{e}\mathrm{U}=\mathrm{M}\mathrm{a}\mathrm{x}\mathrm{U} $;

$ {\mathrm{S}}^{\mathrm{k}}={\mathrm{S}}^{\mathrm{k}}\bigcup \left\{\mathrm{C}\mathrm{a}\mathrm{n}\mathrm{d}\mathrm{i}\mathrm{d}\mathrm{a}\mathrm{t}\mathrm{e}\mathrm{W}\mathrm{o}\mathrm{r}\mathrm{k}\mathrm{e}\mathrm{r}\right\} $;

$ {\mathrm{O}}^{\mathrm{k}}={\mathrm{O}}^{\mathrm{k}}-\left\{\mathrm{C}\mathrm{a}\mathrm{n}\mathrm{d}\mathrm{i}\mathrm{d}\mathrm{a}\mathrm{t}\mathrm{e}\mathrm{W}\mathrm{o}\mathrm{r}\mathrm{k}\mathrm{e}\mathrm{r}\right\} $;

Result: $ {\mathrm{S}}^{\mathrm{k}} $

2.3 Task allocation model for participatory workers

Based on the set of candidate opportunistic workers $ {S}^{k} $, the actual set of uncompleted tasks $ {\Gamma }_{u}^{k} $ at the end of $ {R}_{k}^{\mathrm{o}} $ can be gotten.Among them, whether a task is urgent can be calculated according to the following formula:

$ {J}^{k}\left({T}_{s}\right)=\left\{\begin{array}{l}1, {\tau }_{s}^{k}-\left|{R}_{k}\right|-\left|{R}_{k}^{\mathrm{o}}\right| < \mu \\ 0, {\tau }_{s}^{k}-\left|{R}_{k}\right|-\left|{R}_{k}^{\mathrm{o}}\right|\ge \mu \end{array}\right. $

(9)

Where $ {T}_{s}\in {\Gamma }_{u}^{k} $, $ \mu $ refers to the minimal remaining time for participatory workers to complete the task in the next round.A task will be considered urgent in two cases: 1)The task will be terminated in the current round; 2)The time left for the participatory workers in the next round is less than the minimal remaining time $ \mu $.If $ {J}^{k}\left({T}_{s}\right)=1 $, the task is supposed to be allocated during $ {R}_{k}^{\mathrm{p}} $.Then a candidate task set for participatory workers will be gotten.First, this paper uses the hierarchical clustering algorithm to cluster the tasks so that a participatory worker can complete the tasks with a short distance to save the cost.The cluster task set is denoted as $ \mathrm{C}{\mathrm{T}}^{k} $, where $ \left|\mathrm{c}\mathrm{t}\right|\le \eta , \mathrm{c}\mathrm{t}\in \mathrm{C}{\mathrm{T}}^{k} $.

Then this paper introduces the improved KM algorithm^[19] to achieve task allocation in the $ {R}_{k}^{\mathrm{p}} $ phase.In this algorithm, participatory worker $ {p}_{j} $ is selected as the left node of the bipartite graph, and cluster task $ \mathrm{c}{\mathrm{t}}_{i} $ as the right node.Since only the perfect bipartite graph is fit for KM algorithm, this algorithm introduces a set of virtual task nodes $ \mathrm{V}{\mathrm{\Gamma }}^{k} $.The structure of the bipartite graph is shown in Fig.2, where the weights are set as follows:

$ {W}_{j, i}=\left\{\begin{array}{l}1-\frac{{D}_{j, i}}{\underset{{p}_{j}\in {P}^{k}, \mathrm{c}{\mathrm{t}}_{i}\in \mathrm{C}{\mathrm{\Gamma }}^{k}}{\mathrm{m}\mathrm{a}\mathrm{x}}\left({D}_{j, i}\right)+1}, \frac{D}{{v}_{j}} < \mathrm{m}\mathrm{i}\mathrm{n}({\tau }_{s}^{k}, |{R}_{k}\left|\right)-\left|{R}_{k}^{\mathrm{o}}\right|\\ 0, \frac{D}{{v}_{j}}\ge \mathrm{m}\mathrm{i}\mathrm{n}({\tau }_{s}^{k}, |{R}_{k}\left|\right)-\left|{R}_{k}^{\mathrm{o}}\right|\mathrm{ }\mathrm{o}\mathrm{r}\mathrm{ }\mathrm{c}{\mathrm{t}}_{i}\in \mathrm{V}{\mathrm{\Gamma }}^{k}\end{array}\right. $

(10)

	Download: JPG larger image
Fig. 2 Bipartite graph structure

The initial mark value of the left node is the maximal value of the weights for the edges connected to it.The initial mark value of the right node is 0.

Where $ \mathrm{m}\mathrm{a}\mathrm{t}\mathrm{c}\mathrm{h}\left[\mathrm{c}{\mathrm{t}}_{i}\right]=-1 $, it denotes that the task $ \mathrm{c}{\mathrm{t}}_{i} $ doesn't have a match node or the value of $ \mathrm{m}\mathrm{a}\mathrm{t}\mathrm{c}\mathrm{h}\mathrm{ }\left[\mathrm{c}{\mathrm{t}}_{i}\right] $ is the index of the matched worker.The function $ \mathrm{f}\mathrm{i}\mathrm{n}\mathrm{d}\mathrm{P}\mathrm{a}\mathrm{t}\mathrm{h} $ is to find a next match for the worker and update the values of $ \mathrm{m}\mathrm{a}\mathrm{t}\mathrm{c}\mathrm{h} $.

With the KM-based PW-task match algorithm(as shown in Algorithm2), the result of task allocation can be obtained and its cost can be calculated.Finally, this paper estimates the total cost of both opportunistic workers and participatory workers through adding up their costs during each round.

Algorithm2 KM-based PW-task match algorithm

Input cluster task set $ \mathrm{C}{\mathrm{T}}^{k} $, candidate opportunistic worker set $ {S}^{k} $, participatory worker set $ {P}^{k} $

Output total cost C

Initialize C = 0

for $ \mathrm{k}=1;\mathrm{k}\le \mathrm{t};\mathrm{k}++ $ do

$ \mathrm{C}=\mathrm{C}+\left|{\mathrm{S}}^{\mathrm{k}}\right|\times {\mathrm{I}}_{\mathrm{b}\mathrm{a}\mathrm{s}\mathrm{i}\mathrm{c}} $;

introduce virtual nodes to construct bipartite graph;

initialize $ \mathrm{m}\mathrm{a}\mathrm{t}\mathrm{c}\mathrm{h} $ and bipartite graph;

for $ \mathrm{e}\mathrm{a}\mathrm{c}\mathrm{h}{\mathrm{p}}_{\mathrm{j}}\in {\mathrm{P}}^{\mathrm{k}} $ do

while $ \mathrm{T}\mathrm{r}\mathrm{u}\mathrm{e} $ do

if $ \mathrm{m}\mathrm{a}\mathrm{t}\mathrm{c}\mathrm{h} $$ \left[\mathrm{c}{\mathrm{t}}_{\mathrm{i}}\right]==-1\left|\right| $$ \mathrm{f}\mathrm{i}\mathrm{n}\mathrm{d}\mathrm{P}\mathrm{a}\mathrm{t}\mathrm{h}\left(\mathrm{m}\mathrm{a}\mathrm{t}\mathrm{c}\mathrm{h}\right[\mathrm{c}{\mathrm{t}}_{\mathrm{i}}\left]\right) $ then

Break;

else

update the mark values of conflicted nodes;

for $ \mathrm{e}\mathrm{a}\mathrm{c}\mathrm{h}\mathrm{ }\mathrm{m}\mathrm{a}\mathrm{t}\mathrm{c}\mathrm{h}\mathrm{ }\mathrm{i}\mathrm{n}\mathrm{ }\mathrm{t}\mathrm{h}\mathrm{e}\mathrm{ }\mathrm{b}\mathrm{i}\mathrm{p}\mathrm{a}\mathrm{r}\mathrm{t}\mathrm{i}\mathrm{t}\mathrm{e}\mathrm{ }\mathrm{g}\mathrm{r}\mathrm{a}\mathrm{p}\mathrm{h} $ do

calculate the distance D and the number of tasks q in $ \mathrm{c}{\mathrm{t}}_{\mathrm{i}} $;

$ \mathrm{C}=\mathrm{C}+\mathrm{D}\times {\mathrm{I}}_{\mathrm{d}\mathrm{i}\mathrm{s}}+\mathrm{q}\times {\mathrm{I}}_{\mathrm{t}\mathrm{a}\mathrm{s}\mathrm{k}} $;

Result: C

3 Performance evaluation

In this section, the experimental results and analyses are presented.All experiments are carried out by Java on Windows platform, realized on an Intel^® Core^TM i5-8265U CPU and 8 GB memory laptop computer.

3.1 Baseline methods

In order to verify the performance of the proposed algorithm, this paper selects three typical different baseline methods to carry out comparison experiments.To be specific, the baseline methods this paper selected include the KM algorithm with pure participatory workers (BasicKM), the Greedy-genetic algorithm, and the Hy-greedy algorithm based on the hybrid mode.BasicKM takes the participatory workers and tasks as the node sets, and constructs the inverse function of the distance between them to carry out the maximum weight matching.The Greedy-genetic algorithm^[20] initializes the population by taking the greedy solution of task allocation as the initial solution.Hy-greedy algorithm greedily selects the opportunistic workers whose incremental utility is greater than the threshold, and assigns each remaining task to the nearest participatory worker to deal with.

3.2 Dataset processing

In the experiment, authors use the Brightkite dataset^{[4, 21]}.The dataset includes a total collection of 4 491 143 check-in information of users.This paper selects an area of 60 km×80 km near $ 40 $°N and $ 104 $°W, as shown in Fig.3, in which authors publish sensing tasks and recruit workers.Data of May 2009 in this region is for training, while the following ten days for test.This paper sets 8 a.m.to 5 p.m.the active time for opportunistic workers and 5 p.m.to 7 p.m.for participatory workers.Authors assume that for each task, there must exist a participatory worker who can get to the task in 2 h.During each round, this paper randomly generates sensing tasks within the area, and sets the deadline for tasks on the basis of standardized normal distribution.When workers reach a task within 1 km, it considers that the task has been completed.It selects workers with sufficient data in the test set as opportunistic workers.The speed of participatory workers is randomly classified into 7 km/h, 20 km/h and 40 km/h.Since the worker data set is relatively sparse, this paper supplements part of the worker information appropriately according to the principle of normal distribution.In addition, it sets $ {I}_{\mathrm{t}\mathrm{a}\mathrm{s}\mathrm{k}}=10, {I}_{\mathrm{b}\mathrm{a}\mathrm{s}\mathrm{i}\mathrm{c}}=20 $ $ \mathrm{a}\mathrm{n}\mathrm{d} $ $ {I}_{\mathrm{d}\mathrm{i}\mathrm{s}}=8 $.

	Download: JPG larger image
Fig. 3 Distribution of workers in the dataset

3.3 Experimental results

This paper uses the real dataset to compare the performance of the proposed algorithm with the baseline methods.The experimental results are shown in Fig.4.

	Download: JPG larger image
Fig. 4 Average cost of tasks vs. threshold for the incremental utility $ \mathit{\xi } $

Fig.4(a) shows the change of the optimal $ \xi $ under different $ {I}_{\mathrm{b}\mathrm{a}\mathrm{s}\mathrm{i}\mathrm{c}} $. It can be seen that the higher the cost of opportunistic workers is, the stricter the selection of opportunistic workers will be, and the larger the corresponding optimal $ \xi $ will be.With the increase of $ \xi $, the number of opportunistic workers decreases gradually, and the influence of $ {I}_{\mathrm{b}\mathrm{a}\mathrm{s}\mathrm{i}\mathrm{c}} $ on the cost decreases, leading to gradual convergence of the curve.Fig.4(b) shows the change in cost caused by $ \xi $ under different task-participant proportions.When $ \xi $ is small, the cost of selecting too many inappropriate opportunistic workers is high, while as $ \xi $ increases, the cost drops rapidly.However, when $ \xi $ is large, its increase will lead to the abandonment of suitable opportunistic workers, and the cost will gradually rise and finally reach the situation of pure participatory workers.

Fig.5 shows the performance of each algorithm under the condition of different numbers of participatory workers and tasks.It is illustrated that the proposed algorithm is much more efficient than Hy-greedy and BasicKM algorithm.And even the greedy-genetic algorithm is at a disadvantage compared to the proposed algorithm.

	Download: JPG larger image
Fig. 5 Average cost of tasks vs. the number of participatory workers

In Fig.6, authors pick out the best-performing greedy-genetic algorithm to conduct the further comparison experiments.It can be seen that the proposed algorithm delivers better performance under various conditions than baseline methods.

	Download: JPG larger image
Fig. 6 The proposed algorithm vs. Greedy-genetic algorithm

4 Conclusion

This paper puts forward a reliable hybrid two-phase task allocation algorithm.In the opportunistic workers phase, it predicts the probability of workers arriving at each location, and uses the greedy algorithm to calculate the utility to choose a group of candidate workers based on the geographical entropy.In the participatory workers phase, it employs hierarchical clustering to cluster the urgent tasks and assigns them to workers through the improved KM algorithm.Experiments based on the real-world dataset show that the proposed algorithm outperforms Hy-greedy, BasicKM and Greedy-genetic algorithms.

In this paper, the accurate geographic information of participants is assumed to be obtained in advance, which might not be achieved for privacy-sensitive participants^[22-23].In future work, authors will conduct further research on how to realize dynamic task allocation based on user privacy protection.

Reference

[1]	ZHAO J, ZHANG X T, LI J, et al. Research review of crowd intelligence 2.0[J]. Computer Engineering, 2019, 45(12): 1-7. (0)
[2]	CHEN Y Y, LÜ P, GUO D K, et al. A survey on task and participant matching in mobile crowd sensing[J]. Journal of Computer Science and Technology, 2018, 33(4): 768-791. DOI:10.1007/s11390-018-1855-y (0)
[3]	XIONG H Y, ZHANG D Q, WANG L Y, et al. EMC 3: energy-effcient data transfer in mobile crowdsensing under full coverage constraint[J]. IEEE Transactions on Mobile Computing, 2015, 14(7): 1355-1368. DOI:10.1109/TMC.2014.2357791 (0)
[4]	GONG W, ZHANG B X, LI C. Task assignment in mobile crowdsensing: present and future directions[J]. IEEE Network, 2018, 32(4): 100-107. DOI:10.1109/MNET.2018.1700331 (0)
[5]	WANG C W, LI C S, QIN C, et al. Maximizing spatial-temporal coverage in mobile crowd-sensing based on public transports with predictable trajectory[J]. International Journal of Distributed Sensor Networks, 2018, 14(8): 1-10. (0)
[6]	YUAN S, ZHOU C R, YANG Z Q, et al. Task allocation based on whale optimization algorithm in crowdsensing systems[J]. Computer Engineering and Design, 2020, 41(7): 2031-2037. (0)
[7]	JOHNSON S C. Hierarchical clustering schemes[J]. Psychometrika, 1967, 32(3): 241-254. DOI:10.1007/BF02289588 (0)
[8]	WANG J T, WANG L Y, WANG Y S, et al. Task allocation in mobile crowd sensing: state-of-the-art and future opportunities[J]. IEEE Internet of Things Journal, 2018, 5(5): 3747-3757. DOI:10.1109/JIOT.2018.2864341 (0)
[9]	YANG S, WU F, TANG S J, et al. Selecting most informative contributors with unknown costs for budgeted crowdsensing [C]//Proceedings of the 24th IEEE/ACM International Symposium on Quality of Service. Washington D. C., USA: IEEE Press, 2016: 1-6. (0)
[10]	GONG W, ZHANG B X, LI C. Location-based online task assignment and path planning for mobile crowdsensing[J]. IEEE Transactions on Vehicular Technology, 2019, 68(2): 1772-1783. DOI:10.1109/TVT.2018.2884318 (0)
[11]	GUO B, LIU Y, WU W L, et al. ActiveCrowd: a framework for optimized multitask allocation in mobile crowdsensing systems[J]. IEEE Transactions on Human-Machine Systems, 2017, 47(3): 392-403. DOI:10.1109/THMS.2016.2599489 (0)
[12]	XIAO J L, LI P, NIE L. A reliable multi-task allocation based on reverse auction for mobile crowdsensing [C]//Proceedings of WASA'20. Berlin, Germany: Springer, 2020: 529-541. (0)
[13]	WANG L Y, ZHANG D Q, WANG Y S, et al. Sparse mobile crowdsensing: challenges and opportunities[J]. IEEE Communications Magazine, 2016, 54(7): 161-167. DOI:10.1109/MCOM.2016.7509395 (0)
[14]	KANG Y R, MIAO X, LIU K B, et al. Quality-aware online task assignment in mobile crowdsourcing[C]//Proceedings of MASS'15. Washington D. C., USA: IEEE Press, 2015: 127-135. (0)
[15]	AKTER S, THU N T, YOON S. Deadline sensitive task assignment in mobile crowd sensing: a greedy approach [C]//Proceedings of ICTC'19. Washington D. C., USA: IEEE Press, 2019: 354-356. (0)
[16]	YU H, MIAO C Y, SHEN Z Q, et al. Quality and budget aware task allocation for spatial crowdsourcing[C]//Proceedings of AAMAS'15. Washington D. C., USA: IEEE Press, 2015: 1689-1690. (0)
[17]	GAO H, LIU C H, TIAN Y, et al. Ensuring high-quality data collection for mobile crowd sensing [C]//Proceedings of WCNC'17. Washington D. C., USA: IEEE Press, 2017: 1-6. (0)
[18]	WANG J T, WANG F, WANG Y S, et al. HyTasker: hybrid task allocation in mobile crowd sensing[J]. IEEE Transactions on Mobile Computing, 2020, 19(3): 598-611. DOI:10.1109/TMC.2019.2898950 (0)
[19]	GAO G J, XIAO M J, ZHAO Z H. Optimal multi-taxi dispatch for mobile taxi-hailing systems[C]//Proceedings of ICPP'16. Washington D. C., USA: IEEE Press, 2016: 294-303. (0)
[20]	XING Q, SUN X M, YUAN C M. Online task allocation strategy for spatial crowd-sourcing based on prediction algorithm[J]. Application Research of Computers, 2020, 37(3): 868-871. (0)
[21]	CHO E, MYERS S A, LESKOVEC J. Friendship and mobility: user movement in location-based social networks [C]//Proceedings of KDD'11. New York, USA: ACM Press, 2011: 1082-1090. (0)
[22]	JIN X, WANG T C, LÜ C M, et al. Privacy preservation algorithm of original data in mobile crowd sensing[J]. Journal of Computer Applications, 2020, 40(11): 3249-3254. (0)
[23]	LI M H, LI Y, FANG L M. ELPPS: an enhanced location privacy preserving scheme in mobile crowd-sensing network based on edge computing[C]//Proceedings of TrustCom'20. Washington D. C., USA: IEEE Press, 2021: 475-482. (0)