Chapter 1

A new information age

Abstract

Social sensing broadly refers to a set of sensing and data collection paradigms where data are collected from humans or devices on their behalf. In this chapter, we first give an overview of social sensing as an emerging research field and identify the data reliability problem as a fundamental research challenge in this field. This challenge, if successfully addressed, engenders a paradigm shift in social sensing by allowing development of dependable applications with guaranteed correctness properties that rely on the collective observations of untrained, average, and largely unreliable sources. Followed by the overview, we go over the motivations of social sensing applications and discuss several key challenges and state-of-the-art techniques centered on the data reliability problem. In the end of the chapter, we review the organization of the whole book chapter by chapter.

Keywords

Social sensing

New information age

Introduction

Motivation

Challenges

State-of-the-art

Twenty years ago, your best bet to find information was to go to a library, search through a pile of index cards, find your way through rows of shelves, and borrow a book. Ten years ago, your best bet to let your friends know of your latest endeavors was to call them. These days are long gone.

One of the most remarkable advances in the last decade was the advent of an age of real-time broadcast. An author of this book remembers a recent ride in a hot air balloon that he happened to share with a newly-wed couple. They were taking a lot of pictures. When the ride was over, they were met with friends. The first question they asked their friends was: “Have you seen our air pictures on Facebook?.” Indeed, information about the experience had already preceded them to the ground.

The age of modern broadcast is enabled by a confluence of three technological advances. First, we now have an unprecedented connectivity on the move, apparently, even while in a hot air balloon. Second, we have an increasingly rich set of options for broadcasting information. Twitter, Facebook, YouTube, Instagram, Flickr, and Foursquare are just a few examples. Finally, we live in a world of information acquisition (i.e., sensing) devices that we use on daily basis. Cameras, GPS devices, fitbits, smart watches, and Internet-connected cars are generating significant amounts of data that we may or may not choose to share.

A direct consequence of the age of broadcast is information overload. In point-to-point communication (such as a phone call), a minute consumed by the initiator corresponds to a minute consumed by the responder. Hence, a balance exists between the collective capacity consumed at sources and the collective capacity consumed at sinks. In contrast, on broadcast channels, for every minute consumed at the broadcast source, hours may be collectively consumed by the community of receivers. For example, if a 1-minute broadcast message is read by 1000 recipients, then 1000 minutes are collectively consumed at the receivers for the one minute spent at the source. This is fine when the number of sources is a lot smaller than the number of receivers (e.g., think of the number of radio stations compared to the number of listeners). However, in the current age of democratized real-time broadcast, everyone can be a source. A survey in 2014, suggested that more than 1 Billion users were on Twitter, more than 1.3 Billion on Facebook, and more than 1.5 Billion on Google+. On Twitter, your message is visible to all. The underlying paradigm is one of global broadcast. Given that everyone can be a broadcast source, the balance between production and consumption is disrupted. The proliferation of sensing devices further exacerbates the imbalance. Consequently, a gap widens between the capacity of sources to generate information and the capacity of sinks to consume it. This widening gap heralds a new era of services whose main function is to summarize large amounts of broadcast data into a smaller amount of actionable information for receivers.

A key class of summarization services in the new age of real-time broadcast are services that attain situation awareness. Much of the information uploaded on social media constitutes acts of sensing. In other words, sources report observations they made regarding their physical environment and found worthwhile to comment on. We call this phenomenon, social sensing. In their raw form, however, these observations are not very useful and generally lack reliability. There are conflicting sentiments, conflicting claims, missing data, purposeful pieces of misinformation, and other noise.

Table 1.1 shows examples of tweets collected from Syria on Twitter in August 2013, in the week following the sudden deaths of thousands of citizens in Ghouta; a suburb near the capital, Damascus. It can be seen that many different claims are posted, some of which are true but others are pure conjecture, rumors, or even intentionally planted misinformation. Hence, the use of social sensing data, such as tweets on Twitter, for enhancing situation awareness must be done with care. A key problem is to extract reliable information from large amounts of generally less reliable social sensing data. Recently, significant advances were made on this topic.

Table 1.1

A Twitter Example

Medecins Sans Frontieres says it treated about 3,600 patients with ‘neurotoxic symptoms’ in Syria, of whom 355 died http://t.co/eHWY77jdS0

Weapons expert says #Syria footage of alleged chemical attack “difficult to fake” http://t.co/zfDMujaCTV

U.N. experts in Syria to visit site of poison gas attack http://t.co/jol8OlFxnf via @reuters #PJNET

Syria Gas Attack: ‘My Eyes Were On Fire’ http://t.co/z76MiHj0Em

Saudis offer Russia secret oil deal if it drops Syria via Telegraph http://t.co/iOutxSiaRs

Long-term nerve damage feared after Syria chemical attack http://t.co/8vw7BiOxQR

Syrian official blames rebels for deadly attack http://t.co/76ncmy4eqb

Assad regime responsible for Syrian chemical attack, says UK government http://t.co/pMZ5z7CsNZ

Syrian Chemical Weapons Attack Carried Out by Rebels, Says UN (UPDATE) http://t.co/lN4CkUePUj #Syria http://t.co/tTorVFUfZF

US forces move closer to Syria as options weighed: WASHINGTON (AP) – U.S. naval forces are moving closer to Sy…http://t.co/F6UAAXLa2M

Putin Orders Massive Strike Against Saudi Arabia If West Attacks Syria http://t.co/SFLJ9ghwbt

400 tonnes of arms sent into #Syria through Turkey to boost Syria rebels after CW attack in Damascus – > http://t.co/KLwESYChCc

UN Syria team departs hotel as Assad denies attack http://t.co/O3SqPoiq0x

Vehicle of UN #Syria #ChemicalWeapons team hit by sniper fire. Team replacing vehicle & then returning to area.

International weapons experts leave Syria, U.S. prepares attack. More http://t.co/4Z62RhQKOE

Military strike on Syria would cause retaliatory attack on Israel, Iran declares http://t.co/M950o5VcgW

Asia markets fall on Syria concerns: Asian stocks fall, extending a global market sell-off sparked by growing …http://t.co/06A9h2xCnJ

Syria Warns of False Flag Chemical Attack!

UK Prime Minister Cameron loses Syria war vote (from AP) http://t.co/UlFF1wY9gx

We formally define social sensing as the act of collection of observations about the physical environment from humans or devices acting on their behalf. Assessing data reliability is a fundamental research challenge in this field. This challenge, if successfully addressed, may engender a paradigm shift in situation awareness by allowing development of dependable applications with guaranteed correctness properties that rely on the collective observations of untrained, average, and largely unreliable sources. In this chapter, we introduce this problem, go over the underlying technical enablers and motivations of social sensing applications, and discuss several key challenges and state-of-the-art solutions. We then review the organization of the book, chapter by chapter, to offer a reading guide for the remainder of the book.

1.1 Overview

The idea of leveraging the collective wisdom of the crowd has been around for some time [1, 2]. Today, massive amounts of data are continually being collected and shared (e.g., on social networks) by average individuals that may be used for a myriad of societal applications, from a global neighborhood watch to reducing transportation delays and improving the efficacy of disaster response. Little is analytically known about data validity in this new sensing paradigm, where sources are noisy, unreliable, erroneous, and largely unknown. This motivates a closer look into recent advances in social sensing with an emphasis on the key problem faced by application designers; namely, how to extract reliable information from data collected from largely unknown and possibly unreliable sources? Novel solutions that leverage techniques from machine learning, information fusion, and data mining recently offer significant progress on this problem and are described in this book.

In situations, where the reliability of sources is known, it is easy to compute the probability of correctness of different observations. Among other alternatives, one can use, say, Bayesian analysis to fuse data from sources of different (known) degrees of reliability. The distinguishing challenge in social sensing applications is that the reliability of sources is often unknown. For example, much of the chatter on social networks might come from users who are unknown to the data collection system. Hence, it is hard to assess the reliability of their observations. The same is true of situations, where individuals download a smartphone app that allows them to contribute to a social sensing data collection campaign. If anyone is allowed to participate, the pool of sources is unvetted and the reliability of individual observers is generally unknown to the data collector. It is in this context that the problem of ascertaining data reliability becomes challenging. The challenge arises from the fact that one neither knows the sources, nor can immediately verify their claims. What can be rigorously said, in this case, about the correctness of collected data? More specifically, how can one jointly ascertain data correctness and source reliability in social sensing applications? The problem is of importance in many domains and, as such, touches upon several areas of active research.

In sensor networks, an important challenge has always been to derive accurate representations of physical state and physical context from possibly unreliable, non-specific, or weak proxies. Often one trades off quantity and quality. While individual sensors may be less reliable, collectively (using the ingenious analysis techniques published in various sensor network venues) they may yield reliable conclusions. Much of the research in that area focused on physical sensors. This includes dedicated devices embedded in their environment, as well as human-centric sensing devices such as cell-phones and wearables. Recent research proposed challenges with the use of humans as sensors. Clearly, humans differ from traditional physical devices in many respects. Importantly to our reliability analysis, they lack a design specification and a reliability standard, making it hard to define a generic noise model for sources. Each human is its own individual with different model parameters that predict how good that individual person’s observations are. Hence, many techniques that estimate probability of error for sensors do not apply, since they assume the same error model for all sensors.

Humans also exhibit other interesting artifacts not common to physical sensors, such as gossiping. It is usually hard to tell where a particular observation originated. Even if we are able to unambiguously authenticate the source, who we received some observation from, it is hard to tell if the source made that observation themselves, or heard it from another. Hence, the original provenance of observations may remain uncertain. Techniques described in this book offer analytic means to determine source reliability and mitigate uncertainty in data provenance.

Reputation systems is another area of research, where source reliability is the issue. The assumption is that, when sources are observed over time, their reliability is eventually uncovered. Social sensing applications, however, often deal with scenarios, where a new event requires data collection from sources who have not previously participated in other data collection campaigns, or perhaps not been “tested” in the unique circumstances of the current event. For example, a hurricane strikes New Jersey. This is a rare event. We do not know how accurate the individuals who fled the event are at describing damage left behind. No reputation is accumulated for them in such a scenario. Yet, it would be desirable to leverage their collective observations to deploy help in a more efficient and timely manner. How do we determine which observations to believe?

In cyber-physical systems (CPS) research, an important emphasis has always been on ensuring validity and on proving that systems meet specifications [35]. The topics of reliability, predictability, and performance guarantees receive much attention. While past research on CPS addressed correctness of software systems (even in the presence of unverified code), in today’s data-driven world, a key emerging challenge becomes to ascertain correctness of data (even in the presence of unverified sources). The challenge is promoted by the need to account for the humans in the loop. Humans are the drivers in transportation systems, the occupants in building energy management systems, the survivors and first responders in disaster response systems, and the patients in medical systems. It makes sense to utilize their input when trying to assess system state. For example, one can get a more accurate account of current vehicular traffic state and more accurate prediction of its future evolution, if driver input was taken into account in some global, real-time, and automated fashion. This is assuming that the inputs are reliable, which is not always the case. The reliable social sensing problem, if solved, would enable the development of dependable applications in domains of transportation, energy, disaster response, and military intelligence, among others, where correctness is guaranteed despite reliance on the collective observations of untrained, average, and largely unreliable sources.

Techniques described in this book are also of relevance to business analytics applications, where one is interested in making sense out of large amounts of unreliable data. These techniques can thus serve applications in social networks, big data, and human-in-the-loop systems, and leverage the proliferation of computing artifacts that interact with or monitor the physical world. The goal of this book is to offer the needed theoretical foundations that exploit advances in social sensing analytics to support emerging data-driven applications. The book also touches on contemporary issues, such as privacy, in a highly interconnected and instrumented world.

1.2 Challenges

Social sensing described in this book broadly refers to three types of data collection: (i) participatory sensing, (ii) opportunistic sensing, and (iii) social data scavenging. In participatory sensing, individuals are explicitly and actively involved in the sensing process, and choose to perform some critical operations (e.g., operating the sensors, performing certain tasks, etc.) to meet application requirements. In opportunistic sensing, individuals are passively involved, for example, by pre-authorizing their sensing device to share information on behalf of the owner whenever contacted by the indicated data collection agent to meet application requirements [6]. Social data scavenging refers to a sensing paradigm, where individuals remain unaware of the data collection process. An example is where social networks are treated as sensor networks. Public data posted on social networks (e.g., Twitter) are searched for relevant items. In social data scavenging, the participants “agree” to the fact that their posts are in the public domain and they are simply unaware how the public may actually use their information.

A classical example of a social sensing application is geo-tagging campaigns, where participants report conditions in their environment that need attention (e.g., litter in public parks), tagged by location. More recent examples of social sensing applications include enhancing real-time situation awareness in the aftermath of disasters using online social media, and traffic condition prediction using GPS traces collected from smartphones.

We henceforth call the challenge of jointly ascertaining the correctness of collected data and the reliability of data sources the reliable social sensing problem. In traditional sensing scenarios such as participatory and opportunistic sensing, data sources are sensors, typically in human possession. In the data scavenging scenario, individuals are represented by sensors (data sources) who occasionally make observations about the physical world. These observations may be true or false, and hence can be viewed as binary claims.

The term, participant (or source) reliability is used to denote the odds that a source reports a correct observation. Reliability may be impaired because of poor sensor quality, lack of sensor calibration, lack of (human) attention to the task, or even intent to deceive. Data collection is often open to a large population, where it is impossible to screen all participants (or information sources) beforehand. The likelihood that a participant’s measurements are correct is usually unknown a priori. Consequently, it is challenging to ascertain the correctness of the collected data. It is also challenging to ascertain the reliability of each information source without knowing whether their collected data are correct or not. Therefore, the main questions posed in the reliable social sensing problem are (i) whether or not one can determine, given only the measurements sent and without knowing the reliability of sources, which of the reported observations are true and which are not, and (ii) how reliable each source is without independent ways to verify the correctness of their measurements. This also requires one to address how reliability is formally quantified.

1.3 State of the Art

Prior research on social sensing can be mainly classified into three categories (i.e., discount data fusion, trust and reputation systems, and fact-finding techniques) based on whether the prior knowledge of source reliability and claim correctness or credibility is known to the application. We discuss the state-of-the-art techniques in these categories in detail below.

1.3.1 Efforts on Discount Fusion

When we have prior knowledge about the reliability of the sources but no prior knowledge of the correctness or credibility of the claims (information), one can filter the noise in the claims via fusion. This is a classic case for the target tracking community where disparate sensor systems (the sources) generate tracks that must be combined. The tracks are estimates of the kinematic state of the target, and the reliability of the sources is expressed as a state covariance error for the tracks. The expression for the fused state estimate by accounting for all the sensors as a function of the tracks and error covariances is well known [711]. When the sensor tracks are uncorrelated, the fused track can be interpreted as a weighted average of the senor tracks where the weights are proportional to the inverse of the error covariances for the sensors. The general expression for correlated tracks is slightly more complicated, but it is reasonable to interpret the track fusion process as discounting the tracks based on their reliability followed by a combining process. The difficulty in track fusion is determining which tracks for the various sensors associate for the fusion process. Techniques for track-to-track association do exist [12, 13], but they rely on understanding the correlation of tracks from different sensors. Unfortunately, it is not known how to determine this correlation when the tracks are formed for each sensor in a distributed manner.

In the information fusion community, belief theory provides the mechanism to combine evidence from multiple possibly conflicting sources [1416]. The concept of discounting beliefs based upon source reliability before fusion goes back to Shafer [14]. Recently, subjective logic has emerged as a means to reason over conflicting evidence [17]. Subjective opinions are formed from evidence observed from individual sources. When incorporating multiple opinions, the subjective opinions need to be discounted similar to Dempster-Shafer theory before consensus fusion. In essence, this form of discount fusion can be interpreted as a weighted sum of evidence where the weights are proportional to the source reliabilities. The consensus fusion operation in subjective logic assumes the evidence used to form the subjective opinions of the sources are independent. Current research is investigating the proper fusion rule when the sources incorporate correlated evidence.

1.3.2 Efforts on Trust and Reputation Systems

When we have prior knowledge of the correctness or credibility of claims (information) but no prior knowledge on the reliability of sources, a lot of work in trust and reputation systems make efforts to assess the reliability of sources (e.g., the quality of providers) [1821]. The basic idea of reputation systems is to let entities rate each other (e.g., after a transaction) or review some objects of common interests (e.g., products or dealers), and use the aggregated ratings to derive trust or reputation scores of both sources and objects in the systems [18]. These reputation scores can help other entities in deciding whether or not to trust a given entity or purchase a certain object [19]. Trust and reputation scores can be obtained from both individual and social perspectives [22, 23]. Individual trust often comes from experiences of direct interaction with transaction partners while social trust is computed from third-party experiences, which might include both honest and misleading opinions. Different types of reputation systems are being used successfully in commercial online applications [2427]. For example, eBay is a type of reputation system based on homogeneous peer-to-peer systems, which allows peers to rate each other after each pair of them conduct a transaction [24, 25]. Amazon on-line review system represents another type of reputation systems, where different sources offer reviews on products (or brands, companies) they have experienced [26, 27]. Customers are affected by those reviews (or reputation scores) in making purchase decisions. Various techniques and models have also been developed to detect deceitful behaviors of participants [28, 29] and identify discriminating attitudes and fraudulent activities [30, 31] in the trust and reputation systems to provide reliable service in an open and dynamic environment. Recent work has investigated consistency of reports to estimate and revise trust scores in reputation system [32, 33].

1.3.3 Efforts on Fact-Finding

Given no prior knowledge on the reliability of sources and the credibility of their claims (information), there exists substantial work on techniques referred to as fact-finders within data mining and machine learning communities that jointly compute the source reliability and claim credibility. The inspiration of fact-finders can be traced by Google’s PageRank [34]. PageRank iteratively ranks the credibility of pages on the Web, by considering the credibility of pages that link to them. In fact-finders, they estimate the credibility of claims from the reliability of sources that make them, then estimate the reliability of sources based on the credibility of their claims. Hubs and Authorities [35] established a basic fact-finder model based on linear assumptions to compute scores for sources and claims they asserted. Yin et al. introduced TruthFinder as an unsupervised fact-finder for trust analysis on a providers-facts network [36]. Other fact-finders enhanced these basic frameworks by incorporating analysis on properties [3739] or dependencies within claims or sources [4043]. More recent works came up with some new fact-finding algorithms designed to handle the background knowledge [44, 45] and multi-valued facts [46], provide semantics to the credibility scores [47], and use slot filling systems for multi-dimensional fact-finding [48]. A comprehensive survey of fact-finders used in the context of trust analysis of information networks can be found in [49].

The book reviews in great detail a comprehensive analytical framework that optimally (in the sense of maximum likelihood estimation, MLE) solves the reliable social sensing problem and rigorously analyzes the accuracy of the results, offering correctness guarantees on solid theoretical foundations [50]. It is the notion of quantified correctness guarantees in social sensing that sets the purpose of this book apart from other reviews of data mining, social networks, and sensing literature. The developed techniques can be applied to a new range of reliable social sensing applications, where assurances of data correctness are needed before such data can be used by the application, in order to meet higher level dependability guarantees.

1.4 Organization

The rest of this book familiarizes the reader with recent advances in addressing the reliable social sensing problem. These contributions help establish the foundations of building reliable systems on unreliable data to support social sensing applications with correctness guarantees.

 Chapter 2 reviews the emerging social sensing trends and applications. In this chapter, we outline several recent technical trends that herald the era of social sensing. Some early research in participatory and opportunistic sensing are first reviewed. We then describe social data scavenging, a new social sensing data collection paradigm that is motivated by the popularity of online social networks and large information dissemination opportunities. At the end of the chapter, we discuss the prospective future of social sensing.

 Chapter 3 reviews mathematical foundations and basic technologies that we will use in this book. These foundations include basics of information networks, Bayesian analysis, MLE, expectation maximization (EM), as well as bounds and confidence intervals in estimation theory. The chapter concludes with an analytical framework that allows us to put all these foundations together.

 Chapter 4 reviews the state-of-the-art fact-finders with the emphasis on an analytically-founded Bayesian interpretation of the basic fact-finding scheme that is popularly used in data-mining literature to rank both sources and their asserted information based on credibility values. This interpretation enables the calculation of correct probabilities of conclusions resulting from information network analysis. Such probabilities constitute a measure of quality of information (QoI), which can be used to directly quantify participant reliability and measurement correctness in social sensing context.

 Chapter 5 reviews a MLE scheme that casts the reliable social sensing problem into an EM framework which can be solved optimally and efficiently. The EM algorithm makes inferences regarding both participant reliability and measurement correctness by observing which observations coincide and which do not. It was shown to be very accurate and outperforms the state-of-the-art fact-finders.

 Chapter 6 reviews the work to obtain the confidence bounds of the MLE presented in Chapter 5. In particular, we first setup the reliability assurance problem in social sensing and then review real and asymptotic confidence bounds in participant reliability estimation of the EM scheme. The confidence bounds are computed using the Cramer-Rao lower bound (CRLB) from estimation theory. In addition to the confidence bounds, we present a rigorous sensitivity analysis of such confidence bounds as a function of information network topology. This analysis offers a fundamental understanding of the capabilities and limitations of the MLE model. Finally, we review a real world case study that evaluates the confidence bounds presented in this chapter.

 Chapter 7 reviews a generalization of the basic MLE model presented in previous chapters to address conflicting observations and non-binary claims. The basic model presented in Chapter 5 assumes only corroborating observations from participants. In this chapter, we review the work that generalizes the basic model to solve more complex problems, where conflicting observations exist. This effort was motivated by the fact that observations from different participants in social sensing applications may be mutually contradicting. A real world case study is presented to evaluate the generalized model in a real world social sensing application. Another assumption of the basic model is that the claims were assumed to be binary. We review the work that further generalizes the theory for conflicting observations and non-binary claims.

 Chapter 8 reviews recent work that leverages the understanding of the underlying information dissemination topology to better solve the reliable social sensing problem. What makes this sensing problem formulation different is that, in the case of human participants, not only is the reliability of sources usually unknown but also the original data provenance may be uncertain. Individuals may report observations made by others as their own. Therefore, we review a novel abstraction that models humans as sensors of unknown reliability generating binary measurements of uncertain provenance. This human sensor model considers the impact of information sharing among human participants through social networks on the analytical foundations of reliable social sensing. We also review a real-world case study that evaluates the human sensor model using Twitter as the experimental platform.

 Chapter 9 reviews work that explores physical dependencies between observed variables to improve the estimation accuracy of reliable social sensing. The observed variables reported in social sensing applications that describe the state of the physical world usually have inherent dependencies. We review a cyber-physical approach to solve the problem by exploiting the physical constraints to compensate for unknown source reliability. These physical constraints shape the likelihood function that quantifies the odds of the observations at hand. We show that the maximum likelihood estimate obtained by this new approach is a lot more accurate than one that does not take physical constraints into account. At the end of the chapter, we review a real world case study to showcase the cyber-physical approach in a crowd-sensing application.

 Chapter 10 reviews the work in recursive fact-finding that is designed to address the real-time streaming data challenges in social sensing. The original EM scheme is an iterative algorithm that is mainly designed to run on static data sets. However, such computation is not suited for streaming data because we need to re-run the algorithm on the whole dataset from scratch every time the dataset gets updated. We review a recursive EM algorithm for streaming data that considers incremental data updates, and updates past results in view of new data in a recursive way. The recursive EM algorithm was shown to achieve a nice performance tradeoff between estimation accuracy and the algorithm execution time.

 Chapter 11 points readers to further readings in related areas of social sensing. These areas include estimation theory, data quality, trust analysis, outlier and attack detection, recommender systems, surveys, and opinion polling. The point of this chapter is to help readers better place the work reviewed in this book in the context of broader literature from different communities. Readers are encouraged to investigate beyond the recommended work, identify new problems, and advance the state-of-the-arts in social sensing.

 Chapter 12 concludes the book with a summary of theories, techniques, and methods presented. It provides the readers an opportunity to recap the key problems and solutions presented in each chapter as well as the overall picture of the entire book. At the end of this chapter, we also outline remaining challenges that need to be appropriately addressed in future social sensing applications. These directions can potentially serve as topics for future graduate theses, course projects, and publications.

With this book, we hope to bridge several communities: for CPS researchers, we would like to draw the parallel between system reliability and data reliability, and argue for the growing importance of investigating the latter to support crowd-sensing, humans in the loop, and emerging data-driven CPS applications of the foreseeable future. This should hopefully encourage collaborations between CPS, machine learning, and data mining fields. For sensor networks researchers, we would like to offer the analogy between sensors and humans who happen to share their observations. The book presents a model of humans as sensors that allows rigorous data fusion theory to be applied, while capturing some essential properties of human behavior. For individuals involved in business analytics, we would like to offer a range of solutions that extract reliable and actionable information from large amounts of unreliable data. This will hopefully encourage collaboration between sensor network, social network, and data fusion researchers, in both academia and industry. The outcome could be to disseminate knowledge on building new sensing and data analytics systems that effectively combine inputs from humans and machines, while offering rigorous reliability guarantees, similar to those obtained from physical (hard) data fusion and signal analysis.

References

[1] Surowiecki J. The wisdom of crowds. Random House LLC; 2005.

[2] Sunstein CR. Infotopia: How many minds produce knowledge. Oxford University Press; 2006.

[3] Rajkumar RR, Lee I, Sha L, Stankovic J. Cyber-physical systems: the next computing revolution. In: Proceedings of the 47th Design Automation Conference; ACM; 2010:731–736.

[4] Sha L, Gopalakrishnan S, Liu X, Wang Q. Cyber-physical systems: A new frontier. In: Machine Learning in Cyber Trust. Springer; 2009:3–13.

[5] Lee EA. Cyber physical systems: Design challenges. In: Object Oriented Real-Time Distributed Computing (ISORC), 2008 11th IEEE International Symposium on; IEEE; 2008:363–369.

[6] Lane ND, Eisenman SB, Musolesi M, Miluzzo E, Campbell AT. Urban sensing systems: Opportunistic or participatory. In: Proc. ACM 9th Workshop on Mobile Computing Systems and Applications (HOTMOBILE ’08); 2008.

[7] Bar-Shalom Y, Li XR. Multisensor, Multitarget Tracking: Principles and Techniques. Storrs, CT: YBS Publishing; 1995.

[8] Chen H, Kirubarajan T, Bar-Shalom Y. Performance limits of track-to-track fusion versus centralized estimation: Theory and application. IEEE Transactions on Aerospace and Electronic Systems. Apr. 2003;39(2):386–400.

[9] Chong C-Y, Mori S, Chang K-C. Distributed multitarget multisensor tracking. In: Bar-Shalom Y, ed. Multitarget Multisensor Tracking: Advanced Applications. Norwood, MA: Artech House; 1990:247–295.

[10] Chong C-Y, Mori S, Barker WH, Chang K-C. Architectures and algorithms for track association and fusion. IEEE Aerospace and Electronic Systems Magazine. Jan. 2000;15(1):5–13.

[11] Rao BS, Durrant-Whyte HF. Fully decentralized algorithm for multisensor Kalman filtering. IEE Proceedings-D. 1991;138(5):413–420.

[12] Bar-Shalom Y, Chen H. Multisensor track-to-track association for tracks with dependent errors. In: Proc. of the 43rd IEEE Conf. on Decision and Control, Paradise Island, Bahamas; Dec. 2004.

[13] Kaplan LM, Bar-Shalom Y, Blair WD. Assignment costs for multiple sensor track-to-track association. IEEE Transactions on Aerospace and Electronic Systems. Apr. 2008;44(2):655–677.

[14] Shafer G. A Mathematical Theory of Evidence. Princeton University Press; 1976.

[15] Yager RR, Kacprzyk J, Fedrizzi M, eds. Advances in the Dempster-Shafer Theory of Evidence. New York: John Wiley & Sons, Inc; 1994.

[16] Smarandache F, Dezert J, eds. Advances and Applications of DSmT for Information Fusion: Collected Works. Infinite Study; 2004.

[17] Jøsang A, Marsh S, Pope S. Exploring different types of trust propagation. In: Proc. of the 4th International Conference on Trust Management (iTrust), Pisa, Italy; May 2006.

[18] Jøsang A, Ismail R, Boyd C. A survey of trust and reputation systems for online service provision. Decision Support Systems. Mar. 2007;vol. 43(no. 2):618–644 [Online]. Available: http://dx.doi.org/10.1016/j.dss.2005.05.019.

[19] Artz D, Gil Y. A survey of trust in computer science and the semantic web. Web Semantics: Science, Services and Agents on the World Wide Web. 2007;5(2):58–71.

[20] Wang Y, Vassileva J. A review on trust and reputation for web service selection. In: Distributed Computing Systems Workshops, 2007. ICDCSW’07. 27th International Conference on; IEEE; 2007:25 25.

[21] Cabral L, Hortacsu A. The dynamics of seller reputation: Evidence from eBay. The Journal of Industrial Economics. 2010;58(1):54–78.

[22] Huynh TD, Jennings NR, Shadbolt NR. An integrated trust and reputation model for open multi-agent systems. Autonomous Agents and Multi-Agent Systems. Sep. 2006;vol. 13(no. 2):119–154 [Online]. Available: http://dx.doi.org/10.1007/s10458-005-6825-4.

[23] Huynh TD. Trust and reputation in open multi-agent systems. Ph.D. dissertation: University of Southampton; 2006.

[24] Aberer K, Despotovic Z. Managing trust in a peer-2-peer information system. In: CIKM ’01: Proceedings of the tenth international conference on Information and knowledge management; New York, NY, USA: ACM; 2001 [Online]. Available: http://dx.doi.org/10.1145/502585.502638 pp. 310–317.

[25] Houser D, Wooders J. Reputation in auctions: Theory, and evidence from eBay. Journal of Economics & Management Strategy. 2006;vol. 15(no. 2):353–369 [Online]. Available: http://dx.doi.org/10.1111/j.1530-9134.2006.00103.x.

[26] Zheng W, Jin L. Online reputation systems in web 2.0 era. In: Value Creation in E-Business Management. Springer; 2009:296–306.

[27] Farmer R, Glass B. Building web reputation systems. O’Reilly Media, Inc; 2010.

[28] Yu B, Singh MP. Detecting deception in reputation management. In: Proceedings of the second international joint conference on Autonomous agents and multiagent systems; ACM; 2003:73–80.

[29] Teacy WT, Patel J, Jennings NR, Luck M. TRAVOS: Trust and reputation in the context of inaccurate information sources. Autonomous Agents and Multi-Agent Systems. Mar. 2006;vol. 12(no. 2):183–198 [Online]. Available: http://dx.doi.org/10.1007/s10458-006-5952-x.

[30] Xiong L, Liu L. Peertrust: Supporting reputation-based trust for peer-to-peer electronic communities. IEEE Transactions on Knowledge and Data Engineering. 2004;16(7):843–857.

[31] Pinyol I, Sabater-Mir J. Computational trust and reputation models for open multi-agent systems: a review. Artificial Intelligence Review. 2013;40(1):1–25.

[32] Sensoy M, de Mel G, Kaplan L, Pham T, Norman TJ. Tribe: Trust revision for information based on evidence. In: Information Fusion (FUSION), 2013 16th International Conference on; IEEE; 2013:914–921.

[33] Kaplan L, Scensoy M, de Mel G. Trust estimation and fusion of uncertain information by exploiting consistency. In: Information Fusion (FUSION), 2014 17th International Conference on; IEEE; 2014:1–8.

[34] Brin S, Page L. The anatomy of a large-scale hypertextual web search engine. In: 7th international conference on World Wide Web (WWW’07); 1998:107–117. [Online]. Available: http://portal.acm.org/citation.cfm?id=297805.297827.

[35] Kleinberg JM. Authoritative sources in a hyperlinked environment. Journal of the ACM. 1999;46(5):604–632.

[36] Yin X, Han J, Yu PS. Truth discovery with multiple conflicting information providers on the web. IEEE Trans. on Knowl. and Data Eng. June 2008;vol. 20:796–808 [Online]. Available: http://portal.acm.org/citation.cfm?id=1399100.1399392.

[37] Pasternack J, Roth D. Knowing what to believe (when you already know something). In: International Conference on Computational Linguistics (COLING); 2010.

[38] Galland A, Abiteboul S, Marian A, Senellart P. Corroborating information from disagreeing views. In: WSDM. 2010:131–140.

[39] Wang D, Abdelzaher T, Ahmadi H, Pasternack J, Roth D, Gupta M, Han J, Fatemieh O, Le H. On Bayesian interpretation of fact-finding in information networks. In: 14th International Conference on Information Fusion (Fusion 2011); 2011.

[40] Berti-Equille L, Sarma AD, Dong X, Marian A, Srivastava D. Sailing the information ocean with awareness of currents: Discovery and application of source dependence. In: CIDR’09; 2009.

[41] Dong X, Berti-Equille L, Srivastava D. Truth discovery and copying detection in a dynamic world. VLDB. 2009;vol. 2(no. 1):562–573. [Online]. Available: http://portal.acm.org/citation.cfm?id=1687627.1687691.

[42] Dong X, Berti-Equille L, Hu Y, Srivastava D. Global detection of complex copying relationships between sources. PVLDB. 2010;3(1):1358–1369.

[43] Qi G-J, Aggarwal CC, Han J, Huang T. Mining collective intelligence in diverse groups. In: Proceedings of the 22nd international conference on World Wide Web; International World Wide Web Conferences Steering Committee; 2013:1041–1052.

[44] Pasternack J, Roth D. Making better informed trust decisions with generalized fact-finding. In: Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Three, ser. IJCAI’11; AAAI Press; 2011:2324–2329. [Online]. Available: http://dx.doi.org/10.5591/978-1-57735-516-8/IJCAI11-387.

[45] Pasternack J, Roth D. Generalized fact-finding. In: Proceedings of the 20th international conference companion on World wide web; ACM; 2011:99–100.

[46] Zhao B, Rubinstein BIP, Gemmell J, Han J. A Bayesian approach to discovering truth from conflicting sources for data integration. Proc. VLDB Endow. Feb. 2012;vol. 5(no. 6):550–561. [Online]. Available: http://dl.acm.org/citation.cfm?id=2168651.2168656.

[47] Pasternack J, Roth D. Latent credibility analysis. In: Proceedings of the 22nd international conference on World Wide Web; International World Wide Web Conferences Steering Committee; 2013:1009–1020.

[48] Yu D, Huang H, Cassidy T, Ji H, Wang C, Zhi S, Han J, Voss C, Magdon-Ismail M. The wisdom of minority: Unsupervised slot filling validation based on multi-dimensional truth-finding. In: The 25th International Conference on Computational Linguistics (COLING); 2014.

[49] Gupta M, Han J. Heterogeneous network-based trust analysis: a survey. ACM SIGKDD Explorations Newsletter. 2011;13(1):54–71.

[50] Wang D. On quantifying the quality of information in social sensing. Ph.D. dissertation University of Illinois at Urbana-Champaign; 2013.


2606 “To view the full reference list for the book, click here

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset