Publications

Background: Anaphylaxis is a potentially fatal allergic reaction. However, many patients at risk of anaphylaxis who should permanently carry a life-saving epinephrine auto injector (EAI) do not carry one at the moment of allergen exposure. The proximity-based emergency response communities (ERC) strategy suggests speeding EAI delivery by alerting patient-peers carrying EAI to respond and give their EAI to a nearby patient in need.

Objectives: This study had two objectives: (1) to analyze 10,000 anaphylactic events from the European Anaphylaxis Registry (EAR) by elicitor and location in order to determine typical anaphylactic scenarios and (2) to identify patients’ behavioral and spatial factors influencing their response to ERC emergency requests through a scenario-based survey.

Methods: Data were collected and analyzed in two phases: (1) clustering 10,000 EAR records by elicitor and incident location and (2) conducting a two-center scenario-based survey of adults and parents of minors with severe allergy who were prescribed EAI, in Israel and Germany. Each group received a four-part survey that examined the effect of two behavioral constructs—shared identity and diffusion of responsibility—and two spatial factors—emergency time and emergency location—in addition to sociodemographic data. We performed descriptive, linear correlation, analysis of variance, and t tests to identify patients’ decision factors in responding to ERC alerts.

Results: A total of 53.1% of EAR cases were triggered by food at patients’ home, and 46.9% of them were triggered by venom at parks. Further, 126 Israeli and 121 German participants completed the survey and met the inclusion criteria. Of the Israeli participants, 80% were parents of minor patients with a risk of anaphylaxis due to food allergy; their mean age was 32 years, and 67% were women. In addition, 20% were adult patients with a mean age of 21 years, and 48% were female. Among the German patients, 121 were adults, with an average age of 47 years, and 63% were women. In addition, 21% were allergic to food, 75% were allergic to venom, and 2% had drug allergies. The overall willingness to respond to ERC events was high. Shared identity and the willingness to respond were positively correlated (r=0.51, P<.001) in the parent group. Parents had a stronger sense of shared identity than adult patients (t243= –9.077, P<.001). The bystander effect decreased the willingness of all patients, except the parent group, to respond (F1,269=28.27, P<.001). An interaction between location and time of emergency (F1,473=77.304, P<.001) revealed lower levels of willingness to respond in strange locations during nighttime.

Conclusions: An ERC allergy app has the potential to improve outcomes in case of anaphylactic events, but this is dependent on patient-peers’ willingness to respond. Through a two-stage process, our study identified the behavioral and spatial factors that could influence the willingness to respond, providing a basis for future research of proximity-based mental health communities.

Background: Medical emergencies such as anaphylaxis may require immediate use of emergency medication. Because of the low adherence of chronic patients (ie, carrying anti-anaphylactic medication) and the potentially long response time of emergency medical services (EMSs), alternative approaches to provide immediate first aid are required. A smartphone-based emergency response community (ERC) was established for patients with allergies to enable members to share their automatic adrenaline injector (AAI) with other patients who do not have their AAI at the onset of anaphylactic symptoms. The community is operated by a national EMS. In the first stage of the trial, children with food allergies and their parents were invited to join.

Objective: This study aimed to identify the factors that influence the willingness to join an ERC for a group of patients at risk of anaphylaxis.

Methods: The willingness to join an ERC was studied from different perspectives: the willingness of children with severe allergies to join an ERC, the willingness of their parents to join an ERC, the willingness of parents to enroll their children in an ERC, and the opinions of parents and children about the minimum age to join an ERC. Several types of independent variables were used: demographics, medical data, adherence, parenting style, and children's autonomy. A convenience sample of children and their parents who attended an annual meeting of a nonprofit organization for patients with food allergies was used.

Results: A total of 96 questionnaires, 73 by parents and 23 by children, were collected. Response rates were approximately 95%. Adherence was high: 22 out of 23 children (96%) and 22 out of 52 parents (42%) had their AAI when asked. Willingness to join the community was high among parents (95%) and among children (78%). Willingness of parents to enroll their children was 49% (36/73). The minimum age to join an ERC was 12.27 years (SD 3.02) in the parents’ opinion and 13.15 years (SD 3.44) in the children’s opinion.

Conclusions: Parents’ willingness to join an ERC was negatively correlated with parents’ age, child’s age, and parents’ adherence. This can be explained by the free-rider effect: parents who carried an AAI for their young child, but had low adherence, wanted to join the ERC to get an additional layer of emergency response. Children’s willingness to join the community was positively correlated with age and negatively correlated with the child’s emotional autonomy. Parents’ willingness to enroll their children in an ERC was positively correlated with child’s age and negatively correlated with parents’ adherence: again, this can be explained by the aforementioned free-rider effect. Parents’ and children’s opinions about the minimum age to join an ERC were negatively correlated with protective parenting style and positively correlated with monitoring parenting style.

Text mining have gained great momentum in recent years, with user-generated content becoming widely available. One key use is comment mining, with much attention being given to sentiment analysis and opinion mining. An essential step in the process of comment mining is text pre-processing; a step in which each linguistic term is assigned with a weight that commonly increases with its appearance in the studied text, yet is offset by the frequency of the term in the domain of interest. A common practice is to use the well-known tf-idf formula to compute these weights. This paper reveals the bias introduced by between-participants' discourse to the study of comments in social media, and proposes an adjustment. We find that content extracted from discourse is often highly correlated, resulting in dependency structures between observations in the study, thus introducing a statistical bias. Ignoring this bias can manifest in a non-robust analysis at best and can lead to an entirely wrong conclusion at worst. We propose an adjustment to tf-idf that accounts for this bias. We illustrate the effects of both the bias and correction with with seven Facebook fan pages data, covering different domains, including news, finance, politics, sport, shopping, and entertainment.

Academics and policymakers alike are concerned with the potential impact and repercussions of artificial intelligence on our lives and the world we live in. In the light of the inherent chasm between human
intelligence and artificial intelligence logics, the inevitable need to integrate human and artificial intelligence into symbiotic forms is particularly challenging to Information Systems researchers and designers. This panel aims to explore meaningful research directions on human-artificial intelligence, which
could lead to a better understanding of its impact and better designs. Building on their expertise in design, HCI, AI, and generative systems, the panelists will explore the following challenges:
 What is unique in the combination of human and artificial intelligence compared with systems
that are solely based on one or the other?
 Can we and should we insist on a similar range of considerations when studying and designing
systems based on human-augmented artificial intelligence as we do when studying and designing
systems based solely on human intelligence?
 Can performance improvements expected of human-artificial intelligence, compared with AI, be
effectively studied independently of considerations such as control and trust?
The panel will seek to evoke provocative ideas and generative thinking that can initiate research on the
relationship between human and artificial intelligence in the IS discipline and perhaps also contribute to
the general discourse thereof. 

Fatal overdoses are a common symptom of the opioid epidemic which has been devastating communities throughout the United States for decades. Philadelphia has been particularly impacted, with a drug overdose death rate of 46.8 per 100,000 individuals, far surpassing other large cities’ rates. Despite city and community efforts, this rate continues to increase, indicating the need for new, more effective approaches aimed at mitigating and combating this issue. Through a human-centered design process, we investigated motivators and barriers to participation in a smartphone-based system that mobilizes community members to administer emergency care for individuals experiencing an overdose. We discuss evidence of the system’s feasibility, and how it would benefit from integration with existing community-based efforts.

Drug shortages have been identified as a public health problem in an increasing number of countries. This can negatively impact on the quality and efficiency of patient care, as well as contribute to increases in the cost of treatment and the workload of health care providers. Shortages also raise ethical and political issues. The scientific evidence on drug shortages is still scarce, but many lessons can be drawn from cross-country analyses. The objective of this study was to characterize, compare, and evaluate the current systemic measures and legislative and organizational frameworks aimed at preventing or mitigating drug shortages within health care systems across a range of European and Western Asian countries. The study design was retrospective, cross-sectional, descriptive, and observational. Information was gathered through a survey distributed among senior personnel from ministries of health, state medicines agencies, local health authorities, other health or pharmaceutical pricing and reimbursement authorities, health insurance companies and academic institutions, with knowledge of the pharmaceutical markets in the 28 countries studied. Our study found that formal definitions of drug shortages currently exist in only a few countries. The characteristics of drug shortages, including their assortment, duration, frequency, and dynamics, were found to be variable and sometimes difficult to assess. Numerous information hubs were identified. Providing public access to information on drug shortages to the maximum possible extent is a prerequisite for performing more advanced studies on the problem and identifying solutions. Imposing public service obligations, providing the formal possibility to prescribe unlicensed medicines, and temporary bans on parallel exports are widespread measures. A positive finding of our study was the identification of numerous bottom-up initiatives and organizational frameworks aimed at preventing or mitigating drug shortages. The experiences and lessons drawn from these initiatives should be carefully evaluated, monitored, and presented to a wider international audience for careful appraisal. To be able to find solutions to the problem of drug shortages, there is an urgent need to develop a set of agreed definitions for drug shortages, as well as methodologies for their evaluation and monitoring.

Smartphone applications to support healthcare are proliferating. A growing and important subset of these apps supports emergency medical intervention to address a wide range of illness-related emergencies in order to speed the arrival of relevant treatment. The emergency response characteristics and strategies employed by these apps are the focus in this study resulting in an mHealth Emergency Strategy Index (MESI). While a growing body of knowledge focuses on usability, safety and privacy aspects that characterize such apps, studies that map the various emergency intervention strategies and suggest criteria to evaluate their role as emergency agents are limited. We survey an extensive range of mHealth apps designed for emergency response along with the related assessment literature and present an index for mobile-based medical emergency intervention apps that can address assessment needs of future mHealth apps.

Confidential information is all too easily leaked by naive users posting comments. In this paper we introduce DUIL, a system for Detecting Unintentional Information Leakers. The value of DUIL is in its ability to detect those responsible for information leakage that occurs through comments posted on news articles in a public environment, when those articles have withheld material non-public information. DUIL is comprised of several artefacts, each designed to analyse a different aspect of this challenge: the information, the user(s) who posted the information, and the user(s) who may be involved in the dissemination of information. We present a design science analysis of DUIL as an information system artefact comprised of social, information, and technology artefacts. We demonstrate the performance of DUIL on real data crawled from several Facebook news pages spanning two years of news articles.

Mobile emergency response applications involving location-based alerts and physical response of networked members increasingly appear on smartphones to address a variety of emergencies. EMS (Emergency Medical Services) administrators, policy makers, and other decision makers need to determine when such systems present an effective addition to traditional Emergency Medical Services. We developed a software tool, the Emergency Response Community Effectiveness Modeler (ERCEM) that accepts parameters and compares the potential smartphone-initiated Samaritan/member response to traditional EMS response for a specific medical condition in a given geographic area. This study uses EMS data from the National EMS Information System (NEMSIS) and analyses geographies based on Rural-Urban Commuting Area (RUCA) and Economic Research Service (ERS) urbanicity codes. To demonstrate ERCEM's capabilities, we input a full year of NEMSIS data documenting EMS response incidents across the USA. We conducted three experiments to explore anaphylaxis, hypoglycemia and opioid overdose events across different population density characteristics, with further permutations to consider a series of potential app adoption levels, Samaritan response behaviors, notification radii, etc. Our model emphasizes how medical condition, prescription adherence levels, community network membership, and population density are key factors in determining the effectiveness of Samaritan-based Emergency Response Communities (ERC). We show how the efficacy of deploying mHealth apps for emergency response by volunteers can be modelled and studied in comparison to EMS. A decision maker can utilize ERCEM to generate a detailed simulation of different emergency response scenarios to assess the efficacy of smartphone-based Samaritan response applications in varying geographic regions for a series of different conditions and treatments.

We address the conflict between citizenship engagement through news commenting, and censorship needs. News articles often contain forms of censorship to maintain security, with non-identification of individuals a means of information protection. Commonly used is the replacement of a name with a supposedly non-identifying initial, protecting the identity of military personnel, witnesses, minors, victims or suspects who need to be granted anonymity in the public sphere. We seek to understand the characteristics of commenters including awareness of the potential for social media to circumvent censorship, and attitudes towards censorship in news articles. Our study of censored articles collected from online news pages on Facebook, presents insights into participant characteristics including a strong correlation between personal network size and censorship support.

We explore the challenges of participation by members of emergency response communities who share a similar condition and treatment, and are called upon to participate in emergency events experienced by fellow members. Smartphones and location-based social networking technologies present an opportunity to re-engineer certain aspects of emergency medical response. Life-saving prescription medication extended in an emergency by one individual to another occurs on a micro level, anecdotally documented. We illustrate the issues and our approach through the example of an app to support patients prone to anaphylaxis and prescribed to carry epinephrine auto-injectors. We address unique participation challenges in an mHealth environment in which interventions are primarily short-term interactions which require clear and precise decision-making and constant tracking of potential participants in responding to an emergency medical event. The conflicting effects of diffused responsibility and shared identity are identified as key factors in modeling participation.

This study investigates the interplay between online news, reader comments, and social networks to detect and characterize comments leading to the revelation of censored information. Censorship of identity occurs in different contexts–for example, the military censors the identity of personnel and the judiciary censors the identity of minors and victims. We address three objectives: (a) assess the relevance of identity censorship in the presence of user-generated comments, (b) understand the fashion of censorship circumvention (what people say and how), and (c) determine how comment analysis can aid in identifying decensorship and information leakage through comments. After examining 3,582 comments made on 48 articles containing obfuscated terms, we find that a systematic examination of comments can compromise identity censorship. We identify and categorize information leakage in comments indicative of knowledge of censored information that may result in information decensorship. We show that the majority of censored articles contained at least one comment leading to censorship circumvention.

Yahav, I., Shmueli, G., and Mani, D. (2016) A Tree-Based Approach for Addressing Self-Selection in Impact Studies with Big Data, MIS Quarterly, 40(4), pp 819-848.

In this paper, we introduce a tree-based approach adjusting for observable self-selection bias in intervention studies in management research. In contrast to traditional propensity score (PS) matching methods, including those using classification trees as a subcomponent, our tree-based approach provides a standalone, automated, data-driven methodology that allows for (1) the examination of nascent interventions whose selection is difficult and costly to theoretically specify a priori, (2) detection of heterogeneous intervention effects for different pre-intervention profiles, (3) identification of pre-intervention variables that correlate with the self-selected intervention, and (4) visual presentation of intervention effects that is easy to discern and understand. As such, the tree-based approach is a useful tool for analyzing observational impact studies as well as for post-analysis of experimental data. The tree-based approach is particularly advantageous in the analyses of big data, or data with large sample sizes and a large number of variables. It outperforms PS in terms of computational time, data loss, and automatic capture of nonlinear relationships and heterogeneous interventions. It also requires less user specification and choices than PS, reducing potential data dredging. We discuss the performance of our method in the context of such big data and present results for very large simulated samples with many variables. We illustrate the method and the insights it yields in the context of three impact studies with different study designs: reanalysis of a field study on the effect of training.

Achiam, Y., Yahav, I, and Schwartz, D.G. (2016). Why not Scale Free? Simulating Company Ego Networks on Twitter, In Proceedings of the 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), San Francisco, August 2016.

This paper simulates Companies’ ego networks on Twitter, that is the companies’ number and type of followers. Evident from data, we show that followers distribution, in our focus, is neither scale free nor random, thus common network simulations cannot be used to mimic observed data. We present a novel rate equations model to capture the complex dynamics of these ego networks. By defining ego networks on dimensions that more accurately characterize microblog networks such as Twitter: quantity, quality, and timeliness, we have been able to generate a simulation model that captures Twitter dynamics better than existing baselines. Following data analysis that explained the lack of a scale free distribution we defined a new ego network rate equations-based simulation model. Our experiments and simulation show a resulting fit to accepted models indicating that this new model can be effectively applied for company ego network analysis.

Yahav I., Schwartz D.G., and Shehory O. Correcting Between-Participant Discourse Bias in Comment Classification. Available on SSRN (http://papers.ssrn.com/abstract=2604910)

Text mining and natural language processing have gained great momentum in recent years, with user-generated content becoming widely available. One key use is comment classification, with much attention being given to sentiment analysis and opinion mining. An essential step in the process of comment classification is text pre-processing; a step in which each linguistic term is assigned with a weight that commonly increases with its appearance in the studied text, yet is offset by the frequency of the term in the domain of interest. A common practice is to use the well-known tf-idf formula to compute these weights.

This paper reveals the bias introduced by between-participants' discourse to the study of comments in social media, and proposes a correction. We find that content extracted from between-participants' discourse is often highly correlated, resulting in dependency structures between observations in the study. Ignoring this bias can manifest in a non-robust analysis at best, and can lead to an entirely wrong conclusion at worst. We propose a statistical correction to tf-idf that accounts for this bias. We illustrate the effects of both the bias and correction with real data from Facebook.

Matilainen, S., Schwartz, D.G., and Zeleznikow, J., (2016) Facebook and the elderly: The Benefits of Social Media Adoption for Aged Care Facility Residents, International Conference on Group Decision & Negotiation, Bellingham, WA, USA.

We explore the emotional effects of implementing Facebook in an aged care facility and evaluate whether computers and Facebook are of any benefit in regard to an elderly person’s feeling of social connectedness. Our central hypothesis is that Facebook use would contribute to the participants’ well being and feeling of social connectedness. This preliminary qualitative study took place in a Melbourne-based elected Aged Care Facility. Facebook was accessed through computers and the internet.

Over a four month period six residents engaged in supervised learning of the use of computers and Facebook. Findings indicate that older people are able to connect and learn the use of new technologies which they are not familiar with and are able to use Facebook through a computer. While high levels of user enjoyment were found, measures of social connectedness as a result of Facebook use were inconclusive. The research concludes that computers in combination with Facebook are a practical approach that can support the needs of the independently living in Aged Care Facilities when combined with proper teaching and appropriate technology.

Cascavilla, G., Conti, M., Schwartz, D. G., & Yahav, I. (2015). Revealing Censored Information Through Comments and Commenters in Online Social Networks. In Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015 (pp. 675–680). New York, NY, USA: ACM.

In this work we study information leakage through discussions in online social networks. In particular we focus on articles published by news pages, in which a person's name is censored, and examine whether the person is identifiable (de-censored) by analyzing comments and social network graphs of commenters.  As a case study for our proposed methodology we considered 48 articles (Israeli, military related) with censored content, followed by a threaded discussion. We qualitatively study the set of comments and identify comments (in this case referred as "leakers") and the commenter and the censored person. We denote these commenters as "leakers". We found that such comments are present for some 75% of the articles we considered. Finally, leveraging the social network graphs of the leakers, we were able to identify the censored person. We show the viablity of our methodology through some illustrative use cases.

Yahav, I., Schwartz, D.G. & Silverman, G. (2014). Detecting Unintentional Information Leakage in Social Media News Comments, Workshop on Social Network Security, IEEE IRI Conference, August, San-Francisco.

This paper is concerned with unintentional information leakage (UIL) through social networks, and in particular, Facebook.  Organizations often use forms of self-censorship in order to maintain security.  Non-identification of individuals, products, or places is seen as a sufficient means of information protection. A prime example is the replacement of a name with a supposedly non-identifying initial.  This has traditionally been effective in obfuscating the identity of military personnel, protected witnesses, minors, victims or suspects who need to be granted a level of protection through anonymity. We challenge the effectiveness of this form of censorship in light of current uses and ongoing developments in Social Networks showing that name-obfuscation mandated by court or military order can be systematically compromised through the unintentional actions of public social network commenters. We propose a qualitative method for recognition and characterization of UIL followed by a quantitative study that automatically detects UIL comments.

Schwartz, D.G., Bellou, A., Garcia-Castrillo, L., Muraro, A, & Papadopoulos, N.G., (2014). Towards Chronic Emergency Response Communities for Anaphylaxis, Workshop on Issues and Challenges in Social Computing, IEEE IRI Conference, August, San-Francisco.

Smartphones and location-based social networking technologies present an opportunity to re-engineer certain aspects of emergency medical response.  Life-saving prescription medication extended in an emergency by one individual to another occurs on a micro level, anecdotally documented. Anaphylaxis in particular, with a combination of stable prescriptions, narrow medical regimens, and high availability, presents a common basis for community formation. In the context of introducing a system for chronic emergency response communities we present an ecosystem that has the potential to change key aspects of emergency response for certain chronic conditions.

Yahav I., Kenett R., and Bai X. (2014). Data Driven Testing of Open Source Software (OSS). Lecture Notes in Computer Science 8803, pp 309-321.

The increasing adoption of open source software (OSS) components in soft-ware systems introduces new quality risks. OSS components are usually developed and maintained by open communities. The fluctuation of community members and structures can result in instability of the software quality. For ex-ample, significant developers may join and quit the community and there is no guarantee of consistent quality between different versions, code branches, and development groups. Hence, an investigation is necessary to analyze the correlation between the social aspects of open communities and the quality of the OSS, such as the dynamics in communications and content distribution. The analysis results can be taken as inputs to drive selective testing for effective validation and verification of OSS components. As exhaustive testing is infeasible in most cases due to time and resources limitations, selective testing techniques are usually needed to allocate test resources to the most critical components and features. In this research, the monitored community dynamics are used to measure the dynamics of an OSS component to guide the component’s testing activities. The paper promotes an approach to monitor community dynamics continuously, including communications like email and blogs, and repositories of bugs and fixes. It detects the patterns in the monitored behavior such as changes in traffic levels within and across clusters. Even though the correlation between community changes and software failure are not fully proved, some hypotheses are tested by mining the data using intelligent algorithms. The approach flags components with high bug frequency, as reported by the OSS community so that the adopting organization can allocate more test cases to the flagged components. The paper reports preliminary attempts in this research direction illustrated with data from XWiki OSS projects.

In this research commentary we show that the discipline of information systems (IS) has much that can be learned from the history of the discipline of medicine. We argue that as interest in historical studies of information systems grows, there are important historical lessons to be drawn from disciplines other than IS, with the medical discipline providing fertile ground. Of particular interest are the circumstances that surrounded the practice of the medical craft in the 1800's—circumstances that drove a process of unification and specialization resulting in the modern conceptualization of medical education, research, and practice. In analyzing the history of the field of medicine, with its long-established methods for general practice, specialization, and sub-specialization we find that it serves as an example of a discipline that has dealt effectively with its initial establishment as a scientific discipline, exponential growth of knowledge and ensuing diversity of practice over centuries, and has much to say in regards to a number of discipline-wide debates of IS. Our objective is to isolate the key factors that can be observed from the writings of leading medical historians, and examine those factors from the perspective of the information systems discipline today. Through our analysis we identify the primary factors and structural changes which preceded a modern medical discipline characterized by unification and specialization. We identify these same historic factors within the present-day information systems milieu and discuss the implications of following a unification and specialization strategy for the future of the disciplines of information.

Schwartz, D.G. (2014) Enhancing Knowledge Marketplaces through a Theory of Knowledge Measurement, in F.J. Martinez-Lopez (Editor) Handbook of Strategic e-Business Management, Progress in IS Series, Springer-Verlag:Berlin, 735-748.

We discuss the creation of objective measures for the comparison of different types of knowledge repositories (KR) to enhance the linkage between knowledge management and strategic e-business with a specific focus on knowledge marketplaces.  Knowledge repositories proliferate yet our ability to objectively assess the value and suitability of a given knowledge repository for a given task has much remained in the realm of trial and error.  Knowledge marketplaces can potentially help organizations leverage the wealth of information gathered through e-business activities. The field of knowledge management has grown significantly over the past decade yet we are lacking formal methods through which knowledge management resources can be measured. In order to facilitate such measures, and enable more effective use of knowledge marketplaces, we must first deal with comparing the value of different types of knowledge in an organizational setting and how such value is measured in and reflected by knowledge repositories.  In this chapter we present the background and definition of the problem, and introduce an approach based on semantic calculus and set theory to create a theory of knowledge measurement to be used in the evaluation of KRs. We then discuss how a theory of knowledge measurement (TKM) can be applied to knowledge marketplaces improving the linkage between knowledge management and strategic e-business.

Sadan, Z., & Schwartz, D. G. (2012). Social network analysis for cluster-based IP spam reputation. Information Management & Computer Security, 20(4), 281–295.

Purpose – IP reputation systems, which filter e-mail based on the sender’s IP address, are located at the perimeter – before the messages reach themail server’s anti-spamfilters. To increase IP reputation system efficacy and overcome the shortcomings of individual IP-based filtering, recent studies have suggested exploiting the properties of IP clusters, such as those of Autonomous Systems (AS). Cluster-based techniques can enhance accuracy and reduce false negative rates. However, clusters generally contain enormous amounts of IP addresses, which hinder cluster-based systems from reaching their full spam filtering potential. The purpose of this paper is exploitation of social network metrics to obtain a more granular, i.e. sub-divided, view of cluster-based reputation, and thus enhance spam filtering accuracy.

 

Sadan, Z., & Schwartz, D. G. (2011). Social network analysis of web links to eliminate false positives in collaborative anti-spam systems. Journal of Network and Computer Applications, 34(5), 1717–1723.

The performance of today’s email anti-spam systems is primarily measured by the percentage of false positives (non-spam messages detected as spam) rather than by the percentage of false negatives (real spam messages left unblocked). One reliable anti-spam technique is the Universal Resource Locator (URL)-based filter, which is utilized by most collaborative signature-based filters. URL-based filters examine URL frequency in incoming email and block bulk email when a predetermined threshold is passed. However, this can cause erroneous blocking of mass distribution of legitimate emails. Therefore, URL-based methods are limited in sufficient prevention of false positives, and finding solutions to eliminate this problem is critical for anti-spam systems. We present a complementary technique for URL-based filters, which uses the betweenness of web-page hostnames to prevent the erroneous blocking of legitimate hosts. The technique described was tested on a corpus of 10,000 random domains selected from the URIBL white and black list databases. We generated the appropriate linked network for each domain and calculated its centrality betweenness. We found that betweenness centrality of whitelist domains is significantly higher than that of blacklist domains. Results clearly show that the betweenness centrality metric can be a powerful and effective complementary tool for URL-based anti-spam systems. It can achieve a high level of accuracy in determining legitimate hostnames and thus significantly reduce false positives in these systems.

Molov, B. and Schwartz, D.G. (2011). Towards an Integrated Strategy for Intercultural Dialogue: Computer-mediated Communication and Face to Face, Journal of Intercultural Communication Research, Vol. 39, No. 3, pp 207-224.

This article seeks to provide a preliminary framework for advancing intercultural dialog, from the point of view of political science, social psychology, and anthropology as it relates to computer-mediated communication (CMC) and face-to-face (FTF) contact. It offers a basis for integrating elements which hithertofore have not been integrated as part of a strategy for intercultural dialog. In an age of intercivilizational conflict it is a vital challenge to facilitate a sustained dialog and relationship building between cultures both in FTF contexts (encounters) and via CMC for both political dialog and business activity. Combined with general theoretical elements, focus is placed upon the Middle East and includes references to several case studies. Given the need as well to engage larger segments of the world in communication and dialog such as the Islamic periphery references have also been made to South Asia.

 

Schwartz, D. G. (2010). The Internet in six words or less, Internet Research, 20(4), 389–394.

Purpose

– This paper seeks to present six key articles from the archives of Internet Research within a research framework, covering infrastructure, organization, commerce, governance, linking, and interface.

 

Design/methodology/approach

– The six articles are introduced, summarized, and used to focus attention on each of the core areas of research that impacted the growth of the Internet.

 

Findings

– The prism of time is one of the most powerful tools of observation available to scientists and researchers. Palaeontologists think in terms of aeons, archaeologists consider millennia a mere starting‐point, even the biologists, chemists, and physicists have centuries of prior research to consider. Internet Research, the Journal, is only 20 years old – and the field only slightly older than that. Yet what decades those have been. Six articles from the early years of Internet Research epitomize much of the innovation, excitement, challenges and vision that would reshape the world. While tremendous advances in technology have been made in the past 20 years, a number of the original issues and challenges remain unresolved.

 

Practical implications

– The paper serves to frame the historic articles within a broader research context.

 

Originality/value

– The paper provides a conceptual framework for researchers seeking insights into some of the early formative research on the Internet and web.

Jank W. and Yahav I., E-Loyalty Networks in Online Auctions (2010). In Annals of Applied Statistics, 4(1), pp. 151-178

Creating a loyal customer base is one of the most important, and at the same time, most difficult tasks a company faces. Creating loyalty online (e-loyalty) is especially difficult since customers can “switch” to a competitor with the click of a mouse. In this paper we investigate e-loyalty in online auctions. Using a unique data set of over 30,000 auctions from one of the main consumer-to-consumer online auction houses, we propose a novel measure of e-loyalty via the associated network of transactions between bidders and sellers. Using a bipartite network of bidder and seller nodes, two nodes are linked when a bidder purchases from a seller and the number of repeat-purchases determines the strength of that link.

We employ ideas from functional principal component analysis to derive, from this network, the loyalty distribution which measures the perceived loyalty of every individual seller, and associated loyalty scores which summarize this distribution in a parsimonious way. We then investigate the effect of loyalty on the outcome of an auction. In doing so, we are confronted with several statistical challenges in that standard statistical models lead to a misrepresentation of the data and a violation of the model assumptions. The reason is that loyalty networks result in an extreme clustering of the data, with few high-volume sellers accounting for most of the individual transactions. We investigate several remedies to the clustering problem and conclude that loyalty networks consist of very distinct segments that can best be understood individually.

Sadan, Z., & Schwartz, D.G. (2010). WhiteScript: Using social network analysis parameters to balance between browser usability and malware exposure. Computers & Security, 30(1), 4–12.

Drive-by-download malware exposes internet users to infection of their personal computers, which can occur simply by visiting a website containing malicious content. This can lead to a major threat to the user’s most sensitive information. Popular browsers such as Firefox, Internet Explorer and Maxthon have extensions that block JavaScript, Flash and other executable content. Some extensions globally block all dynamic content, and in others the user needs to specifically enable the content for each site (s)he trusts. Since most of the web-pages today contain dynamic content, disabling them damages user experience and page usability, and that prevents many users from installing security extensions. We propose a novel approach, based on Social Network Analysis parameters, that predicts the user trust perspective for the HTML page currently being viewed. Our system examines the URL that appears in the address bar of the browser and each of the inner HTML URL reputations, and only if all of them have a reputation greater than our predetermined threshold, it marks the webpage as trusted. Each URL reputation is calculated based on the number and quality of the links on the whole web pointing back to the URL. The method was examined on a corpus of 44,429 malware domains and on the top 2000 most popular Alexa sites. Our system managed to enable dynamic content of 70% of the most popular websites and block 100% of malware web-pages, all without any user intervention. Our approach can augment most browser security applications and enhance their effectiveness, thus encouraging more users to install these important applications.

Please reload

SocIntLab

Social Information Technologies for a Better World

© 2014, 2015, 2016, 2017, 2018 

By David G. Schwartz