At Complex Networks AI Lab at Ben-Gurion University (CNAI-LAB@BGU) we tackle research problems in diverse domains using a combination of methods from graph theory and machine learning.
Complex Networks AI Lab
Complex Networks are found in cyber security, social networks, communication networks and the Internet, biological networks, financial networks, text analytics and more. Scientific programmers working the cnai-lab @ BGU develop generic software tools and libraries to analyze the structure of networks derived from the various problem domains. Graduate research students apply these tools to investigate specific problems in their domain of interest.
SELECTED PROJECTS
The Israel-U.S. Center on Cybersecurity Research and Development for Energy (ICRDE)
ICRDE will research, develop, evaluate and demonstrate new technologies to solve critical challenges pertaining to cybersecurity of energy facilities, taking into consideration all phases of energy production and transmission.
Social Network Honeypot
As part of the research project "Detecting APTs Targeting Network Devices" a social network honeypot platform for detecting attackers during the reconnaissance phase was implemented. Key competence/ functionalities of the platform include: collecting information from SNs, generating and monitor artificial profiles, and wiring profiles within the social network.
Target oriented network intelligence collectionÂ
The Target Oriented Network Intelligence Collection (TONIC) problem is the problem of finding profiles in a social network that contains information about a given target via intelligent crawling. Such profiles are called leads. Two best-first search frameworks are proposed for solving TONIC and several heuristics are proposed for each framework.
Fake News and Untrustful Social Accounts
Online social media (OSM) allows people to share news, create content and start trends, potentially influencing large sectors of the population. Unfortunately, organizations and individuals take advantage of the OSM in order to gain influence, damage competitor’s reputation, or spread political propaganda by financing misinformation campaigns. In many cases, these campaigns are carried out by armies of online un-trustful accounts,
Early Detection Alert and Response to eThreats (eDare)
eDare is a research project performed by Telekom Innovation Laboratories at BGU between July 2005 - July 2008. The goal of the project was to detect and tackle emerging ICT security threats propagating across NSP, ISP and enterprise communication networks.
​
Network Classification
Network classification is at a nascent stage, holding great potential. The network classification research is facilitated by an abundance of network datasets available today. In this project we develop algorithms for embedding and classification of vertices, subgraphs, and graphs, as well as search for subgraphs matching a given class.
Cyber-biological warfare
Today arbitrary synthetic DNA can be ordered online and delivered within several days. In order to regulate generation of dangerous substances, most synthetic gene providers screen DNA orders. However, a weakness in the DNA screening guidance allows some screening protocols to be circumvented using a generic obfuscation procedure inspired by early malware obfuscation techniques. Furthermore, accessibility and automation of the synthetic gene engineering workflow, combined with insufficient cybersecurity controls, allow malware to interfere with biological processes within the victim's lab ...
SELECTED PUBLICATIONS
References and Links to Papers
Florian Klaus Kaiser, Uriel Dardik, Aviad Elitzur, Polina Zilberman, Nir Daniel, Marcus Wiens, Frank Schultmann, Yuval Elovici, and Rami Puzis
Cyber threat intelligence on past attacks may help with attack reconstruction and the prediction of the course of an ongoing attack by providing deeper understanding of the tools and attack patterns used by attackers. Therefore, cyber security analysts employ threat intelligence, alert correlations, machine learning, and advanced visualizations in order to produce sound attack hypotheses. In this article, we present AttackDB, a multi-level threat knowledge base that combines data from multiple threat intelligence sources to associate high-level ATT&CK techniques with low-level telemetry found in behavioral malware reports. We also present the Attack Hypothesis Generator which relies on knowledge graph traversal algorithms and a variety of link prediction methods to automatically infer ATT&CK techniques from a set of observable artifacts. Results of experiments performed with 53K VirusTotal reports indicate that the proposed algorithms employed by the Attack Hypothesis Generator are able to produce accurate adversarial technique hypotheses with a mean average precision greater than 0.5 and area under the receiver operating characteristic curve of over 0.8 when it is implemented on the basis of AttackDB. The presented toolkit will help analysts to improve the accuracy of attack hypotheses and to automate the attack hypothesis generation process.
Rami Puzis, Dor Farbiash, Oleg Brodt, Yuval Elovici & Dov Greenbaum
​​
Commercial DNA synthesizers sell billions of nucleotides to customers each year, amounting to
hundreds of millions of dollars in sales1. As DNA synthesis becomes more widespread, concern is mounting that a cyberattack intervening with synthetic DNA orders could lead to the synthesis of nucleic acids encoding parts of pathogenic organisms or harmful proteins and toxins. Reassuringly,
such an attack is non-trivial.
David Toubiana, Rami Puzis, Lingling Wen, Noga Sikron, Assylay Kurmanbayeva, Aigerim Soltabayeva, Maria del Mar Rubio Wilhelmi, Nir Sade, Aaron Fait, Moshe Sagi, Eduardo Blumwald and Yuval Elovici
The identification and understanding of metabolic pathways is a key aspect in crop improvement and drug design. The common approach for the detection of metabolic pathways is based on gene annotation and ontology. Here, we demonstrate the detection of metabolic pathways based on quantitative metabolic data by combining correlation-based network analysis with machine-learning techniques. Metabolites of 169 known tomato metabolic pathways (TomatoCyc), 85 non-tomato pathways (MetaCyc), and 85 random sets of metabolites were mapped as subgraphs onto metabolite correlation networks of the tomato pericarp. For each subgraph, a set of 148 network features for each network was computed. The resulting feature vectors were used to train a robust machine-learning model. The validity of the model was tested on unknown pathways (from PlantCyc and MetaCyc) predicting the presence: i) of the β-alanine-degradation I (yet unknown to plants); ii) the melibiose-degradation pathway although melibiose was not part of the networks. In vivo assays validated the presence of the pathways.
Michael Fire, Lena Tenenboim, Ofrit Lesser, Rami Puzis, Lior Rokach, Yuval Elovici
Online social networking sites have become increasingly popular over the last few years. As a result, new interdisciplinary research directions have emerged in which social network analysis methods are applied to networks containing hundreds millions of users. Unfortunately, links between individuals may be missing due to imperfect acquirement processes or because they are not yet reflected in the online network (ie, friends in the real world did not form a virtual connection.) Existing link prediction techniques lack the ...
Shlomi Dolev, Yuval Elovici, Rami Puzis
Betweenness-Centrality measure is often used in social and computer communication networks to estimate the potential monitoring and control capabilities a vertex may have on data flowing in the network. In this article, we define the Routing Betweenness Centrality (RBC) measure that generalizes previously well known Betweenness measures such as the Shortest Path Betweenness, Flow Betweenness, and Traffic Load Centrality by considering network flows created by arbitrary loop-free routing ...
Rami Puzis, Yuval Elovici, Shlomi Dolev
In this paper, we propose a method for rapid computation of group betweenness
centrality whose running time (after preprocessing) does not depend on network size. The
calculation of group betweenness centrality is computationally demanding and, therefore, it
is not suitable for applications that compute the centrality of many groups in order to identify
new properties. Our method is based on the concept of path betweenness centrality defined
in this paper. We demonstrate how the method can be used to find the most prominent ...