Data-driven Secure Business Intelligence

Disciplinary Research

Causal effect estimation Ability to autonomously reason about causality is necessary for machine learning algorithms to be successfully applied in policy making. An important example of such a scenario is clinical decision making, e.g. what treatment to give to a patient in a hospital. In (Johansson 16), we address the core of this problem by proposing an algorithm that estimates the individual effect of treatment based on historical data. We bound the error made by this algorithm and show state-of-the-art empirical results.
Lovasz Embedding for ML The Lovasz geometrical embedding of graphs is a powerful tool in combinatorial optimization on graphs. We have developed and exploited it as a tool for machine learning on graph structured data. (Jethava 12) and (Jethava 13) develops an alternate kernel based characterization of the Lovasz function and shows how a fast and highly scalable approximation of the Lovasz embedding can be computed and used to solve classical problems on graphs.
Graph Kernels and Similarity Functions Kernel methods offer a very powerful and flexible framework for classification problems. Using the Lovasz embedding, we developed a new graph kernel that has global properties unlike previous graph kernels (Johansson 13). We used graph kernels to develop a fast entity disambiguation method using only network properties (Hermansson 13). Sometimes kernels are hard to come by and similarity functions are more natural. We place the Balcan-Blum-Srebo theory of "good" similarity functions in the framework of optimal transport and coupling (Johansson 15a) and show how it can be used to achieve the beast results on graph classification.
Deep learning Deep learning has shown astonishing results in a wide range of tasks, from image classification to speech recognition and natural language processing. Our group is taking an active role in developing novel approaches in this emerging field. We have created several demonstrators that are based partly or entirely on deep learning. A virtual assistant for discussion forum users and customer service systems (Hagstedt and Mogren 16) is currently the core technology for a new start-up called Textual based in Gothenburg. We have developed a tool for named entity recognition in Swedish biomedical texts, that can be trained on non-sensitive source data, yet performs well on real patient records. Our extractive summarization system (Kageback 14, Mogren 15a) is deployed at our partner Findwise.'We have also developed a system for word-sense induction (Kageback 15).
Enumeration methods We also developed a novel, purely combinatorial and efficient method for multi-document summarization, tailored to short independent texts (Damaschke 16b). It is based on enumerating the bicliques in the word-sentence occurrence graph which can be done fast (Damaschke 14b) (Damaschke 15a), and picking the most similar pairs of sentences. A graph-editing problem is proposed in (Damaschke 14a), to get clearly defined clusters of related words which may yield more accurate sentence similarity measures. An approach to new event detection in text streams, working with a fast enumeration of minimal new subsets of co-occurring words (Damaschke15b) has been further developed empirically.
Differential Privacy is a recent framework that offers rigorous privacy guarantees while retaining utility. We have used the differential privacy framework to show how graph classification can be done using private versions of graph kernels (Johansson 15b).
Privacy policies We have developed a formal framework for specifying privacy policies in social networks (Pardo 2014). This framework is based on epistemic logic or "the logic of knowledge", where users are able to specify who is able to know their information. This framework is able to deal with implicit leaks of information as opposed to most of the current access control models.
Data Minimisation. According to several regulations about personal data processing, the collection of such data should be limited to the minimum necessary. However, it is not clearly stated what this means and how to ensure it. We proposed a definition of data minimality and a technique to provide data minimisation for programs (Antignac 16).

Lovász graph embedding

Graph kernel classification

Extractive summarization

Publications

2018

[Mallozzi 2018] - P. Mallozzi, R. Pardo, V. Duplessis, P. Pelliccione and G. Schneider, “MoVenMo: A Structured Approach to Engineer Reward Functions“. In Proceedings of IEEE International Conference on Robotic Computing (IRC), 2018. To appear.
[Pardo 2018] - R. Pardo, C. Sánchez, and G. Schneider, “Timed Epistemic Knowledge Bases for Social Networks“. In Formal Methods (FM'18), 2018. To appear.
[Antignac 2018] -T. Antignac, R. Scandariato, and G. Schneider, “Privacy Compliance via Model Transformations“. In International IEEE Workshop on Privacy Engineering (IWPE'18), 2018. To appear.
[Pinisetty 2018] -S. Pinisetty, D. Sands, and G. Schneider, “Runtime Verification of Hyperproperties for Deterministic Programs“. In 6th Conference on Formal Methods in Software Engineering (FormaliSE'18). ACM, 2018. To appear.

2017

[Antignac 2017a] -T. Antignac, M. Mukelabai, and G. Schneider, “Specification, Design, and Verification of an Accountability-aware Surveillance Protocol“. In 32nd ACM/SIGAPP Symposium On Applied Computing -Software Verification and Testing track (SAC-SVT'17), pages 1372-1378, 2017.
[Antignac 2017b] -T. Antignac, D. Sands, and G. Schneider, “Data Minimisation: A Language-Based Approach“. In IFIP Information Security & Privacy Conference (IFIP SEC'17), Volume 502, pages 442-456, 2017. Springer Science and Business Media.
[Pardo 2017a] -R. Pardo, M. Balliu, and G. Schneider, “Formalising privacy policies in social networks“. In Journal of Logical and Algebraic Methods in Programming: 90:125-157, Aug 2017
[Pardo 2017b] - R. Pardo and G. Schneider, “Model Checking Social Network Models“. In Proceedings of International Symposium on Games, Automata, Logics, and Formal Verification (GandALF), Volume 256, pages 238-252, 2017. [pdf]

[Picazo 2017] - P. Picazo-Sanchez, R. Pardo and G. Schneider, “Secure Photo Sharing in Social Networks“. In Proceedings of International Conference on ICT Systems Security and Privacy Protection (IFIP SEC), 2017. [pdf]

2016

[Tossou 2016b] - A. Tossou, C. Dimitrakakis and D. Dubhashi, “Thompson Sampling For Stochastic Bandits with Graph Feedback“. In Proceedings of 31st AAAI Conference on Artificial Intelligence (AAAI), 2017.
[Tossou 2016a] - A. Tossou and C. Dimitrakakis, “Achieving privacy in the adversarial multi‐armed bandit,“. In Proceedings of 31st AAAI Conference on Artificial Intelligence (AAAI), 2017.
[Mogren 2016] - S. Almgren, S. Pavlov, O. Mogren, “Named Entity Recognition in Swedish Medical Journals with Deep Bidirectional Character-Based LSTMs“. In Proceedings of Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining (BIOTXTM), 2016.
[Pardo 2016c] - R. Pardo, I. Kellyérová, C. Sánchez and G. Schneider, “Specification of Evolving Privacy Policies for Online Social Networks“. In Proceedings of 23rd International Symposium on Temporal Representation and Reasoning (TIME), IEEE, 2016. [pdf]
[Pardo 2016b] - R. Pardo, C. Colombo, G. J. Pace and G. Schneider, “An Automata-based Approach to Evolving Privacy Policies for Social Networks“. In Proceedings of 16th International Conference on Runtime Verification (RV), Volume 10012, pages 285-301, 2016. [pdf]
[Schneider 2016] - G. J. Pace, R. Pardo and G. Schneider, “On the Runtime Enforcement of Evolving Privacy Policies in Online Social Networks“. In Proceedings of 7th International Symposium on Leveraging Applications of Formal Methods, Verification and Validation (ISOLA), Volume 9953, pages 407-412, 2016. [pdf]
[Hagstedt and Mogren 16] - J. Hagstedt P Suorra, O. Mogren, “Assisting Discussion Forum Users using Deep Recurrent Neural Networks“. In Proceedings of First Workshop on Representation Learning for NLP (RepL4NLP) at ACL, 2016. [pdf]
[Antignac 16] - T. Antignac, R. Scandariato, G. Schneider, “A Privacy-Aware Conceptual Model for Handling Personal Data“. In Proceedings of Leveraging Applications of Formal Methods, Verification and Validation (ISOLA), Volume 9952, pages 942-957, 2016.
[Ebadi 16b] - H. Ebadi, and D. Sands, “Featherweight PINQ“. In Journal of Privacy and Confidentiality, IEEE, 2016. [pdf]
[Hallgren 16] - P. Hallgren and M. Ochoa, and A. Sabelfeld, “MaxPace: Speed-Constrained Location Queries“. In Proceedings of the IEEE Conference on Communications and Network Security (CNS), Philadelphia, PA, USA, 2016. [pdf]
[Agadakos 16] - I. Agadakos, P. Hallgren, G. Portokalidis, and A. Sabelfeld, “Location-enhanced Authentication using the IoT“. In Proceedings of the Annual Computer Security Applications Conference (ACSAC), Los Angeles, CA, USA, 2016. [pdf]
[Ebadi 16a] - H. Ebadi, T. Antignac and D. Sands “Sampling and Partitioning for Differential Privacy“. In Proceedings of 14th Annual Conference on Privacy, Security and Trust (PST), 2016. [pdf]
[Dubhashi 16] - D. Dubhashi and S. Lappin “AI Dangers: Real and Imagined“. CACM (to appear).
[Pardo 16a] - R. Pardo, M. Balliu and G. Schneider “Formalising Privacy in Evolving Social Networks“. In Journal of Logical and Algebraic Methods in Programming (JLAMP). To appear. 2016.
[Damaschke 16b] - A. S. Muhammad, P. Damaschke, O. Mogren, “Summarizing online user reviews using bicliques“. In Proceedings of 42nd International Conference on Current Trends in Theory and Practice of Computer Science (SOFSEM), LNCS 9587, pages 569-579, 2016.
[Damaschke 16a] - P. Damaschke, “Sufficient conditions for edit-optimal clusters“. In Information Processing Letters 116, Pages 267-272, 2016.
[Johansson 16] - FD. Johansson, U. Shalit, D. Sontag. “Learning Representations for Counterfactual Inference“. In 33rd International Conference on Machine Learning (ICML), 2016. [pdf]

2015

[van Delft 15] - B. van Delft, S. Hunt, D. Sands “Very Static Enforcement of Dynamic Policies“. In Proceedings of 4th International Conference on Principles of Security and Trust (POST), Pages 32-52, 2015.
[Broberg 15] - N. Broberg, B. van Delft, D. Sands “The Anatomy and Facets of Dynamic Policies“. In Proceedings of 28th IEEE Computer Security Foundations Symposium (CSF), Pages 122-136, 2015.
[Tahmasebi 15] - N. Tahmasebi, L. Borin, G. Capannini, D. Dubhashi, P. Exner, M. Forsberg, G. Gossen, F. D. Johansson, R. Johansson, M. Kågebäck, O. Mogren, P. Nugues, T. Risse, “Visions and open challenges for a knowledge-based culturomics“. In International Journal on Digital Libraries 15, (2-4) 201, Pages 169-187, 2015.
[Mogren 15b] - O. Mogren, M. Kågebäck, D. Dubhashi, “Extractive Summarization by Aggregating Multiple Similarities“. In Proceedings of Recent Advances in Natural Language Processing, pages 451-457, 2015.
[Hallgren 15b] - P. Hallgren, M. Ochoa and A. Sabelfeld, “BetterTimes: Privacy-assured Outsourced Multiplications for Additively Homomorphic Encryption on Finite Fields“. In Proceedings of the International Conference on Provable Security (ProvSec), 2015. [pdf]
[Hallgren 15a] - P. Hallgren, M. Ochoa and A. Sabelfeld, “InnerCircle: A Parallelizable Decentralized Privacy-Preserving Location Proximity Protocol“. In Proceedings of the International Conference on Privacy, Security and Trust (PST), 2015. [pdf]
[Pardo 15] - R. Pardo, M. Balliu and G. Schneider, “Privacy in Evolving Social Networks (Extended Abstract)“. In 27th Nordic Workshop on Programming Theory (NWPT), 2015. [pdf]
[Damaschke 15a] - P. Damaschke, “Finding and enumerating large intersections“. In Theoretical Computer Science 580, Pages 75-82, 2015. [pdf]
[Damaschke 15b] - P. Damaschke, “Pairs covered by a sequence of sets“. In Proceedings of 20th International Symposium on Fundamentals of Computation Theory, Gdansk, LNCS 9210, Pages 214-226, FCT, 2015.
[Johansson 15a] - F. Johansson and D. Dubhashi, “Learning with similarity functions on graphs using matchings of geometric embeddings”. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Pages 467-476, 2015.[pdf]
[Kågebäck 15] - M. Kågebäck, F. Johansson, R. Johansson and D. Dubhashi, “Neural context embeddings for automatic discovery of word senses”. In Proceedings of NAACL-HLT, Pages 25-32, 2015. [pdf]
[Hermansson 15] - L. Hermansson, F. Johansson and O. Watanabe, “Generalized Shortest Path Kernel on Graphs”. In Proceedings of the 18th International Conference Discovery Science, 2015
[Mogren 15a] - O. Mogren, M. Kågebäck and D. Dubhashi. “Extractive Summarization by Aggregating Multiple Similarities”. In Recent Advances in Natural Language Processing, 2015
[Ebadi 15] - H. Ebadi, D. Sands and G. Schneider, “Differential Privacy: Now it's Getting Personal“. In Proceedings of the 42nd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM, 2015 [pdf]
[Johansson 15b] - F. Johansson, O. Frost, C. Retzner, and D. Dubhashi, “Classifying large graphs with differential privacy”. In Proceedings of the 12th International Conference on Modeling Decisions for Artificial Intelligence, 2015
[Johansson 15c] - F. Johansson, A. Chattoraj, C. Bhattacharyya and D. Dubhashi, “Weighted Theta Functions and Embeddings with Applications to Max-Cut, Clustering and Summarization”. In Advances in Neural Information Processing Systems 28 (NIPS 2015), 2015

2014

[Damaschke 14b] - P. Damaschke, “Enumerating maximal bicliques in bipartite graphs with favorable degree sequences”. In Information Processing Letters 114.6, Pages 317-321, 2014 [pdf]
[Kågebäck 14] - M. Kågebäck, O. Mogren, N. Tahmasebi and D. Dubhashi, "Extractive Summarization using Continuous Vector Space Models". In Proceedings of the 2nd Workshop on Continuous Vector Space Models and their Compositionality at EACL, 2014 [pdf]
[Damaschke 14a] - P. Damaschke, O. Mogren, “Editing simple graphs“. In Journal of Graph Algorithms and Applications 18 (2014), Special issue of WALCOM 2014 (less than 10% of WALCOM submissions were invited) [pdf]
[Johansson 14] - F. Johansson, V. Jethava, D. Dubhashi and C. Bhattacharyya, “Global graph kernels using geometric embeddings”. In Proceedings of the 31st International Conference on Machine Learning, ICML, 2014
[Pardo 14] - R. Pardo and G. Schneider, “A Formal Privacy Policy Framework for Social Networks“. In Proceedings of the 12th International Conference on Software Engineering and Formal Methods (SEFM), 2014. [pdf]
[Hedin 14] - D. Hedin, A. Birgisson, L. Bello and A. Sabelfeld “JSFlow: Tracking Information Flow in JavaScript and its APIs, with”. In Proceedings of the ACM Symposium on Applied Computing (SAC), 2014. [pdf]

2013

[Damaschke 13] - P. Damaschke, “Cluster editing with locally bounded modifications revisited”. In Proceedings of the 24th International Workshop on Combinatorial Algorithms IWOCA, Lecture Notes in Computer Science Vol. 8288 (2013), Pages 433-437. 2013 [pdf]
[Hermansson 13] - L. Hermansson, T. Kerola, F. Johansson, V. Jethava, and D. Dubhashi, “Entity disambiguation in anonymized graphs using graph kernels”. In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management (CIKM), 2013. [pdf]
[Broberg 13] - N. Broberg, B. van Delft and D. Sands, “Paragon for Practical Programming with Information-Flow Control” In APLAS, volume 8301 of Lecture Notes in Computer Science, Pages 217-232. Springer, 2013 [pdf]
[Jethava 13] - V. Jethava, A. Martinsson, C. Bhattacharyya and D. Dubhashi, “Lovász ϑ function, SVMs and finding dense subgraphs”. Journal of Machine Learning Research 14(1): 3495-3536, 2013 [pdf]
[Johansson 13] - F. Johansson, V. Jethava and D. Dubhashi, “DLOREAN: Dynamic Location-Aware Reconstruction of Multiway Networks”. In Proceedings of the IEEE 13th International Conference on Data Mining Workshops, ICDMW, Pages 1012-1019, 2013 [pdf]

2012

[Jethava 12] - V. Jethava, A. Martinsson, C. Bhattacharyya and D. Dubhashi, “The Lovasz $\theta$ function, SVMs and finding large dense subgraphs”. In Advances in Neural Information Processing Systems, Pages 1160-1168, 2012 [pdf]