Research
2021
- How and Why to Use Experimental Data to Evaluate Methods for Observational Causal InferenceAmanda M Gentzel, Purva Pruthi, and David Jensen2021
Methods that infer causal dependence from observational data are central to many areas of science, including medicine, economics, and the social sciences. A variety of theoretical properties of these methods have been proven, but empirical evaluation remains a challenge, largely due to the lack of observational data sets for which treatment effect is known. We describe and analyze observational sampling from randomized controlled trials (OSRCT), a method for evaluating causal inference methods using data from randomized controlled trials (RCTs). This method can be used to create constructed observational data sets with corresponding unbiased estimates of treatment effect, substantially increasing the number of data sets available for evaluating causal inference methods. We show that, in expectation, OSRCT creates data sets that are equivalent to those produced by randomly sampling from empirical data sets in which all potential outcomes are available. We then perform a large-scale evaluation of seven causal inference methods over 37 data sets, drawn from RCTs, as well as simulators, real-world computational systems, and observational data sets augmented with a synthetic response variable. We find notable performance differences when comparing across data from different sources, demonstrating the importance of using data from a variety of sources when evaluating any causal inference method.
2020
- Structure Mapping for Transferability of Causal ModelsPurva Pruthi, Javier González, Xiaoyu Lu, and Madalina Fiterau2020
Human beings learn causal models and constantly use them to transfer knowledge between similar environments. We use this intuition to design a transfer-learning framework using object-oriented representations to learn the causal relationships between objects. A learned causal dynamics model can be used to transfer between variants of an environment with exchangeable perceptual features among objects but with the same underlying causal dynamics. We adapt continuous optimization for structure learning techniques to explicitly learn the cause and effects of the actions in an interactive environment and transfer to the target domain by categorization of the objects based on causal knowledge. We demonstrate the advantages of our approach in a gridworld setting by combining causal model-based approach with model-free approach in reinforcement learning.
2015
- How Has Twitter Changed the Event Discussion Scenario? A Spatio-temporal Diffusion AnalysisPurva Pruthi, Anu Yadav, Farheen Abbasi, and Durga Toshniwal2015
Earlier during the times of traditional print media, there used to be one-way information dissemination which was restricted to geographical boundaries having limited span and reach. With the advent of online social media, the process of information diffusion has changed significant ally. It has become the fastest means of communication gaining wide popularity. Online Social Networks like Facebook, Twitter have revolutionized the interpersonal communication by providing a platform to individuals to express themselves at a global level, beyond their immediate geography. Most research in this area has focused on analyzing general information diffusion phenomenon. Our aim is to study diffusion dynamics of specific real world events, discussed on Twitter, with respect to location and time. We categorize the events into broad categories based on the following features - temporal (short or long), geo-spatial distribution (local or global), information diffusion mechanism (viral or gradual), influence(popular or unpopular) and cause (natural or planned). Temporal analysis shows that pre-event, during-event and post event frequency distribution of tweets differ with respect to nature of events. For example, a planned event like "Delhi Elections" is more discussed after its actual occurrence whereas other planned event like "Obama’s visit to India" is mainly discussed during the visit only. Through geospatial analysis, we find that some events which are supposed to be constrained locally, cross regional boundaries and become a matter of global discussion. We also study the diffusion of the events using the user interaction graph formed by retweet/mention links. We conclude with the three-dimensional analysis of spatio-temporal diffusion dynamics of real-world events by exploring relationships among them.