Moshe Cotacallapa

Data Science & Complex Networks
São José dos Campos, São Paulo - Brazil
twitter | instagram | telegram

Hello, I'm Moshe, a data scientist at Climatempo Labs and a fourth-year PhD student at the Laboratory for Applied Computing, National Institute for Space Research, in Brazil, under supervision of Prof. Marcos G. Quiles, Prof. Elbert E. N. Macau and Prof. Manoel Cardoso. My research has been focused on measuring changes in temporal complex networks and data mining large datasets from several sources like NASA and Google Big Query. In my free time, I work on developing side projects, listening to classical music, painting watercolors, playing the violin or ocarina, and reading news.

Before starting my PhD in 2015, I received my Master’s Degree of Complex Systems Modeling at University of São Paulo (USP), Brazil, where I developed a master equation model of epidemics in complex networks, under supervision of Prof. Masayuki O. Hase. Besides my academic career, I also studied Portrait and Human Figure for 4 years at The National Museum of Lima (MALI), in Peru, under guidance of Prof. Jorge Flores.

- For more details, please look at my curriculum vitae

Research interests:

  • Temporal networks
  • Social network analysis
  • Data mining large datasets

Contact:

Tools I use and recommend:

  • Jupyter Notebook: The perfect environment for scientific software development in Python, R and Julia. Easy to test and share code.
  • Distill Monitor: A very configurable online tool for monitoring and alerting when any change is found on the website you want to track.
  • RocketBolt : Know when and where the recipient opened the email you sent.
  • Mendeley: Save and organize your bibliography. Use the browser extension to pull the data from the article you are visiting.
  • Overleaf: Write your documents online in LaTeX. Edit and share easily with your colleages. No installation needed.
  • Slides: A simple and powerful online presentation maker. Embed almost anything inside the slides.
  • Pocket: Save any content from the Web to read later.

Publications:

Engagement index for users and conversations in encrypted messages from WhatsApp groups

Moshe Cotacallapa, Didier A Vega-Oliveros

WhatsApp, a very popular cross-platform messaging platform with more than one and a half billion users, is the preferred medium for communication in several countries. One of the interesting features in the platform is called Groups (WG), which is an encrypted end-to-end virtual room created by a group of individuals, where only the group members can send and see the messages. The goodness of privacy and security offered by WG have attracted the attention of families, friends, businesses, and organizations. However, in several cases, those WG's have been used as a powerful tool for spreading dangerous content or fake news. In this work, we propose a new method for measuring the level of engagement in conversations and users from WhatsApp groups, by using temporal interaction networks and without reading the messages. Our framework creates an ensemble of networks that represent the temporal evolution of the conversation every 10 minutes. In this way, we use network measurements to build an Engagement Index (EI) for fractions of the conversations. Our results in five real-world WGs data indicate that the EI is able to identify different type of conversations, and users' behaviors according to each category, as well as anomalies on users' engagement.

arXiv preprint arXiv:1906.08875.

From spatio-temporal data to chronological networks: an application to wildfire analysis

Didier A. Vega-Oliveros, Moshé Cotacallapa, Leonardo N. Ferreira, Marcos Quiles, Liang Zhao, Elbert E. N. Macau, Manoel F. Cardoso

Network theory has established itself as an appropriate tool for complex systems analysis and pattern recognition. In the context of spatiotemporal data analysis, correlation networks are used in the vast majority of works. However, the Pearson correlation coefficient captures only linear relationships and does not correctly capture recurrent events. This missed information is essential for temporal pattern recognition. In this work, we propose a chronological network construction process that is capable of capturing various events. Similar to the previous methods, we divide the area of study into grid cells and represent them by nodes. In our approach, links are established if two consecutive events occur in two different nodes. Our method is computationally efficient, adaptable to different time windows and can be applied to any spatiotemporal data set. As a proof-of-concept, we evaluated the proposed approach by constructing chronological networks from the MODIS dataset for fire events in the Amazon basin. We explore two data analytic approaches: one static and another temporal. The results show some activity patterns on the fire events and a displacement phenomenon over the year. The validity of the analyses in this application indicates that our data modeling approach is very promising for spatio-temporal data mining.

SAC '19 Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing Pages 675-682.

Random walk in degree space and the time-dependent Watts-Strogatz model

H. L. Casa Grande, M. Cotacallapa, M. O. Hase

In this work, we propose a scheme that provides an analytical estimate for the time-dependent degree distribution of some networks. This scheme maps the problem into a random walk in degree space, and then we choose the paths that are responsible for the dominant contributions. The method is illustrated on the dynamical versions of the Erdős-Rényi and Watts-Strogatz graphs, which were introduced as static models in the original formulation. We have succeeded in obtaining an analytical form for the dynamics Watts-Strogatz model, which is asymptotically exact for some regimes.

Phys. Rev. E 95, 012321 (2017).

Epidemics in networks: a master equation approach

M. Cotacallapa and M. O. Hase

A problem closely related to epidemiology, where a subgraph of 'infected' links is defined inside a larger network, is investigated. This subgraph is generated from the underlying network by a random variable, which decides whether a link is able to propagate a disease/information. The relaxation timescale of this random variable is examined in both annealed and quenched limits, and the effectiveness of propagation of disease/information is analyzed. The dynamics of the model is governed by a master equation and two types of underlying network are considered: one is scale-free and the other has exponential degree distribution. We have shown that the relaxation timescale of the contagion variable has a major influence on the topology of the subgraph of infected links, which determines the efficiency of spreading of disease/information over the network.

Journal of Physics A: Mathematical and Theoretical 49, 065001 (2016).

Sentiment and Behavior Analysis of One Controversial American Individual on Twitter

J. Eliakin M. de Oliveira, M. Cotacallapa, Wilson Seron, Rafael D. C. dos Santos, Marcos G. Quiles

Social media is a convenient tool for expressing ideas and a powerful means for opinion formation. In this paper, we apply sentiment analysis and machine learning techniques to study a controversial American individual on Twitter., aiming to grasp temporal patterns of opinion changes and the geographical distribution of sentiments (positive, neutral or negative), in the American territory. Specifically, we choose the American TV presenter and candidate for the Republican party nomination, Donald J. Trump. The results acquired aim to elucidate some interesting points about the data, such as: what is the distribution of users considering a match between their sentiment and their relevance? Which clusters can we get from the temporal data of each state? How is the distribution of sentiments, before and after, the first two Republican party debates?

Lecture Notes in Computer Science. 1ed.: Springer International Publishing v. 9948, p. 509-518 (2016).