Using machine learning to make interdisciplinary studies tractable - Agency as a case study
Researchers are often faced with novel and complex problems requiring interdisciplinary solutions. However, interdisciplinary research requires integrating previously unrelated concepts across different fields—a task that involves discovering and processing very large quantities of information. Given the nature of the challenge, big data and machine learning tools naturally come to mind as potential solutions. Here, we present an approach that automates the discovery of relevant literature and uses machine intelligence to identify fine-grained semantic relationships embedded within thousands of articles. Specifically, we apply this technique to the case of human agency. In doing so, we aim to fill a critical gap in that broader literature, namely the absence of an account of agency that integrates the sociological and psychological natures of the phenomenon. We programmatically scanned 6 databases using the keywords ‘human agency’. The automated method mined 2700+ full papers across 9 different disciplines. We then used Latent Dirichlet Allocation—a Bayesian machine learning technique—to identify 54 topics present in this corpus. PCA was used to distribute the topics on the semantic space to visualize the lay of the land. Rendering these in a networked representation allowed us to locate specific cross-disciplinary relationships from a haystack of literature without having to manually read nearly 3,000 papers. Finally, the trained model was used to quantify intersectionality within each paper that helped us identify key articles. Our method enables researchers to discover a broad and exhaustive corpus of relevant literature, quickly develop a big-picture understanding from it, and discover deep, interdisciplinary connections. Being automated, the approach ameliorates selection biases. The approach is also sensitive to different conceptualizations of the same word which makes it particularly suited to process interdisciplinary literature. Finally, the method is topic-neutral, and therefore broadly applicable. We have published its codebase on Github for the wider community.
There is nothing here yet. Be the first to create a thread.
Cite this as:
Hemmer, P., Musolino, J., &