Classifying Philosophy: Problems and Strategies
Venue: online. Zoom link; password: workshop
Date: Wednesday, March 31, 2021
- 2:00 pm Barry Smith: Introduction and Welcome
- 2:10 pm Giulio Carducci & Marco Leontino: Semantically Aware Text Categorization for Metadata Annotation
- 2:40 pm Colin F. Allen: Classifying Philosophy – InPho: By the People and For the People
- 3:10 pm Jaimie Murdock: Comparing Philosophy Categorization Systems
- 3:40 pm Break
- 3:50 pm Christophe Malaterre: Categorizing Philosophy of Science by means of Topic-Modeling
- 4:20 pm Gloria Sansò & Yusheng Xia: Some Proposals to Improve the PhilPapers Categorization System
- 4:50 pm Joe Ulatowski: Truth Under the Macroscope
- 5:20 pm Louis Chartrand: Explorations into the Language Model Revolution
Semantically Aware Text Categorisation for Metadata Annotation
Giulio Carducci & Marco Leontino
We present a system aimed at solving a long-standing and challenging problem: acquiring a classifier to automatically annotate bibliographic records by starting from a huge set of unbalanced and unlabelled data. We illustrate the main features of the dataset, the learning algorithm adopted, and how it was used to discriminate philosophical documents from documents of other disciplines. One strength of our approach lies in the novel combination of a standard learning approach with a semantic one: the results of the acquired classifier are improved by accessing a semantic network containing conceptual information. We illustrate the experimentation by describing the construction rationale of training and test set, we report and discuss the obtained results and conclude by drawing future work.
Classifying philosophy — InPhO by the People and for the People
The InPhO project was originally conceived in recognition of the fact that philosophy was never going to attract the kind of resources needed to build the kinds of formal representation schemes exemplified by the Gene Ontology and the Disease Ontology. From the beginning we were interested in crowd-sourcing simple judgments about algorithmically-identified relationships among terms in texts, and then using machine reasoning to convert those judgements into a classification hierarchy for philosophical ideas. Our problems on the input processing side were (1) how to automatically scrape philosophical text (in our case the Stanford Encyclopedia of Philosophy) so as to (2) deliver simple questions that could be answered with minimal effort by busy people, (3) how to combine judgements of experts and non-experts alike, and (4) how to incentivize people to do the work. Our problems on the output side included how to represent the results of the processing to (5) machines and (6) people in ways that are useful to both. None of these problems have been fully solved, although we like to think we had some good ideas! In this talk I will talk about the strengths and weaknesses of the approach we took, and hope to foster a discussion of wha exactly the goals should be, and whether it would be better to clear the ground and start over or reasonable to keep building onto the existing framework, despite knowing the risk of committing the sunk cost fallacy.
Comparing Philosophy Categorization Systems
The InPhO Project has collected user feedback on the relationships between philosophical concepts since 2006. This feedback has been stratified by expertise level and is stored on a peruser basis. As the InPhO taxonomy is populated through formal reasoning methods over this feedback, alternate concept schemes can be generated by changing the input scheme or the reasoning program. I discuss some of the challenges in managing multiple taxonomies, including ontology alignment, namespacing, and representation patterns.
Categorizing Philosophy of Science by Means of Topic-Modeling
Philosophy of science covers a broad array of questions about scientific knowledge, from its epistemic grounding to its value-ladenness, including issues that arise in specific scientific domains and many others. Needless to say that these numerous research questions have also varied in intensity over the years. In this contribution, we propose to apply distant-reading computational approaches to categorize the broad trends in philosophy of science research over the past 8 decades. By applying topic-modeling tools to the complete full-text corpus of one major journal of the field —Philosophy of Science — we identify 126 major research topics and map out their diachronic evolution from the journal launch in 1934 up until 2015. We also show how clustering and rule inference algorithms can help identify topical associative patterns in specific types of articles. We hope to show how these tools, as well as others, can contribute to the broader project of categorizing philosophy.
Explorations into the Language Model Revolution
Even as document classification is increasingly done with the assistance of automatic methods, the introduction of pretrained neural language models promise a paradigm change in NLP, and could thus change our approach to classification. In particular, contextualized word and sentence embedding models trained on multiple NLP tasks, such as BERT and its variants, have enabled new kinds of representations for textual data, which have become a reference for classification tasks. However, digital humanists have yet to catch up to these new technologies and use them to their full extent. In this talk, I will try to evaluate the potential of BERT-based methods for the classification of philosophical papers. I first present BERT in broad strokes, highlighting its strengths and weaknesses in representing semantic and syntactic information from textual data. Then I report on tasks of supervised classification and unsupervised clustering of abstracts of philosophical papers. Finally, I suggest how it might enable new ways of organizing philosophical documents.
Truth Under the Macroscope
The Macroscope (macroscope.tech) is a user interface consisting of a client-server interaction, which permits users to query and analyse synchronic contextual structure of words and diachronic word embeddings to uncover in the former which words are most semantically similar and in the latter which words most often co-occur. The Macroscope uses over 155 billion words of historical text from 1800-2009, and the corpora adjust automatically in accordance with the English Google Ngram Book corpus. In this presentation, I show how to examine ‘truth’ and its cognates using The Macroscope and what such an exploration tells us about the nature of truth, in particular whether truth evolves in a way that favours alethic relativism.
Some Proposals to Improve the PhilPapers Categorization System
Gloria Sansò & Yusheng Xia
PhilPapers has a comprehensive categorization system with a 5-level hierarchy and more than 5000 categories. A categorization system of this size and complexity, however, presents problems. Some of these problems concern how the categories are related to one another. Some categories do not have proper super-categories (there are, for example, categories for ‘aesthetic cognition’ and ‘mathematical cognition’, but not for ‘cognition’ itself). In other cases, some categories are not subsumed by the proper super-categories (‘knowledge of emotion’, for example, is subsumed by ‘aspects of emotion’ instead of by ‘knowledge’). Other problems concern the rules that the users have to follow in order to classify their philosophical works. According to one of these rules, all philosophical works have to be placed in at least one leaf category. This means, however, that many of them are placed in the so-called ‘Misc. categories’. We shall argue, however, that there are both theoretical and practical reasons why these categories should be avoided. We note that this goal can be easily reached by using a replacement rule requiring that each work must be placed in a leaf category other than Misc. or, where this is impossible, in the immediate super-category.
Barry Smith is a well-known contributor to both theoretical and applied ontology.
Giulio Carducci works as an R&D software engineer at Synapta, a company based in Turin which operates in the field of public procurement. Previously, he won a research grant for the REPOSUM project from the University of Turin, where he studied a corpus of metadata about PhD theses using several data analysis and natural language processing techniques.
Marco Leontino has recently been involved in the PRiSMHA project from the University of Turin, where he contributed in the development of an annotation platform about historical events. Previously, he won a research grant for the REPOSUM project from the University of Turin, where he studied a corpus of metadata about PhD theses.
Colin F. Allen is Distinguished Professor at the University of Pittsburgh. His main areas of research are neuroscience and the philosophical foundations of cognitive science. He is the director of Internet Philosophy Ontology (InPho) who received many research grants for its work in computational humanities.
Jaimie Murdock is a Senior Member of the Technical Staff at Sandia National Laboratories. He works on digital history, and he is particularly interested in intellectual development. For ten years, he has been the lead developer of the InPho Project.
Christophe Malaterre is Professor at the Université du Québec à Montréal. His main area of research is philosophy of science with a special focus on astrobiology and the origins of life; he is also interested inon digital humanities
Louis Chartrand is a Postdoctoral researcher at the University of Pittsburgh and an associate researcher at the Université du Québec à Montréal. He is an expert in concept mining and corpora, currently working on the Geography of Philosophy Project
Joe Ulatowski is Senior Lecturer in Philosophy and Director of the Experimental Philosophy Research Group at the University of Waikato, in New Zealand. His research focuses on facts, and the nature and the value of truth
Gloria Sansò is a Phd student at the University at Buffalo and a member of LabOnt - Center for Ontology. Her main area of research is social ontology.
Yusheng Xia is a Master’s student at the University at Buffalo. He is currently working on metaphysics and digital approaches to the humanities.