JSO thesis adventure

We review recurrent questions about groups that have shifted meaning over time (see Table [table:1]{reference-type="ref" reference="table:1"}). We show that many answers to those vexing questions were determined by authors' ontological stance on group realism, i.e., the idea that group states are somehow irreducible to individual interactions. We introduce four different periods where the debates took various forms; early social sciences, social network analysis, organizational sciences (and collective action theory), and the evolutionary dynamics of group-level features.

When asking themselves 'What are groups (Q1)?', early social scientists and anthropologists offered a variety of perspectives---some focusing on the nature of group membership, others on the persistence and function of social structures. Their debates set the stage for the ontological and modeling challenges that we face today. As we will see, on one side stand the group realists, concerned with the supervening forces that act upon individuals; on the other, the methodological individualists, who reject the idea that groups possess enough ontological reality to be studied independently.

Charles Cooley (1909) famously distinguished between primary groups---small, tightly knit collectives based on intimate, enduring ties---and secondary groups, which are larger, more impersonal, and goal-oriented [@cooley_social_1909]. This distinction later informed structuralist views that differentiate between core and peripheral members within groups [@homans_human_1950; @freeman_sociological_1992]. Around the same time, William Graham Sumner (1906) emphasized the emotional dimension of group identity, arguing that "we-groups" (in-groups) shape loyalty, patriotism, and favoritism toward others [@sumner_folkways_1906]. These early thinkers thus approached Q2---what kinds of groups exist---by highlighting the structural, phenomenological, and affective differences across group types.

In contrast to Simmel's focus on relational structure, Durkheim, Sumner, and Spencer offered a more explicitly emergentist view of social phenomena. For Émile Durkheim, society is not reducible to individual actions but is instead made up of "social facts"---external forces that shape individual behavior [@durkheim_rules_1895]. These facts emerge from the "composition of individualities with particular consciousness" and give rise to a collective mind. In Durkheim's view, emergence is tied to functionalism: individual roles contribute to maintaining system-level stability. Sumner shared a similar perspective. In Folkways (1906), he argued that group norms and customs emerge from countless repeated actions over time rather than deliberate intention. These norms, once formed, constrain behavior and persist independently of any one person's will. As he put it, folkways are not the product of "human purpose and wit", but arise "unconsciously" through repeated "petty acts" [@sumner_folkways_1906 p.4]. Herbert Spencer further extended this logic by applying evolutionary analogies to social life. He likened societies to organisms, with individuals playing specialized roles akin to organs in a body [@spencer_principles_1896]. This organism-like analogy (Q5) offered an explanation for how social systems can persist despite the turnover of individuals (Q7): it is the continuity of roles and functions that sustains the collective, not the specific people who occupy them. Like Durkheim, Sumner viewed these emergent forces as structuring individual behavior in ways that could not be reduced to individual intentions.

In contrast, the rise of methodological individualism (MI) challenged these emergentist views. MI holds that explanations in the social sciences should begin with individuals---their beliefs, preferences, and actions---rather than abstract group-level entities. Joseph Schumpeter (1909) famously argued that "only individuals can feel wants", making them the fundamental unit of analysis in economics [@schumpeter_concept_1909]. While Schumpeter did not formalize rational choice theory himself, his focus on individual agency and utility laid important philosophical groundwork for later developments in rational choice models, where individuals are treated as having consistent preferences over outcomes, from which market and social phenomena can emerge. Similarly, one of the key figures in modern sociology, Max Weber (1913, 1922), insisted that even large bureaucracies are the result of individual rule-following and decision-making, not emergent collective minds [@weber_categories_1913; @weber_economy_1922]. In this perspective, society does not function like an organism; its structure arises from decentralized rule-based behavior rather than system-level goals. In this way, MI promotes a nominalist ontology: groups are epiphenomena, not actors, social forces, or entities in their own right.

In the debate between group realists and MI, we note the presence of Georg Simmel (1898), who pioneered a structural approach that sought to unify group behavior through patterns of interaction. He introduced the idea that motifs---social configurations involving more than two individuals---are essential to understanding group dynamics (Q9). In particular, triads (three-person groups) exhibit dynamics that dyads cannot, offering "lower intensity, higher stability" and allowing changes in consensus or mediation of conflict [@simmel_persistence_1898]. Simmel emphasized that the persistence of groups (Q4) depends not just on who belongs to a group but on the structure of their interactions. He also introduced the concept of duality---the tension between individual identity and group membership---which affects both cohesion and fragmentation. His work thus connected psychological experiences with higher-order interaction patterns, providing early insight into how group boundaries emerge and stabilize (Q6). Although Simmel didn't have a strong stance on the existence of groups, we will see that structuralism later favored patterns of individual interactions over those of groups, showing the subtle influence of how group interactions ended up being secondary to individuals.

The evolution of groups in schema form, adapted from . On the left, Homans (1950) depicts social exchanges as part of the group, without representing social structures. Wellman (1989) aims to summarize multiple interacting social structures, representing intimate and nonintimate ties and types of relationships: kin (immediate and extended), friends, coworkers, and neighbors. Feld (1981) provides a dual perspective, with foci represented as the intersection of varying social groups.

The work of George C. Homans on small-group research marked a turning point in the study of social dynamics, providing a "playbook" for individual-based sociological studies (CITE). In The Human Group (1950), he focused on interpersonal interactions within small groups, measuring observable behaviors---conversation length, participation, initiation, sentiment---and interpreting them through behaviorist principles [@homans_human_1950]. Drawing on case studies from factory workers, gangs, families, and field observations, he proposed several "elementary propositions" about social exchanges, including the idea that individuals repeat behaviors that are rewarded (the "Success Proposition") and that the value of rewards diminishes over time (he calls it the "Deprivation--Satiation Proposition"). Strongly influenced by behaviorist psychology, which focuses on observable behaviors rather than hidden mental (or group) states, Homans sought to explain social cohesion (Q4) and socialization processes (Q17) through reductionist mechanisms grounded in individual behavior and reinforcement.

While Homans focused on face-to-face interaction, Jacob Moreno (1934) extended this logic to group structure through sociograms---early visualizations of social networks based on sociometric tests [@moreno_who_1934]. In school settings, for example, children were asked to select classmates they preferred to sit with, revealing patterns of ties, cliques, exclusions, and triangles. Although Moreno emphasized psychological and behavioral principles like Homans, he conceptualized these patterns as forming an emergent group structure and embraced an organism-like view of social systems (Q5). He also argued that groups grow toward structured organization "just as the individual organism grows toward maturity", and introduced the metaphor of "social organs" to describe how attraction and repulsion shape group cohesion over time. Although Moreno is remembered for his sociograms in social network analysis, he occupies a position between group functionalism and structuralism, gradually being embraced by methodological individualists.

Whereas small-group research often focused on internal conformity and local dynamics [@festinger_social_1950], other researchers began to study influence as a network-level phenomenon. In Personal Influence (1955), Katz and Lazarsfeld proposed a "two-step flow of communication" model: media messages reach opinion leaders first, who then disseminate them to their peers through interpersonal ties [@katz_personal_1955]. This reframed the question of influence as a function of social structure rather than mass communication. Their notion of "group" shifted from bounded entities to neighborhoods of influence embedded in larger networks---groups not as cohesive wholes, but as relational contexts acting on focal individuals.

Simmel's ideas on duality---the tension between individuals and their group affiliations---resonated strongly with later developments in social network analysis (SNA), particularly the use of dual-mode or bipartite networks [@breiger_duality_1974; @feld_focused_1981; @mcpherson_hypernetwork_1982; @moody_structural_2003]. In these models, one set of nodes represents individuals, and a second set represents affiliations such as clubs, classrooms, shared events, or co-authorships. Edges link individuals to the groups they belong to, allowing researchers to study both the structure of affiliations and the indirect ties they create between people. A common strategy involves projecting bipartite networks into one-mode networks---connecting individuals through shared memberships---which shapes network-level properties such as density, transitivity, and clustering. Building on Simmel's notion of overlapping social circles, Feld (1981) highlighted how intersecting affiliations structure social opportunities, emphasizing the trade-off between intimate ties that offer support and peripheral ties that provide access to novel contexts [@feld_focused_1981].

These affiliation-based approaches allow SNA to study group-structured data while avoiding strong ontological commitments to groups as real entities. Groups are typically treated as metadata---attributes of individuals---rather than as entities with causal powers of their own (Q1) [@wellman_structural_1988]. This nominalist framing has steered empirical work toward groups with well-defined membership lists (Q8)---such as voluntary associations, workplaces, and classrooms---while sidelining more abstract questions about group boundaries (Q9) or group-level persistence and transformation over time (Q12--13). Although SNA does not explicitly identify with methodological individualism, it shares its underlying assumptions: individuals remain the ontological primitives, and groups are inferred from patterns of interaction rather than modeled as independent units.

Despite its emphasis on social structure, SNA generally rejected the functionalist and organismic views associated with group realism. It aligned more closely with MI by explaining macro-level outcomes in terms of local ties and individual positions. The rise of MI paralleled the geopolitical shift of the scientific center of gravity to the United States. Influential European émigrés---such as Schumpeter, von Mises, and Hayek---helped popularize a vision of science grounded in individual agency, free markets, and limited state intervention [@oreskes_big_2023]. Ironically, both Spencer's social Darwinism and Austrian economics ultimately converged in justifying social hierarchies as the natural result of competition---whether biological or economic. Yet, group realism kept coming back in different forms. In the following sections, we explore how inter-firm competition, the challenge of cooperation among unrelated individuals, and theories of collective action have renewed interest in groups as units of analysis in their own right.

In Chapter [chapter:interface]{reference-type="ref" reference="chapter:interface"}, we discuss how group interactions are increasingly modeled as distinct from pairwise interactions in models of HONs. However, while these models capture the structural differences of multi-way interactions, they typically stop short of adopting a stronger ontology of groups. In contrast, by introducing group-level states independent of individual states--as we do in Chapter [chapter:coevo]{reference-type="ref" reference="chapter:coevo"}--we can more effectively model the co-evolution of institutions and individuals. This allows us to capture novel dynamics, such as the call for action effect in contagion processes, where higher infection rates do not necessarily result in larger outbreaks once institutional responses are taken into account.

The rise of organizational sciences and the new institutional economics

As MI and social network analysis gained prominence, both tended to sideline groups as fundamental units of analysis---focusing instead on individuals and their relational ties. However, beginning in the 1930s, foundational work in organizational science began to reassert the importance of formal organizations as distinctive social structures. These were not just loose collections of individuals, but coordinated entities characterized by "cooperation among men that is conscious, deliberate, purposeful" [@barnard_functions_1938]. By the mid-20th century, corporations such as IBM, General Motors, General Electric, and Procter & Gamble had become dominant economic actors, each developing their own internal hierarchies, rules, and goals. Rejecting traditional economic assumptions, a new wave of scholars emphasized that many organizational behaviors---such as managing transaction costs or setting internal coordination mechanisms---could not be explained solely by aggregating individual choices [@coase_nature_1937; @simon_concept_1964; @williamson_transaction_1998]. As March and Simon (1958) put it: "High specificity of structure and coordination within organizations---as contrasted with the diffuse and variable relations among organizations and among unorganized individuals---marks off the individual organization as a sociological unit comparable in significance to the individual organism in biology" [@march_organizations_1958]. In this way, the organism-like analogy (Q5) reemerged---not through the lens of social cohesion or collective consciousness, but as a rational-structural metaphor for explaining how formal organizations achieve internal coherence [@scott_organizations_2007].

Organizational science broadly concerns how groups of people, under varying governance structures---rules, roles, norms---coordinate collective action to achieve shared goals. Within economics, institutional theorists explore how institutions shape economic performance and social costs [@north_institutions_1990; @williamson_transaction_1998]. One key approach, transaction cost economics, explains the emergence of firms as organized groups that reduce the costs of negotiating, monitoring, and enforcing contracts [@coase_nature_1937]. Firms are defined by their boundaries, within which coordination and ownership are facilitated (Q11).

Building on this, Nelson and Winter (1973) introduced an evolutionary perspective on institutional change, viewing firms as competing units within a market environment (Q13) [@nelson_toward_1973]. Douglass North (1990) added a critical note: not all institutional changes are optimal, and modeling must include feedback loops between human perception and institutional structure [@north_institutions_1990]. Rather than rejecting functionalism entirely, these scholars called for more principled evolutionary models of institutional persistence and adaptation.

A second perspective understands institutions as formal and informal rule systems that enable coordination and cooperation by reducing uncertainty in social interactions [@bowles_cooperative_2011; @smaldino_cultural_2014]. Drawing on political science and rational choice theory, this view sees institutions as providing structure to otherwise self-interested agents [@hobbes_leviathan_1651]. In the 1960s, this perspective influenced policy-making around resource governance, exemplified in Hardin's famous "tragedy of the commons" [@hardin_tragedy_1968]. Without centralized control or privatization, shared resources would be depleted by self-serving behavior. Elinor Ostrom challenged this logic by showing that under certain conditions, groups can successfully self-organize to manage common-pool resources---without top-down coercion [@ostrom_covenants_1992]. Her work showed that institutions do not simply constrain behavior; they are also shaped by and embedded within group-level dynamics.

Between these macro- and micro-perspectives lies a growing body of work on group effectiveness within organizations (Q18). Often referred to as "team science", this field began in management and organizational psychology but now spans complexity science and even the science of science [@guimera_team_2005]. The central idea is that effective organizations recognize groups---not individuals---as the key units of coordination and problem-solving [@leavitt_suppose_1974; @hackman_design_1987; @katzenbach_wisdom_1992]. Group performance (or synergy) is studied in work teams, sports teams, and research collaborations, with attention to how diversity of expertise and communication patterns shape outcomes [@page_diversity_2019; @almaatouq_collective_2020; @mukherjee_prior_2019]. A recurring theme is the role of socialization (Q17): how new members internalize norms, adopt roles, and align their behaviors with organizational culture [@ashforth_social_1989]. This echoes the early work in social psychology on group identification and intergroup behavior [@tolman_identification_1943].

Together, organizational science and institutional theory offer a new framework for group realism to thrive. They suggest that groups---especially formal organizations---are not reducible to interpersonal ties or individual decisions. They are bounded, persistent, and shaped by internal governance and external selection pressures. They define boundaries in a way that makes modeling easier; just use workgroups and teams within organizations, as they are well-defined. But this answer, arguably, remains somewhat unsatisfactory, ontologically speaking, even if useful.

Institutional performance and the evolutionary dynamics of group-level features

In one sense, most would agree that groups can form, grow, dissolve, and occasionally split into daughter groups (Q12). Even Schumpeter, despite popularizing methodological individualism, was deeply interested in group dynamics. He is well-known for his work on the life cycle of firms, arguing that capitalism is driven by a process of "creative destruction," in which established organizations are periodically replaced by newer and more innovative ones [@schumpeter_capitalism_1942]. Yet, controversy resurfaces when such processes---birth, death, reproduction---are thought to generate selection pressure at the group level, thereby influencing the evolution of individual traits (Q5).

As originally formulated by Wynne-Edwards (1962), group selection proposed that cooperative groups could outcompete less cooperative ones, allowing altruistic traits to spread even if they were individually costly [@wynne-edwards_animal_1962]. His application to bird populations suggested that groups practicing reproductive restraint would persist longer than those that overpopulated and collapsed. This was not a return to the organism-like metaphor (where individuals are organs in a body), but a more radical claim: that individuals might adopt costly behaviors simply because they benefit the group [@wilson_structured_1979]. This shifted the explanatory unit away from the gene, but not without resistance. Unlike genes or individuals, groups lack a clear physical boundary---which makes them harder to study and harder to accept as legitimate units of selection.

Today, the emergence of cooperation is largely attributed to mechanisms like kin selection and positive assortment---especially limited interaction ranges that maintain altruistic traits among related or similar individuals [@trivers_evolution_1971; @hamilton_genetical_1964; @smaldino_cultural_2014]. Group selection, at least in its biological form, fell out of favor partly because groups are rarely isolated enough to preserve between-group variation while maintaining low variance within groups [@mcelreath_mathematical_2007]. But cultural evolution offered a potential solution. Unlike genes, cultural transmission mechanisms allow for both the stabilization and selective amplification of group-level traits.

In this context, cultural group selection (CGS)[^1] has gained traction as a framework for explaining uniquely human forms of cooperation [@boyd_group_1990; @richerson_cultural_2016; @wilson_multilevel_2023]. CGS posits that culture can maintain high variation between groups and low variation within groups, which are necessary conditions to discuss natural selection. The first step is human cultural psychology itself: we are adapted for learning from others, especially through mechanisms such as conformity, prestige bias and norm enforcement. These mechanisms promote within-group cohesion while preserving intergroup diversity---supporting the emergence of stable group-level phenotypes, scaffolded by shared expectations, norms, and roles.

The second step is intergroup competition leading to the selection of cultural traits that help a group to thrive and reproduce---whether through warfare, economic success, resource management, or prestige-biased imitation---can spread, even if they are individually costly. Darwin famously suggested in The Descent of Man (1871) that tribes with more altruistic and patriotic members would prevail over less cohesive ones [@darwin_descent_1871; @bowles_did_2009]. In recent times, it has been shown that stateless warfare cultures do lead to the expansion of overly cooperative groups, at the expense of groups that are more peaceful in their ways of life. [@turchin_war_2013; @turchin_warfare_2010; @bowles_did_2009]. Anthropological studies show that in a warfare context, cultural boundaries are stronger; with the appropriate symbolic markers making the difference between the life and death of individuals [@turchin_ultrasociety_2016]. Even in a modern context, war is known to amplify conformity, with group members being more inclined to punish group members who deviate from group-beneficial norms [@henrich_weirdest_2020]. While early theories centered on tribes and ethnolinguistic groups with symbolic identity markers (Q8, Q11), more recent work extends these ideas to voluntary organizations such as churches, monasteries, universities, and firms [@henrich_weirdest_2020]. In this view, intergroup competition---whether violent or symbolic---is central (Q16). What matters is that group-level traits shape retention, imitation, and success at the group level.

This insight was paralleled by the rise of institutional and organizational theory in the 1970s. Nelson and Winter extended Schumpeter's theory of innovation to argue that firms exhibit evolutionary dynamics: they vary in routines, strategies, and capabilities and are subject to selective pressures based on performance [@nelson_neoclassical_1974; @nelson_schumpeterian_1982]. They replaced the optimizing rational actor with bounded rationality---where firms adopt heuristics under incomplete information, often locking into suboptimal yet stable trajectories. A classic example is the QWERTY keyboard, a design maintained not by efficiency but by historical constraints and institutional inertia.

Douglass North advanced this perspective by showing how institutions---formal and informal rules---persist through "the learning embodied in individuals, groups, and societies' and are transmitted across generations [@north_rise_1973]. Though he did not initially adopt this view, North came to see institutions as cumulative structures that outlast individuals, shaping incentives and reducing uncertainty in social interactions [@north_institutions_1990]. While he avoided making explicit ontological claims about group realism, his work---alongside that of Nelson and Winter---helped shift attention from individual behavior to the evolution of group-level capabilities. Like Boyd and Richerson, Campbell, and others, they sought to explain how institutional performance could be subject to evolutionary pressures, without presupposing that any observed collective behavior is necessarily adaptive.

Together, CGS and institutional theory imply that group-level traits can evolve, persist, and exert causal influence independently of individuals. For modelers, this raises several conceptual and practical questions: Are group-level traits reducible to individual strategies (Q18)? What boundaries are required for selection to act at the group level (Q11)? How do cultural and institutional inheritance mechanisms differ from genetic ones? And how do individual learning dynamics interact with group performance and evolution? These questions demand explicit attention to cross-level dynamics---how individual decisions scale up into group-level outcomes, and how group-level norms or structures feed back into individual incentives. We address these modeling challenges in the next section, beginning with the idea that some group traits are not just emergent but irreducible.

Working with groups

Modeling group interactions is challenging due to its polysemous social ontology. They involve distinct ways of belonging---and different understandings of what it means to be part of a group. Consider the following statements

::: spacing 1.10

Your group taps maple trees and extract sap around February every year.
Your group rioted the city when your favorite hockey player was suspended.
Your group of workers, united under a common banner, demands minimum wages from the state for all its members.
Your group was 'stabbed in the back' when your country achieved full sovereignty. :::

Each of the above group interactions has a different scale with varying implications. The first two examples are driven by concrete collective action problems, while the latter two rely more heavily on mutual understanding and shared narratives. Crucially, they all require different kinds of institutions---formal and informal rules that bind individual behavior to varying degrees, and in doing so, reshape the nature of group interaction itself. Before diving into our typology and sorting those into dimensions grounded in network science, we briefly review how researchers have engaged with the empirical study of groups. We identify three main approaches that researchers are using with groups and relevant results derived from them (see Fig. 1.2).

The study of well-defined groups {#the-study-of-well-defined-groups .unnumbered}

First, in the study of firms and teams, researchers tend to leave out the membership question (Q8) and focus on what is given; as in small group research, teams are thought to be well-defined groups engaged in specific tasks [@bavelas_communication_1950; @hackman_design_1987], while firms are based on membership. In organizational and information science, there is a consensus that small teams (number of coauthors, working groups, focus groups, core software developers) perform best with fewer than ten people; above this number, they are considered 'large' groups, which requires addressing coordination and communication challenges [@isaac_group_1988; @hall_science_2018; @national_research_council_enhancing_2015].

At the other end of the spectrum, we have unions and firms. In the third example, the union claimed to represent 200,000 workers (taking place in 1972, the national union density of the province was around 35% at that time), based on the number of union cards (Fig. [1.2]d). Already in 1958, researchers were investigating firm size distribution, showing that it follows a heavy-tail distribution [@bonini_decision_1958]. Axtell provides us with a snapshot of Zipf's law of firm sizes (2001), with the largest firm having more than one million employees (see Fig. [1.2]d) [@axtell_zipf_2001]. Since Axtell published his work, we find firms such as Walmart with 2.1 million employees. It is a million short of meeting the largest union, The National Education Association, with its 3 million members.

Scale of groups I

Yet, as a hierarchical organization, unions can be said to resemble "tribes" in anthropology, sharing a collective identity and forming alliances in larger conflicts. In our example, the union was divided into three main "clans," bound by shared norms and institutions, acting as independent political units. This scale of organization was crucial in tribal warfare, influenced by territorial resources. Although higher-level institutions are present, small-scale cooperation continues to operate at the local level, with a total of 1,701 local unions averaging around 155 members each (similar to Fig. [1.2]a, [1.3]f, and to the camp sizes shown in Fig. [1.2]b). At this scale, models suggest that monitoring free-riding---individuals who benefit from collective payoffs without bearing individual costs---becomes increasingly difficult, as mechanisms of direct and indirect reciprocity are harder to enforce [@boyd_evolution_1988; @ostrom_covenants_1992]. Yet, a review of historical and archaeological evidence suggests that this level of cooperation among unrelated individuals has been typical at least since the late Pleistocene and Holocene [@boyd_largescale_2022; @casari_group_2018].

The study of collective action problems

We find some groups that do not fit neatly into the categories of small teams or large-scale cooperation. First, research teams in academia or the smallest groups that cooperate for subsistence in the d-place database show a non-negligible proportion of teams with more than ten members, though they do not approach the empirical mean of the common-pool resources (CPR) database (see Fig. 1.3a). Second, there are evidently papers with more than 10 coauthors, with the largest having over 5,000 (a practice called hyper-authorship). This is more akin to the size of large movie crews, which, to date, have a slightly lower maximum (3,310 people were credited for Iron Man 3). As with movie crews, larger coauthors exhibit differentiated roles, which we discuss next in the context of group-level features. When this is the case, group boundaries are derived from theoretical frameworks; one of the most well-known being that of CPR.

In her pioneering work on the governance of CPR ( Q15), Elinor Ostrom is well-known for publishing an extensive codebook defining all aspects of the enterprise [@ostrom_cpr_1989]. In it, she and her collaborators mention that to be considered part of "well-defined" groups (Q11); individuals, and not their households (only the person who can extract the CPR), must have the ability to appropriate relevant resources. Hence, the boundaries of the group are established by a more or less formal agreement among group members, which her is defined by the CPR theoretical framework. In Fig. [1.2]a, we show the result of 86 case studies by Ostrom, revealing that the average size of well-defined groups participating in a CPR is around 154 individuals. If we consider the broader definition of the actual group size of individuals who were somewhat involved in the CPR, the mean increases to about 233 individuals. Anthropologists have shown that this level of cooperation, which includes unrelated individuals, can be traced back as far as 10,000 years, with traditional societies engaged in communal hunts or constructing shared capital facilities [@boyd_largescale_2022].

In Chapter [chapter:groupSkills]{reference-type="ref" reference="chapter:groupSkills"}, we introduce a group-based model of skill emergence in research groups, framed as a public goods problem. We use programming in the humanities as a case study, where groups may face institutional pressure to include computational expertise, creating incentives for individuals to learn programming---even when the personal cost is high. This highlights a dynamic in the science of science: accounting for the tension between group-level pressures and individual-level costs offers new insights into how scientific practices evolve.

The ontology of cultural databases {#the-ontology-of-cultural-databases .unnumbered}

In our fourth example, the ethnolinguistic subgroup is Quebec in 1981, comprising 6.3 million people within a Canada-wide population of approximately 27 million. The story of your group being 'stabbed' is about what Quebecois call "La Nuit des longs couteaux" in French, while English Canadians know it as the "Kitchen Accord". Both refer to a compromise in November 1981 between federal Justice Minister Jean Chrétien and two provincial ministers that paved the way for the patriation of the Canadian Constitution---without Quebec's consent---by resolving disputes over an amending formula and a Charter of Rights. The story is somewhat scale-independent; but if enough people believe in it, it becomes ingrained in society as a powerful myth that can prevent the blending of cultures, stir social protests, and maintain strong boundaries between the subgroup within a territory and the rest of the country.

In discussing the challenges of cultural evolutionary databases, Slingerland et al. (2024) highlight the necessity of committing to a particular group ontology [@slingerland_database_2024]. For example, they note that Seshat: The Global History Databank [@turchin_seshat_2015] (Fig. [1.3]a), by focusing on polities (i.e., politically autonomous groups), loses granularity in tracking the evolution of cultural traits---such as religious beliefs---that may not align neatly with broader political boundaries. In contrast, the Environmental Diversity (D-PLACE)[@kirby_d-place_2016] focuses on ethnolinguistic groups as typically found in traditional societies (Fig. [1.2]c).

Scale of groupsII

The Database of Religious History (DRH) [@slingerland_coding_2020] is a recent cultural database that focuses on fine-grained details to precisely measure variations from the perspective of specific group-level feature traits. In the DRH, cultural traits are presented as a number of domain-driven questions about different polls--such as the religious institutions poll--filled out by experts in the field. As such, the DRH distinguishes between reporting group affiliation and more stringent measures such as the frequency of participation in group activities and the significance of the group to individuals (Fig. [1.2]f). However, the DRH is noteworthy in that it also invites experts from the humanities to weigh in on particular cultural topics, such as estimating the number of adherents of religious groups. Here are a few excerpts that highlight the ontological challenge of such an undertaking:

[Mithraism (100 CE - 400 CE; part of the Roman Empire region)]{.smallcaps}: "This is a guesstimate, starting from a reasonable guess at the number of Mithraists in c. 200 CE in Ostia (pop. c. 60,000) as c. 600 (c. 15 mithraea in the city's excavated area = half its total area; a mithraeum on average seats 20: 15x2x20=600), then extrapolating to the Roman Empire (pop. c. 60M), but cutting the resulting number to 1/30th---here's where the use of demographics gets really disreputable, but you wanted an estimate, didn't you?---to allow for the fact that Ostia was a city of particular cult concentration. Note 1) the sample region is coextensive with the spread of the cult, i.e. the Roman Empire. 2) The "adherents" were exclusively adult males." -Professor Roger Beck at University Toronto in Historical Studies, expert on Mithraism.
"Estimate of the population of the city of Amarna. Unclear whether they were all believers and whether there were believers outside the city of Amarna., Sources:: Barry Kemp: The City of Akhenaten and Nefertiti. 2012, 272." -Professor Thomas Schneider at University of British Column in Egyptology and Near Eastern Studies.
[Northern Irish Protestants (2015-2016)]{.smallcaps}"As "Protestant" is also used to refer to an ethnic group in Northern Ireland, it is difficult to estimate how many of those who claim to be "Protestant" practice the religion, or even necessarily believe in it's core tenets." -PhD candidate Samuel Ward at the Institute of Cognition and Culture.

These excerpts show the inherent challenges in studying groups, as human socio-cultural groups are shaped by our interpretation of the interaction between institutional arrangements and individual psychology. It helps model the group as inter-ethnic interaction between members of groups that differ in size and prestige, as well as in power, as with the "kitchen accord" [@bunce_sustainability_2018]. In that sense, it is not so far from unions, but cultural variation structured by ethnic identity is deeper in that it is shaped in the ontogeny of group members and is marked by a degree of spatial structure [@bunce_ethnicity_2022]. Whereas unions and teams can sometimes use list-based membership as a proxy for group membership, group-level features--norms, beliefs, stories, or institutions--require a more extensive modeling effort and stronger ontologies to study them.

Similarly, to conclude our second example on measuring protest, measuring protest can be as simple as counting people in a march. For instance, the number of participants in the so-called Richard riot (in honor of Maurice Richard, taking place in 1955) was estimated to be about 6,000 people. Although it was fairly small in terms of size compared to medieval protests in Europe, it was significant for its people nonetheless (see Fig. 1.2{reference-type="ref" reference="fig:ScaleOfGroups"}b). In the HisCOD database, they typically filtered out data from databases of social protests [@chambru_introducing_2024]. But an alternative way would be to commit to the view of protest as having a role within cultural evolution, where its shape and numbers depend on what it accomplished at the societal level.

Groups can be studied using a well-defined, list-based definition. However, without explicit models of group interactions, drawing principled conclusions about their nature remains difficult. In the absence of a clear cultural ontology, group boundaries risk being defined for convenience---leading to conclusions based on what is easiest to observe. We argue that the models from the physics of HONs provide a useful framework for modeling diverse group interactions, but a clearer roadmap is needed to guide this effort.

[^1]: We take cultural group selection and multilevel selection theory, as developed by D.S. Wilson, E. Sober, and others, as formally equivalent here. Both frameworks rely on the related formalism, and importantly opposing the same methodological individualism seen in biology and the social sciences.