sodestream

Articles by Prashant Khare

Impact of early engagement on longevity of IETF participation

by Prashant Khare • Friday 02 April 2021 • Permalink

Introduction

What factors influence an Internet Engineering Task Force (IETF) participant to remain engaged with the IETF community for a long time? Are there early signs that can indicate whether a new participant will go on to stay engaged with IETF activities for a long time or are they likely to get disengaged?

In our recent work, we have attempted to find answers to these questions. Nearly 4000 participants per year were found to actively participate across various mailing lists between 2000-2004. During this period nearly 2000 new participants per year joined various mailing lists. By the beginning of the next decade (2011-), the number of active participants per year (across mailing lists) were still close to 4000. While this could reflect an entirely new generation of IETF participants across different time periods or a combination of some old participants still active along with the new participants, the reality is that a significant proportion of new people who joined over the years eventually became inactive or disengaged with the IETF community, at least on the mailing lists.

So we decided to look at the interaction behaviour of the new joiners over the years. Without giving too much away, we find out that engaging people early in their life span may be the most effective tool to keep participants engaged. We notice a clear difference in the early age interaction activities of those who go on to stay for a long time in comparison to those who leave early.

Methodology

We take over 1100 IETF mailing lists archives (all the mailing lists available at the time of download) and, for each user, determine their activity period (also referred to as ‘age’). This is the time period between their first and last email exchange. Note that we identify users with multiple accounts and create unique identifiers, resulting in more than 200,000 identifiers.

The first thing we notice is that the data is dominated by users who only send one or two emails (covering around 83% of all users). The one-time emails can often vary between spam emails, introduction/greet emails, etc. To avoid skewing the results, we therefore apply a filter to consider only users who have sent at least three messages. It is important to note that the remaining 17% of the accounts contributed towards over 90% of the total volume of emails (over 2.8 million emails).

Analysis

Question: How many years do IETF participants generally remain active, as they join over the years?

To begin with, we are curious about the age (years of remaining active) that IETF participants tend to acquire. Figure 1 shows a kernel density of final age acquired by people joining over the years. The colour density bar on the side shows increasing order of the probability where the colour intensity is high. It gives an indication that while a high number of people leave early (as the darker contours of density at the bottom suggest a higher probability for more participants leaving early), some participants go on to stay for a few more years and some participants go on to remain active for as long as they could.

Kernel Density: Age acquired by people joining each year
Kernel Density: Age acquired by people joining each year (Figure 1)

To understand more about the age categories we generated Gaussian Mixture Models (GMM) to reveal the possible clusters of age probability distribution that the participants go on to acquire. The data reflects the year in which the person exchanged email for the first time (year born), and the number of years the person remains active for (age).

We generate five Gaussian maximum age clusters, where each person belongs to a particular cluster. Each of these clusters are then manually analysed and some of the clusters are further merged as they were identified to be of similar age categories. Figure 2 shows, broadly, three categories of maximum acquired age, for users joining each year between the years 2000 and 2013 (to allow participants a time window to fully exhibit their longevity of association). The three broad categories identified are:

  • Early leavers: participants who go on to get inactive within 1 year of joining IETF.
  • Mid-level stayers: participants who go on to remain active for a period between 1 and 5 years before getting inactive.
  • Long-term stayers: participants who go onto remain active for 5 or more years.

This is an interesting observation since these categories are, broadly, consistent over an observed period of 13 years. Substantial proportion of people leave within a year or so throughout this time period, while some indeed go onto remain active for 5 or more years.

Participant Category Number of participants
Early leavers 17142
Mid-level stayers 5349
Long-term stayers 4833
Number of participants across categories (Table 1)

Gaussian Mixture Models: max age acquired over the years
Gaussian Mixture Models: max age acquired over the years (Figure 2)

Now that we have identified how IETF participants cluster regarding the age they go on to acquire, we explore what factors influence the length of their association.

Question: Is early age interaction activity indicative of how long a new participant goes on to remain active?

What is an interaction? - participants either respond to someone’s email on the mailing lists or their email is responded to by some participant and these collectively reflect the interaction activities of a participant. We hypothesise that new joiners who engage more with the existing community early on are more likely to stay for longer. This is based on the observation that after removing nearly 83% of accounts (posting two or less number of emails), the remaining 17% of the accounts formulate over 90% of the total volume of emails in the archives. Thus, the extent to which a new joiner interacts with the active community can influence their ability/motivation to remain engaged. Since Early leavers get inactive within 1 year of joining IETF, we consider analysing interaction behaviour of participants of all the three categories (above) in the first year of their participantship.

To understand whether a new joiner interacts more with young participants (other new joiners) or participants who have been in IETF for a long time we categorise the network nodes as one of the categories identified in the GMM model:

  • Senior participants: when the age of this participant, at the time of interaction with a new joiner, is 5 or more years.
  • Mid-age participants: when the age of this participant, at the time of interaction with a new joiner, is between 1 and 5 years.
  • Young participants: when the age of this participant, at the time of interaction with a new joiner, is 1 or less than 1 year.

We, now, have three types of categories for new joiners based on the number of years that they will remain in the IETF., early leavers, mid-level stayers, and long-term stayers. And, we also have three types of categories for their network viz., senior participants, mid-age participants, and young participants. To understand the interaction dynamics we look at two types of interactions:

  • Outgoing interaction (new joiner responds to an email from the network: email sent)
  • Incoming interaction (network responds to an email by new joiner: email received)

Next, for each new joiner we evaluate how many people (from each network category) they have an outgoing or incoming interaction with, and how many messages were covered in these interactions respectively (in the first year of their IETF lifespan). We do this to observe if the joiners from individual categories reflect any specific behaviour with respect to how they interact with their network, for instance, if the long-term stayers show a certain behavioural aspect, in terms of interactions, which early leavers do not. We record first year interactions for the new joiners in the years 2000-2013. We select the years till 2013 to make sure that there is enough time for participants to acquire whatever age they could go on to acquire till the time data was collected (in 2020), and avoid bias towards younger participants. For e.g. a new joiner, Person A, has outgoing interactions with 10 participants of the mid-age participant category sending 15 messages. We plot graphs, in Figures 3 and 4, showing incoming and outgoing interactions respectively, between new joiners across three categories and their corresponding network. We make the following observations:

  • Early leavers engage significantly less with the senior participants or mid-age participants. For instance, early leavers send less than 1 email as outgoing interactions, on average, to senior participants. The incoming interactions from senior participants are much lower than the outgoing interactions.
  • In their first year, long-term stayers not only take initiative to interact with the senior community but are also responded to by the senior participants of the community. Long-term stayers send more than 4 emails as outgoing interactions, on average, to senior participants. The incoming interactions from senior participants are close to 4 on average.
  • Mid-level stayers interact more with senior participants as compared to the early leavers, and send out over 2 emails as outgoing interactions, on average, to senior participants.

Conclusion

While the longevity of association of a person with the IETF might have more than one influencing factor, it is an important observation that participants who go on to remain associated with the IETF community for a long time, engage more with the senior participants and mid-age participants of the community in their early years. Getting a response back from the community in the mailing lists can turn out to be a motivating factor for a new joining participant. We also aim to explore how these interactions are reflected in the conversations, and how strongly the context of conversations play a role in these interactions and thereby, influencing the longevity of a person's association with IETF.

Additional figures

Incoming interactions, from IETF participants to new joinee (in their first year) of types: Early leavers (max age <= 1 year), Mid-level stayers (max age 1-5 years), and Long-term stayers (max age >= 5 years). IETF participants are classified as  senior (age >= 5 at time of interaction with new joinee), mid-age (age 1-5 at time of interaction), and young (age <= 1 at time of interaction).
Incoming interactions, from IETF participants to new joinee (in their first year) of types: Early leavers (max age <= 1 year), Mid-level stayers (max age 1-5 years), and Long-term stayers (max age >= 5 years). IETF participants are classified as senior (age >= 5 at time of interaction with new joinee), mid-age (age 1-5 at time of interaction), and young (age <= 1 at time of interaction) (Figure 3)

Outgoing interactions, from new joinees in their first year of joining depending on the longevity of the new joinee (Early leavers, max age <= 1 year; Mid-level stayers, max age 1-5 years; Long-term stayers,max age >= 5 years), and the seniority (at the time of the interaction) of the IETF participants they interact with (senior participants, age >=5;  mid-age, age 1-5; young participants, age <= 1).
Outgoing interactions, from new joinees in their first year of joining depending on the longevity of the new joinee (Early leavers, max age <= 1 year; Mid-level stayers, max age 1-5 years; Long-term stayers,max age >= 5 years), and the seniority (at the time of the interaction) of the IETF participants they interact with (senior participants, age >=5; mid-age, age 1-5; young participants, age <= 1) (Figure 4)