Preview Mode Links will not work in preview mode

Welcome to Uncovering Hidden Risks, a broader set of podcasts focused on identifying the various risks organizations face as they navigate the internal and external requirements they must comply with.
 
We’ll take you through a journey on insider risks to uncover some of the hidden security threats that Microsoft and organizations across the world are facing.  We will bring to surface some best-in-class technology and processes to help you protect your organization and employees from risks from trusted insiders.  All in an open discussion with topnotch industry experts!

Learn all about Microsoft M365 Compliance solutions here. Stay up to date by following our Insider Risk blog here.

May 26, 2021

When Professor Kathleen Carley of Carnegie Mellon University agreed to talk with us about network analysis and its impact on insider risks, we scooched our chairs a little closer to our screens and leaned right in.

In this episode of Uncovering Hidden Risks, Liz Willets and Christophe Fiessinger get schooled by Professor Carley about the history of Network Analysis and how social and dynamic networks affect the way that people interact with each other, exchange information and even manage social discord.

0:00

Welcome and recap of <Uncovering Hidden Risks #7: Say what you mean! >  

1:30

Meet our guest: Kathleen Carley, Professor at Carnegie Mellon University; Director of Computational Analysis & Social and Organizational Systems; and Director of Ideas for Informed Democracy and Social Cybersecurity

3:00

Setting the story: Understanding Network Analysis and its impact on company silos, insider threats, counter terrorism and social media.

5:00

The science of social networks: how formal and informal relationships contribute to the spread of information and insider risks

7:00

The influence of dynamic networks: how locations, people and beliefs impact behavior and shape predictive analytics

13:30

Feelings vs Facts:  Using sentiment analysis to identify positive or negative sentiments via text

19:41

Calming the crowd: How social networks and secondary actors can stave off social unrest

22:00

Building a sentiment model from scratch: understanding the challenges and ethics of identifying offensive language and insider threats

26:00

Getting granular: how to differentiate between more subtle sentiments such as anger, disgust and disappointment

28:15

Staying Relevant: the challenge of building training sets and ML models that stay current with social and language trends.

 

Liz Willets:

Well, hi, everyone. Uh, welcome back to our podcast series Uncovering Hidden Risks, um, our podcast where we uncover insights from the latest trends, um, in the news and in research through conversations with some of the experts in the insider risk space. Um, so, my name's Liz Willets, and I'm here with my cohost, Christophe Fiessinger, to dis- just discuss and deep dive on some interesting topics.

            Um, so, Christophe, can you believe we're already on Episode 3? (laughs)

Christophe Fiessinger:

No, and so much to talk about, and I'm just super excited about this episode today and, and our guest.

Liz Willets:

Awesome. Yeah, no. I'm super excited. Um, quickly though, let's recap last week. Um, you know, we spoke with Christian Rudnick. He's from our Data Science, um, and Research team at Microsoft and really got his perspective, uh, a little bit more on the machining learning side of things. Um, so, you know, we talked about all the various signals, languages, um, content types, whether that's image, text that we're really using ML to intelligently detect inappropriate communications. You know, we talked about how the keyword and lexicon approach just won't cut it, um, and, and kind of the value of machine learning there. Um, and then, ultimately, you know, just how to get a signal out of all of the noise, um, so super interesting, um, topic.

            And I think today, we're gonna kind of change gears a bit. I'm really excited to have Kathleen Carley here. Uh, she's a professor across many disciplines at Carnigen Melligan, Carnegie Mellon University, um, you know, focused with your research around network analysis and computational social theory. Um, so, so, welcome, uh, Kathleen. Uh, we're super excited to have you here and, and would love to just hear a little bit about your background and really how you got into this space.

Professor Kathleen Carley:

So, um, hello, Liz and Christophe, and I'm, I'm really thrilled to be here and excited to talk to you. So, I'm a professor at Carnegie Mellon, and I'm also the director there of two different, uh, centers. One is Computational Analysis of Social and Organizational Systems, which is, you know, it brings computer science and social science together to look at everything from terrorism to insider threat to how to design your next organization. And then, I'm also the director of a new center that we just set up called IDeaS for Informed Democracy and Social Cybersecurity, which is all about disinformation, uh, hate speech, and extremism online.

Liz Willets:

Wow.

Professor Kathleen Carley:

Awesome.

Liz Willets:

Sounds like you're (laughs) definitely gonna run the gamut over there (laughs) at, uh, CMU. Um, that's great to hear and definitely would love, um, especially for the listeners and even for my own edification to kinda double-click on that network analysis piece, um, and l- learn a little bit more about what that is and kind of how it's developed over the past, um, couple years.

Professor Kathleen Carley:

So, network analysis is the scientific field that actually started before World War II, and it's all about connecting things. And it's the idea that when you have a set of things, the way they're connected both constrains and enables them and makes different things possible.

            The field first started it was called social networks. This is long before social media. And, um, people were doing things like watching kindergartners play with each other, and they realized that the way which kids played with which, which kids bashed each other over the head with the, their sand shovel was really informative at effect at telling how they would actually do in the various kind of studies they needed to do. The same kind of thing was applied to hermit crabs and to deers and other kinds of animals to identify pecking orders, and, from those groups, and identify which animals had the best survival rate.

            Today, of course, the field's grown up a lot, and we now, uh, talk about kind of networks+. So, we apply network science to everything from, you know, how your company ... Where are the silos in your company? Who should be talking to 'em? We also apply to things like insider threat and look at it there to say, "Ah, well, maybe these two people should be talking, but they're not. That's a potential problem," a, and we apply to things like counterterrorism. We apply it to social media and so on. So, people now look at really large networks and very what are called high-dimensional or meta networks such as, who's talking to whom, who's talking about what, and how those ideas are connected to each other.

Liz Willets:

Awesome. Yeah, I think, I know Christophe and I, we're very interested around that space and thinking about who should be talking to one another, um, you know, as we think about communication risks in an organization, especially in the (laughs) financial services industry. You've got things, um, that, you know, you're mandated by law to, um, kind of detect for like collusion between two parties whether it's your sales and trading group who just should not be, um, communicating with one another. So, I think that certainly applies, um, to your point earlier around the insider threat space.

Professor Kathleen Carley:

Well, one of the great things in, in, uh, using social networks, especially depending what data you have access to, you may be able to find informal linkages. So, not just who's, uh, formally connected because they're like in an authority relationship, like you report to your boss, but, you know, who you're friends with or who you go to lunch with or, you know, and all these kind of informal relationships. And we often find that those are as or more important for affecting, you know, house, how information goes through a group, how information gets traded, and even for such things as promotion and your health.

Christophe Fiessinger:

And to not only to, to add to, uh, what you were saying, Kathleen, is like the context is usually important to make an informed decisions of what's going on in that network.

Professor Kathleen Carley:

And then, cer-

Christophe Fiessinger:

Isn't that what you think about it?

Professor Kathleen Carley:

Yeah, certainly. In fact, the context is very important, and it's also important to realize that one context doesn't, um, capture all of s- somebody's interactions, right? So, for example, when Twitter started, people were trying to predict elections from, uh, interactions on Twitter among people. Well, the problem was not only was not everybody on Twitter, so you didn't have a full social network, not all communication even with people who were on Twitter, that's not the only way they communiticated with each other. They might have also gone to the bars together or, or whatever.

Liz Willets:

Um, I was actually kinda reading through some of your research (laughs) as I was prepping for this interview and, um, read, um, some of your research around the difference between social network analysis and dynamic network analysis. And so, as you think about, kind of as we're talking, contexts and, you know, it's not just maybe the social connections, but it's adding in now the organization or the location or someone's beliefs. Um, I'd love if you could just kind of, you know, double-click there for us and tell us a little bit more about that.

Professor Kathleen Carley:

Yeah. So, when, um, when the field started, right, people were really dealing with fairly small groups. And so, it was not unusual to say go into a small, like, startup company, and you would have maybe 20, 25 people. Um, for each one of 'em, you would know who was friends with who and who went to 'em for advice, and that was your data set, right? It was all people, and it was all just one or two types of links. Technically, we call that one-mode data 'cause there's only one type of node, and there's two types of links. So, it's t- ... It's multiplex and one mode.

            Um, but now what's happened, as the field has gotten grown up in some sense, uh, we're dealing with much larger data sets, and you happen to have multiple modes of data. So, you'll have things like people, organizations, locations, beliefs, resources, tasks, et cetera, and when you have all of that, you have multiple modes of data. And in fact, this is great because you need multiple modes of data to be able to do things like do predictive analytics, but in addition, you have way ... And you have lots of different kinds of ties. So, I not only have ties between people, I have ties of people to these things like what resources they have, what knowledge they have, and so on. So, it's called by bipartite data.

            But then, I also have the connections among those things themselves, like words to words, and because you have all of that high-dimensional data and you have it through time, you now have a kind of a dynamic, high-dimensional networks. And so, the big difference here is that you've got more data, more kinds of data, and you've got it dynamically. And we even talk about it sometimes as geospatial because sometimes, you even have locations and you have to take into account, uh, both the distance physically as well as the distance socially.

Christophe Fiessinger:

Interesting. And Kathleen, I, I, I can't resist-

Professor Kathleen Carley:

Mm-hmm (affirmative).

Christophe Fiessinger:

I mean, I got kids and, and, uh, uh, I'm originally from Europe, and the way my k- kids interact with their family non-members, grandmothers in Europe is obviously very different than how I did it when I was growing up. So, to your point on all those dimensions is you also see a difference where a person might talk one way on a channel or, uh, an app and talk another way in another app, and then layer that, you know, I would talk differently on a PC where I get a full form. I can be very verbalist in my email or whatever versus my phone wherever I'm located. Are you seeing some of those patterns as well influence?

Professor Kathleen Carley:

Absolutely. Yeah, and then they're ... Yeah. And you, you've probably even seen these in your own work lives because, for example, you'll communicate one way on LinkedIn. You'll communicate a different way on Facebook, a different way on Twitter, and a different way in person. So, it also matters what media you're on, and it also matters whether or what kind of others you surround yourself with. I mean, I know people who use different variants of their names on-

Christophe Fiessinger:

Mm-hmm (affirmative).

Professor Kathleen Carley:

... different platforms to signal to themselves, "Oh, when I'm on this one, I don't talk about money," or, "When I'm on this one, I don't talk politics," you know? And so, people not only change how they talk, they change what they talk about, and they change who they talk to.

Christophe Fiessinger:

Yeah. And I think the personas as well. I've seen my younger one who plays, uh, who does a lot of gaming.

Professor Kathleen Carley:

Yep.

Christophe Fiessinger:

Typically, they have their own persona, and, and then obviously, there's a different realm then of, of, of a different network, but they even put a different hat going into that mode of, of talking in the context of a game.

Professor Kathleen Carley:

Well, and for there, it's just doing a game, right? But what we're actually seeing on social media is, you know, you do see adversarial actors-

Christophe Fiessinger:

Uh-huh (affirmative).

Professor Kathleen Carley:

... under fake personas doing things like trying to do fishing expeditions or trying, you know, trying to convince you that they're just one of the other people in the neighborhood-

Christophe Fiessinger:

Yeah.

Professor Kathleen Carley:

... and they really aren't, you know, and try, and trying to suck you into things.

Christophe Fiessinger:

Yeah.

Professor Kathleen Carley:

So, we see a lot of that as well.

Christophe Fiessinger:

Yeah.

Liz Willets:

Grooming.

Christophe Fiessinger:

I guess grooming is also not a new problem but also something that, that's present in those communities or anywhere.

Professor Kathleen Carley:

Yeah.

Liz Willets:

Definitely, and I think what we've seen especially with the pandemic is, yes, you might have these different personas, um, but now, like your, your home is become your workplace. And so, how you might have typically behaved, um, you know, when you'd come home at the end of a long day versus now, you're in the context of work. Um, you know, I think we've seen a lot of organizations think about the risks that, that that could pose, um, in addition to all the other, um, you know, (laughs) stresses that people have on their day-to-day lives.

            Um, but I think it's interesting, um, to your point earlier around, you know, having all the context. Um, you know, we're seeing signals come through from Teams, email, Zoom, uh, you know, social media, et cetera, and, uh, um, also detecting for things like repeated bullying, um, behavior. And so, it's not just, uh, a way f- to your point and around using the analytics to predict something, but it's also to say, "Hey, this is a pattern, and, uh, you know, we should probably step in and do something about it."

Professor Kathleen Carley:

Yeah, absolutely. And I think people are becoming more aware of these patterns themselves because they're actually not just seeing their own communication. They're actually seeing their kids' communication or their parents communication or whatever. And so, they're starting to realize that the people around them may be comm- communicating in ways that impacts them, and so there's a variety of now new technologies that people are talking about trying to develop to try to help people manage this more collectively.

Liz Willets:

Definitely. And I think, um, you know, another area that I'd love to explore with you is just around sentiment analysis. So, you know, you have all these signals, but, um, how do you know if someone's talking about something positively or negatively, um, and g- kind of would love to kind of hear if you've done any research in that spaces?

Professor Kathleen Carley:

Oh, yes, we ... Yeah. I and my group, of course, we do a lot of work on sentiment. So, um, so, sentiment is one of those really tricky things when you're, uh, when you're not there because it depends on how many different modalities you have. Like, if you only have text, it's harder to detect than if you have text plus images, which is still harder than if you also have sound. So, the ... So, it's kind of tricky, and there's new techniques for all of those.

            But let's just think about text for the moment. The way people often de- try to detect sentiment and then where they started out was just by, um, counting the number of positive versus negative words. Okay? And that's kinda okay, but it more tells you about overall, was the message kind of written from an upbeat or a downbeat kind of way. That's really all it really tells you, but people thought that that meant that if there was a something they cared about, like let's say I wanna know if it's about vaccines and are they happy about the vaccines or upset. Well, they would just say, "Here's a message. It has the word vaccine in it. Oh, there's more happy words than sad words, so it must be positive toward vaccines." No. Not even close.

            Because locally, it coulda been, "I'm so happy I don't have to take the vaccine." That woulda come out as overall positive, but it's really negative about the vaccine. So, then, the people came up with loads. So, then, we work on locals then, but how do I tell for a particular word?

            But the thing is when I make a statement like that, that's out of context still because there could've been this whole dialogue discussion, right? And in the ... And when we actually then looked at, at, at these kind of sentences within the context of the discussion, over 50% of time, we had to change our mind about what the sentiment really was in that particular and what was really meant, you know?

            And then, there's issue of sarcasm and humor, which we were terrible at detecting, right?

Liz Willets:

(laughs)

Professor Kathleen Carley:

And so, peep ... And one of the ways people start to detect that is by looking at what's written and then looking for an emoji or emoticon, and if it's at the opposite sentiment of the what's written, you go, "Ah, this must be a joke." Okay?

Christophe Fiessinger:

Or just sarcastic again.

Professor Kathleen Carley:

Yeah. So, it goes cra- ... It goes on and on from there, but there's a couple of a ... There's ... That's kind of the classic line. And now, of course, we do all that with machine learning as opposed to just based on just keywords.

            But there's two other things that are in the sentiment field that people often forget about. One is, um, these subconscious almost supplemental cues that are in messages. So, when you write things and use images, your reader will pick up on things in it and it will cause them to respond and with particular emotional reactions.

            So, for example, you've probably gotten an email or a text from someone where it was in all caps, and your, and your initial response is, "Oh, my gosh. They must be mad at me," right? Or, "What did I do wrong now?" It's like, "Oh, okay." But that's a subliminal cue, okay? It's like things like all caps, use of pronouns. There're special words that people use that will evoke emotions in others, so we look for these subliminal cues also.

            And, uh, an emergent field is looking for these in images, like the use of light versus dark images, the use of cute little kitties, right?

Christophe Fiessinger:

Yeah.

Professor Kathleen Carley:

There's a whole bunch of things that people know now make them happy. And then, so, that's another aspect of it.

            And then, the third aspect of it is that, um, sentiment is actually very tied to your social networks. Your emotional state is tied to your social networks. So, the more I can get you excited either really happy or really angry, the more I can change your social network patterns. So, we can actually look at for our detections in changes in social network patterns as a way of figuring out something about sentiment as well.

Liz Willets:

Interesting. So, are you saying essentially that through your social networks, it kind of like reinforces or, or strengthen, strengthens your connections with that group that you're identifying yourself with?

Professor Kathleen Carley:

So, I'm saying that, well, it does. It's kind of a cycle because your mind likes to, um, maintain balance, okay? It likes to be emotionally balanced. You don't ... You really don't like to be overly excited in any direction, right? Most people don't. And so, if something's making you very uncomfortable, you will either ... If it, like, your connection with someone's making you, uh, very uncomfortable, you will either change your opinion to be more like theirs so you're less more comfortable, or you will drop your connection with that person. So, your affect of your emotional state modulates your social networks, and your social networks that affect what information and emotions come to you and modulate what emotions you have. So, it's kind of this cycle.

Christophe Fiessinger:

Then-

Professor Kathleen Carley:

And so, we actually can watch this happening in groups where I can form them into ... I can prime groups to be ready to be emotionally triggered simply by building up social network connections among them. And then, I can emotionally trigger them, and the people in them will either get more involved in the group or they'll say, "I'm not really feeling comfortable anymore. I'm gonna leave."

Christophe Fiessinger:

Mm-hmm (affirmative). I'm sure you've got a trove of data to research with COVID or with recent election in the U.S. that would-

Liz Willets:

(laughs)

Christophe Fiessinger:

... that would prove those theories of the relationship between your social network and h- your, your sentiment, right?

Professor Kathleen Carley:

Yes. Yeah. Yeah.

Christophe Fiessinger:

Well, actually, going back, tying this to, um, to what you were mentioning earlier, Kathleen, like, sometimes, we say that the conversation at the edges are, are the one, um, are the highest risk one, and the ones that are happening on the fringes and, you know ... A- And then, you add to that like something you mentioned earlier which is a, and also looking at how you, how you are potentially detecting like social unrest and things like that. And, and because those are like at the fringes, it might start very small in a network with very few people, but it could definitely have a network effect very quickly. How do you find those needles that that did, didn't exist before the, a theory, a pattern, an opinion?

Professor Kathleen Carley:

So, the short answer is it's really hard, and we're not good at it yet. (laughing)

Christophe Fiessinger:

Okay.

Professor Kathleen Carley:

Um, but there's a couple of techniques that first off, sometimes, you find 'em by luck. You just happened on 'em. Sometimes, you find 'em just through, um, good journalistic forensics, um, and sometime, but sometimes, we can aid and help that a bit by actually looking for, um, critical secondary actors.

Christophe Fiessinger:

Sure.

Professor Kathleen Carley:

And these are like there's these kinda network metrics for finding these kinda critical secondary actors, and we look for those because those are the kind of actors that could emerge into leaders of these kinds of things. So, they're kind of ... It's not quite anomaly detection, but it's kind of like anomaly detection for networks.

Christophe Fiessinger:

Oh. Is it kind of like that secondary actor is potentially a placebo that could flip and you're trying to either a, a change compared to that, that baseline?

Professor Kathleen Carley:

I think that's probably the wrong, the wrong model of it.

Christophe Fiessinger:

Okay.

Professor Kathleen Carley:

Like, uh, a s- secondary actor is often someone who does things like brokerage relationships between two big actors, okay?

Christophe Fiessinger:

Ah, okay.

Professor Kathleen Carley:

Yeah.

Christophe Fiessinger:

So, that person would be potentially more of a f- will ... Pride, whatever, would be a fire starter and will accelerate on that.

Professor Kathleen Carley:

Yeah. Exactly.

Christophe Fiessinger:

Two people having a point of view to suddenly a wildfire is spreading out across the entire network.

Professor Kathleen Carley:

Exactly. Yeah.

Christophe Fiessinger:

Okay, I get it. Thanks.

Liz Willets:

Yeah, but back to your point around some of the challenges with the, for example, detecting sarcasm, and is it an emoji? Um, would love to hear your thoughts on just some of the other challenges more generally if you're thinking about building, uh, a, a sentiment model from scratch, um, whether it's for, you know, threats or offensive language, um, or things like burnout and suicide. Um, how do you go about doing that, and how do you go to do about doing that in an ethical, um, manner?

Professor Kathleen Carley:

Okay. So, um, so, one of the challenges is culture and language because the way we express sentiment vary, differs, even though there's like basic emotions that are, that are built in cognitively in our brain. The way we express those is socially, culturally defined.

Christophe Fiessinger:

Mm-hmm (affirmative).

Professor Kathleen Carley:

So, one of the big issues is making sure you understand the culture and the language that's associated with it. So, that's part of it.

            The second, a second, uh, critical thing is the fact that, um, when people express themselves, when you're using, and if you're mainly using online data, um, people can go silent, in which case, you don't have any data. Your data could just be a sample. They could choose to enact one of their personas and be lying.

Christophe Fiessinger:

Yeah.

Professor Kathleen Carley:

So, there's lots of ways in which your data-

Christophe Fiessinger:

Mm-hmm (affirmative).

Professor Kathleen Carley:

... itself could be wrong, okay? And that's another big challenge in the area. So, those, I would say, are, uh, so those are examples of some of the challenges in addition to having to have the whole discussion and having to, you know, be careful what you're looking at sentiment around and so on.

            So, from an ethical perspective, um, I would say that part of this is, is that when you're collecting data and trying to analyze it and create, like a model for one of these issues, one of the biggest chall- one of the biggest issues is making sure that you haven't over focused on a certain class of people, like only focused on young white guys or only focused on, you know, um, agent, uh, Hispanic women. You wanna make sure that you're as k- much as possible balanced across the different kinds of publics you want to serve. So, that's, that's part and the ... That's one of the challenge, or one of the kind of ethical guidelines and challenges at the same time.

            Um, the other part is if you were actually going to, to intervene, then you'd need to think about intervention from a, you know, what does the community consider appropriate ethically within that community for the way you intervene? And the answer may be very different if you're talking about, you know, intervening with children versus intervening with, uh, young adults versus intervening with people with autism. So, so you need to look at it more from a community perspective. So, those are two I would raise.

Liz Willets:

That's fine. Yeah. I think, um, you know, especially at Microsoft, we are committed to having unbiased, um, training data so that we aren't, you know, discriminating by against someone because they have these, um, certain characteristics, um, and definitely keep that top of mind, um, as well as, you know, remediation and, and how do you go about now that you've identified that this person is at risk for whatever, uh, reason? Now, how do you reach out to them and give them the support they might need, or how do you alert, um, you know, someone who, who might need to step in? And so, I think that's been, um, a really interesting challenge that we're digging into on our end as well.

            Um, and I think to the first piece you were talking about just more generally the challenges, um, I know you've done some research around control theory, um, and would love to get your perspective on, you know, especially, uh, in some of these more granular sentiments. Like, how do you differentiate between anger, disgust, disappointment, um, and, and really, um, kind of define exactly what you're looking for in the communications to pull that out?

Professor Kathleen Carley:

Yeah. So, um, basically, we, we start with what are thought to be the basic emotions, the ones that are built in cognitively? So, and we would take those ones, and those, you can distinguish fairly reasonably on the basis of the cues I was talking about, and they're kind of big swaths of things. Of course, most of the basic emotions are ones that are kind of more on the negative side, so it's really more on the positive side, discriminating, you know, happy from ecstatic from mildly amused. That, it's much harder there 'cause that's, none of those are ba- basic, just happiness is basic, right? All the others are variations of happiness.

            So, we start with the basic of the emotion and try to discriminate into those categories, and to go further than that, we often find we don't need to. If we need to, um, then really, it's because the context demands that you have to pay attention to a parti- ... So, you're looking for something particular in a particular environment. And so, then, we let the context dictate what the difference is that it's interested in.

            Um, so, for example, if I, if I was doing this for Disney for, you know, people's response to a new ride, for example, that context would dictate that what I really wanna focus on is not just happiness but their satisfaction and pull that out. And so, then, I would actually develop my technology around that, around the, the different people who fell into the different categories, and I might do it first by getting survey data or something like that.

Christophe Fiessinger:

Yeah.

Professor Kathleen Carley:

But, you know, you said something that made me realize that I hadn't mentioned one of the major challenges-

Christophe Fiessinger:

That was good.

Professor Kathleen Carley:

... that, um, people often overlook because we're so in love with machine learning, right? And we so think, "Training sets," right? Well, the trouble is, in a social space, your training sets are yesterday's news.

Christophe Fiessinger:

Yeah.

Professor Kathleen Carley:

They're never up-to-date. They're always, they're always a mess, and a lot of things where you wanna use sentiment and wanna look at behavior of people, you don't have time to build a training set. So, this is an area where we really need new technologies like match functions and things like that, or where you can just get the bare minimum training set and then do some kind of leapfrogging on it.

Christophe Fiessinger:

Yeah. I think it-

Liz Willets:

Yeah.

Christophe Fiessinger:

I think this is to, to that point. I can relate to that. I think the ... And also, what you were s- saying early on the key part where you look at demographics or what is that target audience with that pattern you're trying to detect is even that let's say that sp- specific demographics, you did a good job on day zero. We know language is this constant evolving function, and just because, to your point, you know, it was yesterday's data set. Just because you would put ball on sweats to do a white paper to detect blah for those demographics.

Professor Kathleen Carley:

Yep.

Christophe Fiessinger:

That was great at that point in time, but I'm sure it already changed rapidly because of, of the today's availability of social network and things like that, you know. My, when I was visiting Europe, my, my nephew and niece speak English from what they've seen on YouTube and Netflix.

Professor Kathleen Carley:

Yeah.

Christophe Fiessinger:

So, just it'll almost feel like language is even moving faster with that, uh, availability of, of that, all those tools worldwide that's making researchers I'm sure his job even harder to stay up-to-date.

Professor Kathleen Carley:

Absolutely. (laughs) Yeah. The level of new jargon and new phrases out, it's crazy.

Liz Willets:

(laughs)

Christophe Fiessinger:

Yeah.

Liz Willets:

And that's not just in English, too, you know?

Professor Kathleen Carley:

That's right. That's right.

Liz Willets:

We were talking last week with Christian around languages (laughs) and, you know, how many languages there are in the world and how you have to kind of build your models to be trained to kind of reason over, uh, you know, diable, double-byte characters and, um, you know-

Professor Kathleen Carley:

Yep.

Liz Willets:

... Japanese and, and Chinese characters. And so, it just (laughs) it's never ending.

Christophe Fiessinger:

Yeah.

Professor Kathleen Carley:

And sometimes, the fact that you have the multiple character sets and multiple languages can be diagnostic, right? So, for, like, when we look at, um, response to say natural disasters in various areas, typically, people, when they communicate online, will communicate in one language with others in the same language. And there'll be a few people who will communicate in multiple languages, but they'll have different groups like, "Here's my English group. Here's my Spanish group." Okay?

            But during a disaster, you'll see, actually see more messages come out where you've got mixed part English, part Spanish in the same message-

Christophe Fiessinger:

Mm-hmm (affirmative).

Professor Kathleen Carley:

... and, and so it can be diagnostic of, "Oh, this is a bilingual community," for example.

Liz Willets:

Interesting.

Christophe Fiessinger:

Interesting.

Liz Willets:

Well, great. I know, um, Kathleen, I have certainly learned a lot and wanna thank you again for, for joining us today. Um, Christophe, I thought that was a great conversation.

Christophe Fiessinger:

Yeah. I, after that, I wish I was a student, and I could join, uh, CMU and be one of your students and write a PhD. It sounds like a infinite number of fascinating topics so and, and research topics, so it sounds-

Professor Kathleen Carley:

Well-

Christophe Fiessinger:

... very fascinating.