Wednesday, September 04, 2024

$1.8 million NIH grant to FAU engineering fuels quest to decode human evolution



Cutting-edge tools will help to unravel genetic mechanisms behind disease resistance and defense



Florida Atlantic University

Quest to Decode Human Evolution 

image: 

Michael DeGiorgio, Ph.D., PI, associate chair and associate professor, FAU Department of Electrical Engineering and Computer Science, and Department of Biomedical Engineering.

view more 

Credit: Florida Atlantic University





Natural selection is an important evolutionary force that enables humans to adapt to new environments and fight disease-causing pathogens. However, the unique footprints of natural selection in our genome can be buried beneath those left by other evolutionary forces. Thus, by leveraging information about multiple evolutionary forces, researchers can identify signatures of natural selection in the human genome, and ultimately determine its role in human adaptation and disease.

Low-cost DNA sequencing has provided researchers with an abundance of genomic data, enabling them to search for evidence of natural selection in different species. However, various nonadaptive factors can sometimes obscure these signals, making it essential to develop sophisticated statistical methods that can account for multiple factors influencing genetic variation.

Michael DeGiorgio, Ph.D., in the College of Engineering and Computer Science at Florida Atlantic University, has received a five-year $1,874,360 grant from the National Institute of General Medical Sciences (NIGMS) of the United States National Institutes of Health (NIH) to further his research on designing and applying statistical methods to identify regions of the genome affected by natural selection. The project titled, “Identifying Complex Modes of Adaptation from Population-genomic Data,” is an NIH NIGMS Maximizing Investigators Research Award for Established Investigators.

This research aims to develop powerful tools for identifying diverse modes of adaptation from genetic data and to better understand the evolutionary mechanisms underlying traits like disease resistance and pathogen defense.

“To truly grasp how human genetic variation has evolved and is distributed, it’s essential to study the evolutionary mechanisms at play,” said Stella Batalama, Ph.D., dean, FAU College of Engineering and Computer Science. “The advent of advanced high-throughput sequencing technologies, along with significant boosts in computational capabilities, has equipped geneticists with powerful new tools. This important grant from the National Institutes of Health will enable our outstanding research team led by professor DeGiorgio to delve deeper into understanding the evolutionary forces that contribute to the diversity observed across human populations.”

DeGiorgio and his research team work on detecting natural selection, which affects the frequency of traits within populations and leaves subtle genetic signals in the DNA sequences of individuals within these populations. Over the past four years, his team has made significant advances in this field, developing some of the first, most powerful and state-of-the-art model-based methods for unearthing genomic signals of a diverse array of adaptive events through analysis of DNA within and across species. These methods draw from a broad array of statistical and engineering techniques, by leveraging and integrating the strengths of probabilistic, machine learning, and signal processing frameworks.

“Our methods have led to several novel insights,” said DeGiorgio, associate chair and associate professor, FAU Department of Electrical Engineering and Computer Science, and Department of Biomedical Engineering. “For example, we found evidence of convergent positive selection in Europeans and East Asians that may explain differences in insulin response between these populations. We also discovered positive selection in olfactory genes affecting scent and behavior of rats in New York City for navigating harsh and noisy urban environments, and identified balancing selection in venom genes that may play a role in predator-prey interactions in rattlesnakes.”

Recent advancements in AI, especially deep learning, have greatly improved outcome prediction using complex data like genetic information. These algorithms learn from training data and apply this knowledge to new, unseen data. Their strength lies in handling complex features and adapting to various data types. However, they often face challenges when the new data differs from the training data, a problem known as “domain shift.”

“To enhance prediction accuracy, it's crucial to adapt to changing data conditions and refine feature selection and modeling,” said DeGiorgio.

In the coming five years, DeGiorgio plans to advance this research by developing improved statistical, machine learning, and signal processing approaches. These methods will aim to detect complex patterns of adaptation by considering how various evolutionary forces simultaneously shape genetic diversity. Specifically, researchers will focus on creating novel frameworks to identify positive and balancing selection while accounting for genomic, temporal and spatial factors.

DeGiorgio and his research team will work on methods to detect regions with complex patterns of selection from ancient genetic variation, use signal processing techniques to analyze genomic data from images for machine learning models, and develop innovative procedures to address uncertainties in genetic and demographic parameters when training these models.

“With these advanced techniques, researchers can now study adaptation in a wider variety of organisms, from well-researched models to those less frequently examined,” said Javad Hashemi, Ph.D., inaugural chair and professor, FAU Department of Biomedical Engineering, and associate dean for research and professor in the College of Engineering and Computer Science. “This broader focus will not only increase inclusivity in this research but also deepen the understanding of how different species adapt to their environments. By applying these novel methods to diverse organisms – such as primates, rodents, snakes, insects and plants – our researchers will tackle significant evolutionary questions and uncover new insights across a range of biological contexts.”

- FAU -

About FAU’s College of Engineering and Computer Science:

The FAU College of Engineering and Computer Science is internationally recognized for cutting-edge research and education in the areas of computer science and artificial intelligence (AI), computer engineering, electrical engineering, biomedical engineering, civil, environmental and geomatics engineering, mechanical engineering, and ocean engineering. Research conducted by the faculty and their teams expose students to technology innovations that push the current state-of-the art of the disciplines. The College research efforts are supported by the National Science Foundation (NSF), the National Institutes of Health (NIH), the Department of Defense (DOD), the Department of Transportation (DOT), the Department of Education (DOEd), the State of Florida, and industry. The FAU College of Engineering and Computer Science offers degrees with a modern twist that bear specializations in areas of national priority such as AI, cybersecurity, internet-of-things, transportation and supply chain management, and data science. New degree programs include Master of Science in AI (first in Florida), Master of Science and Bachelor in Data Science and Analytics, and the new Professional Master of Science and Ph.D. in computer science for working professionals. For more information about the College, please visit eng.fau.edu

 

About Florida Atlantic University:
Florida Atlantic University, established in 1961, officially opened its doors in 1964 as the fifth public university in Florida. Today, the University serves more than 30,000 undergraduate and graduate students across six campuses located along the southeast Florida coast. In recent years, the University has doubled its research expenditures and outpaced its peers in student achievement rates. Through the coexistence of access and excellence, FAU embodies an innovative model where traditional achievement gaps vanish. FAU is designated a Hispanic-serving institution, ranked as a top public university by U.S. News & World Report and a High Research Activity institution by the Carnegie Foundation for the Advancement of Teaching. For more information, visit www.fau.edu.

 

How zebrafish map their environment



Spatial orientation mechanisms surprisingly similar to our own


Max-Planck-Gesellschaft

Tracking microscope 

image: 

A tracking microscop follows the zebrafish during their natural behaviour.

 

view more 

Credit: Jean-Claude Winkler/MPI for Biological Cybernetics




Researchers are turning to zebrafish to unlock the secrets of place cells, which play a crucial role in forming mental maps of space, social networks, and abstract relationships. Until now, place cells have only been found in mammals and birds, leaving the question of how other species internally represent the external world largely unanswered. A team of researchers at the Max Planck Institute for Biological Cybernetics has now found the first compelling evidence for place cells in the brain of the tiny larval zebrafish.

When we explore an unfamiliar city, we use various cues – landmarks, a sense of how far we have walked in one direction, perhaps a river we cannot cross – to create an internal map of our environment. Deep in the brain, in a structure called the hippocampus, a set of place cells play a key role in building our internal maps of the external world. These place cells fire when we are at specific locations in space and can self-organize into an array of different mental maps.

That much is known for mammals, including humans, and even for birds. However, the existence of place cells in other species is controversial. A group of researchers at the Max Planck Institute for Biological Cybernetics in Tübingen (Germany), led by Jennifer Li and Drew Robson, has now found the first conclusive evidence for place cells in zebrafish.

Recording the entire brain during natural behaviour

The researchers recorded the brain activity of young zebrafish as they explored their environment. These fish are completely transparent when they are only a few days old, making it possible to look into their tiny brains, which contain only 100,000 cells. One can even make individual active neurons light up using fluorescent calcium indicators, since all neuronal activity is associated with fluctuations in calcium ion concentrations. An earlier key invention of Li and Robson was essential for observing brain activity during navigation: tracking microscopes that move with the freely swimming fish.

Using this experimental design, the team analysed what spatial information is encoded in each neuron in the fish's brain. They identified a population of about 1000 place cells in each fish, most of which only fire when the animal is in a specific location, while a few respond to more than one area. “Collectively, the place cell population encodes spatial information,” explains Jennifer Li. “From the firing patterns of the place cells, we were able to decode the location of each fish over time – with an error of just a few millimetres.”

Strikingly, most of the place cells were located in the telencephalon, an area of the zebrafish’s forebrain, whose precise function has been a source of debate for several decades. “The high concentration of place cells in the telencephalon potentially confirms the longstanding conjecture that this brain region is a functional analogue of the mammalian hippocampus, in miniature,” comments Drew Robson.

A flexible mechanism that integrates different inputs

However, Li and Robson needed additional evidence to conclude that the cells they had identified were indeed an analogue to mammalian place cells. The first feature to be tested was whether place cells use self-motion or external cues. In terms of human experience, a cue such as "I’ve been walking straight ahead at a brisk pace for about a minute" relies on self-motion, whereas "I can see the Eiffel Tower" is an external cue. In a series of experiments, the researchers manipulated both sources of information – taking the fish out of their environment and placing them back, removing landmarks, or rotating the behavioural chamber. They found that the fish integrate both external and self-motion cues to create their internal maps – just like we do.

Not only do the fish appear to refine their spatial representation map as they become more familiar with an unfamiliar environment, but they can also adapt to change: they use the same neuronal circuits to remember a second environment. When returned to their initial surroundings, they do not have to map it from scratch, but can partially recover the representation map they created previously. Thus, the place cell population exhibits a flexible memory system, a further hallmark of mammalian place cells.

An emerging model organism for a complex neuronal network

The authors of the study plan to use zebrafish as a new model organism to unravel the mysteries of place cells. In addition to their role in creating mental maps of space, these cells are also crucial for forming maps of social networks and abstract relationships, as well as for memory and planning. While mammalian place cells have been intensively investigated since their Nobel Prize-winning discovery more than 50 years ago, scientists still do not fully understand the neural networks that generate place cells or how they support such a wide range of mental functions.

The primary challenge has been the sheer complexity and size of mammalian place cell networks, which make it extremely difficult to study the entire network simultaneously. In contrast, the larval zebrafish brain is one of the smallest biological systems capable of generating place cells. Robson concludes: “Using this new minimal model, future studies can potentially trace all of the inputs to each place cell and create detailed models for how place cells acquire all their unique properties.”

A behavioral chamber under the tracking microscope

Credit

Jean-Claude Winkler/MPI for Biological Cybernetics

 

New study reveals relationship between HIV risk factors for LGBTQ+ youth



A new study has uncovered empirical evidence that shows the importance of taking a holistic approach to addressing HIV risk factors



University of Connecticut




A new study has uncovered empirical evidence that shows what researchers have long suspected about HIV risk – that having multiple risk factors is much worse than having only one.

Pablo Kokay Valente, assistant professor of allied health sciences in the College of Agriculture, Health and Natural Resources (CAHNR) led this study in collaboration with Ryan Watson, associate professor, and Lisa Eaton, professor, both in the Department of Human Development and Family Sciences. The study was recently published in the American Journal of Public Health.

new study has uncovered empirical evidence that shows what researchers have long suspected about HIV risk – that having multiple risk factors is much worse than having only one.

Pablo Kokay Valente, assistant professor of allied health sciences in the College of Agriculture, Health and Natural Resources (CAHNR) led this study in collaboration with Ryan Watson, associate professor, and Lisa Eaton, professor, both in the Department of Human Development and Family Sciences. The study was recently published in the American Journal of Public Health.

Factors like poverty, depression, anxiety, substance use, alcohol use, other mental health diagnoses, and sexual victimization make people more likely to engage in risky behaviors like having sex without a condom, having multiple sexual partners, and not using PrEP (Pre-Exposure Prophylaxis).

For a long time, researchers looked at each of these factors independently. But recently there has been a push to consider their interactions.

Most papers looking at these factors have demonstrated a linear relationship. What this means is that if you have one factor at play – say, for example, depression – it is twice as bad to have two factors, like depression and alcohol use.

This new paper demonstrates an exponential relationship instead. This is something that scientists have theorized for years without much empirical evidence to support the hypothesis until now.

“Most studies haven’t been able to demonstrate this kind of synergistic relationship between the syndemic factors,” Valente says. “And that’s what we did. We’ve shown that having two factors is much worse than having one and having three factors is much, much worse than having two factors.”

The researchers used data from a survey of LGBTQ+ youth. LGBTQ+ people have historically and continue to be one group that is at a greater risk of contracting HIV.

“A major strength of this study was the use of our national sample of LGBTQ+ youth, many of whom reported intersections of multiple marginalized social positions,” Watson says. “Data collected with so many young LGBTQ+ youth give us a unique view into the complexities and nuances of the lived experiences of today’s LGBTQ+ teens.”

One unexpected finding is that the more factors an individual had, the more likely they were to be exposed to PrEP – a medication that can prevent the contraction of HIV even if you come into contact with the virus – and get information about the drug and its benefits.

“People who are exposed to these factors in combination, they have much more risk and they are probably more connected and more aware of what’s out there in terms of PrEP,” Valente says.

This understanding changes the way researchers think about interventions for people at risk of contracting HIV.

“Better understanding syndemic conditions in one of the most vulnerable youth populations — sexual and gender diverse adolescents — has been much needed, and this study contributes to the growing body of research by using a large national sample of LGBTQ+ youth,” Watson says.

If the linear model were accurate, it would not matter which HIV risk factor interventions were addressed since they all have, in theory, the same impact. But an exponential relationship demonstrates the need for interventions that tackle multiple risk factors at once to provide a substantial benefit.

“If they are linear, the implication is that whatever you address, there is some benefit to that,” Valente says. “Showing that it’s synergistic, it calls for interventions that address more than one of these things. Addressing two of these factors would have more of an impact than addressing things individually.”

For example, therapy-based interventions that address stigma surrounding HIV may also reduce substance use among participants, since that is a common coping strategy for stigmatization, and improve their overall mental health.

“They’re all deeply related,” Valente says. “So, I think dismantling several of them at a time will be very important.”

Valente is continuing to use the methods that uncovered the exponential relationship among LGBTQ+ people with a dataset of hospitals that provide care for people living with HIV.

Factors like poverty, depression, anxiety, substance use, alcohol use, other mental health diagnoses, and sexual victimization make people more likely to engage in risky behaviors like having sex without a condom, having multiple sexual partners, and not using PrEP (Pre-Exposure Prophylaxis).

For a long time, researchers looked at each of these factors independently. But recently there has been a push to consider their interactions.

Most papers looking at these factors have demonstrated a linear relationship. What this means is that if you have one factor at play – say, for example, depression – it is twice as bad to have two factors, like depression and alcohol use.

This new paper demonstrates an exponential relationship instead. This is something that scientists have theorized for years without much empirical evidence to support the hypothesis until now.

“Most studies haven’t been able to demonstrate this kind of synergistic relationship between the syndemic factors,” Valente says. “And that’s what we did. We’ve shown that having two factors is much worse than having one and having three factors is much, much worse than having two factors.”

The researchers used data from a survey of LGBTQ+ youth. LGBTQ+ people have historically and continue to be one group that is at a greater risk of contracting HIV.

“A major strength of this study was the use of our national sample of LGBTQ+ youth, many of whom reported intersections of multiple marginalized social positions,” Watson says. “Data collected with so many young LGBTQ+ youth give us a unique view into the complexities and nuances of the lived experiences of today’s LGBTQ+ teens.”

One unexpected finding is that the more factors an individual had, the more likely they were to be exposed to PrEP – a medication that can prevent the contraction of HIV even if you come into contact with the virus – and get information about the drug and its benefits.

“People who are exposed to these factors in combination, they have much more risk and they are probably more connected and more aware of what’s out there in terms of PrEP,” Valente says.

This understanding changes the way researchers think about interventions for people at risk of contracting HIV.

“Better understanding syndemic conditions in one of the most vulnerable youth populations — sexual and gender diverse adolescents — has been much needed, and this study contributes to the growing body of research by using a large national sample of LGBTQ+ youth,” Watson says.

If the linear model were accurate, it would not matter which HIV risk factor interventions were addressed since they all have, in theory, the same impact. But an exponential relationship demonstrates the need for interventions that tackle multiple risk factors at once to provide a substantial benefit.

“If they are linear, the implication is that whatever you address, there is some benefit to that,” Valente says. “Showing that it’s synergistic, it calls for interventions that address more than one of these things. Addressing two of these factors would have more of an impact than addressing things individually.”

For example, therapy-based interventions that address stigma surrounding HIV may also reduce substance use among participants, since that is a common coping strategy for stigmatization, and improve their overall mental health.

“They’re all deeply related,” Valente says. “So, I think dismantling several of them at a time will be very important.”

Valente is continuing to use the methods that uncovered the exponential relationship among LGBTQ+ people with a dataset of hospitals that provide care for people living with HIV.

 

 

Rein tension may affect horse behavior




University of Helsinki
Rein tension 

image: 

Rein tension can be measured with a sensor attached between the bit and the rein. 

view more 

Credit: Nina Mäki-Kihniä




In a pilot study carried out at the University of Helsinki, high rein tension was found to be associated with trotters opening their mouths, which indicates pain or discomfort in the mouth.

Rein tension denotes the force employed by the rider or driver through the reins. It can be measured with a sensor attached between the bit and the rein.

“Our group has previously investigated mouth injuries in trotters, and we found that moderate and severe injuries were associated with certain bit types. It is possible that drivers ended up using those bit types due to problems experienced with lighter rein cues. This is why we wanted to explore rein tension,” says researcher and veterinarian Kati Tuomola from the Faculty of Veterinary Medicine, University of Helsinki.

Eight horses and their drivers were recruited for the study. With the drivers driving their horses on a racetrack at walk and trot, the researchers measured rein tension and recorded video of the horses’ behaviour from a moving vehicle. Each horse was equipped with a regular single-jointed bit. Subsequently, one researcher coded the horse's behaviour from the videos in accordance with a predetermined catalogue of behaviours. The coder was unaware of any findings associated with rein tension and mouth injuries among these horses

None of the horses had mouth injuries before the driving. After the drive, three horses had moderate bruises in their mouths.  Their median rein tension was numerically higher (approximately 3.5 kg per rein) than that in horses without injuries (approximately 2 kg per rein), and they showed longer periods of rushed walk.  

The median rein tension for a single rein varied between 0.5 kg and 3.7 kg, with the highest tension varying between 11 kg and 24 kg. According to the researchers, these rein tensions can be considered rather high, as prior studies have shown that horses avoid tension exceeding 0.6–1 kg. To investigate behavioural differences during different rein tensions, five 30-second periods were visually selected from the rein tension graphs of all horses, representing samples of tension ranging from low to high. During low rein tension, the horses mainly walked and mostly kept their mouths closed. During periods of higher rein tension, the horses mainly trotted either slowly or quickly, keeping their mouths open for longer periods of time.

“Horse trainers should monitor the horse’s mouth behaviour, arousal state and ability to walk calmly, and adjust the training accordingly. The horse keeping its mouth widely or repeatedly open may indicate evasive behaviour, meaning discomfort or pain in the mouth. In addition, rushed walking may indicate high arousal, which in turn can increase the risk of mouth injuries,” says Tuomola, the article’s lead author.

In a pilot study carried out at the University of Helsinki, high rein tension was found to be associated with trotters opening their mouths, which indicates pain or discomfort in the mouth.

Rein tension denotes the force employed by the rider or driver through the reins. It can be measured with a sensor attached between the bit and the rein.

“Our group has previously investigated mouth injuries in trotters, and we found that moderate and severe injuries were associated with certain bit types. It is possible that drivers ended up using those bit types due to problems experienced with lighter rein cues. This is why we wanted to explore rein tension,” says researcher and veterinarian Kati Tuomola from the Faculty of Veterinary Medicine, University of Helsinki.

Eight horses and their drivers were recruited for the study. With the drivers driving their horses on a racetrack at walk and trot, the researchers measured rein tension and recorded video of the horses’ behaviour from a moving vehicle. Each horse was equipped with a regular single-jointed bit. Subsequently, one researcher coded the horse's behaviour from the videos in accordance with a predetermined catalogue of behaviours. The coder was unaware of any findings associated with rein tension and mouth injuries among these horses

None of the horses had mouth injuries before the driving. After the drive, three horses had moderate bruises in their mouths.  Their median rein tension was numerically higher (approximately 3.5 kg per rein) than that in horses without injuries (approximately 2 kg per rein), and they showed longer periods of rushed walk.  

The median rein tension for a single rein varied between 0.5 kg and 3.7 kg, with the highest tension varying between 11 kg and 24 kg. According to the researchers, these rein tensions can be considered rather high, as prior studies have shown that horses avoid tension exceeding 0.6–1 kg. To investigate behavioural differences during different rein tensions, five 30-second periods were visually selected from the rein tension graphs of all horses, representing samples of tension ranging from low to high. During low rein tension, the horses mainly walked and mostly kept their mouths closed. During periods of higher rein tension, the horses mainly trotted either slowly or quickly, keeping their mouths open for longer periods of time.

“Horse trainers should monitor the horse’s mouth behaviour, arousal state and ability to walk calmly, and adjust the training accordingly. The horse keeping its mouth widely or repeatedly open may indicate evasive behaviour, meaning discomfort or pain in the mouth. In addition, rushed walking may indicate high arousal, which in turn can increase the risk of mouth injuries,” says Tuomola, the article’s lead author.

Mathematicians model a puzzling breakdown in cooperative behaviour




University of British Columbia
Mutualism two-layer lattice. 

image: 

A model developed by evolutionary mathematicians in Canada and Europe shows that as cooperation becomes easier, it can unexpectedly break down. The researchers at the University of British Columbia and Hungarian Research Network used computational spatial models to arrange individuals from the two species on separate lattices facing one another.

view more 

Credit: Christoph Hauert and György Szabó



Darwin was puzzled by cooperation in nature—it ran directly against natural selection and the notion of survival of the fittest. But over the past decades, evolutionary mathematicians have used game theory to better understand why mutual cooperation persists when evolution should favour self-serving cheaters. 
 
At a basic level, cooperation flourishes when the costs to cooperation are low or the benefits large. When cooperation becomes too costly, it disappears—at least in the realm of pure mathematics. Symbiotic relationships between species—like those between pollinators and plants–are more complex, but follow similar patterns.

But new modelling published today in PNAS Nexus adds a wrinkle to that theory, indicating that cooperative behaviour between species may break down in situations where, theoretically at least, it should flourish. 

“As we began to improve the conditions for cooperation in our model, the frequency of mutually beneficial behaviour in both species increases, as expected,” says Dr. Christoph Hauert, a mathematician at the University of British Columbia who studies evolutionary dynamics. 

“But as the frequency of cooperation in our simulation gets higher—closer to 50 per cent—suddenly there's a split. More cooperators pool in one species and fewer in the other—and this asymmetry continues to get stronger as the conditions for cooperation get more benign.”

While this ‘symmetry breaking of cooperation’ between two populations has been modelled by mathematicians before, this is the first model that enables individuals in each group to interact and join forces in a more natural way. 

Dr. Hauert and colleague Dr. György Szabó from the Hungarian Research Network used computational spatial models to arrange individuals from the two species on separate lattices facing one another. This enables cooperators to form clusters and reduce their exposure to (and exploitation by) cheaters by more frequently interacting with other cooperators.

“Because we chose symmetric interactions, the level of cooperation is the same in both populations,” says Dr. Hauert. “Clusters can still form and protect cooperators but now they need to be synchronized across lattices because that’s where the interactions occur.”

"The odd symmetry breaking in cooperation shows parallels to phase transitions in magnetic materials and highlights the success of approaches developed in statistical and solid state physics,” says Dr. Szabó. 

"At the same time the model sheds light on spikes in dramatic changes in behaviour that can significantly affect the interactions in complex living systems." 

The research was supported by the National Science and Engineering Research Council of Canada.

Mutualisms: cooperation between species
A model developed by evolutionary mathematicians in Canada and Europe shows that as cooperation becomes easier, it can unexpectedly break down. Watch a simulation of spatial interactions of cooperators and defectors for each species under different scenarios. 

Disclaimer: A

Machine learning technique predicts likely accounting fraud across supply chains



Tsinghua University Press

Multi-relational graph representation learning for financial statement fraud detection 

image: 

Overview of the FraudGCN approach. The researchers constructed three types of ‘sub-graphs’ depending on the type of relationships between companies: with accounting firms; along supply chains; and throughout an industry. The training direction of the machine learning model is depicted by red arrows. Grey circles (‘nodes’) represent fraudulent firms and white circles represent normal firms

view more 

Credit: Big Data Mining and Analytics, Tsinghua University Press




As the perpetrators of accounting fraud become ever more sophisticated in their techniques, fraud detection needs to step up its game. Thankfully, a group of researchers have devised a new machine learning ‘detective’ that is able to analyze not just fraud at a single firm, but predict likely fraud across whole supply chains and industries.

 

A paper describing the team’s approach was published in the journal Big Data Mining and Analytics on August 28.

 

Financial statement fraud, or, more commonly, accounting fraud may be a less frequent form of corporate fraud, but it is by far the costliest crime in the world. Perhaps the most famous cases of white-collar crime are at base such accounting fraud, when an enterprise manipulates the figures on its financial statements or other valuation date in order to make it appear more profitable than it is.

 

The collapse of US energy firm Enron, the largest bankruptcy in US history, came from their cooking of the books in collusion with their accounting firm. In 2008, Lehman Brothers declared bankruptcy due to insolvency, having concealed approximately $50 billion in debt through balance sheet fraud. In the late 2010s, American investment advisor Bernie Madoff managed to cheat clients out of a whopping $65 billion.

 

It is not only investors who are hurt by financial statement fraud. Hundreds of thousands of jobs can be lost, communities devastated, and, in the most extreme cases, through knock-on effects, it can threaten the stability of national economies.

 

Despite the threat that such fraud poses, it remains very hard for authorities to catch. Red flags such as a sudden surge in a company’s performance just before the end of a reporting period, or soaring sales growth while competing firms’ sales remain sluggish could turn out to be just the result of good luck or a superior product. And so for decades forensic auditors have used statistical analysis to spot manipulation.

 

But such efforts are enormously labor intensive and require trawling through huge volumes of data. As a result, authorities tend to depend upon random audits, but this means that most firms most of the time go unchecked.

 

“Making matters even worse, in recent years, fraudsters have become increasingly sophisticated in the techniques they deploy,” said Chenxu Wang, lead author of the paper and an associate professor with the School of Software Engineering and the Key Lab of Intelligent Networks and Network Security at Xi’an Jiaotong University. “It’s an unending, mathematical arms race between the authorities and the fraudsters.”

 

“What is needed is an effective and accurate algorithm to automatically identify accounting fraud, and leave the days of random auditing behind,” said Mengqin Wang, also of Xi’an Jiaotong University.

 

A number of mathematicians and computer scientists specializing in the topic have achieved some success in this regard by the use of machine learning. But up to now, this approach has only been applied to individual firms.

 

“This overlooks the often-intricate relationships between different firms that may also offer up indicators of fraud,” said Yi Long, another team member, but from Shenzhen Finance Institute, at the Chinese University of Hong Kong, Shenzhen. “An accounting firm that colludes in financial statement fraud with one company has an increased likelihood of engaging in fraudulent activities with other companies.”

 

And it is not just between accounting firms and their clients where the fraudulent relationships are propagated. Accounting fraud practices can spread up and down supply chains, or, be perpetuated horizontally across industries.

 

But to incorporate data beyond a single firm means a commensurate increase in the computational expense. Moreover, existing machine-learning approaches suffer from a severe imbalance in the samples used to train the computer model how to classify something as fraudulent because normal, non-fraudulent samples significantly outnumber actual fraud cases. This imbalance can lead to biased computer models that prioritize the majority “class,” the non-fraudulent cases, making it difficult to accurately detect fraudulent activities.

 

To overcome all of these challenges, the research team developed a machine-learning technique combined with mathematical methods taken from the realm of graph theory.

 

The cutting-edge artificial intelligence financial-fraud ‘detective’ they devised involves a “graph,” a structure that mathematically represents the connections or relations (described as “edges”) between different companies, individuals and products (described as “nodes”). And multi-relational graphs allow for multiple types of edges, allowing the representation of diverse relationships between nodes, and offer a more comprehensive representation of the complexity of connections among them.

 

And the detective itself, called “FraudGCN” is a graph convolutional network, or GCN, a type of neural network designed to operate on graph-structured data. Unlike traditional neural networks that operate on grid-like data such as images, GCNs can operate on data represented as graphs.

 

FraudGCN itself constructs a multi-relational graph representing various industry connections, supply chain links, and shared accounting firm auditing practices, and by doing so capture rich information arising from these relationships, in particular details uncovered in particular ‘neighborhoods’ of nodes in the graphs. By aggregating such information, FraudGCN not only enhances the ability to identify patterns indicative of existing likely fraudulent activities, but also predict where they are likely to arise.

 

Finally, unlike previous efforts at machine-learning assisted fraud detection, FraudGCN is able to handle addition of new nodes without the need for the model to be retrained, enhancing its adaptability and scalability.

 

The team trialled FraudGCN on a real-world dataset from Chinese listed companies to assess its performance, and found that it beats state-of-the-art approaches by between 3.15% and 3.86%.

 

Moving forward, the team hope to develop their approach to be able to deal with medium-sized enterprises, not just larger ones.

 


About Big Data Mining and Analytics

Big Data Mining and Analytics (Published by Tsinghua University Press) discovers hidden patterns, correlations, insights and knowledge through mining and analyzing large amounts of data obtained from various applications. It addresses the most innovative developments, research issues and solutions in big data research and their applications. Big Data Mining and Analytics is indexed and abstracted in ESCI, EI, Scopus, DBLP Computer Science, Google Scholar, INSPEC, CSCD, DOAJ, CNKI, etc.

About SciOpen 

SciOpen is an open access resource of scientific and technical content published by Tsinghua University Press and its publishing partners. SciOpen provides end-to-end services across manuscript submission, peer review, content hosting, analytics, identity management, and expert advice to ensure each journal’s development. By digitalizing the publishing process, SciOpen widens the reach, deepens the impact, and accelerates the exchange of ideas.