A schematic individual-level data representation for the life2vec model.

image: 

A schematic individual-level data representation for the life2vec model. (A) We organize socioeconomic and health data from the Danish national registers from 1st January 2008 until 31st December 2015 into a single chronologically ordered life-sequence. Each database entry becomes an event in the sequence, where an event has associated positional and contextual data. The contextual data include variables associated with the entry (e.g., industry, city, income, job type). The positional data includes the person’s age (expressed in full years), absolute position (number of days since January 1st, 2008). The raw life-sequence is then passed to the model described in panel (B). The model consists of multiple stacked encoders. The first encoder combines contextual and positional information to produce a contextual representation of each life event. The following encoders output deep contextual representations of each life event (considering the overall content of the lifesequence). The final encoder layer fuses the representations of life-events to produce the representation of a life-sequence. The decoder uses the latter to make predictions. Credit: G. Savcisens et al.


view more 

Credit: Credit: G. Savcisens et al.

Artificial intelligence developed to model written language can be utilized to predict events in people’s lives. A research project from DTU, University of Copenhagen, ITU, and Northeastern University in the US shows that if you use large amounts of data about people’s lives and train so-called ‘transformer models’, which (like ChatGPT) are used to process language, they can systematically organize the data and predict what will happen in a person’s life and even estimate the time of death.

In a new scientific article, ‘Using Sequences of Life-events to Predict Human Lives’, published in Nature Computational Science, researchers have analyzed health data and attachment to the labour market for 6 million Danes in a model dubbed life2vec. After the model has been trained in an initial phase, i.e., learned the patterns in the data, it has been shown to outperform other advanced neural networks (see fact box) and predict outcomes such as personality and time of death with high accuracy.

“We used the model to address the fundamental question: to what extent can we predict events in your future based on conditions and events in your past? Scientifically, what is exciting for us is not so much the prediction itself, but the aspects of data that enable the model to provide such precise answers,” says Sune Lehmann, professor at DTU and corresponding author of the paper.

Predictions of time of death

The predictions from Life2vec are answers to general questions such as: ‘death within four years’? When the researchers analyze the model’s responses, the results are consistent with existing findings within the social sciences; for example, all things being equal, individuals in a leadership position or with a high income are more likely to survive, while being male, skilled or having a mental diagnosis is associated with a higher risk of dying. Life2vec encodes the data in a large system of vectors, a mathematical structure that organizes the different data. The model decides where to place data on the time of birth, schooling, education, salary, housing and health.

“What’s exciting is to consider human life as a long sequence of events, similar to how a sentence in a language consists of a series of words. This is usually the type of task for which transformer models in AI are used, but in our experiments we use them to analyze what we call life sequences, i.e., events that have happened in human life,” says Sune Lehmann.

Raising ethical questions

The researchers behind the article point out that ethical questions surround the life2vec model, such as protecting sensitive data, privacy, and the role of bias in data. These challenges must be understood more deeply before the model can be used, for example, to assess an individual’s risk of contracting a disease or other preventable life events.

“The model opens up important positive and negative perspectives to discuss and address politically. Similar technologies for predicting life events and human behaviour are already used today inside tech companies that, for example, track our behaviour on social networks, profile us extremely accurately, and use these profiles to predict our behaviour and influence us. This discussion needs to be part of the democratic conversation so that we consider where technology is taking us and whether this is a development we want,” says Sune Lehmann.

According to the researchers, the next step would be to incorporate other types of information, such as text and images or information about our social connections. This use of data opens up a whole new interaction between social and health sciences.

FACTS:

The research project

The research project ‘Using Sequences of Life-events to Predict Human Lives’ is based on labour market data and data from the National Patient Registry (LPR) and Statistics Denmark. The dataset includes all 6 million Danes and contains information on income, salary, stipend, job type, industry, social benefits, etc. The health dataset includes records of visits to healthcare professionals or hospitals, diagnosis, patient type and degree of urgency. The dataset spans from 2008 to 2020, but in several analyses, researchers focus on the 2008-2016 period and an age-restricted subset of individuals.

Transformer model

A transformer model is an AI, deep learning data architecture used to learn about language and other tasks. The models can be trained to understand and generate language. The transformer model is designed to be faster and more efficient than previous models and is often used to train large language models on large datasets.

Neural networks

A neural network is a computer model inspired by the brain and nervous system of humans and animals. There are many different types of neural networks (e.g. transformer models). Like the brain, a neural network is made up of artificial neurons. These neurons are connected and can send signals to each other. Each neuron receives input from other neurons and then calculates an output passed on to other neurons. A neural network can learn to solve tasks by training on large amounts of data. Neural networks rely on training data to learn and improve their accuracy over time. But once these learning algorithms are fine-tuned for accuracy, they are potent tools in computer science and artificial intelligence that allow us to classify and group data at high speed. One of the most well-known neural networks is Google’s search algorithm. Read more: Neural network – Wikipedia.


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.

link

Leave a Reply

Your email address will not be published. Required fields are marked *