Journalistic Portfolio Analysis

Duration

12 months

Status

Completed

MTC Team

Dr. Saikishore Kalloori, Piriyakorn Piriyatamwong, Dr. Fabio Zünd, Prof. Ryan Cotterell

Collaborators

Dr. Cristina Kadar (NZZ), Dr. Tatyana Ruzsics (NZZ), Dr. Dominic Herzog (TX), Milena Djordjevic (TX), Florian Notter (SRF)​


Website UI

Online news portals became a popular way for users to read news articles online, by e.g. visiting the websites of their favorite media publishers. However, the high number of fresh articles that are published online every day can make it challenging for a single individual to be able to keep track of an event, or a series of related events, from an unbiased and truly informative point of view. News media clustering is a powerful technique that has been widely used for organizing news articles into smaller and manageable information blocks. Additionally, clustering the news articles based on a particular topic and the stance or sentiment prediction on sentences inside the news article will enhance the user's reading experience and also help journalists understand the biases around the articles for each topic. Moreover, automatic stance and sentiment detection on each news article (sentence level or paragraph level) can help to curate produced content, improve user engagement metrics, users’ inclination to register or subscribe, or even open up new opportunities in advertising.

Goal

There is tremendous growth in the volume of news content produced by various publishers. Users often find themselves having to navigate multiple publisher websites to gain a comprehensive understanding of the positive and negative aspects of news events or topics. In this project, our objective is the automatic categorization of articles, which is a crucial step towards generating portfolio summaries and monitoring articles within specific categories.

Our approach involves identifying topics for a given set of articles, and then using these topics to predict the stance, whether on a sentence or paragraph level, for each article. Additionally, the development of an end-user prototype for a defined use case was undertaken.

Outcome

Our work aims to simplify this process by grouping articles from similar events/topics and extracting relevant arguments. Our goal is divided into two tasks: (a) automatically clustering daily incoming articles and (b) discovering arguments within those articles. We propose a contrastive learning method and text augmentation methods to cluster the daily influx of news articles from any online news portal. Additionally, we demonstrate the capabilities of LLMs, such as chatGPT, to generate titles for each clustered topic.

Furthermore, we extract questions directly from the source article text using both rule-based and probability-based methods under each topic. For a given question-sentence pair label from GPT, we train a classifier to predict the pro/con label. We have developed an application that integrates a clustering model and an argument prediction model. This application, when given a database of news articles from multiple sources, clusters the articles into specific topics to extract relevant questions. Subsequently, it uses these questions to predict arguments for and against the topics within each news article.

The system, as demonstrated in the video below, presents users with the arguments extracted from news articles, along with their corresponding questions. Users also have access to the original full article to ensure a coherent and satisfying user experience.

By playing the video you accept the privacy policy of YouTube.Learn more OK
2024 Media Technology Center

Public Open-​Source Repository

Journalistic Portfolio Analysis

View of the Journalistic Portfolio Analysis layout

An LLM-based library for extracting debate arguments from news articles. external pageFind the repository on GitHub


Additional Project Resources

Resources for Industry Partners

Additional project resources for our industry partners are only available to registered users and can be found here.

Project Demo Applications

Project demos are hosted on our project demo external pagedashboard. They are accessible by registered users only.

JavaScript has been disabled in your browser