The main goal of this project is to analyze media coverage of companies listed on the PSI-20 index. Using data analysis, machine learning, and visualization techniques, the project aims to extract relevant insights about public perception and sentiment surrounding these companies.
The specific objectives and features of the project include:
Sentiment Analysis
Assess the sentiment in news articles and media coverage to determine whether keywords associated with PSI-20 companies have a positive or negative impact on public perception.Named Entity Recognition (NER)
Extract and identify key entities (such as company names, people, places, etc.) in media content to understand how these entities relate to PSI-20 companies. This helps reveal the context and nature of associations, showing how different entities influence the perception of each company.Media Coverage Analysis
Monitor the frequency and volume of media mentions for each PSI-20 company, highlighting trends over time and identifying patterns that may indicate shifts in public perception or market impact.Data Collection and Analysis
Extract data from arquivo.pt focusing on PSI-20 companies and perform data exploration and analysis using Jupyter notebooks.Data Visualization
Create static and interactive visualizations to present insights such as sentiment trends, media mentions over time, and relationships between companies and key entities. These visualizations make complex data easier to interpret.Web Application
Develop a web application using Flask to interactively present key insights and visualizations. The app allows users to explore the data, view sentiment trends, and gain insights into media coverage in an intuitive way.
By integrating data collection, analysis, machine learning, and visualization, this project aims to turn media coverage into actionable insights that support decision-making by investors, analysts, and other stakeholders.