Speaker
Description
Corporate financial statements provide a comprehensive summary of a company’s annual performance, but they also reflect writing biases, shaped both by the author and by the historical moment in which they are produced. Sentiment analysis can help uncover these biases by classifying the tone of the text on a scale from positive to negative. This is possible through the use of neural networks, ranging from LSTM to more advanced Transformer-based models.
Despite being standardized, financial statements also present structural biases. An information of interest is not always easy to locate--even keyword searches often aren't enough--because these documents include several related topics, making the text inherently complex. The goal of this research is to fine-tune an open-source language model (LM) on a built-from-scratch database of financial documents, in order to build a Retrieval-Augmented Generation (RAG) pipeline. By asking questions (queries) to the model, it's possible to identify specific topics within the documents and generate coherent, context-aware answers.