Transparency

Status: Optimization

Introduction

How often do you go shopping, whether physically or online, find the products you are looking for and buy it without much of a second thought except for your imminent desire for this product. My guess would be most of the time, but what if you found out that this particular company forced cheap laborers, employed children or just destroyed a priceless natural landscape to gather the materials? You would probably think twice before buying it. But then again would you do the research to gather the information?

Breakdown

This is a complex situation, thus as a starting point, let us dissect this quote taken from A Brief History of the Masses (Jonsson 43)

“At the very moment when the third estate constitutes itself as the voice of the nation, a new distinction thus emerges. On the one hand, the most dynamic elements of the bourgeoisie step forward as universal citizens acting in the interest of the people, the plebs, or le menu peuple, who because of their poverty and ignorance can't conceivably participate in the political decisions that are made in their name. ”

This relates to democracy in the making during the French revolution. Let us remember these events were set in motion to empower the people and, in a way, the bourgeoisie used the masses' raw power to place themselves into power, as the people joined willingly due to living conditions getting worse. Here we can make an obvious parallel with the “gilet jaune” movement occurring at the moment in France. In the modern case, there are people working up to three part-time jobs (Donada) and being victims of a 100% increase in energy cost (“Retour sur l'augmentation du prix de différentes énergies”). Around the world there have been changes in living conditions and opportunity, but everyone is too busy to dig into it. It is understandable that when you finish your long hours all you want is to sit down and forget everything by playing video games, looking at series, videos or memes and you certainly do not want to go into the complexity that encompasses the cheap product you buy that helps destroy life.

The other part we need to address is participation “in the political decisions..”, again this was during the creation of modern democracy and which can bring questions to the one in place nowadays. Just before Covid you could see multiple countries fighting against their elected government due to corruption (Perkins) Returning to the French, Macron sold Gaz de France to Suez (Archimède), hence the fast rise in price, apparently not in the interest of its people.

Thesis

So what can we do, as we lose all confidence in the people we democratically elected since they represent the interest of private corporations anyway? On this point I will agree with Elise Van Beneden, president of Anticor, an association that fights against corruption for ethics, when she is questioned about boycotts “Je trouve que le pouvoir d’achat et une arme qui es complètement sous estimer” (I find that buying power is an underestimated weapon) (“France : Corruption à tous les étages ? Elise Van Beneden [ EN DIRECT ]”). Here, I think it is important that we take a step back to explain a bit why purchasing power is so important. Considering we have a capitalist economy: “an economic and political system in which a country's trade and industry are controlled by private owners for profit, rather than by the state.” (Oxford Languages dictionary) and that we have multiple examples of the state helping the private before its citizen, the 2008 real estate crash is a great demonstration of government interest as they delivered free bailouts to the banks (Mckay). Since the government is here to help when big corporations make a mistake that impacts millions it seems like an impossible problem to take on. But there is a caveat, they still rely on the people to have a stable money flow. Thus they have to answer to the laws of supply and demand and this is where our buying powers come into play. or another way we can use our leverage witness by massively taking on the stock market (“GameStop: Were The Short Sellers Routed? Does It Matter? (Beware The ‘Gamma’)”).

In any case, if we can have the people make more informed decisions when it comes to shopping then we can have a real impact and make environmental, economical and social changes. But the solution needs to be extremely accessible, non threatening while being informative. This is where my background in art, studies in data science and programming come into play.

The idea is to make a browser application that creates an art piece whenever it sees a product on the screen. The piece could be small at first and expand once the user places his cursor over the product. Once activated the piece is separated into three distinctive sections: economic, sociological, and ecological all related to the company's actions that can be found on the web. Each section has five subdivisions made from multiple texts and their position is important. The closer it is to the center the worse the activity is, alternatively, the further away the better. Concerning the size, it is set by the weight of the articles in that section, meaning how bad or how good it is.

Technical

At the moment, the prototype uses Google to inform the user about the actions a company takes in connection with the social, economic, and ecological aspects. For the moment the algorithm is very simple and only gives us a caricature of the state of its sectors. Effectively, it takes the texts from the top 20 sites related to each sector and calculates whether they are positive or negative. Subsequently, the algorithm generates an informative visual. Or each text takes a form whose dimensions are defined by their intentions and their positions by sector. Very simple but effective.

Currently, the visuals are satisfactory, as I am focusing on developing the guts and the brain of the code. The first exploration gathers 60 web pages and visualizes them individually, Google allows my search function for up to 2 seconds in web page download time, which corresponds to over a hundred results. To narrow it down, the project utilizes machine learning (ML) (Zafra) to classify the texts extracted from google. The model is a random forest selected from five other models all being trained on web scraped text. Once the texts are selected, cleaned, and classified the texts are then scanned for positivity or negativity level that are then used to place and size the shape.

Machine Learning

Regarding machine learning processes, in order to be able to offer the application fast, easy, accessible, and up-to-date, the keyword here is speed, we would have to train it constantly. As far as my algorithm goes, the current application takes around five minutes to generate the data; which is way too long because she visits all the sites. It would be relevant for her to learn how to manage her content. Which sites are most relevant for the economic section? Probably "Investopedia" or "Yahoo Stock" while on the social side it would be relevant to look at the comments related to the products and the Reddit forum on working conditions.

Here is the result of the trained model on the data, which is not very good as 71% is not enough for a usefull model.

Next comes the analysis of the content of these sites, hence the NLP. It is essential to maximize the classification of texts based on several criteria; one of these would be to be able to link tests that are of similar subject or content and ultimately compare them. Another aspect to consider is related to the content of the texts, before making sure that the algorithm understands what is written, we can optimize the word banks which help in the classification of the texts. First, have a pair of word banks for each section. So have a positive and a negative word bank directly related to a theme, for example, the environment. In addition, each word cannot have the same value - there are words with much more weight than others. So add a value to each word. Further in the process, AI could be valued to itself assess the value of the word by considering its context in the text but also socially at the time of its publication (when the text was written). Like, for example, the words liberty, equality and fraternity do not invoke the same thing in 1789 as compared to 2021. And there are still many more considerations to be elaborated in order to obtain better results.

Improvement:

Due to the data set found online and only classified but research subject ML model is inclined to mistakes, this can be resolved by an improvement on the data sets by extracting the text only from key websites or a pre-classified dataset. Furthermore, with the data properly classified, subsection can be made and abstract summarization can be applied to offer the possibility to read about the various texts found. As for the trigger, there is an image classification model trained on 32 logos that needs to be applied to this project.

The graphs are principal component analyses to visualise the random forest models' performance. As seen with the top graphs, when trained with subsections, the output is convoluted. The issue is not solved by having less category, which is to say that web scrapping to gather data un-classified data is not the best approch

Conclusion:

There is still more to achieve on this project for it to be optimal, but once it is ready to go on the web as Google extension it could offer a way to make pressure on companies to respect more their employees, the environment and to be generous with their profit.

This is the first Python interpretation of the Processing design made in 2019. In this case, the grey represents all the text that has not been classified with more than 60% accuracy. Further, training on better data could offer a solution.

Bibliography:

Archimède, Samy. ““LES PRÉDATEURS”, OU COMMENT LA FUSION GDF-SUEZ A ENRICHI DES MILLIARDAIRES.” Le Journal, 2018 12 05, https://journal.ccas.fr/les-predateurs-ou-comment-la-fusion-gdf-suez-a-enrichi-des-milliardaires/.
Donada, Emma. “La baisse du chômage cache-t-elle une hausse de la précarité ?” Libération, 15 08 2019, https://www.liberation.fr/checknews/2019/08/15/la-baisse-du-chomage-cache-t-elle-une-hausse-de-la-precarite_1745266/.
“France : Corruption à tous les étages ? Elise Van Beneden [ EN DIRECT ].” Youtube, Thinkerview, 2021, https://www.youtube.com/watch?v=rrTH31VbdXI&t=3933s&ab_channel=Thinkerview. Accessed 05 04 2021.
“GameStop: Were The Short Sellers Routed? Does It Matter? (Beware The ‘Gamma’).” Forbes, 19 03 2021, https://www.forbes.com/sites/georgecalhoun/2021/03/19/gamestop-were-the-short-sellers-routed-does-it-matter-beware-the-gamma/?sh=1419f7084dae. Accessed 08 04 2021.
Jonsson, Stefan. A Brief History of the Masses: Three Revolutions. Columbia University Press, 2008.
Mckay, Adam, director. The Big Short. Paramount Pictures, 2015.
Perkins, John. Confessions of an Economic Hit Man. Berrett-Koehler Publishers.
“Retour sur l'augmentation du prix de différentes énergies.” libow, 2015, https://www.izi-by-edf-renov.fr/blog/augmentation-prix-energies.
Zafra, Miguel Fernandez. “Latest News Classifier.” Guihub, https://github.com/miguelfzafra/Latest-News-Classifier#latest-news-classifier. Accessed 02 04 2021.
Zafra, Miguel Fernandez. “Text Classification in Python.” Towards Data Science, https://towardsdatascience.com/text-classification-in-python-dd95d264c802. Accessed 02 04 2021.