Creating NSFW AI involves multiple difficulties which lie in both technical, ethical and financial areas. Creating an AI to identify inappropriate content generally requires large training datasets: you may need millions of labeled images and text. Collecting these datasets can be extremely time consuming and expensive, with some companies paying over $10M per year in databases subscriptions that will allow them to maintain updated data. But, the complexity does not end with just acquiring dataset; models have to differentiate between subtle pieces of content like satire and real harm that only some high-level algorithms combined in an effective neural network can understand.
As detailed in the 2021 report issued by Facebook, even with cutting-edge models, error rates for content moderation continue to be around 10%, resulting both false positives and negatives. Notwithstanding accuracy, it is critical but very difficult to balance ethical considerations in the model development process. It is well known that the developers need to eliminate bias during training (they try at least) as soon as they are dealing with ‘subtler’ subjective content such movies and — even worse case — culturally sensitive one. Timnit Gebru, an AI expert, has also noted that AI could replicate human bias (potentially unfavourable to models of colour ) in reflecting the data they have been trained against; a particularly complex issue when dealing with potentially NSFW content — which is rife with ethical consequences and cultural nuance.
This is where resource allocation becomes another key factor. Training these models requires significant computational resources, with the use of GPUs 24 hour a day for weeks leading to energy costs in excess of $50K per model iteration. Tech giants like Google and OpenAI that can likely take on these costs without blinking an eye, but smaller companies less able to compete. In addition, it is not easy to include NSFW AI in current systems due API integration and require regular updates, an even more amount of testing which further increase the time as well as money expenses.
In the real world,Twitter's 2022 event had flagged innocuous tweets as unsafe to show, which illustrates that striking this balance between model and user HAD BEEN a problemAccessException: userEmailIssueISSITE.RIGHT_CLICKED_ON_UNSUPPORTED_BROWSERQ :ELL.default[js_44]… These mistakes bring about a lack of user faith and that could be critically detrimental for platform engagement. Over time the teams reflect upon these points and continuously refine, sometimes taking months of testing cycles, retraining & tweaking. This iterative method in combination with the fast pace of change for online content introduces a constant problem around scalability.
So the obvious question would be: Can developers walk this tight line between efficacy, ethics, and affordability? In fact, some in the industry say that is where Americas cup answer can be found -- collaborative development. As an example, business people keep too much code to the vest in AI and ultimately results are becoming compromised because companies refuse or simply do not have incentives for data sharing that would help better model accuracy but open source efforts can remedy part of it.
The development journey of NSFW AI is a must to learn for those who are exploring how nsfw ai fits in content moderation strategies. Here you can discover more at this nsfwai.